Web Development with Webmachine for Erlang

Origin and Aims
Webmachine is a REST framework built on top of Mochiweb, a web server written in Erlang. It aims at easily building a web service with correct HTTP semantics while not having to deal with HTTP in the business logic. Webmachine was refactored out of Riak as explained in the talk "Webmachine: a Practical Executable Model of HTTP" by one of its creators, Justin Sheehy. Since it was the initial implementation, this article mostly discusses the Erlang one and is focused on writing a REST service with Webmachine for Erlang.

Principles
The framework implements some default behaviour and the application has to implement a given set of functions which will be called by Webmachine (Hollywood Principle). All functions have the same signature which is:


 * Result: contains mostly true/false which act as signals for the decision core in the framework code.


 * ReqData: stores the request and response. Every function can read and modify those and pass them on to the next function.


 * Context: is ignored by the framework. In this parameter the application can store internal state for processing of one request.

Webmachine enforces a side effect free programming model because of the underlying implementation in Erlang. The only possibility to modify state inside another function is to pass an argument to it. With this in mind, a Webmachine application can be seen as a set of functions which call each other with their computed values.

Structure of a Webmachine Application
An application consists of one ore more resources. These are Erlang modules which usually implement framework callbacks and internal functions. The resources are located in the /src folder. For every application there is also a dispatch.conf file which configures the routes of an URL accessing request to a resource.

Building a REST Service
The running example will be a paper repository. It provides an API to store, change, retrieve and delete papers with their title and author.

The operations which are implemented in the example are the following:

The example will be developed in a test driven manner. The API tests are written with the excellent Python library Requests by Kenneth Reitz. Erlang unit tests are written with EUnit.

All source code can be found in a Github repository.

What we expect from the fresh Application
A barebones Webmachine application listens to port 8888 and returns a short html response stating "Hello world".

Running the test with python -m unittest api_tests.py -v

should produce a failing assertion. If this test succeeds, it means that there is already an application running on that port.

Install and run a basic Webmachine Application
For this tutorial a working installation of Erlang is needed. There are precompiled binaries available and it is also accesible via package managers.

After this, clone the Webmachine code repository from github.

git clone git://github.com/basho/webmachine.git

Then let it generate the skeleton code for our application which we will call prp – a short hand for paper repository and the namespace used for this tutorial code.

webmachine/scripts/new_webmachine.sh prp

After generating the code we build and run it: make ./start.sh

Note: Everytime the code is built, the changed modules are reloaded into the Erlang VM with hot code reloading. Because a full run of make checks all dependencies with rebar and is therefore comparatively slow. To speed up develpment it is recommended to run rebar compile

if no new dependencies were added and just the written Webmachine application needs to be compile.

Now the test should pass.

Adding the testrunner to the Makefile
For convenience it is recommended to add the api_tests.py to a new test directory at the application root. mkdir test The directory structure should look like:

tree -L 1 . ├── Makefile ├── deps ├── ebin ├── include ├── priv ├── rebar ├── rebar.config ├── src ├── start.sh └── test

In addition the testrunner is added to the Makefile as a new target:

Now type make api-tests to execute the test suite.

Configure Routes
For every request a Webmachine application decides which function in which module it has to call. This configuration is done in a file in priv/dispatch.conf. The syntax for this configuration can be found at the documentation.

In priv/dispatch.conf:

The first line tells Webmachine to look into the resource file prp_paper.erl to generate a response for a request to /paper/id. In addition, for a URL path /paper/foobar the part after the second slash, in this case "foobar" is accessible via the atom 'id'. With this help we can later get the id of a paper via:

Adding a storage layer
The service which is built in this tutorial requires a storage. A simple abstraction layer exists around it whose creation is not discussed in this tutorial. It uses an in-memory Mnesia database, a distributed database written in Erlang which is part of the Erlang distribution as well so there is no need to install drivers and databases.

Please get the source code from the Github repository. The relevant files are in src/prp_schema.erl and priv/prp_datatypes.hrl. The corresponding unit tests are located in test/prp_schema_tests.erl.

Writing a first test
At first a test method is added to the TestPaperAPI class:

Running the test via

make api_tests

should fail.

Implementing a GET response in HTML
As configured in dispatch.conf the request gets routed to src/prp_paper.erl. On application startup the init/1 method is called. It fills the in-memory Mnesia database wich is used for testing with inital data to test with (Fixtures).

The function content_types_provided/2 dispatches the request to that module to a function – in this case to_html/2. Erlang functions are referenced using the function_name/arity notation. to_html/2 means the function to_html with two arguments.

Turn on tracing for Debugging
Webmachine has a simple but effective way to debug an application. It is possible to trace the decisions made in the Webmachine core to get an idea where things go wrong. Let's go back to src/prp_paper.erl and turn on tracing. The init/1 function should now be:

src/prp_paper.erl

It is also necessary to add a new dispatch target to look at the traces via a web browser because Webmachine renders the decision flow into an HTML Canvas.

Add to dispatch.conf:

Now it is possibility to look at a trace after another run of the testsuite. It should look like the example trace in figure 1.

Making sure that not existing papers return a 404
If an entity doesn't exist it is necessary to return the status code 404 Not Found.

Running that test shows us that by default, Webmachine states that each resource is available. To change that behaviour we implement resource_exists/2 and return true/false if a paper exists.

src/prp_paper.erl Now, the test should pass.

Responding in different formats
In Webmachine it is possible to respond in different formats depending on the format desired by the client. The negotiation takes place in Mochiweb.

Requesting JSON
In addition to the already described response in HTML format a JSON response is implemented. A test describes the wanted behaviour.

As I am using an outdated version of the requests library the response is not parsed as JSON but interpreted as a string, which is not the way one should to it. In a current version of requests it is possible to use response.json to get the parsed JSON and compare it to a Python dictionary.

Implement a JSON Response for GET /paper/id
At first register a JSON producing function (in this case to_json/2) for a client requesting this format. It is specified in content_types_provided/2.

src/prp_resource.erl

If a client wishes another format than html or json it would get the proper HTML status code 415 Unsupported Media Type. To encode the paper data as JSON the mochijson library is used.

Now all tests should pass.

A PUT to /paper/id should add or modify a Paper with the given id
With PUT one can add a certain entity to a well-known location or modify it. This request should be idempotent which means that each identical request has the same effect as the first one. It is also important to note that an existing entity will be modified. If it is not possible than the response should be 409 Conflict. To make the application watch for conflicts is_conflict/2 must be implemented. In this example no conflict can occur. Every client is allowed to make any modification and the typical use case of non matching version numbers does not apply.

The wished behaviour is descibed in the following tests: A PUT should create a resource and update an existing one.

Running this test produces an assertion error because the application does not allow a PUT to this resource. To fix this it is necessary to modify allowed_methods/2 in src/prp_paper.erl.

In addition a handler for the PUT request has to be registered in content_types_accepted/2 in src/prp_paper.erl:

The explicit adding of JSON to the response body produces a 200 OK with the freshly created entity. If you prefer a 204 No Content just return RD instead of Resp.

The helper function paper2json/2 can also be used in to_json/2 follow the Don't Repeat Yourself principle. The function should be refactored to:

After another make api-tests all tests should be fulfilled.

A Note on Types
In this tutorial type specifications are used for custom helper functions. It is not necessary in Erlang and is just added to make the examples more clear. The notation reads as the following:

In a later section the type wm_reqdata is used which represents a request and its response. As Webmachine currently does not include full type specifications it is necessary to add a type declaration to deps/webmachine/wm_reqdata.hrl.

Or as a quick fix this declaration can be added into src/prp_paper.erl:

Respond with 201 Created instead of 200 OK
If the response should have no content but only a signal that the resource was created (status code 201 Created) it is necessary to explicitly add this information to the response. In src/prp_paper.erl modify from_json/2 to the following:

DELETE /paper/id should delete the record
A test for a permanent deletion of a paper with a given id could look like the following one. First, a paper is created, than deleted and than it should be not found by the application.

This test should fail bacause the status code of resp1 is 405 Method Not Allowed.

Implementing a DELETE
To make a DELETE succesful is required to first allow this HTTP method for a resource, in this case in src/prp_paper.erl.

After this a make api-tests fails with a status code 500 Internal Server Error because a DELETE is allowed but the Webmachine callback delete_resource/2 is not exported from the module. To fulfill Webmachine's expectation the function could be implemented like this the following:

The returned status (first entry in returned triple) marks if the delete was successful. If it was not, the application returns the status code 500 Internal Server Error.

POST to /paper adds a new Paper
While PUT is fine to add a paper with an id specified by the client it does not allow clients to just add a paper without knowledge about its URI. For this a POST is used because accoring to this table of RESTful HTTP methods a POST to a collection (in this case /paper) adds a new entity to it.

Like for the other methods a test is used to specify the wanted behaviour.

Because it is not known in the test which id the created paper has, we just assert that the content is not empty. With proper JSON parsing it would be much easier.

Adding a route to /paper
To /priv/dispatch.conf we add a route to allow a request to /paper: It is absolutely necessary to put that behind the first rule because the rules are matched from top to bottom which will result in a no-set id from the request path. The whole file should be like:

Implementing a POST which creates a Resource
Webmachine treats a POST to an URI like a PUT to this location if it is specified that a POST creates a resource. Therefore the function post_is_create/2 has to be implemented in src/prp_paper.erl and has to return true. In addition create_path/2 has to be written which produces the path onto which the POST is treated like a PUT. Please note that because post_is_create/2 returns true the response code automatically is set to 201 Created and the location header is set.

Unfortunately Webmachine does not rematch the dispatch rules on the created path which means that the created id is not accessible via wrq:path_info(id, RD) because this funciton still works with a normal PUT but returns undefined if the request is a POST. This means we have to modify the handling from_json/2 in src/prp_paper.erl.

A run of the test shows that this implementation was not sufficient. The test of the PUT still runs but POST fails with a 404 Not Found instead of a 200 OK. The reason is that per default a POST to a non existing resource is forbidden.



This problem is solved by implementing allow_missing_post/2 in src/prp_paper.erl.

Now all tests should pass.

Making sure that POST/PUT are well formed
Webmachine has a framework function malformed_request/2 which is called too erly (decision node B9 in the diagram ) to put in any application logic.

Any match out of data given by the API user might lead to a badmatch exception whch should be avoided because the crashing module will only be restarted if the number of crashes in a given interval are less that what is specified in its supervisor. With Webmachine the default is 10 times in 10 seconds (see line 57 in prp_sup.erl). Therefore one should check user input in production.

As an example it is shown how to check if data is provided to save.

The test should fail.

An implementation of how to check if a title is provided (remember, the paper consist of an id and a title) could look like the following code listing:

Now all test should pass again.

Are POST and PUT equal now?
It is also worth noting that because the recently implemented POST behaves like a PUT to the path created by create_path/2. But with one major difference: On a POST, create_path/2 is always called which means that a POST to /paper/id will be applied to /paper/generated_id. A test confirms this hypothesis.

Full Source Code Listing
test/api_tests.py

priv/dispatch.conf

src/prp_paper.erl

Ports to other Platforms
Webmachine originally is written in Erlang and can only be used from Erlang code. For using its priniciples it was ported to various other platforms.

Porting a Webmachine Application to Cowboy
In late 2011 the new HTTP server Cowboy was publicly released and gained a lot of momentum in the Erlang community because of its performance and its clean, small and modular codebase. The Cowboy authors reimplemented Webmachine's HTTP handling which makes it possible to easily port a Webmachine application to Cowboy.

The only thing to be changed is the handling of strings. Since Cowboy strictly uses binary strings (hence no list of chars) some type conversions are no longer necessary and string processing is sped up. In addition it may provide a generally even better performance as seen in a benchmark.

Sinatra and Webmachine
Sinatra can be considered a framework which addresses a similar class of problems – building a REST API. Since it is implemented in Ruby its interface and programming model is much different because it is essentially a domain-specific language. But both frameworks have in common that they are small and generic which gives a lot of flexibility and performance compared to full-stack frameworks. They should be used when there are specific needs and no scaffolding or MVC architecture is desired.

A main difference between Sinatra and Webmachine is the rich ecosystem of libraries and examples from the Ruby community while the Erlang world lacks a lot of reusable open source libraries e.g. for image processing, common cloud provider API and so forth.

Leveraging the Erlang Platform
Webmachine has some clear advantages for some use cases because of its Erlang implementation. As Erlang is intended for use in high performance, distributed and fault tolerant applications , a Webmachine application benefits of the already done work by the Erlang VM and the Open Telecom Platform (OTP), a collection of middleware libraries and applications. As seen in the code example there is little error handling because of the Erlang philosophy of "let it crash": A seldom error should not be handled in the code because it decreases readability and increases its complexity – it should just produce an uncatched runtime error. This optimistic programming can suprisingly lead to fewer errors and downtimes because of the use of supervisors. In Erlang, a supervisor is a process which starts, stops and monitors child processes. Child processes can be workers or other supervisors to build supervision trees. Processes are threads of execution inside the Erlang Virtual Machine called BEAM with separated heaps which are scheduled from the SMP emulator within the runtime. This means since Erlang programs are modeled in processes which send each other asynchronous messages, the runtime automatically utilizes all physical processors or processor cores. In addition the VM makes all I/O calls asynchronous and schedules other processes while waiting for the operating system.

In figure 3 is shown that a Webmachine application starts a supervisor (in this case prp_sup, which is defined in src/prp_sup.erl). The supervisor starts and monitors a process which interfaces between the Mochiweb server and Webmachine (mochiweb_webmachine) which starts and supervises all request acceptors. This means for each request an Erlang process is started which handles it and a crash is isolated in that process so it only affects one user and keeps the whole application running. But chances are high that the user with the crashing request doesn't notice the crash at all – the supervisor process, which traps that crash, captures the valuable state of the crashing process and restarts it with that information so nothing gets lost.

Conclusion
Webmachine is a great entry point to Erlang because it is a small and focused framework. Its aim to let users easily write a correctly behaving REST API is met. The graphical tracing to see how a request was handled in the framwework plus the user defined callbacks is a great way to debug and helps in many situations because it provides transparency to make the framework less magical. If Erlang might be not the best implementation choice a variety of Ports to other platforms are available.

It also is a mature framwork which is constantly tested due to the use in Riak and the backing from the company Basho.

However, there are some difficulties during development: The error messages are often only the erlang exception plus a complete print of the whole ReqData (the state passed in to every function) object which is clumsy, deeply nested, and mostly not helping at all. In addition it lacks of a variety of good examples and a sound documentation. Therefore it is often necessary to test the behaviour of the framework which makes a good testing toolchain essential (which holds true for every software project).

Another drawback is the inconsistency of types used in Webmachine and Mochiweb which makes type conversions often necessary. It would be great if Webmachine did type conversions to circumvent Mochiweb's non uniformity. It should provide bitstrings (String in a binary container via the bit syntax) which makes pattern matching in strings easier and string operations faster compared to the implementation of strings as lists of characters instead of a vector/array of characters which would allow $$O(1)$$ access instead of $$O(n)$$. But this datatype issue can be avoided by using the Cowboy webserver instead of Mochiweb.

Webmachine related documents online









 * – A blogpost explaining Bishop

Erlang online resources

 * – a good resource to learn Erlang



Books on Erlang/OTP