Bluish Coder

Programming Languages, Martials Arts and Computers. The Weblog of Chris Double.


2011-04-24

Converting C Programs to ATS

I tend to use ATS as a low level programming language - a better, safer, C. Often I start with a C program and slowly add ATS features as I go. I do this so I can utilize existing C code, write my higher level functionality in ATS, and slowly convert parts of the C to ATS to add compile time safety and resource utilisation checks. ATS facilitates this by allowing C to be embedded in ATS programs directly.

http-server.c

Recently I needed an HTTP server that I could embed in a program to serve simple pages and proxy requests, with modifications, to another server. I decided on using libevent since it has HTTP functionality integrated with the event system. The libevent 2 distribution has an http-server sample program and I started with a slightly modified version of that. The C sample can be compiled with:

gcc -o http-server http-server.c -levent

Running it with a path to a docroot as an argument starts an HTTP server on port 8080. Requests are served with the files from the docroot. The special URL '/dump' prints details of the request on standard output.

http-server.dats

The first step at embedding this in my ATS program was to create http-server.dats (pretty-printed html) with the same C code embedded in the ATS program. The C include files are enclosed as follows:

%{^
#include <stdio.h>
...
#include <event2/event.h>
#}

The '%{' marker tells ATS that this is C code that should be placed at the top of any compiled ATS code. The rest of the C code was embedded as-is using '%{':

%{
/* Try to guess a good content-type for 'path' */
static const char *
guess_content_type(const char *path)
{
    ...
}
...
%}

Code in beween the '%{...%}' markers is inserted in the compiled ATS code directly. I made a small change to remove the C main function and factor parts of that out into a http-server function so I could write a main in ATS that calls it. The ATS code to call the C code looks like:

extern fun syntax():void = "mac#syntax"
extern fun http_server(docroot: string):int = "mac#http_server"
exception Error of ()

implement main(argc, argv) =
  if argc < 2 then
    syntax()
  else
    if http_server(argv[1]) = 0 then () else $raise Error ()

The 'extern fun' definitions are what allows the ATS code to call the C code. Instead of providing a body for these functions I assign a string like, "mac#syntax". This tells ATS that this function is implemented in C with the name of the C function after the #. The 'mac' before the # gives instructions to ATS about how to use the C function. More on the # syntax is available in this mailing list post.

The program can be compiled and tested with:

atscc -o http-server http-server.dats -levent

There isn't much advantage in this program vs the C program. There's the slight benefit of the compile time checking that argv isn't referenced out of bounds but that's it. However, now that it's embedded in ATS we can start converting code.

http-server2.dats

The first function I tackled at converting was http_server. The resulting ATS code is in http-server2.dats (pretty-printed html). For this code I need to call libevent functions from ATS. libevent uses C structures to hold state - these structures are abstract in that C code can't look inside them. All access is via libevent functions. The structures used by http_server are:

  • event_base
  • evhttp
  • evhttp_request

libevent has functions to manage the allocation and freeing of these structures. If you've been reading my other ATS posts on safer C usage you'll know that these can be represented as abstract viewtypes. The type wrapper for event_base looks like:

absviewtype event_base (l:addr)
viewtypedef event_base0 = [l:addr | l >= null ] event_base l
viewtypedef event_base1 = [l:addr | l >  null ] event_base l

This defines event_base as an abstract view type with an address, l. It's effectively a C pointer of type event_base*. The two viewtypedef statements define aliases for an event_base type that can be NULL (event_base0) and an event_base type that cannot be NULL (event_base1). Defining these typedef's makes it easier to tell which functions accept NULL objects and which don't. It also allows defining functions without having to have universal type quantifiers everywhere (eg, the {l:addr} in function definitions). Similar wrappers are done for evhttp and evhttp_request.

event_base objects are created and destroyed with event_base_new and event_base_free. Events are dispatched using event_base_dispatch. The ATS definitions for these look like:

extern fun event_base_new(): event_base0 = "mac#event_base_new"
extern fun event_base_free (p: event_base1):void = "mac#event_base_free"
extern fun event_base_dispatch (base: !event_base1):int = "mac#event_base_dispatch"

This says that event_base_new returns a possible NULL event_base and event_base_free takes a non-NULL event_base. Abstract viewtype's are linear objects which means the compile time type system checks that they are destroyed and that they aren't used after destruction. event_base_free consumes the linear type (For it not to consume the type it would have to define the argument with a ! like p: !event_base1). You can see this in the definition of event_base_dispatch which has the ! annotation in the argument to say it doesn't consume the type.

Code to create and destroy an event_base will look like:

val base = event_base_new()
val () = assert_errmsg(~base, "event_base_new failed")
...
val _  = event_base_dispatch(base)
...
val () = event_base_free(base)

Note the assert_errmsg call. The ~ operator returns true if base is not NULL. So the assert checks that base is non-NULL and the type system tracks this. It knows that base from then on is a non-NULL pointer (ie. The event_base1 typedef). This allows it to be passed to event_base_free. Without this assert (or other check for non-NULL-ness) there would be a compile error.

Similar wrappers are done for the other libevent functions that http_server uses. evhttp_set_cb and evhttp_set_gencb are a little different however. Their C definitions look like:

void evhttp_set_gencb(struct evhttp *http,
    void (*cb)(struct evhttp_request *, void *), void *arg);
int evhttp_set_cb(struct evhttp *http, const char *path,
    void (*cb)(struct evhttp_request *, void *), void *cb_arg);

They take a C function as a callback and an argument to pass to that C function. The callback is called by libevent when a particular URL is accessed. The argument is typed as a void* but we can do better in ATS. Here's the ATS definitions:

typedef evhttp_callback (t1:viewtype) = (!evhttp_request1, !t1) -<fun1> void
extern fun evhttp_set_cb {a:viewtype} (http: !evhttp1,
                                       path: string,
                                       callback: evhttp_callback (a),
                                       arg: !a): int = "mac#evhttp_set_cb"
extern fun evhttp_set_gencb {a:viewtype} (http: !evhttp1,xi
                                          callback: evhttp_callback (a),
                                          arg: !a): void = "mac#evhttp_set_gencb"

The typedef defines an alias for referring to the callback function. It is parameterized over the type of the argument. The definition states that an evhttp_callback is a C function (the fun1, see Functions in ATS) that takes two arguments. A non-NULL evhttp_request object that is not consumed, and an object of type t1 where t1 is any viewtype. It is also not consumed. The function returns void. An evhttp_callback (event_base) is therefore a C function that takes an event_base as the second argument.

evhttp_set_cb is a polymorphic function. The arg parameter can be any viewtype. The callback paramter is a evhttp_callback parameterized over this same type. This means that the callback must accept as an argument the same type as the argument we pass to evhttp_set_cb. evhttp_set_cb and evhttp_set_gencb are called like:

 val _ = evhttp_set_cb {ptr} (http, "/dump", dump_request_cb, null)
 val () = evhttp_set_gencb {string} (http, send_document_cb, docroot)

Note that we pass the argument type (ptr and string in the example) as a static argument in the call so ATS knows which type to use in the polymorphic function. Sometimes ATS can infer this and the argument is not needed but in this case ATS' type inference can't do it. The ATS wrappers for dump_request_cb and send_document_cb are:

extern fun dump_request_cb (request: !evhttp_request1, arg: !ptr): void = "mac#dump_request_cb"
extern fun send_document_cb (request: !evhttp_request1, arg: !string): void = "mac#send_document_cb"

http-example2.dats adds an additional callback to cause the server to exit. The server exit is done by breaking out of the event loop using event_base_loopexit. That function needs an event_base so I pass this as the callback argument. In this case there is no C function to wrap so I create the callback function as an ATS anonymous function:

val _ = evhttp_set_cb {event_base1} (http,
                                     "/quit",
                                     lam (req, arg) => ignore(event_base_loopexit(arg, null)),
                                     base)

The equivalent callback function in C would need to cast the void* argument to a base_event*. I don't need to do this in ATS as the callback argument is correctly typed thanks to the definition of evhttp_set_cb and evhttp_callback as described above.

The ignore call is to help make the ATS code a bit more readable. event_base_loopexit returns an integer but the callback returns void. We need to consume the return value of event_base_loopexit. In ATS this needs to be done using the verbose syntax let val _ = ... in () end. ignore is a macro that hides this:

macdef ignore (x) = let val _ = ,(x) in () end

This version of http-server2.dats provides a little bit more type safety than the C version. It still utilizes a lot of the C code. There is no overhead from ATS - all the wrappers are defined in terms of the existing C functions. They provide additional checking at compile time, no extra run time code is generated. Although the wrappers look verbose, they'd usually be in a libevent ATS module. The resulting ATS code is a little smaller and clearer if you ignore the wrapper's the sample code includes.

Next steps

The next steps would be to start converting the existing functions, or to add extra functionality in ATS (like I did with the quit callback). I'll do a followup post on anything interesting or new found during the remaining conversion. Otherwise you might like to try it as an exercise in learning ATS.

One interesting thing to look at would be how to handle the very callback oriented libevent code. This can make code difficult to follow. Writing libevent code that does HTTP requests looks something like:

fun handle_result(result: string) = ...
...
var () = http_get("http://www.bluishcoder.co.nz", handle_result)

I much prefer something like the following and have the compiler break the code up into the callback style:

var result = http_get("http://www.bluishcoder.co.nz")

Languages that have continuations or coroutines make this sort of thing easy. It'd be nice to be able to factor code like this to be easier to read and use in ATS. If anyone has any ideas I'd love to hear them.

Tags


This site is accessable over tor as hidden service 6vp5u25g4izec5c37wv52skvecikld6kysvsivnl6sdg6q7wy25lixad.onion, or Freenet using key:
USK@1ORdIvjL2H1bZblJcP8hu2LjjKtVB-rVzp8mLty~5N4,8hL85otZBbq0geDsSKkBK4sKESL2SrNVecFZz9NxGVQ,AQACAAE/bluishcoder/-61/


Tags

Archives
Links