Bluish Coder

2015-11-04

A quick look at the Pony Programming Language

Pony is a new programming language described on their site as "an open-source, object-oriented, actor-model, capabilities-secure, high performance programming language."

It has some interesting features and is different enough to existing popular programming languages to make it a nice diversion to experiment with. Some features include:

lightweight actor based concurrency with M:N threading, mapping multiple language level threads to operating system threads.
strong static typing with generics
data-race free. The type system ensures at compile time that a concurrent program can never have data races.
deadlock free. There are no locking mechanisms exposed to the user so there are no deadlocks.
capabilities exposed to the type system to allow compile time enforcing of such things as objects that have no other references to it, immutable values, reference values, etc.
lightweight C FFI

This post is an outline of my initial experiments with the languages including pitfalls to be aware of.

Installing

Pony can be installed from git and run from the build directory:

$ git clone https://github.com/CausalityLtd/ponyc
$ cd ponyc
$ make config=release
$ export PATH=`pwd`/build/release:$PATH
$ ponyc --help

Run tests with:

$ make config=release test

Some of the Pony standard packages dynamically load shared libraries. If they're not installed this will be reflected in build failures during the tests. The required libraries on a Linux based machine are openssl and pcre2-8. To build Pony itself llvm version 3.6 needs to be installed. There is an llvm37 branch on github that works on Linux but is awaiting some llvm37 fixes before it is merged into master.

Pony can be installed in a default location, or using prefix to install it somewhere else:

$ make config=release prefix=/home/user/pony install

One catch is that running ponyc requires it to find the Pony runtime library libponyrt.a for linking purposes. This might not be found if installed somewhere that it doesn't expect. This can be resolved by setting the environment variable LIBRARY_PATH to the directory where libponyrt.a resides. I had to do this for the Nix Pony package.

Compiling Pony programs

A basic "Hello World" application looks like:

actor Main
  new create(env: Env) =>
    env.out.print("hello world")

Place this in a main.pony file in a directory and compile:

$ mkdir hello
$ cat >hello/main.pony
  actor Main
   new create(env: Env) =>
     env.out.print("hello world")
$ ponyc hello
$ ./hello1
hello world

ponyc requires a directory as an argument and it compiles the *.pony files in that directory. It generates an executable based on the directory name, with a number appended if needed to prevent a name clash with the directory. The program starts executing by creating a Main actor and passing it an Env object allowing access to command line arguments, standard input/output, etc. The Main actor can then create other actors or do whatever required for program execution.

Actors

Actors are the method of concurrency in Pony. An actor is like a normal object in that it can have state and methods. It can also have behaviours. A behaviour is a method that when called is executed asynchronously. It returns immediately and is queued to be run on an actor local queue. When the actor has nothing to do (not running an existing method or behaviour) it will pop the oldest queued behaviour and run that. An actor can only run one behaviour at a time - this means there needs to be no locking within the behaviour since access to actor local state is serialized. For this reason it's useful to think of an actor as a unit of sequential execution. Parallelism is achieved by utilising multiple actors.

To compare the difference between a standard object and an actor I'll use the following program:

class Logger
  let _env: Env
  let _prefix: String

  new create(env: Env, prefix: String) =>
    _env = env
    _prefix = prefix

  fun log(msg: String, delay: U32) =>
    @sleep[I32](delay)
    _env.out.print(_prefix + ": " + msg)

actor Main
  new create(env: Env) =>
    let l1 = Logger.create(env, "logger 1")
    let l2 = Logger.create(env, "logger 2")

    l1.log("one", 3)
    l2.log("two", 1)
    l1.log("three", 3)
    l2.log("four", 1)

This creates a class called Logger that on construction takes an Env to use to output log messages and a string prefix to prepend to a message. It has a log method that will log a message to standard output after sleeping for a number of seconds given by delay. The unusual syntax for the sleep call is the syntax for calling the sleep C function using the Pony FFI. I'll cover this later.

The Main actor creates two loggers and logs twice to each one with a different delay. As a standard object using class is not asynchronous running this will result in a delay of three seconds, outputting the first log line, a delay of one second, outputting the second line, a delay of three seconds, outputting the third line and finally a delay of one second, outputting the final line. Everything happens on the single Pony thread that runs the Main actor's create constructor. Pony runs this on a single operating system thread. Total elapsed time is the sum of the delays.

Compile and build with:

$ mkdir clogger
$ cat >clogger/main.pony
  ..contents of program above...
$ ponyc clogger
$ time ./clogger1
  logger 1: one
  logger 2: two
  logger 1: three
  logger 2: four

  real  0m8.093s
  user  0m0.116s
  sys   0m0.132s

Changing the Logger class to an actor and making the log method a behaviour will result in the logging happen asynchronously. The changes are:

actor Logger
  let _env: Env
  let _prefix: String

  new create(env: Env, prefix: String) =>
    _env = env
    _prefix = prefix

  be log(msg: String, delay: U32) =>
    @sleep[I32](delay)
    _env.out.print(_prefix + ": " + msg)

Nothing else in the program changes. I've just changed class to actor and fun to be. Now when the Main actor calls log it will add the behaviour call to the actor's queue and immediately return. Each Logger instance is running in its own Pony thread and will be mapped to an operating system thread if possible. On a multiple core machine this should mean each actor's behaviour is running on a different core.

Compiling and running gives:

$ mkdir alogger
$ cat >alogger/main.pony
  ..contents of program above...
$ ponyc alogger
$ time ./alogger1
  logger 2: two
  logger 2: four
  logger 1: one
  logger 1: three

  real  0m6.113s
  user  0m0.164s
  sys   0m0.084s

Notice that the total elapsed time is now six seconds. This is the sum of the delays in the calls to log in the first Logger instance. The second instance is running on another OS thread so executes in parallel. Each log call immediately returns and is queued to run. The delays on the second Logger instance are shorter so they appear first. They two log calls on the second Logger run sequentially as behaviours on a single actor instance are executed in order. The log calls for the first Logger instance run after their delay, again sequentially for the calls within that actor.

Capabilities

Pony uses reference capabilities to allow safe concurrent access to objects. In practice this means annotating types with a tag to indicate how 'sharable' an object is. For data to be passed to another actor it must be safe for that actor to use without data races. Reference capabilities allow enforcing this at compile time. There are defaults for most types so you don't need to annotate everything. Notice that none of the examples I've done so far use any capability annotations. I'll go through a few examples here but won't be exhaustive. The Pony tutorial has coverage of the combinations and defaults.

val and ref

A val capability is for value types. They are immutable and therefore anyone can read from them at any time. val objects can be passed to actors and used concurrently. Primitives like U32 are val by default. This is why none of the primitive arguments to behaviours in the previous examples needed annotation.

A ref capability is for references to mutable data structures. They can be read from and written to and have multiple aliases to it. You can't share these with other actors as that would potentially cause data races. Classes are ref by default.

This is an example of passing a val to another actor:

actor Doer
  be do1(n: U32) =>
    None

actor Main
  new create(env: Env) =>
    let a = Doer.create()
    let n: U32 = 5
    a.do1(n)

As U32 is a primitive it defaults to a val reference capability. It is immutable and can be read by anyone at any time so this compiles without problem. This example fails to compile however:

class Foo
  let n: U32 = 5

actor Doer
  be do1(n: Foo) =>
    None

actor Main
  new create(env: Env) =>
    let a = Doer.create()
    let b = Foo.create()
    a.do1(b)

The error is:

main.pony:5:13: this parameter must be sendable (iso, val or tag)
  be do1(n: Foo) =>
            ^

class defaults to the ref capability which can be read, written and aliased. It can't be used to send to another actor as there's no guarantee that it won't be modifed by any other object holding a reference to it. The iso and tag capabilities mentioned in the error message are other capability types.

iso is for single references to data structures that can be read and written too. The type system guarantees that only one reference exists to the object. It is short for 'isolated'.

tag is for identification only. Objects of capability tag cannot be read from or written too. They can only be used for object identity or, if they are an Actor, calling behaviours on them. Actors default to tag capabilities. Calling behaviours is safe as behaviour running is serialized for the actor instance and they don't return data.

To get the previous example to work we can force the Foo object to be of type val if it can be immutable:

class Foo 
  let n: U32 = 5

actor Doer
  be do1(n: Foo val) =>
    None

actor Main
  new create(env: Env) =>
    let a = Doer.create()
    let b: Foo val = Foo.create()
    a.do1(b)

ref and iso

Let's modify the example so we can change the value of the Foo object to demonstrate moving a mutable reference from one actor to another:

class Foo
  var n: U32 = 5

  fun ref set(m: U32) =>
    n = m

  fun print(env: Env) =>
    env.out.print(n.string())

actor Doer
  be do1(env:Env, n: Foo iso) =>
    n.print(env)

actor Main
  new create(env: Env) =>
    let a = Doer.create()
    let b = Foo.create()
    a.do1(env, b)

In this example the do1 behaviour now requires an iso reference capability. As mentioned previously, iso means only one reference to the object exists therefore it is safe to read and write. But where we create the instance of Foo we have a reference to it in the variable b. Passing it as an argument to do1 effectively aliases it. The compile time error is:

main.pony:18:16: argument not a subtype of parameter
    a.do1(env, b)
               ^
main.pony:11:19: parameter type: Foo iso
  be do1(env:Env, n: Foo iso) =>

main.pony:18:16: argument type: Foo iso!
a.do1(env, b)
           ^

This error states that do1 requires a Foo iso parameter whereas it is being passed a Foo iso!. The ! at the end means that it is an alias to another variable. Even though class objects are ref by default, Pony has inferred the capability for b as iso as we didn't declare a type for b and we are passing it to a function that wants an iso. However as it has an alias it can't be used as an iso therefore it's an error.

One way of avoiding the aliasing is to pass the result of the create call directly:

actor Main
  new create(env: Env) =>
    let a = Doer.create()
    a.do1(env, Foo.create())

There is no alias here so it compiles fine.

If we do want to have an initial reference to it, say to set a value first, we can tell the type system that we are consuming the existing reference and will no longer use it. This is what the consume keyword is for:

actor Main
  new create(env: Env) =>
    let a = Doer.create()
    let b = Foo.create()
    b.set(42)
    a.do1(env, consume b)
    // b.set(0)

This now compiles. Uncommenting out the use of b after the do1 call will be a compile error as we've consumed b and it no longer exists. In this case the error owuld be:

main.pony:20:5: can't use a consumed local in an expression
    b.set(0)
    ^
main.pony:20:6: invalid left hand side
    b.set(0)

consume is more often used for passing iso objects around. To pass it to another object you need to consume the existing reference to it. This becomes problematic if you are consuming a field of an object. Modifying the example so that the Foo is stored as a field of Main shows the problem:

actor Main
  var b: Foo iso = Foo.create()

  new create(env: Env) =>
    let a = Doer.create()
    b.set(42)
    a.do1(env, consume b)

The error is:

main.pony:20:16: consume must take 'this', a local, or a parameter
    a.do1(env, consume b)
               ^

b can't be consumed as it's a field of Main. It can't be left consumed - it must have a valid Foo iso object stored in it. In Pony assignment returns the old value of the variable being assigned too. This allows assigning a new value to the field and returning the old value in one operation and avoiding leaving the field in an invalid state:

new create(env: Env) =>
  let a = Doer.create()
  b.set(42)
  a.do1(env, b = Foo.create())

b gets a new value of a new instance of Foo and do1 gets passed the old value.

There's a lot more to capabilities and the capabilities section of the tutorial covers a lot. Although there are sane defaults it feels like that 'capability tutorials' will be the Pony equivalent of 'Monad tutorials' in other languages for a while. When I first was learning ATS I spent a lot of time floundering with function annotations to get things to compile, trying random changes, until I learnt how it worked. I'm probably at that stage with capabilities at the moment and I hope it becomes clearer as I write more Pony programs.

Pattern Matching

Pony has many of the concepts of most modern functional programming languages. Matching on values is allowed:

let x: U32 = 2
match x
  | 1 => "one"
  | 2 => "two"
else
  "3"
end

Union types with capturing:

type Data is (U32 | String | None)
....
match x
| None => "None"
| 1 => "one"
| let u: U32 => "A number that is not one: " + u.string()
| let s: String => "A string: " + s
end

Enumerations are a bit verbose in that you have to use primitive to define each variant of the enumeration first:

primitive Red
primitive Blue
primitive Green

type Colour is (Red | Blue | Green)
...
let x: Colour = Red
match x
| Red => "Red"
| Blue => "Blue"
| Green => "Green"
end

C FFI

Pony has an easy to use C FFI. I showed an example of this previously:

@sleep[I32](delay)

The @ signifies that this is a C FFI function call. The type in the backets is the return type of the C function call. The types of the arguments must match what the actual C function expects. Errors here will crash the program. Pony allows specifying the type of an FFI function in advance so argument types are checked. For sleep it would be:

use @sleep[I32](n: U32)
...
@sleep(10)

Note that it's no longer necessary to specify the return type at the call point as it's already been defined in the declaration.

If the C function is part of a library already linked into the Pony executable then there is no need use a statement to define the library file to link against. sleep is part of libc so it isn't needed. In the cases where you need to link against a specific library then the use statement is used in this manner:

use "lib:foo"

The addressof keyword is used to pass pointers to C code. It can be used for passing out parameters of primitives types:

var n: U32 = 0
@dosomething[None](addressof n)
env.out.print("Result: " + n.string())

Callbacks

The FFI allows passing Pony functions to C for the C code to later call back. The syntax for this looks like:

let foo = Foo.create()
@callmeback[None](addressof foo.method, foo)

Calling C code example

A working example for the following C function in a cbffi.c file:

void do_callback(void (*func)(void* this, char* s), void* this) {
    func(this, "hello world");
}

The Pony code to use this is:

use "lib:cbffi"

class Foo
  let prefix: String
  let env: Env

  new create(e: Env, p: String) =>
    prefix = p
    env = e

  fun display(msg: Pointer[U8]) =>
    env.out.print(prefix + ":" + String.copy_cstring(msg))

actor Main
  new create(env: Env) =>
    let foo = Foo.create(env, "From Pony")
    @do_callback[None](addressof foo.display, foo)

Note that the display function takes a Pointer[U8] as an argument. Pointer[U8] is a generic type with U8 being the parameter. In this case it is the C string that the C function passes. Pony String types are an object with fields so C doesn't pass it directly. The String type has a couple of constructor functions that take Pointer[U8] as input and return a Pony String - the one used here, copy_cstring, makes a copy of the C string passed in.

Compile with:

$ mkdir cb
$ cat >cb/main.pony
  ...Pony code...
$ cat >cb/cbffi.c
  ...C code...
$ gcc -fPIC -shared -o libcbffi.so cb/cbffi.c
$ LIBRARY_PATH=. ponyc cb
$ LD_LIBRARY_PATH=. ./cb1
  From Pony:hello world

Here LIBRARY_PATH is set to find the shared library during compiling and linking. To run the generated executable LD_LIBRARY_PATH is used to find the shared library at runtime.

It's also possible to link against static C libraries:

$ rm libcbffi.so
$ gcc -c -o libcbffi.o cb/cbffi.c
$ ar -q libcbffi.a libcbffi.o
$ LIBRARY_PATH=. ponyc cb
$ ./cb1
  From Pony:hello world

Things to look out for

While writing Pony code I came across a couple of things to be aware of. Each actor has their own garbage collector but it runs only between behaviour calls. If a behaviour runs for a long time, never calling another actor behaviour, then it can be a while before garbage is collected. An example of where this can happen is a simple Main actor where everything is done in the default constructor and never calls another actor. Benchmarks can be an example here. No GC will occur and you can get an OOM (Out of Memory) situation.

Another is that there is no backpressure handling for behaviour calls on an actor. The message queues are unbounded so if a producer sends messages to an actor at a faster rate than it processes them then it will eventually OOM. This can occur if you have the message sender tied to an external process. For example a TCP listener that uses sockets and translates the data to a message to an actor. If the external users of the TCP interface (a webserver for example) are sending data faster than the actor handling the messages then OOM will occur. Slides from the Pony developers indicates that backpressure is on their radar to look at.

As usual with a new programming language there is a lack of libraries and library documentation. Expect to look through the Pony source code to find examples of how to do things. The tutorial is great though - even though parts are incomplete - and is on github.

There is a --docs command line argument that can be used to parse docstrings in Pony libraries and produce documentation in markdown format. For example:

$ cd packages
$ ponyc --docs collections
$ ls collections-docs/

Conclusion

This has only been a quick overview of some features of Pony. There's more too it. Some places to get more Pony information:

Pony website
Tutorial
/r/ponylang
#ponylang on irc.freenode.net
Mailing List
Online Sandbox to try Pony in a browser

Tags: pony

2015-09-14

Using Freenet for Static Websites

This website is generated from markdown to static HTML and I mirror it on Freenet. Data on Freenet slowly disappears if it is not regularly requested and this happens to parts of the mirror of my blog since many posts have a small target audience and the cross section of Freenet users and that target audience results in a low number of requests.

I've thought about changing the clearnet site so it is a thin proxy in front of a Freenet node and retrieves the data from Freenet. This enables all the clearnet requests to contribute to the healing of the Freenet data. It also means an update to the site on Freenet will automatically be reflected in the clearnet version.

The recent announcement of Neocities mirroring their sites on IPFS prompted me to try this on Freenet to see how viable it was.

I've been able to get something working and this site is now being served directly from the Freenet data with nginx acting as a caching reverse proxy. Performance is acceptable. Taking this approach has a security tradeoff in that I've had to lock down internal node pages that may allow manipulating the node directly. See the end of this post for details on this.

Freenet has an API called FCP that allows retrieval of content programatically. I thought about writing a simple HTTP proxy server that would retrieve the requests from Freenet via FCP and send them back to the requester. I didn't want to invest too much effort into a proof of concept so I looked to see if there are existing tools to do this.

SCGIPublisher is a plugin for Freenet that provides an SCGI interface to Freenet content using a whitelist to expose only the desired data. It expects to be exposing actual Freenet URIs and keys. I want to hide all this behind my standard domain and I couldn't work out how to prevent the filtering of data and rewriting of URLs that it does. An example of SCGIPublish usage is d6.gnutella2.info - it's a proxy that provides access to a number of Freenet sites from clearnet.

Freenet already has a built in proxy, FProxy. It does filtering of the requested data to remove JavaScript and detect potentially malicious file formats. If I could disable this filtering and use nginx as a reverse proxy I'd be able to get what I wanted without writing any code. It turns out this can be disabled by doing the following:

In the Freenet node Configuration/Web Interface menu, set "Disable progress page when loading pages?" to false.
In the same menu, set "Maximum size of transparent pass-through in the web interface where we cannot show progress" and "Maximum size for transparent pass-through in the web interface where we can show a progress bar" to something higher than the maximum file size of the site you are exposing. Without this the user will receive an HTML page instead of the required content if the content is large.
Append "?forcedownload=true" to all requested URLs.

With this setup an nginx reverse proxy can be created that uses the Freenet node web interface as the upstream. Unfortunately setting ?forcedownload=true results in Freenet not sending the mime type for the content so I had to create a lookup table in nginx to compute the mime type. This table looks like:

map $uri $custom_content_type {
     default         "text/html";
     ~(.*\.xml)$  "text/xml";
     ~(.*\.rss)$  "application/rss+xml";
     ~(.*\.png)$  "image/png";
     ~(.*\.gif)$  "image/gif";
     ~(.*\.jpg)$  "image/jpeg";
     ~(.*\.pdf)$  "application/pdf";
     ..etc...
 }

In the server section of the configuration I set up some high timeout values to cater for the inital slowness of the freenet node. I intercept the 404 and 500 error pages to display some static HTML error messages. This stops the Freenet proxy error pages from having internal links allowing doing things on the node.

server {
    listen       80;
    server_name proxy.example.com;

    proxy_intercept_errors on;
    error_page 404 /404.html;
    error_page 500 /500.html;
    proxy_connect_timeout 300;
    proxy_send_timeout 300;
    proxy_read_timeout 300;
    send_timeout 300;
    try_files $uri $uri/index.html;
    location /404.html {
        root /var/www/html/;
        allow all;
        internal;
    }
    location /500.html {
        root /var/www/html/;
        allow all;
        internal;
    }
    ... location blocks ...
}

Following that comes the location blocks. These are hardcoded for the Freenet keys being exposed to prevent the proxy being used to browse any Freenet site. I've shortened the actual key below with .... to keep the example short.

This block hides headers returned by the Freenet proxy, adds the ?forcedownload=true query parameter and sets the proxy_pass to go to the Freenet node with the hardcoded key.

location /freenet/USK@..../bluishcoder/ {
        proxy_intercept_errors on;

        index index.html;
        proxy_redirect
            ~^/freenet:USK@..../bluishcoder/(?<edition>[0-9]+)/(.*)\?forcedownload=true$
            /freenet/USK@..../bluishcoder/$edition/$2;
        set $args forcedownload=true;
        proxy_hide_header Content-Type;
        proxy_hide_header Content-Transfer-Encoding;
        proxy_hide_header Content-Disposition;
        proxy_hide_header X-Content-Type-Options;
        proxy_hide_header X-Content-Security-Policy;
        proxy_hide_header X-Webkit-Csp;
        proxy_hide_header Content-Security-Policy;
        proxy_hide_header Cache-Control;
        proxy_hide_header Pragma;
        add_header Content-Type $custom_content_type;
        error_page 301 302 307 =200 @redir;

        proxy_pass http://127.0.0.1:8888/USK@..../bluishcoder/;
}

USK keys have an edition number that is incremented every time a site is updated. I bookmark the USK key in the node which results in it subscribing for updates and it will automatically pick up the latest edition, even if a request in the nginx configuration file is coded to use a specific edition number.

I don't include the USK edition number in the proxy_pass request here to make handling edition updates easier. If a request is made for an edition where a later one is available the node will send a 301 redirect. The Location header in the redirect is of the format freenet:USK@.../bluishcoder/edition/.... The edition is a numeric value of the latest edition number. The proxy_redirect statement rewrites this into the URL format that our location block uses.

At this point the error_page statement is hit which converts the 301 location moved response to a 200 (HTTP OK response) and passes it to a redir block:

location @redir {
  set $args forcedownload=true;
  proxy_hide_header Content-Type;
  proxy_hide_header Content-Transfer-Encoding;
  proxy_hide_header Content-Disposition;
  proxy_hide_header X-Content-Type-Options;
  proxy_hide_header X-Content-Security-Policy;
  proxy_hide_header X-Webkit-Csp;
  proxy_hide_header Content-Security-Policy;
  proxy_hide_header Cache-Control;
  proxy_hide_header Pragma;
  add_header Content-Type $custom_content_type;

  set $foo $upstream_http_location;
  proxy_pass http://127.0.0.1:8888$foo;
}

This block saves the original upstream HTTP Location header (The rewritten one in our own URL format) and passes it back to the proxy to get the data. In this way updated USK editions are handled even though requests are made for earlier editions.

With this in place we have a Freenet proxy for whitelisted URLs that work with requests lke http://proxy.example.com/freenet/USK@.../bluishcoder/17/.... Putting another reverse proxy in front of this allows standard clearnet URLs to be used:

server {
    listen       80;
    server_name example.com;

    proxy_intercept_errors on;
    error_page 404 /404.html;
    error_page 500 /500.html;
    proxy_connect_timeout 300;
    proxy_send_timeout 300;
    proxy_read_timeout 300;
    send_timeout 300;
    try_files $uri $uri/index.html;
    location /404.html {
        root /var/www/html/;
        allow all;
        internal;
    }
    location /500.html {
        root /var/www/html/;
        allow all;
        internal;
    }

    location / {
        index index.html;
        proxy_pass http://proxy.example.com/freenet/USK@..../bluishcoder/17/;
    }
}

Now requests to http://example.com proxy to the Freenet USK and avoids the internal filtering and progress pages.

I've got this site running with this setup and it works pretty well. I use nginx proxy caching to cache requests for a period of time to ease the load on the Freenet node. There are some rough edges. The first request for a large file, video for example, takes a while as it has to download the entire video - it can't stream it. Once it is cached by nginx then streaming to other requests is fine.

Taking this approach got things working quickly but I think in the long term it would be better to take the approach of writing a proxy that utilizes FCP as described earlier. This would enable using the mime type that Freenet knows for the files, avoiding the manual table in the nginx configuration, and avoids any possible security issues from accidentally leaking internal Freenet proxy pages. It helps prove the approach as viable however.

Security Tradeoffs

If an internal node page somehow became available to the user they may be able to access any Freenet URL, download files to any location on the machine and upload any files on the machine. They can reconfigure the node and create connections to other nodes.

To restrict the damage they can do I've changed the following settings on the node:

In Configuration/Core Settings change the "Directories downloading is allowed to" to empty. This prevents the node being used to download files to disk.
In Configuration/Core Settings change the "Directories uploading is allowed from" to empty. This prevents the node being used to upload files.
In Configuraiton/Web Interface change "Public Gateway Mode" to true. This will prevent the user from being able to change any node settings. You should configure an IP address for an admin user to access the node with full administation settings. Optionally, and what I do, is enable it for all users. If I want to temporarily administer the node I shut it down from a shell, edit freenet.ini to change fproxy.publicGatewayMode to false, and restart.

These issues would go away if a proxy that uses FCP, or a Freenet plugin that does similar, was created.

Building Erlang for Android

Support for building Erlang on Android is provided in the standard Erlang source.

Build setup

I use the Erlang git version for building. Cloning can be done with:

git clone https://github.com/erlang/otp

In the xcomp directory of the cloned repository there is an erl-xcomp-arm-android.conf file that contains details for cross compiling to Android. This can be modified per the instructions in the Erlang cross compilation documentation but the defaults are probably fine.

Some environment variables need to be set to locate the Android NDK:

export NDK_ROOT=/path/to/ndk/root
export NDK_PLAT=android-21

The NDK_PLAT environment variable identifies the Android API version to use for building. In this case android-21 is for KitKat (See STABLE-APIS.html).

Add the path to the Android version of the gcc compiler:

export PATH=$PATH:$NDK_ROOT/toolchains/arm-linux-androideabi-4.8/prebuilt/linux-x86_64/bin

When building from the git repository an initial step of generating configuration files needs to be done. This requires autoconf version 2.59. If autoconf2.59 is the command to run this version you may need to change some symlinks or defaults for your OS or you can edit the otp_build file to replace occurences of autoconf with autoconf2.59.

Building

Building from git requires generating build configuration files first:

./otp_build autoconf

Once generated, run the configure step. This will configure both a host version of Erlang for bootstrapping and the Android version:

./otp_build configure --xcomp-conf=xcomp/erl-xcomp-arm-android.conf

Build a bootstrap system of Erlang for the host machine, followed by one for the Android target:

./otp_build boot -a

Installing

Installing to an Android device involves running a build step to copy the files to a temporary directory, run a script to change paths to the directory where the installation will be on the Android device and pushing the final result to the device.

In the following series of commands I use /tmp/erlang as the temporary directory on the host system and /data/local/tmp/erlang as the install directory on the Android device. The directory /data/local/tmp is writable on non-rooted Android KitKat devices. It's useful for testing.

./otp_build release -a /tmp/erlang
cd /tmp/erlang
./Install -cross -minimal /data/local/tmp/erlang

One of the files bin/epmd is a symlink which adb push has problems with. For the copying of the files below I delete the file and manually recreate the symlink after the push:

adb shell mkdir /data/local/tmp/erlang
cd /tmp
rm erlang/bin/epmd
adb push erlang /data/local/tmp/erlang/
adb shell ln -s /data/local/tmp/erlang/erts-6.4.1/bin/epmd \
                /data/local/tmp/erlang/bin/epmd

The adb commands assume the device is already connected and can be accessed via adb.

Running

Once the final push completes Erlang can be run via adb shell or a terminal application on the device:

$ adb shell
$ cd /data/local/tmp/erlang
$ sh bin/erl    
Eshell V6.4.1  (abort with ^G)
1>

You may get an error about sed not being found. This is due to a sed command run on the first argument of the erl shell script. A workaround for this is to build an Android version of sed and install it along with Erlang.

Networking

I've tested with some basic Erlang functionality and it works fine. Some tweaks need to be made for networking however. The method Erlang uses for DNS lookup doesn't work on Android. By default it's using native system calls. With a configuration file it can be changed to use its own internal DNS method. Create a file with the following contents:

{lookup, [file,dns]}.
{nameserver, {8,8,8,8}}.

In this case the nameserver for DNS lookups is hardcoded to Google's DNS. Ideally this would be looked up somehow using Android functionality for whatever is configured on the phone but this works for test cases. Push this file to the device and run erl with it passed as an argument like so (inetrc is the name I used for the file in this case):

$ adb push inetrc /data/local/tmp/erlang/
$ adb shell
$ cd/data/local/tmp/erlang
$ sh bin/erl -kernel inetrc '"./inetrc"'

Network examples should now work:

1> inets:start().
ok
2> inet_res:getbyname("www.example.com",a).
{ok,{hostent,"www.example.com",[],inet,4,[{93,184,216,34}]}}
3> httpc:request(get, {"http://bluishcoder.co.nz/index.html", []}, [], []).
{ok,...}

More information on the inetrc file format is available in the Erlang documentation.

Conclusion

This showed that a basic installation of Erlang works on Android. I've also tested on a Firefox OS phone with root access. An interesting project would be to install Erlang on either a Firefox OS or an Android AOSP build as a system service and write phone services in Erlang as a test for an Erlang based device.

Tags: erlang

2015-03-24

Contributing to Servo

Servo is a web browser engine written in the Rust programming language. It is being developed by Mozilla. Servo is open source and the project is developed on github.

I was looking for a small project to do some Rust programming and Servo being written in Rust seemed likely to have tasks that were small enough to do in my spare time yet be useful contributions to the project. This post outlines how I built Servo, found issues to work on, and got them merged.

Preparing Servo

The Servo README has details on the pre-requisites needed. Installing the pre-requisites and cloning the repository on Ubuntu was:

$ sudo apt-get install curl freeglut3-dev \
   libfreetype6-dev libgl1-mesa-dri libglib2.0-dev xorg-dev \
   msttcorefonts gperf g++ cmake python-virtualenv \
   libssl-dev libbz2-dev libosmesa6-dev 
...
$ git clone https://github.com/servo/servo

Building Rust

The Rust programming language has been fairly volatile in terms of language and library changes. Servo deals with this by requiring a specific git commit of the Rust compiler to build. The Servo source is periodically updated for new Rust versions. The commit id for Rust that is required to build is stored in the rust-snapshot-hash file in the Servo repository.

If the Rust compiler isn't installed already there are two options for building Servo. The first is to build the required version of Rust yourself, as outlined below. The second is to let the Servo build system, mach, download a binary snapshot and use that. If you wish to do the latter, and it may make things easier when starting out, skip this step to build Rust.

$ cat servo/rust-snapshot-hash
d3c49d2140fc65e8bb7d7cf25bfe74dda6ce5ecf/rustc-1.0.0-dev
$ git clone https://github.com/rust-lang/rust
$ cd rust
$ git checkout -b servo d3c49d2140fc65e8bb7d7cf25bfe74dda6ce5ecf
$ ./configure --prefix=/home/myuser/rust
$ make
$ make install

Note that I configure Rust to be installed in a directory off my home directory. I do this out of preference to enable managing different Rust versions. The build will take a long time and once built you need to add the prefix directories to the PATH:

$ export PATH=$PATH:/home/myuser/rust/bin
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/myuser/rust/lib

Building Servo

There is a configuration file used by the Servo build system to store information on what Rust compiler to use, whether to use a system wide Cargo (Rust package manager) install and various paths. This file, .servobuild, should exist in the root of the Servo source that was cloned. There is a sample file that can be used as a template. The values I used were:

[tools]
system-rust = true
system-cargo = false

[build]
android = false
debug-mozjs = false

If you want to use a downloaded binary snapshot of Rust to build Servo you should set the system-rust setting to false. With it set to true as above it will expect to find a Rust of the correct version in the path.

Servo uses the mach command line interface that is used to build Firefox. Once the .servobuild is created then Servo can be built with:

$ ./mach build

Servo can be run with:

$ ./mach run http://bluishcoder.co.nz

To run the test suite:

$ ./mach test

Finding something to work on

The github issue list has three useful labels for finding work. They are:

For my first task I searched for E-easy issues that were not currently assigned (using the C-assigned label). I commented in the issue asking if I could work on it and it was then assigned to me by a Servo maintainer.

Submitting the Fix

Fixing the issue involved:

Fork the Servo repository on github.
Clone my fork localling and make the changes required to the source in a branch I created for the issue I was working on.
Commit the changes locally and push them to my fork on github.
Raise a pull request for my branch.

Raising the pull request runs a couple of automated actions on the Servo repository. The first is an automated response thanking you for the changes followed by a link to the external critic review system.

Reviews

The Servo project uses the Critic review tool. This will contain data from your pull request and any reviews made by Servo reviewers.

To address reviews I made the required changes and committed them to my local branch as seperate commits using the fixup flag to git commit. This associates the new commit with the original commit that contained the change. It allows easier squashing later.

$ git commit --fixup=<commit id of original commit>

The changes are then pushed to the github fork and the previously made pull request is automatically updated. The Critic review tool also automatically picks up the change and will associate the fix with the relevant lines in the review.

With some back and forth the changes get approved and a request might be made to squash the commits. If fixup was used to record the review changes then they will be squashed into the correct commits when you rebase:

$ git fetch origin
$ git rebase --autosquash origin/master

Force pushing this to the fork will result in the pull request being updated. When the reviewer marks this as r+ the merge to master will start automatically, along with a build and test runs. If test failures happen these get added to the pull request and the review process starts again. If tests pass and it merges then it will be closed and the task is done.

A full overview of the process is available on the github wiki under Github and Critic PR handling 101.

Conclusion

The process overhead of committing to Servo is quite low. There are plenty of small tasks that don't require a deep knowledge of Rust. The first task I worked on was basically a search/replace. The second was more involved, implementing view-source protocol and text/plain handling. The latter allows the following to work in Servo:

$ ./mach run view-source:http://bluishcoder.co.nz
$ ./mach run http://cd.pn/plainttext.txt

The main issues I encountered working with Rust and Servo were:

Compiling Servo is quite slow. Even changing private functions in a module would result in other modules rebuilding. I assume this is due to cross module inlining.
I'd hoped to get away from intermittent test failures like there are in Gecko but there seems to be the occasional intermittent reftest failure.

The things I liked:

Very helpful Servo maintainers on IRC and in github/review comments.
Typechecking in Rust helped find errors early.
I found it easier comparing Servo code to HTML specifications and following them together than I do in Gecko.

I hope to contribute more as time permits.

Tags: mozilla rust servo

2015-03-03

Firefox Media Source Extensions Update

This is an update on some recent work on the Media Source Extensions API in Firefox. There has been a lot of work done on MSE and the underlying media framework by Gecko developers and this update just covers some of the telemetry and exposed debug data that I've been involved with implementing.

Telemetry

Mozilla has a telemetry system to get data on how Firefox behaves in the real world. We've added some MSE video stats to telemetry to help identify usage patterns and possible issues.

Bug 1119947 added information on what state an MSE video is in when the video is unloaded. The intent of this is to find out if users are exiting videos due to slow buffering or seeking. The data is available on telemetry.mozilla.org under the VIDEO_MSE_UNLOAD_STATE category. This has five states:

0 = ended, 1 = paused, 2 = stalled, 3 = seeking, 4 = other

The data provides a count of the number of times a video was unloaded for each state. If a large number of users were exiting during the stalled state then we might have an issue with videos stalling too often. Looking at current stats on beta 37 we see about 3% unloading on stall with 14% on ended and 57% on other. The 'other' represents unloading during normal playback.

Bug 1127646 will add additional data to get:

Join Latency - time between video load and video playback for autoplay videos
Mean Time Between Rebuffering - play time between rebuffering hiccups

This will be useful for determining performance of MSE for sites like YouTube. The bug is going through the review/comment stage and when landed the data will be viewable at telemetry.mozilla.org.

about:media plugin

While developing the Media Source Extensions support in Firefox we found it useful to have a page displaying internal debug data about active MSE videos.

In particular it was good to be able to get a view of what buffered data the MSE JavaSript API had and what our internal Media Source C++ code stored. This helped track down issues involving switching buffers, memory size of resources and other similar things.

The internal data is displayed in an about:media page. Originally the page was hard coded in the browser but :gavin suggested moving it to an addon. The addon is now located at https://github.com/doublec/aboutmedia. That repository includes the aboutmedia.xpi which can be installed directly in Firefox. Once installed you can go to about:media to view data on any MSE videos.

To test this, visit a video that has MSE support in a nightly build with the about:config preferences media.mediasource.enabled and media.mediasource.mp4.enabled set to true. Let the video play for a short time then visit about:media in another tab. You should see something like:

https://www.youtube.com/watch?v=3V7wWemZ_cs
  mediasource:https://www.youtube.com/6b23ac42-19ff-4165-8c04-422970b3d0fb
    currentTime: 101.40625
    SourceBuffer 0
      start=0 end=14.93043
    SourceBuffer 1
      start=0 end=15

    Internal Data:
      Dumping data for reader 7f9d85ef1800:
        Dumping Audio Track Decoders: - mLastAudioTime: 7.732243
          Reader 1: 7f9d75cba800 ranges=[(10.007800, 14.930430)] active=false size=79880
          Reader 0: 7f9d85e88000 ranges=[(0.000000, 10.007800)] active=false size=160246
        Dumping Video Track Decoders - mLastVideoTime: 7.000000
          Reader 1: 7f9d75cbd800 ranges=[(10.000000, 15.000000)] active=false size=184613
          Reader 0: 7f9d85985000 ranges=[(0.000000, 10.000000)] active=false size=1281914

The first portion of the displayed data shows the JS API video of the data buffered:

currentTime: 101.40625
  SourceBuffer 0
    start=0 end=14.93043
  SourceBuffer 1
    start=0 end=15

This shows two SourceBuffer objects. One containing data from 0-14.9 seconds and the other 0-15 seconds. One of these will be video data and the other audio. The currentTime attribute of the video is 101.4 seconds. Since there is no buffered data for this range the video is likely buffering. I captured this data just after seeking while it was waiting for data from the seeked point.

The second portion of the displayed data shows information on the C++ objects implementing media source:

Dumping data for reader 7f9d85ef1800:
  Dumping Audio Track Decoders: - mLastAudioTime: 7.732243
    Reader 1: 7f9d75cba800 ranges=[(10.007800, 14.930430)] active=false size=79880
    Reader 0: 7f9d85e88000 ranges=[(0.000000, 10.007800)] active=false size=160246
  Dumping Video Track Decoders - mLastVideoTime: 7.000000
    Reader 1: 7f9d75cbd800 ranges=[(10.000000, 15.000000)] active=false size=184613
    Reader 0: 7f9d85985000 ranges=[(0.000000, 10.000000)] active=false size=1281914

A reader is an instance of the MediaSourceReader C++ class. That reader holds two SourceBufferDecoder C++ instances. One for audio and the other for video. Looking at the video decoder it has two readers associated with it. These readers are instances of a derived class of MediaDecoderReader which are tasked with the job of reading frames from a particular video format (WebM, MP4, etc).

The two readers each have buffered data ranging from 0-10 seconds and 10-15 seconds. Neither are 'active'. This means they are not currently the video stream used for playback. This will be because we just started a seek. You can view how buffer switching works by watching which of these become active as the video plays. The size is the amount of data in bytes that the reader is holding in memory. mLastVideoTime is the presentation time of the last processed video frame.

MSE videos will have data evicted as they are played. This size threshold for eviction defaults to 75MB and can be changed with the media.mediasource.eviction_threshold variable in about:config. When data is appended via the appendBuffer method on a SourceBuffer an eviction routine is run. If data greater than the threshold is held then we start removing portions of data held in the readers. This will be noticed in about:media by the start and end ranges being trimmed or readers being removed entirely.

This internal data is most useful for Firefox media developers. If you encounter stalls playing videos or unusual buffer switching behaviour then copy/pasting the data from about:media in a bug report can help with tracking the problem down. If you are developing an MSE player then the information may also be useful to find out why the Firefox implementation may not be behaving how you expect.

The source of the addon is on github and relies on a chrome only debug method, mozDebugReaderData on MediaSource. Patches to improve the data and functionality are welcome.

Status

Media Source Extensions is still in progress in Firefox and can be tested on Nightly, Aurora and Beta builds. The current plan is to enable support limited to YouTube only in Firefox 37 on Windows and Mac OS X for MP4 videos. Other platforms, video formats and wider site usage will be enabled in future versions as the implementation improves.

To track work on the API you can follow the MSE bug in Bugzilla.

Tags: mozilla

← Older Newer →

This site is accessable over tor as hidden service 6vp5u25g4izec5c37wv52skvecikld6kysvsivnl6sdg6q7wy25lixad.onion, or Freenet using key:
USK@1ORdIvjL2H1bZblJcP8hu2LjjKtVB-rVzp8mLty~5N4,8hL85otZBbq0geDsSKkBK4sKESL2SrNVecFZz9NxGVQ,AQACAAE/bluishcoder/-61/

Bluish Coder

A quick look at the Pony Programming Language

Installing

Compiling Pony programs

Actors

Capabilities

val and ref

ref and iso

Pattern Matching

C FFI

Callbacks

Calling C code example

Things to look out for

Conclusion

Using Freenet for Static Websites

Security Tradeoffs

Related Links

Building Erlang for Android

Build setup

Building

Installing

Running

Networking

Conclusion

Contributing to Servo

Preparing Servo

Building Rust

Building Servo

Finding something to work on

Submitting the Fix

Reviews

Conclusion

Firefox Media Source Extensions Update

Telemetry

about:media plugin

Status