Bluish Coder

Programming Languages, Martials Arts and Computers. The Weblog of Chris Double.


2016-07-18

Borrowing in Pony

The 'TL;DR' of this post on how to borrow internal fields of iso objects in Pony is:

To borrow fields internal to an iso object, recover the object to a ref (or other valid capability) perform the operations using the field, then consume the object back to an iso.

Read on to find out why.

In this post I use the term borrowing to describe the process of taking a pointer or reference internal to some object, using it, then returning it. An example from C would be something like:

void* new_foo();
void* get_bar(foo* f);
void  delete_foo(foo* f);

...
void* f = new_foo();
void* b = get_bar(f);
...
delete_foo(f);

Here a new foo is created and a pointer to a bar object returned from it. This pointer is to data internal to foo. It's important not to use it after foo is deleted as it will be a dangling pointer. While holding the bar pointer you have an alias to something internal to foo. This makes it difficult to share foo with other threads or reason about data races. The foo object could change the bar data without the holder of the borrowed pointer to bar knowing making it a dangling pointer, or invalid data, at any time. I go through a real world case of this in my article on using C in the ATS programming language.

Pony has the concept of a reference to an object where only one pointer to that object exists. It can't be aliased and nothing else can read or write to that object but the current reference to it. This is the iso reference capability. Capabilities are 'deep' in pony, rather than 'shallow'. This means that the reference capability of an alias to an object affects the reference capabilities of fields of that object as seen by that alias. The description of this is in the viewpoint adaption section of the Pony tutorial.

The following is a Pony equivalent of the previous C example:

class Foo
  let bar: Bar ref
...
let f: Foo ref = Foo.create()
let b: Bar ref = f.bar

The reference capability of f determines the reference capability of bar as seen by f. In this case f is a ref (the default of class objects) which according to the viewpoint adaption table means that bar as seen by f is also a ref. Intuitively this makes sense - a ref signifies multiple read/write aliases can exist therefore getting a read/write alias to something internal to the object is no issue. A ref is not sendable so cannot be accessed from multiple threads.

If f is an iso then things change:

class Foo
  let bar: Bar ref
...
let f: Foo iso = recover iso Foo.create() end
let b: Bar tag = f.bar

Now bar as seen by f is a tag. A tag can be aliased but cannot be used to read/write to it. Only object identity and calling behaviours is allowed. Again this is intuitive. If we have a non-aliasable reference to an object (f being iso here) then we can't alias internally to the object either. Doing so would mean that the object could be changed on one thread and the internals modified on another giving a data race.

The viewpoint adaption table shows that given an iso f it's very difficult to get a bar that you can write to. The following read only access to bar is ok:

class Foo
  let bar: Bar val
...
let f: Foo iso = recover iso Foo.create() end
let b: Bar val = f.bar

Here bar is a val. This allows multiple aliases, sendable across threads, but only read access is provided. Nothing can write to it. According to viewpoint adaption, bar as seen by f is a val. It makes sense that given a non-aliasable reference to an object, anything within that object that is immutable is safe to borrow since it cannot be changed. What if bar is itself an iso?

class Foo
  let bar: Bar iso = recover iso Bar end
...
let f: Foo iso = recover iso Foo.create() end
let b: Bar iso = f.bar

This won't compile. Viewpoint adaption shows that bar as seen by f is an iso. The assignment to b doesn't typecheck because it's aliasing an iso and iso reference capabilities don't allow aliasing. The usual solution when a field isn't involved is to consume the original but it won't work here. The contents of an objects field can't be consumed because it would then be left in an undefined state. A Foo object that doesn't have a valid bar is not really a Foo. To get access to bar externally from Foo the destructive read syntax is required:

class Foo
  var bar: Bar iso = recover iso Bar end
...
let f: Foo iso = recover iso Foo.create() end
let b: Bar iso = f.bar = recover iso Bar end

This results in f.bar being set to a new instance of Bar so it's never in an undefined state. The old value of f.bar is then assigned to b. This is safe as there are no aliases to it anymore due to the first part of the assignment being done first.

What if the internal field is a ref and we really want to access it as a ref? This is possible using recover. As described in the tutorial, one of the uses for recover is:

"Extract" a mutable field from an iso and return it as an iso.

This looks like:

class Foo
  let bar: Bar ref
... 
let f: Foo iso = recover iso Foo end
let f' = recover iso
           let f'': Foo ref = consume f
           let b: Bar ref = f''.bar
           consume f''
         end

Inside the recover block f is consumed and returned as a ref. The f alias to the object no longer exists at this point and we have the same object but as a ref capability in f''. bar as seen by f'' is a ref according to viewpoint adaption and can now be used within the recover block as a ref. When the recover block ends the f'' alias is consumed and returned out of the block as an iso again in f'.

This works because inside the recover block only sendable values from the enclosing scope can be accessed (ie. val, iso, or tag). When exiting the block all aliases except for the object being returned are destroyed. There can be many aliases to bar within the block but none of them can leak out. Multiple aliases to f' can be created also and they are not going to leaked either. At the end of the block only one can be returned and by consuming it the compiler knows that there are no more aliases to it so it is safe to make it an iso.

To show how the ref aliases created within the recover block can't escape, here's an example of an erroneous attempt to assign the f' alias to an object in the outer scope:

class Baz
  var a: (Foo ref | None) = None
  var b: (Foo ref | None) = None

  fun ref set(x: Foo ref) =>
    a = x
    b = x

class Bar

class Foo
  let bar: Bar ref = Bar

var baz: Baz iso = recover iso Baz end
var f: Foo iso = recover iso Foo end
f = recover iso
      let f': Foo ref = consume f
      baz.set(f')
      let b: Bar ref = f'.bar
      consume f'
    end

If this were to compile then baz would contain two references to the f' object which is then consumed as an iso. f would contain what it thinks is non-aliasable reference but baz would actually hold two additional references to it. This fails to compile at this line:

main.pony:20:18: receiver type is not a subtype of target type
          baz.set(f')
                 ^
Info:
main.pony:20:11: receiver type: Baz iso!
              baz.set(f')
              ^
main.pony:5:3: target type: Baz ref
      fun ref set(x: Foo ref) =>
      ^
main.pony:20:18: this would be possible if the arguments and return value were all sendable
              baz.set(f')
                     ^

baz is an iso so is allowed to be accessed from within the recover block. But the set method on it expects a ref receiver. This doesn't work because the receiver of a method of an object is also an implicit argument to that method and therefore needs to be aliased. In this way it's not possible to store data created within the recover block in something passed into the recover block externally. No aliases can be leaked and the compiler can track things easily.

There is something called automatic receiver recovery that is alluded to in the error message ("this would be possible...") which states that if the arguments were sendable then it is possible for the compiler to work out that it's ok to call a ref method on an iso object. Our ref arguments are not sendable which is why this doesn't kick in.

A real world example of where all this comes up is using the Pony net/http package. A user on IRC posted the following code snippet:

use "net/http"
class MyRequestHandler is RequestHandler

  let env: Env

  new val create(env': Env) =>
    env = env'

  fun val apply(request: Payload iso): Any =>
    for (k, v) in request.headers().pairs() do
      env.out.print(k)
      env.out.print(v)
    end

    let r = Payload.response(200)
    r.add_chunk("Woot")
    (consume request).respond(consume r)

The code attempts to iterate over the HTTP request headers and print them out. It fails in the request.headers().pairs() call, complaining that tag is not a subtype of box in the result of headers() when calling pairs(). Looking at the Payload class definition shows:

class iso Payload
  let _headers: Map[String, String] = _headers.create()

  fun headers(): this->Map[String, String] =>
    _headers

In the example code request is an iso and the headers function is a box (the default for fun). The return value of headers uses an arrow type. It reads as "return a Map[String, String] with the reference capability of _headers as seen by this". In this example this is the request object which is iso. _headers is a ref according to the class definition. So it's returning a ref as seen by an iso which according to viewpoint adaption is a tag.

This makes sense as we're getting a reference to the internal field of an iso object. As explained previously this must be a tag to prevent data races. This means that pairs() can't be called on the result as tag doesn't allow function calls. pairs() is a box method which is why the error message refers to tag not being a subtype of box.

To borrow the headers correctly we can use the approach done earlier of using a recover block:

fun val apply(request: Payload iso): Any =>
  let request'' = recover iso
    let request': Payload ref = consume request
    for (k, v) in request'.headers().pairs() do
      env.out.print(k)
      env.out.print(v)
    end
    consume request'
  end
  let r = Payload.response(200)
  r.add_chunk("Woot")
  (consume request'').respond(consume r)

In short, to borrow fields internal to an iso object, recover the object to a ref (or other valid capability) perform the operations using the field, then consume the object back to an iso.

Tags: pony 

2016-07-14

Concurrency in Wasp Lisp

Wasp Lisp has a light weight co-operative threading model that's allows programming in an Actor style. It's possible to serialize Wasp values and send them to other processes and machines to be deserialized and run. MOSREF uses this to compile Lisp code on the console process and send the bytecode to drone processes to execute. This allows drones to operate without the Lisp compiler present.

Spawning threads

Threads are created using the spawn function. It takes the function to run as a thread as an argument:

(spawn (lambda () (print "Hello World\n")))

Communication between threads is done using queues. A queue is an unbounded channel that can have many senders but only one receiver. The function send adds data to the queue and wait receives data. If there is no data in the queue then wait blocks. Input/Output in Wasp Lisp is done using the same wait/send mechanism making it easy to pipeline data from console and file output to sockets.

Implementing Actors

A basic Actor can be implemented like the following:

(define (actor1)
  (define counter 0)
  (define chan (make-queue))

  (define (loop)
    (define msg (wait chan))
    (cond
      ((eq? msg 'inc)
        (set! counter (+ 1 counter)))
      ((eq? msg 'dec)
        (set! counter (- 1 counter)))
      ((and (list? msg) (eq? (car msg) 'get))
       (send counter (cadr msg))))
    (loop))

  (spawn loop)
  chan)

actor1 is a function that contains a counter holding an numeric value. It creates chan, a queue for holding messages, spawns a thread to run loop and returns the chan so messages can be queued for loop to process.

loop waits for a message on chan. This is a blocking call and the thread will go idle until a message is queued. It processes the message, incrementing or decrementing the counter as requested. An additional message, get, can be used to get the value of the counter. That message also includes a channel object to place the result in. loop recursively calls itself to continue.

A sample interaction is:

>> (define a1 (actor1))
>> (define result (make-queue))
>> (send (list 'get result) a1)
>> (wait result)
:: 0
>> (send 'inc a1)
>> (send (list 'get result) a1)
>> (wait result)
:: 1

This creates an actor and a queue to receive results. It asks for the current value of the actor, increments it, then asks again.

Updating an Actor

It's possible to update the code for an Actor without stopping the application. Running in a Lisp REPL means you can change functions on the fly but you can't change the internal implementation of a running loop from the REPL if that loop is internal to a function. A way around this is to provide the Actor with the means to receive a function as a message that performs the update. Here is an example of an updatable actor:

(define (actor3)
  (define counter 0)
  (define chan (make-queue))

  (define (loop chan)
    (define msg (wait chan))
    (cond
      ((eq? msg 'inc)
        (set! counter (+ 1 counter)))
      ((eq? msg 'dec)
        (set! counter (- 1 counter)))
      ((and (list? msg) (eq? (car msg) 'get))
       (send counter (cadr msg)))
      ((function? msg)
       (return ((msg counter) chan))))
    (loop chan))

  (spawn loop chan)
  chan)

This code contains an additional branch in the cond to check if the message is a function. If it is then that function is called passing the current value of the counter. It is expected to return a function which will be the new loop to call. This can contain any code and effectively updates the entire actor with new functionality. An example update function to change the messages to increment/decrement by two is:

(define (update oldstate)
  (define counter (* oldstate 2))
  (define (loop chan)
    (define msg (wait chan))
    (cond
      ((eq? msg 'inc)
        (set! counter (+ 2 counter)))
      ((eq? msg 'dec)
        (set! counter (- 2 counter)))
      ((and (list? msg) (eq? (car msg) 'get))
       (send counter (cadr msg)))
      ((function? msg)
       (return ((msg counter) chan))))
   (loop chan))
  loop)

An example interaction of the actor and upgrading it is:

>> (define a3 (actor3))
>> (define result (make-queue))
>> (send 'inc a3)
>> (send (list 'get result) a3)
>> (wait result)
:: 1
>> (send update a3)   ;; Updating the actor here
>> (send (list 'get result) a3)
>> (wait result)
:: 2                  ;; This shows the new counter value that 'update' changed
>> (send 'inc a3)
>> (send (list 'get result) a3)
>> (wait result)
:: 4                  ;; Amount is now incrementing by two
>> (send 'inc a3)
>> (send (list 'get result) a3)
>> (wait result)
:: 6

This is a variant of Joe Armstrong's Erlang Universal Server allowing a server to be updated to do anything.

Filters

An idiom when programming in an Actor or coroutine style is to write small processes that take an input, modify it in some way, and send it to another process to do something else. A program becomes a chain or pipeline of these individual processes. Wasp Lisp calls these small units of functionality filters. They are described in filter.ms as:

A process that waits for data from an input channel, and sends data to an output channel. Filters are constructed using a constructor function, then wired together using either the input-chain or output-chain functions."

This is an example of a line filter from the Wasp source code;

(define-filter (line-filter)
  (define buf (make-string 80)) 

  (define (parse)
    (forever
      (define next (string-read-line! buf))
      (if next (send next out)
               (return))))

  (define (line-loop)
    (forever
      (define next (wait-input in))
      (cond 
        ((string? next)
         (string-append! buf next)
         (parse))
        ((eq? next 'close)
         (return))
        (else
          (send-output next out)))))

  (line-loop)

  (send-output buf out)
  (send-output 'close out))

A line-filter receives strings of bytes on the input channel and outputs a complete line on the output channel when it has one. It does this by appending received bytes onto a string buffer and checking if that buffer contains a line. If it does it removes the line data from the buffer and sends it to the output channel. It then continues to wait for data on the input channel. An example of usage:

>> (import "lib/filter")
>> (import "lib/line-filter")
>> (define q (make-queue))
>> (define lines (input-chain q (line-filter)))
>> (spawn (lambda () (forever (print (wait lines)))))

>> (send "hello" q)
>> (send "world\n" q)
helloworld
>> (send "foo\nbar" q)
foo
>> (send "baz\n" q)
barbaz

This creates a queue, q for input data. It creates a chain containing only one filter, the line-filter. It returns the output channel which contains the filtered data. Data placed in q is retrieved by the line filter and when a line is received it is sent to the output channel. A thread is spawned to loop forever printing any lines from the output channel. Notice in the manual sending of data to the channel q that output is only printed by the spawned thread when a line is completed.

Wasp Lisp comes with some default filters for parsing s-expressions, encrypting and decrypting data and fuzzing data amongst other things. Scott Dunlop wrote about coroutines and filters on the Wasp blog.

Sending data to other OS processes

Some Wasp values can be serialized and deserialized. This provides a way to send values to other wasp instances running in different OS processes or machines. Lisp objects are serialized using freeze and unserialized using thaw.

The following server function starts a TCP server on port 10000. Clients connnected to it send Lisp objects to it and it prints it to the standard output on the server process.

(import "lib/tcp-server")

(define (server)
  (define server-output (current-output))

  (define (acceptor)
    (forever
      (define data (wait))
      (with-output server-output
        (print (format (thaw data)))
        (print "\n"))))

  (spawn-tcp-server 10000 acceptor))

The acceptor function is called with its current input and output bound to the TCP stream. For this reason we capture the value of current-output before it is bound so we can output to the server console rather than to the TCP stream. A sample test:

;; On server 
>> (server)

;; On client
>> (define s (tcp-connect "127.0.0.1" 10000))
>> (send (freeze "foo") s)

;; On Server
"foo"

;; On Client
>> (send (freeze 66) s)

;; On Server
66

;; On Client
>> (send (freeze '(one (two three))) s)

;; On Server
(one (two three))

Notice that all i/o is done using the 'send' and 'wait' channel operators. This means we can use a filter to do the freezing/thawing automatically and Wasp has a freeze-filter and thaw-filter that does this. The server becomes:

(import "lib/tcp-server")
(import "lib/package-filter")
(import "lib/filter")
(import "lib/format-filter")

(define (server2)             
  (define server-output (current-output))

  (define (acceptor)
    (define chan (input-chain (current-input)
                              (thaw-filter)
                              (format-filter)))
    (forever
      (define data (wait chan))
      (print* data "\n")))

  (spawn-tcp-server 10000 acceptor))

Usage from a client is:

>> (import "lib/filter")
>> (import "lib/package-filter")

>> (define s (tcp-connect "127.0.0.1" 10000))
>> (define chan (output-chain s (freeze-filter)))
>> (send "hello" chan)
>> (send '(one (two three)) chan)

Through the use of the thaw/freeze filter there is no need to manually call freeze and thaw.

Sending bytecode to other processes

Unfortunately it's not possible to freeze or thaw closures or functions. It is possible however to assemble Lisp to bytecode and send that. This enables sending new functions across OS processes and is how MOSREF is able to compile Lisp on the console and send it to the drone. This example will compile a function from source to bytecode and run it:

>> (define code '((print "Hello World\n")))
>> (define proc (assemble (optimize (compile code))))
>> (proc)
Hello World

The result of assemble can be frozen, sent somewhere and thawed:

>> (define x (freeze (assemble (optimize (compile '((print "Hello World\n")))))))
>> (define y (thaw x))
>> (y)
Hello World

Using this we can have an upgradable server process:

(define (server3)
  (define server-output (current-output))

  (define (acceptor)
    (define chan (input-chain (current-input)
                              (thaw-filter)))

    (define (loop chan)
      (define data (wait chan))
      (cond
        ((function? data)
          (return ((data) chan)))
        (else
          (print* "OLD: " (format data) "\n")
          (return (loop chan)))))
     (loop chan))

  (spawn-tcp-server 10000 acceptor))

This will display the data sent to the server prefixed by "OLD:" unless it is sent a function. In which case it calls that function as the new server loop. An upgraded server loop to prefix with "NEW: " is:

(define (new-server3)
  (assemble
    (optimize
      (compile
        '((define (loop chan)
            (define data (wait chan))
            (cond
              ((function? data)
                (return ((data) chan)))
              (else
                (print* "NEW: " (format data) "\n")
                (return (loop chan))))))))))

We can't send a function directly so this compiles the new loop from source and returns the compiled procedure. This can be frozen, sent to the server and it will execute it as the new loop. An example interaction:

;; On Server
>> (server3)

;; On Client
>> (define s (tcp-connect "127.0.0.1" 10000))
>> (define chan (output-chain s (freeze-filter)))
>> (send '(one (two three)) chan)

;; On Server
OLD: (one (two three))

;; On Client
>> (send (new-server3) chan)
>> (send '(one (two three)) chan)

;; On Server
>> NEW: (one (two three))

Why not send the source to the server process and have it eval it? The approach of sending the bytecode allows the server process to skip including the Lisp compiler. The Wasp VM includes an interpreter and deserializer - the compiler and other libraries are all in Lisp. A Wasp executable consists of the VM stub with bytecode appended to the end of it. On execution it looks for the bytecode, deserializes it and runs it. This provides a minimal program that can have functionality added by sending it bytecode as needed.

An aside on tail call optimization

It's important that a process loop is tail recursive otherwise each call through the loop will increase stack size and eventually exhaust memory. The following is not tail recursive in Wasp Lisp, even though it looks like it should be:

(define (test1 chan)
  (define msg (wait chan))
  (cond
    ((eq msg 'foo)
      (test1 chan))
    ((eq msg 'bar)
      (test1 chan))
    (else
      (test1 chan))))

This is because the recursive call to 'test1' compiles down to bytecode that looks like:

(newf)
(ldg eq)
(arg)
(ldg msg)
(arg)
(ldc bar)
(arg)
(call)
(jf false-47) ;; If the msg is not 'bar then jump to false-47
...
false-47
(newf)
(ldg test1)
(arg)
(ldg chan)
(arg)
(call)        ;; recursively call 'test1'
done-46
done-44
(retn)        ;; return from function 'test1'

The stack frame for test1 is not exited (the retn instruction) until after the recursive call is done. Compare this to the obvious tail recursive case:

(newf)
(ldg wait)
(arg)
(ldg chan)
(arg)
(call)
(stg msg)
(newf)
(ldg test2)
(arg)
(ldg chan)
(arg)
(tail)

Note that tail instruction. This does an immediate jump rather than a call so a retn is not necessary. The call stack does not grow. The difference between the two cases is due to the way the Wasp Lisp compiler generates the instructions and optimizes looking for tail calls. The instructions generated can be viewed using:

(define x '(define (test2 chan)
             (define msg (wait chan))
             (test2 chan)))
(define code (compile x))
(for-each (lambda (x) (print* (format x) "\n")) code)

Using compile shows the first pass which does not look for tail calls:

(newf)
(ldg test2)
(arg)
(ldg chan)
(arg)
(call)
(retn)

Notice the call followed by retn. This is the sequence that optimize looks for to generate the tail instruction:

(define x '(define (test2 chan)
             (define msg (wait chan))
             (test2 chan)))
(define code (optimize (compile x)))
(for-each (lambda (x) (print* (format x) "\n")) code)
...
(newf)
(ldg test2)
(arg)
(ldg chan)
(arg)
(tail)

Looking back at the instructions for test1 the call is followed by a jump or a label before retn so the optimizer misses it. This can be worked around by doing an explicit return statement:

(define (test3 chan)
  (define msg (wait chan))
  (cond
    ((eq msg 'foo)
      (return (test1 chan)))
    ((eq msg 'bar)
      (return (test1 chan)))
    (else
      (return (test1 chan)))))

The code in the cond branches generates to the following which is now a tail call:

(jf false-93)
(newf)
(ldg test1)
(arg)
(ldg chan)
(arg)
(tail)

Some things to note

The Wasp VM is single threaded and non-preemptive. Threads yield to the scheduler explicitly using yield or implicitly when doing i/o or waiting on a queue. The bytecode is cross platform. Serialized objects on one architecture can be deserialized on another. The Wasp VM history comes from Mosquito Lisp and MOSREF - a penetration testing platform. It's written in C with some GNU extensions (nested functions are used in the VM).

This post came about from exploring the difference in Actor programming in the Pony programming language and a dynamic language where the Actor model isn't explicit. The programming style is similar in that pipelines of calls to actors to transform data is a common idiom.

Wasp Lisp isn't actively developed anymore but the author, Scott Dunlop, still processes pull requests and monitors it. I like to use it for projects and tinker with it as it's an interesting little cross platform lisp. MOSREF is useful as a way to access and maintain servers of different architectures, aside from its use as a penetration testing tool.

Some other Wasp resources:

Tags: waspvm 

2016-06-05

Building Static Wasp Lisp Binaries on Linux

Wasp Lisp builds binaries that are linked dynamically to glibc. This ties the binary to specific versions of Linux. It's usually not possible to run on an OS with older glibc versions than what it was compiled against. I wanted to be able to run a single binary of Wasp Lisp and MOSREF drones on new Ubuntu versions and some machines with an older version of Ubuntu. To do this I needed to have the libc linked statically.

Changing Wasp Lisp to statically link glibc doesn't work though. Some networking routines in glibc require dynamic linking. If glibc is statically linked then networking doesn't work.

The solution I opted for is to use musl libc instead of glibc. This is a libc that was designed to be statically linked. To buid Wasp Lisp binaries with musl it required:

  • Building musl libc
  • Building libevent using musl libc headers
  • Building Wasp Lisp against musl and libevent

Building musl libc

Building musl libc requires using git to clone the repository and following the standard configure, make, make install invocations. The bin directory for the musl tools is added to the PATH:

$ git clone git://git.musl-libc.org/musl
$ cd musl
$ ./configure
$ make
$ sudo make install
$ export PATH=$PATH:/usr/local/musl/bin/

Building libevent

Building libevent with musl requires using the musl-gcc command which was installed by the previous step. This invokes GCC with the required options to use musl. The following steps performs the build:

$ wget https://github.com/libevent/libevent/releases/download/release-2.0.22-stable/libevent-2.0.22-stable.tar.gz
$ tar xvf libevent-2.0.22-stable.tar.gz
$ cd libevent-2.0.22-stable/
$ ./configure --prefix=/tmp/musl/usr CC=musl-gcc --enable-static --disable-shared
$ make
$ make install

Building Wasp Lisp

The Wasp VM source requires a change to the Makefile.cf to use static linking for all libraries. This changes:

EXEFLAGS += -Wl,-Bstatic $(STATICLIBS) -Wl,-Bdynamic $(DYNAMICLIBS)

to:

EXEFLAGS += -static $(STATICLIBS) $(DYNAMICLIBS)

I've made this change in the static branch of my github fork . This branch also includes some other changes from the official repository for real number support. Building with musl and libevent is done with:

$ git clone https://github.com/doublec/WaspVM --branch static
$ cd WaspVM
$ CC=musl-gcc CFLAGS="-I /tmp/musl/usr/include -L /tmp/musl/usr/lib" make repl

This runs directly into the Lisp REPL. The following confirms a static binary:

$ ldd wasp
not a dynamic executable

Building MOSREF

The stub generated is also static and can be used to build static drones:

$ cd mod
$ ../waspc -exe ../mosref bin/mosref
$ chmod +x ../mosref
$ ../mosref
console> set addr=xx.xx.xx.xx
console> set port=8000
console> drone mydrone foo linux-x86_64
Drone executable created.

The generated drone should run on a wider range of Linux versions than the non-static build at the cost of a larger size. I rename the waspvm-linux-x86_64 stub to be waspvm-musl-x86-64 so I can generate static drones or dynamic linked drones as needed from the MOSREF console by using linux-x86_64 or musl-x86_64 respectively.

Tags: waspvm 

2016-05-11

Exploring actors in Pony

Pony is an actor oriented programming language. In Pony Actors are objects that can send and receive messages asychronously while processing their received messages sequentially and in parallel with other actors processing their messages. They are the unit of concurrency in the language. Each actor is similar to a lightweight thread of execution in languages that support those.

For background on the Actor model of computation there are a lot of papers at the erights.org actor page. They make good background reading. In this post I'm going to go through some things I've learnt while learning Pony and using actors in some small projects.

An actor is defined very similar to a class. The following class definition creates a counter that can be incremented and decremented:

class Counter
  var count: U32

  new create(start: U32) =>
    count = 0

  fun ref inc() =>
    count = count + 1

  fun ref dec() =>
    count = count - 1

  fun get(): U32 =>
    count

actor Main
  new create(env: Env) =>
    let c1 = Counter(0)
    c1.inc()
    c1.inc()
    env.out.print(c1.get().string())

The first thing to note here is the actor called Main. Every Pony program has an actor called Main that is the entry point for the program. This actor is instantiated by the Pony runtime and the constructor is expected to perform the program operations in a similar manner to how the main functions works in the C programming language.

The Counter class is created in the constructor of Main and incremented a couple of times. All this happens in a single thread of control. Because it operates within a single thread there is no concurrent access to the state held by the counter. This makes it safe to call the inc, dec and get methods. The order of operations is well defined. We can only pass the counter instance to another thread if we give up any aliases to it so we can ensure that it can be safely used elsewhere or if we make it immutable so that nothing can change it at any time.

Behaviours

If we want to use a Counter from multiple threads but still allow modification then making it an actor is an option. This can be done by changing the class keyword to actor and the methods to behaviours:

actor Counter
  var count: U32

  new create(start: U32) =>
    count = 0

  be inc() =>
    count = count + 1

  be dec() =>
    count = count - 1

  be display(out:OutStream) =>
    out.print(count.string())

actor Main
  new create(env: Env) =>
    let c1 = Counter(0)
    c1.inc()
    c1.inc()
    c1.display(env.out)

A behaviour is introduced with the be keyword. It is like a function except that it is asynchronous. When a behaviour is called it is not executed immediately.

Internally each actor has a queue for holding messages. Each behaviour call on an actor puts a message in that queue to run that behaviour at some future point in time. The actor runs a message loop that pops a message off the queue and runs the associated behaviour. When the behaviour completes executing then it will run the next one in the queue for that actor. If there are none left to run the the actor is idle until a behaviour is called. During this idle period it can perform garbage collection. The Pony runtime has a scheduler that uses operating system threads to execute actor behaviours. In this way multiple behaviours for different actors can be running on many OS threads at the same time.

The behaviours that are queued for an individual actor are executed sequentially. Two behaviours for the same actor will never run concurrently. This means that within a behaviour the actor has exclusive access to its internal state. There is no need for locks or guards to control access. For this reason it helps to think of actors as a unit of sequentiality rather than of a parallelism. See the actors section of the tutorial for more on this.

The main change with the conversion of the counter class is there is no longer a get method. It's replaced by a display behaviour that outputs the string. get was removed because behaviours are executed asynchronously so they cannot return the result of the function - they've returned to the caller before the body of the behaviour is executed. They always return the object the behaviour was called on. This makes chaining behaviour calls possible:

    let c1 = Counter(0)
    c1.inc()
      .inc()
      .display(env.out)

tag reference capability

A class defaults to a reference capability of ref. An actor defaults to tag. A tag only allows object identification. No read or write operations are allowed but you can alias tag objects and you can pass them to other actors. This is safe since the holder of a tag alias can't view or modify the state. It can call behaviours on it though. This is safe because behaviours are queued for sequential processing at a future point in time - access to the state of the actor is serialized through behaviours.

Simulating return values

How do you deal with returning values from behaviours if they don't support return values? One approach is to pass an object to the behaviour that it uses as a callback with the result. For the counter example this could look like:

actor Counter
  ...as before...

  be get(cb: {(U32)} iso) =>
    cb(count)

actor Main
  new create(env: Env) =>
    let c1 = Counter(0)
    c1.inc()
      .inc()
      .get(recover lambda (x:U32)(env) => env.out.print(x.string()) end end)

Here the get behaviour receives a closure as an argument. This is called passing a closure that prints the value out. When get is executed asynchronously it's safe for it to pass the count value to the closure. The closure can't modify it. The closure itself is an iso reference capability so nothing else but the behaviour is accessing it.

This approach leads to a very 'callback' style of programming. It can feel like programming in continuation passing style at times. It requires careful design when dealing with error handling. Pony includes a promises library to help manage this.

Promises

The promises library provides the ability to pass callbacks, handle errors and chain promises together to make it easier to manage callback style programming. The counter example converted to use promoses looks like:

use "promises"

actor Counter
  ...as before...

  be get(p: Promise[U32]) =>
    p(count) 

actor Main
  new create(env: Env) =>
    let c1 = Counter(0)
    let p = Promise[U32]
    c1.inc()
      .inc()
      .get(p)
    p.next[None](recover lambda ref(x:U32)(env) => env.out.print(x.string()) end end)

The get method has been changed to take a Promise[U32]. The Promise type is a generic type and here it is indexed over the U32 value that it will be provided with. In the Main actor a promise is created and passed to get. Then the next method is called on the promise to tell it what to do when a value is provided to it. In this case it's the same closure as in the previous example so there's not much of a win here.

What promises do provide though is a way to handle failure. A callback used in the promise can raise an error and the promise will try the next operation in the chain. Chained promises can manipulate values as they're passed down the chain to form a pipeline of operations.

The boilerplate to create the promise and pass it to the behaviour can be hidden by a method on the actor:

actor Counter
  ...as before...

  be myget(p: Promise[U32]) =>
    p(count)

  fun tag get(): Promise[U32] =>
    let p = Promise[U32]
    myget(p)
    p

actor Main
  new create(env: Env) =>
    let c1 = Counter(0)
    c1.inc()
      .inc()
      .get().next[None](recover lambda ref(x:U32)(env) => env.out.print(x.string()) end end)

In this example the get method creates the promise and passes it to the behaviour then returns the promise. The caller can then use method chaining to call next on the promise to perform the action.

Notice that the get method has a tag reference capability. This is required to allow other actors to call it. A reference to an actor has the tag capability so only behaviours and tag methods can be called with it. A tag method can't modify internal state - all it can do is call behaviours on the actor - so this is safe to be called externally. It would be a compile error if the method attempted to view or modify actor state.

The following demonstrates promise chaining:

actor Counter
  ...as before...

  fun tag get_string(): Promise[String] =>
    get().next[String](object iso
                         fun ref apply(x:U32): String => x.string()
                       end)

actor Main
  new create(env: Env) =>
    let c1 = Counter(0)
    c1.inc()
      .inc()
      .get_string().next[Main](recover this~print(env.out) end)

  be print(out:OutStream, s: String) =>
    out.print(s)

In this case we want a String from the behaviour call. The get_string method on Counter calls get and chains the next callback to be one that returns a result of type `String. It just does a conversion by calling the string method. I use an object literal here instead of a closure for clarity.

The caller in Main calls get_string and chains the returned promise with another callback. This callback uses partial application to call the print behaviour on Main to print the string. The next call uses Main to parameterize the promise result as calling the print behaviour returns the receiver - in this case Main.

The result of this is that when the get behaviour is executed it calls the first promise in the chain to return the result. That converts the U32 to a String. The next promise in the chain is then called which calls print on the Main actor. That behaviour gets queued and eventually run to output the result.

Which is best to use, promises or callbacks? It depends on what the objects are doing. For single return values with an error case then promises are a good approach. For objects that need to callback multiple times then a callback or notifier object may be a better choice. For an example of the latter, see the net packages use of various notifier classes like TCPConnectionNotify to provide notification of different states in the TCP connection lifetime:

interface TCPConnectionNotify
  fun ref accepted(conn: TCPConnection ref)
  fun ref connecting(conn: TCPConnection ref, count: U32)
  fun ref connected(conn: TCPConnection ref)
  fun ref connect_failed(conn: TCPConnection ref)
  fun ref auth_failed(conn: TCPConnection ref)
  fun ref sent(conn: TCPConnection ref, data: ByteSeq): ByteSeq ?
  fun ref sentv(conn: TCPConnection ref, data: ByteSeqIter): ByteSeqIter ?
  fun ref received(conn: TCPConnection ref, data: Array[U8] iso)
  fun ref expect(conn: TCPConnection ref, qty: USize): USize
  fun ref closed(conn: TCPConnection ref)

Sendable objects

As behaviours are sent asycnhronously this means the arguments to those behaviours must be sharable. The passing and sharing section of the tutorial makes the distinction between 'passing' and 'sharing' objects.

In 'passing' an object from one actor to another the originating actor is giving up ownership. It can no longer access the object after giving it to the receiving actor. This is the iso reference capability. The sending actor must consume it when passing it to the receiver:

actor Main
  new create(env: Env) =>
    let a = recover iso String end
    let b = Something
    b.doit(consume a)

actor Something
  be doit(s: String iso) =>
    s.append("Hello World")

In 'sharing' an object you want both the originating actor and the receiver (and any others) to be able to read from the object. Nothing should be able to write to it. This is the val reference capability:

class Data
  var count: U32 = 0

  fun ref inc() =>
    count = count + 1

actor Main
  new create(env: Env) =>
    let a: Data trn = recover trn Data end
    a.inc()

    let d: Data val = consume a
    let s1 = Something(env.out)
    let s2 = Something(env.out)
    s1.doit(d)
    s2.doit(d)

actor Something
  let _out: OutStream

  new create(out: OutStream) =>
    _out = out

  be doit(d: Data val) =>
    _out.print("Got " + d.count.string())

This example has a Data class with an integer count field. In the Main actor we create an instance as a trn reference capability. This is used for objects that you want to write to initially but give out immutable access to later. While we hold the mutable trn reference we increment it and then consume it to get an immutable val reference capability for it. The old a alias is no longer usable at this point - no writeable aliases to the object exist. Because it is immutable we can now pass it to as many actors as we want and they get read only access to the objects fields and methods.

Another sharable type is the tag reference capability. This provides only identity access to an object. A receiver of a tag object can't read or write fields but it can call behaviours. It is the reference capability used for actors and is what you use to pass actors around. The previous sharing example uses this to pass the env.out object around. The OutStream is an actor.

It's important to keep in mind that creating aliases of objects doesn't copy the object. It's a new variable pointing to the same object. There is no copy operation involved in passing the Data val objects around. Although the capability is called 'val' it is not a 'value object'. The two Something actors have the same Data object in terms of object identity. The val only means 'immutable'.

The reference capabilities and the checking by the type system is what allows avoiding copies to be safe in the presence of multiple actors. This does mean that if you have a ref object you can't pass it to an actor. This is a compile error:

actor Something
  be doit(s: String ref) =>
  None

Only val, iso and tag can be used as an argument to a behaviour. This will also fail to compile:

actor Main
  new create(env: Env) =>
    let a = recover ref String end
    a.append("Hello World")
    let b = Something
    b.doit(a)

actor Something
  be doit(s: String val) =>
    None

Here we have a String ref and are trying to pass it to a behaviour expecting a String val. It is not possible to convert a ref to a val. A ref provides read and write access to the object. Multiple aliases to the same ref object can exist within a single actor. This is safe because behaviour execution within an actor is sequential. All of these aliases would need to be consumed to safely get a val alias. The type system doesn't (and probably couldn't) prove that all aliases are consumed at the time of converting to a val so it is not possible to do the conversion.

This is a case where a copy is needed:

actor Main
  new create(env: Env) =>
    let a = recover ref String end
    a.append("Hello World")
    let b = Something
    b.doit(a.clone())

The clone method on String returns a String iso^ which is convertable automatically to a String val by virtue of the fact that it has no aliases (The ^ part of the type). See capability subtyping for details

Cloning creates a copy distinct from the original. They have different identities and is a less efficient operation so it's worthwhile examining the data being passed around and seeing if it's possible to avoid holding multiple references to data and use the strictest reference capability to avoid aliasing.

Blocking operations

Pony has no blocking operations (outside of using the C FFI). In languages like Erlang it's common to do a blocking receive within a function to wait for a message and operate on it. In Pony this is implicitly done by actors in their event loop, hidden from the programmer. A behaviour call queues the message and it is executed when it's popped off the queue. You can't block for a message within the body of a behaviour itself.

This results in having to change the programming mode from "wait for data and do something" to "notify me of data when it's available". Instead of blocking for N seconds within a behaviour you create a timer to notify the actor of something 'N' seconds later.

The Pony standard library is structured in this way to use notifier objects, callbacks and promises to make programming in this style easier.

Causal Messaging

Data races can be difficult to avoid in the presence of asynchronous executation of threads. Pony has a message ordering guarantee to make the following type of code safe:

actor Counter
  let _out: OutStream
  var count: U32 = 0

  new create(out: OutStream) =>
    _out = out

  be inc() =>
    count = count + 1

  be dec() =>
    count = count - 1
    if count == 0 then _out.print("counter is destoyed") end

actor Something
  be doit(counter: Counter) =>
    counter.dec()

actor Main
  new create(env: Env) =>
    let c1 = Counter(env.out)
    let c2 = Something
    c1.inc()
    c2.doit(c1)

In this example the Counter object does something when the count is decremented to zero. In the Main actor a counter is created, incremented and passed to another actor where it is decremented. The inc and doit calls are on different actors and therefore are executed asynchronously. It's important that the inc call executes before the dec call in the doit behaviour of the other actor.

Pony has a message ordering guarantee, called 'causal messaging', to ensure this ordering happens. This is described in a forum thread discussion as:

[...] Pony makes a messaging order guarantee that's much stronger than is typical for the actor model. It guarantees causal messaging. That is, any message that is a "cause" of another message (i.e. was sent or received by an actor prior to the message in question) is guaranteed to arrive before the "effect" if they have the same destination.

For more information the paper Fully Concurrent Garbage Collection of Actors on Many-Core Machines goes into detail about how it works.

Garbage Collection

Actor's have their own garbage collection heap. Garbage collection occurs between behaviour calls on the actor. This allows GC to occur for an actor without interrupting execution of other actors. The runtime detects when it is no longer possible for an actor to receive messages and will garbage collect the actor itself. This can avoid the need to implement a 'poison pill' protocol whereby the actor receives a message to say it can terminate.

Even with this automatic actor garbage detection in place there are times when it is necessary to implement a shutdown protocol. An actor may be receiving notification callbacks - the actor sending the callbacks needs to be told to stop sending the messages so the system can detect that the receiver can be garbage collected. In my IMAP Idle monitor I use dispose methods to cancel timers or close TCP connections. A Pony library class called a Custodian holds a collection of actors to be disposed and calls dispose on each one when its own dispose behaviour is called. This results in the runtime detecting that none of the actors in the application can receive messages anymore and the entire application terminates.

One thing to be careful of with garbage collection only happening between behaviour calls is that a long running behaviour will not GC during its execution. Simple benchmark applications that do everything in the Main actors constructor exhibit this. They use large amounts of memory due to no GC happening if the benchmark doesn't call a behaviour on the actor.

Tags: pony 

2016-05-04

Bang, Hat and Arrow in Pony

If you've looked at Pony programming language code you've probably seen use of punctuation symbols in places and wondered what they meant. This post is an attempt to explain three of those - the bang, hat and arrow (!, ^ and -> respectively). Note that this is my understanding based on usage, reading the tutorials and watching videos so there may be errors. I welcome corrections!

Bang

The bang symbol (otherwise known as an exclamation mark) combined with a type name can be thought of as the type of an alias of the given type. Having an alias of an object means having another reference to that object. So an alias to a String iso is of type String iso!. This matters mostly in generic code which will be explained later but it does come up in error messages.

If you see ! in an error message like "iso! is not a subtype of iso" this means you are probably trying to assign an object that cannot be aliased without first consuming it.

If you see ! in a type declaration in code like "let foo: A!" then you can read this as "replace A! with a type that can safely hold an alias to A". If A is a String iso then A! would be a String trn for example (following the rules for aliased substitution.

Bang in errors

The following code demonstrates something that is often encountered by first time Pony users:

class Something

actor Main
  new create(env: Env) =>
    let a = recover iso Something end
    Foo(a)

actor Foo
  new create(s: Something iso) =>
    None

Here we have a class called Something. A new instance of it is created in the Main actor with reference capability iso. A new Foo actor is created passing this instance to it. This will fail to compile as we are aliasing the Something object held in a. a holds a reference to it and the variable s holding the argument to the Foo constructor is holding a reference to it at the same time. Objects with a reference capability of iso cannot have more than one reference to it. The error from the compiler will look like:

Error:
e1/main.pony:6:9: argument not a subtype of parameter
    Foo(a)
        ^
    Info:
    e1/main.pony:9:14: parameter type: Something iso
      new create(s: Something iso) =>
                 ^
    e1/main.pony:6:9: argument type: Something iso!
        Foo(a)
            ^
    e1/main.pony:1:1: Something iso! is not a subtype of Something iso: iso! is not a subtype of iso

This error states that the expected type of the parameter for the Foo constructor is of type Something iso but the type that we passed is a Something iso!. It further explains things by noting that Something iso! is not a subtype of Something iso because iso! is not a subtype of iso.

Armed with the knowledge that the bang symbol means the type for an alias this can be read as the argument passed was an alias to a Something iso. This is an error as iso cannot be aliased - this is what iso! is not a subtype of iso means. The subtyping relationship for aliases is outlined in the Capability Subtyping section of the tutorial.

The code can be fixed by consuming the a so it is no longer aliased:

let a = recover iso Something end
Foo(consume a)

Bang in generics

The other place where you'll see the alias type is in generic code. The following non-generic code compiles fine:

class Something
  let a: U8

  new create(x: U8) =>
    a = x

actor Main
  new create(env: Env) =>
    let aint = Something(42)

U8 defaults to val reference capability which can be aliased. This allows the assignment to the field a in Something which is aliasing the x object. If we make this a generic so that any type can be used then it fails to compile:

class Something[A]
  let a: A

  new create(x: A) =>
    a = x

actor Main
  new create(env: Env) =>
    let aint = Something[U8](42)

The error is:

Error:
e3/main.pony:5:7: right side must be a subtype of left side
    a = x
      ^
    Info:
    e3/main.pony:4:17: right side type: A #any !
      new create(x: A) =>
                    ^
    e3/main.pony:5:5: left side type: A #any
        a = x
        ^
    e3/main.pony:4:17: A #any ! is not a subtype of A #any: the subtype has no constraint
      new create(x: A) =>
                ^

For now, ignore the #any in the error message. I'll expand on this later but it's informing us that the type A is unconstrained and can have any reference capability.

The error states that x is an A! but a is an A and A! is not a subtype of A so the assignment cannot happen.

This occurs Because A is unconstrained. It can be any reference capability. Therefore the code must be able to be compiled under the assumption that the most restrictive reference capability can be used. It works fine with val, which can be aliased, but not with iso which cannot. Therefore the generic code cannot be compiled. You can see how iso would fail by expanding a version using String iso:

class Something
  let a: String  iso

  new create(x: String iso) =>
    a = x

actor Main
  new create(env: Env) =>
    let aint = Something(recover iso String end)

The error is:

Error:
e5/main.pony:5:7: right side must be a subtype of left side
    a = x
      ^
    Info:
    e5/main.pony:4:17: right side type: String iso!
      new create(x: String iso) =>
                    ^
    e5/main.pony:2:10: left side type: String iso
      let a: String  iso
             ^
    e5/main.pony:4:17: String iso! is not a subtype of String iso: iso! is not a subtype of iso
      new create(x: String iso) =>
                    ^

This is the same error that the generic code is giving us. The generic code can be fixed in a few ways. The first is to constrain the type so that it is a specific reference capability that works. Here it is changed to val:

class Something[A: Any val]
  let a: A

  new create(x: A) =>
    a = x

actor Main
  new create(env: Env) =>
    let aint = Something[U8](42)

The A: Any val syntax constrains the type parameter to be a subtype of the type after the :. In this case, any type with a reference capability of val. This won't work if you want to be able to use any aliasable type (eg ref as well as val):

class Something[A: Any val]
  let a: A

  new create(x: A) =>
    a = x

actor Main
  new create(env: Env) =>
    let aint = Something[U8](42)
    let bint = Something[String ref](recover ref String end)

The error here is obvious in that we are trying to pass a ref parameter to a function expecting a val. Pony generics solves this by allowing code to be polymorphic over the reference capability. There are specific annotations for classes of reference capabilities. They are:

#read  = { ref, val, box }                = Anything you can read from
#send  = { iso, val, tag }                = Anything you can send to an actor
#share = { val, tag }                     = Anything you can send to more than one actor
#any   = { iso, trn, ref, val, box, tag } = Default of a constraint
#alias = {ref,val, box, tag}              = Set of capabilities that alias as themselves (used by compiler)

A version that will work for ref, val and box becomes:

class Something[A: Any #read]
  let a: A

  new create(x: A) =>
    a = x

actor Main
  new create(env: Env) =>
    let aint = Something[U8](42)
    let bint = Something[String ref](recover ref String end)

But what if you want it to work with non-aliasable types like iso? A solution is to consume the parameter:

class Something[A]
  let a: A

  new create(x: A) =>
    a = consume x

actor Main
  new create(env: Env) =>
    let aint = Something[U8](42)
    let bint = Something[String ref](recover ref String end)
    let cint = Something[String iso](recover iso String end)

Another solution is to declare the field type to be A! instead of A. In the String iso case using A means String iso which cannot hold an alias. Using A! means String iso! which should be read as "a type that can safely alias a String iso". Looking at the Aliased substitution table this is a tag:

class Something[A]
  let a: A!

  new create(x: A) =>
    a = x

actor Main
  new create(env: Env) =>
    let aint = Something[U8](42)
    let bint = Something[String ref](recover ref String end)
    let cint = Something[String iso](recover iso String end)

In this case we are using ! to tell the compiler to use a reference capability that works for whatever the type of A is. An iso becomes a tag, a trn becomes a box, a ref stays a ref, etc.

Hat

The hat symbol (or ^) is an ephemeral type. It's the type of an object that is not assigned to a variable. consume x is used to prevent aliasing of x but at the point of being consumed and before it is assigned to anything else, what type is it? If x is type A then the type of consume x is A^. Constructors always return an ephemeral type as they create objects and return them but they aren't yet assigned to anything.

The following example creates a Box type that acts like single instance array of String iso objects. A value can be stored and updated. A utility function Foo.doit takes a String iso as an argument. It's stubbed out since it doesn't need to do anything for the example. The main code creates a Box, updates it, and calls the utility function on it.

class Box
  var a: String iso

  new create(x: String iso) =>
    a = consume x

  fun ref update(x: String iso): String iso =>
    let b = a = consume x
    consume b

primitive Foo
  fun doit(s: String iso) =>
    None

actor Main
  new create(env: Env) =>
    let a = Box(recover iso String end)
    let b = a.update(recover iso String end)
    Foo.doit(consume b)

Some things to note based on prior discussion. The create method consumes the argument to prevent aliasing. The update function also consumes the argument to prevent aliasing. It uses the destructive read syntax to assign the argument x to the field a and assign the old value of a to b to avoid aliasing. Unfortunately this example fails to compile:

Error:
f3/main.pony:19:14: argument not a subtype of parameter
    Foo.doit(consume b)
             ^
    Info:
    f3/main.pony:12:12: parameter type: String iso
      fun doit(s: String iso) =>
               ^
    f3/main.pony:19:14: argument type: String iso!
        Foo.doit(consume b)
                 ^
    f3/main.pony:7:34: String iso! is not a subtype of String iso: iso! is not a subtype of iso
      fun ref update(x: String iso): String iso =>
                                     ^

From the discussion previously on ! this error tells us that we are aliasing b. We can narrow it down by explicitly declaring the type of b:

let b: String iso = a.update(recover iso String end)

The error is due to update returning a String iso. We are consuming b and returning it as a String iso which then gets aliased when assigned to b in the main routine. Changing the return type to use hat resolves the issue. consume b returns the ephmeral type which is an object with no variable referencing it. It is safe to assign to a String iso so the change compiles:

class Box
  var a: String iso

  new create(x: String iso) =>
    a = consume x

  fun ref update(x: String iso): String iso^ =>
    let b = a = consume x
    consume b

primitive Foo
  fun doit(s: String iso) =>
    None

actor Main
  new create(env: Env) =>
    let a = Box(recover iso String end)
    let b = a.update(recover iso String end)
    Foo.doit(consume b)

Another approach would be to return a String iso but change doit to be a String tag (the type that can alias a String iso). This compiles but because doit now takes String tag it is limited in what it can do with the string. The approach of using an ephemeral type allows obtaining the mutable object from the Box.

Hat in parameters

Sometimes you'll see hat in parameter lists. The Array builtin has an init constructor that looks like:

new init(from: A^, len: USize)

This initializes an array so that all elements are the from value. To explore how this works, here's a smaller example that does something similar:

class Box[A]
  var a: A
  var b: A

  new create(x: A^) =>
    a = x
    b = x

Without the hat in A^ there is an error due to aliasing. We can't assign x to both a and b in case A is an iso. With the hat it compiles. The is because an epehemeral reference capability is a way of saying "a reference capability that, when aliased, results in the base reference capability". So a String iso^ can be assigned to a String iso, a String ref^ can be assigned to a String ref, etc. This means the generic class itself compiles but using it for a String iso will fail due to aliasing but it can be used for other reference capability types. Compare this to a plain x: A where the generic class itself won't compile since a String iso can't be assigned to another String iso due to aliasing.

Code demonstrating this is:

actor Main
  new create(env: Env) =>
    let a = Box[String iso](recover iso String end)
    let c = Box[String ref](recover ref String end)
    let e = Box[String val](recover val String end)

Arrow

The arrow syntax (or ->) is known as viewpoint adapter types and is related to viewpoint adaption.

Arrow in error messages

Viewpoint adaption defines what the reference capability of a field looks like to some caller based on the reference capability of the object the field is being read from. This is important to maintain the reference capability guarantees. A val object should not be able to access an iso field as iso or it breaks the constraints of val - it should be immutable but obtaining it as iso allows mutation of the field. There is a table in viewpoint adaption that shows what the mapping is.

An example of an error that can occur by ignoring viewpoint adaption is in the following code:

class Something
  var a: String iso

  new create() =>
    a = recover iso String end

  fun doit(s: String) =>
    a.append(s) 

actor Main
  new create(env: Env) =>
    let a = Something
    a.doit("hello")

The error here is calling append on a the a field in the doit method. By default methods have a receiver reference capability of box. Anything that happens inside the method cannot affect the state of the object. This is why you see methods that modify object fields start with fun ref - it's to change the receiver reference capability to something mutable. Even though the field a is iso and therefore mutable because we are inside a box method it appears as a non-mutable reference capability. The viewpoint adaption table shows that a box origin with an iso field gives a tag type. So a looks like a String tag within the method. The compiler gives:

Error:
v/main.pony:8:13: receiver type is not a subtype of target type
    a.append(s) 
            ^
    Info:
    v/main.pony:8:5: receiver type: this->String iso!
        a.append(s) 
        ^
    ny/ponyc/packages/builtin/string.pony:622:3: target type: String ref
      fun ref append(seq: ReadSeq[U8], offset: USize = 0, len: USize = -1)
      ^
    v/main.pony:2:10: String tag is not a subtype of String ref: tag is not a subtype of ref
      var a: String iso
             ^

The 'receiver type' of this->String iso! is an example of arrow usage. It's saying that an object of type String iso! (an alias to a String iso) as seen by an origin of this. The reference capability of this in a method is that of the receiver reference capability on the function - in this case box. So this->String iso! is String tag. That's why the last error description line refers to String tag.

The solution here is to change the reference capability for the method to something that allows mutation:

fun ref doit(s: String) =>

Arrow in type declarations

When writing generic code it's sometimes required to be explicit in what viewpoint adaption to use for generic types. Returning to the Box example used previously we'll make it generic and make it usable for any reference capability:

class Box[A]
  var a: A

  new create(x: A) =>
    a = consume x

  fun apply(): this->A! =>
    a

  fun ref update(x: A): A^ =>
    let b = a = consume x
    consume b

  fun clone(): Box[this->A!] =>
    Box[this->A!](a)

Notice the use of this->A! in the return type of apply. We want to return what is held in the Box. If it is a Box[String val] val then we can return a String val since it is immutable and the box is immutable. If it is a Box[String ref] val we still want to return a String val, not a String ref. The latter would allow modifying an immutable box. If it's a Box[String ref] ref then it's safe to return a String ref. This is what the arrow type handles for us. The this refers to the reference capability of the object. The A! refers to the field type - note that it is being aliased here so we want a type that can hold an alias to an A. The viewpoint adaption gives the resulting reference capability of the type.

Looking up the table of viewpoint adaption gives:

Box[String val] val => val->val => val => String val
Box[String ref] val => val->ref => val => String val
Box[String ref] ref => ref->ref => ref => String ref
Box[String iso] ref => ref->iso => iso => String iso
Box[String ref] iso => iso->ref => tag => String tag

That last one is interesting in that the Box[String ref] iso says that only one reference of the Box can exist. If we allow a String ref to be obtained from it then it breaks this condition since both the original reference to the box can modify the string and so can the returned reference. This is why the viewpoint adaption gives a String tag. A tag only allows identity operations so it's safe to have this type of alias of an iso.

Note that the table above gives the mapping for this->A. Because it's a this->A! it has to be a type that can hold an alias to the type of the table. So we have another mapping:

String val! => String val
String ref! => String ref
String iso! => String tag

In this way a Box[String iso] ref will give out a String tag - the only safe way of aliasing the original string in the box.

The other use of an arrow type in this example is in the clone function. This must do a shallow copy of the object. It returns a new Box holding a reference to the same value. Because we need to alias the value the same constraints as described for the apply method exist. We want to return a Box[this->A!] to ensure the value object for that box instance is a safe alias to the original. For a Box[String iso] ref this returns a Box[String tag] for example.

The following code can be used with the Box class above to test it:

primitive Foo
  fun doit(s: String tag) =>
    None

actor Main
  new create(env: Env) =>
    let a = Box[String iso](recover iso String end)
    let b = a.clone()
    Foo.doit(b())

    let c = Box[String ref](recover ref String end)
    let d = c.clone()
    Foo.doit(d())

    let e = Box[String val](recover val String end)
    let f = e.clone()
    Foo.doit(f())

A->B arrows

Arrow types don't need to always use this on the receiver side. They can use an explicit reference capability like box->A or they can use another parameterized type. Examples of this are in some of the library code:

class ArrayValues[A, B: Array[A] #read] is Iterator[B->A]

An ArrayValues is returned by the values method on Array. It's an iterator over the objects in the array. The B->A syntax means that the type of the generic argument to Iterator is of type "A as seen by B" using viewpoint adaption. It's not an iterator over A, it's an iterator over "A as seen by B". This allows iteration over arrays whether they are val or ref and produces a compatible type for the Iterator that works with both.

Conclusion

Most of the functionality described here is some of the more esotoric Pony functionality. It is mainly hit when using generics. The best current reference for generics is a video by Sylvan Clebsch for the virtual Pony users group - Writing Generic Code.

A good way to learn is to try some of the examples in this post and play around with them. Try aliasing, using different types, different reference capabilities and see what happens. The Pony library code, Array.pony for example, is a useful reference.

Tags: pony 


This site is accessable over tor as hidden service mh7mkfvezts5j6yu.onion, or Freenet using key:
USK@1ORdIvjL2H1bZblJcP8hu2LjjKtVB-rVzp8mLty~5N4,8hL85otZBbq0geDsSKkBK4sKESL2SrNVecFZz9NxGVQ,AQACAAE/bluishcoder/-30/


Tags

Archives
Links