Bluish Coder

Programming Languages, Martials Arts and Computers. The Weblog of Chris Double.


2016-07-14

Concurrency in Wasp Lisp

Wasp Lisp has a light weight co-operative threading model that's allows programming in an Actor style. It's possible to serialize Wasp values and send them to other processes and machines to be deserialized and run. MOSREF uses this to compile Lisp code on the console process and send the bytecode to drone processes to execute. This allows drones to operate without the Lisp compiler present.

Spawning threads

Threads are created using the spawn function. It takes the function to run as a thread as an argument:

(spawn (lambda () (print "Hello World\n")))

Communication between threads is done using queues. A queue is an unbounded channel that can have many senders but only one receiver. The function send adds data to the queue and wait receives data. If there is no data in the queue then wait blocks. Input/Output in Wasp Lisp is done using the same wait/send mechanism making it easy to pipeline data from console and file output to sockets.

Implementing Actors

A basic Actor can be implemented like the following:

(define (actor1)
  (define counter 0)
  (define chan (make-queue))

  (define (loop)
    (define msg (wait chan))
    (cond
      ((eq? msg 'inc)
        (set! counter (+ 1 counter)))
      ((eq? msg 'dec)
        (set! counter (- 1 counter)))
      ((and (list? msg) (eq? (car msg) 'get))
       (send counter (cadr msg))))
    (loop))

  (spawn loop)
  chan)

actor1 is a function that contains a counter holding an numeric value. It creates chan, a queue for holding messages, spawns a thread to run loop and returns the chan so messages can be queued for loop to process.

loop waits for a message on chan. This is a blocking call and the thread will go idle until a message is queued. It processes the message, incrementing or decrementing the counter as requested. An additional message, get, can be used to get the value of the counter. That message also includes a channel object to place the result in. loop recursively calls itself to continue.

A sample interaction is:

>> (define a1 (actor1))
>> (define result (make-queue))
>> (send (list 'get result) a1)
>> (wait result)
:: 0
>> (send 'inc a1)
>> (send (list 'get result) a1)
>> (wait result)
:: 1

This creates an actor and a queue to receive results. It asks for the current value of the actor, increments it, then asks again.

Updating an Actor

It's possible to update the code for an Actor without stopping the application. Running in a Lisp REPL means you can change functions on the fly but you can't change the internal implementation of a running loop from the REPL if that loop is internal to a function. A way around this is to provide the Actor with the means to receive a function as a message that performs the update. Here is an example of an updatable actor:

(define (actor3)
  (define counter 0)
  (define chan (make-queue))

  (define (loop chan)
    (define msg (wait chan))
    (cond
      ((eq? msg 'inc)
        (set! counter (+ 1 counter)))
      ((eq? msg 'dec)
        (set! counter (- 1 counter)))
      ((and (list? msg) (eq? (car msg) 'get))
       (send counter (cadr msg)))
      ((function? msg)
       (return ((msg counter) chan))))
    (loop chan))

  (spawn loop chan)
  chan)

This code contains an additional branch in the cond to check if the message is a function. If it is then that function is called passing the current value of the counter. It is expected to return a function which will be the new loop to call. This can contain any code and effectively updates the entire actor with new functionality. An example update function to change the messages to increment/decrement by two is:

(define (update oldstate)
  (define counter (* oldstate 2))
  (define (loop chan)
    (define msg (wait chan))
    (cond
      ((eq? msg 'inc)
        (set! counter (+ 2 counter)))
      ((eq? msg 'dec)
        (set! counter (- 2 counter)))
      ((and (list? msg) (eq? (car msg) 'get))
       (send counter (cadr msg)))
      ((function? msg)
       (return ((msg counter) chan))))
   (loop chan))
  loop)

An example interaction of the actor and upgrading it is:

>> (define a3 (actor3))
>> (define result (make-queue))
>> (send 'inc a3)
>> (send (list 'get result) a3)
>> (wait result)
:: 1
>> (send update a3)   ;; Updating the actor here
>> (send (list 'get result) a3)
>> (wait result)
:: 2                  ;; This shows the new counter value that 'update' changed
>> (send 'inc a3)
>> (send (list 'get result) a3)
>> (wait result)
:: 4                  ;; Amount is now incrementing by two
>> (send 'inc a3)
>> (send (list 'get result) a3)
>> (wait result)
:: 6

This is a variant of Joe Armstrong's Erlang Universal Server allowing a server to be updated to do anything.

Filters

An idiom when programming in an Actor or coroutine style is to write small processes that take an input, modify it in some way, and send it to another process to do something else. A program becomes a chain or pipeline of these individual processes. Wasp Lisp calls these small units of functionality filters. They are described in filter.ms as:

A process that waits for data from an input channel, and sends data to an output channel. Filters are constructed using a constructor function, then wired together using either the input-chain or output-chain functions."

This is an example of a line filter from the Wasp source code;

(define-filter (line-filter)
  (define buf (make-string 80)) 

  (define (parse)
    (forever
      (define next (string-read-line! buf))
      (if next (send next out)
               (return))))

  (define (line-loop)
    (forever
      (define next (wait-input in))
      (cond 
        ((string? next)
         (string-append! buf next)
         (parse))
        ((eq? next 'close)
         (return))
        (else
          (send-output next out)))))

  (line-loop)

  (send-output buf out)
  (send-output 'close out))

A line-filter receives strings of bytes on the input channel and outputs a complete line on the output channel when it has one. It does this by appending received bytes onto a string buffer and checking if that buffer contains a line. If it does it removes the line data from the buffer and sends it to the output channel. It then continues to wait for data on the input channel. An example of usage:

>> (import "lib/filter")
>> (import "lib/line-filter")
>> (define q (make-queue))
>> (define lines (input-chain q (line-filter)))
>> (spawn (lambda () (forever (print (wait lines)))))

>> (send "hello" q)
>> (send "world\n" q)
helloworld
>> (send "foo\nbar" q)
foo
>> (send "baz\n" q)
barbaz

This creates a queue, q for input data. It creates a chain containing only one filter, the line-filter. It returns the output channel which contains the filtered data. Data placed in q is retrieved by the line filter and when a line is received it is sent to the output channel. A thread is spawned to loop forever printing any lines from the output channel. Notice in the manual sending of data to the channel q that output is only printed by the spawned thread when a line is completed.

Wasp Lisp comes with some default filters for parsing s-expressions, encrypting and decrypting data and fuzzing data amongst other things. Scott Dunlop wrote about coroutines and filters on the Wasp blog.

Sending data to other OS processes

Some Wasp values can be serialized and deserialized. This provides a way to send values to other wasp instances running in different OS processes or machines. Lisp objects are serialized using freeze and unserialized using thaw.

The following server function starts a TCP server on port 10000. Clients connnected to it send Lisp objects to it and it prints it to the standard output on the server process.

(import "lib/tcp-server")

(define (server)
  (define server-output (current-output))

  (define (acceptor)
    (forever
      (define data (wait))
      (with-output server-output
        (print (format (thaw data)))
        (print "\n"))))

  (spawn-tcp-server 10000 acceptor))

The acceptor function is called with its current input and output bound to the TCP stream. For this reason we capture the value of current-output before it is bound so we can output to the server console rather than to the TCP stream. A sample test:

;; On server 
>> (server)

;; On client
>> (define s (tcp-connect "127.0.0.1" 10000))
>> (send (freeze "foo") s)

;; On Server
"foo"

;; On Client
>> (send (freeze 66) s)

;; On Server
66

;; On Client
>> (send (freeze '(one (two three))) s)

;; On Server
(one (two three))

Notice that all i/o is done using the 'send' and 'wait' channel operators. This means we can use a filter to do the freezing/thawing automatically and Wasp has a freeze-filter and thaw-filter that does this. The server becomes:

(import "lib/tcp-server")
(import "lib/package-filter")
(import "lib/filter")
(import "lib/format-filter")

(define (server2)             
  (define server-output (current-output))

  (define (acceptor)
    (define chan (input-chain (current-input)
                              (thaw-filter)
                              (format-filter)))
    (forever
      (define data (wait chan))
      (print* data "\n")))

  (spawn-tcp-server 10000 acceptor))

Usage from a client is:

>> (import "lib/filter")
>> (import "lib/package-filter")

>> (define s (tcp-connect "127.0.0.1" 10000))
>> (define chan (output-chain s (freeze-filter)))
>> (send "hello" chan)
>> (send '(one (two three)) chan)

Through the use of the thaw/freeze filter there is no need to manually call freeze and thaw.

Sending bytecode to other processes

Unfortunately it's not possible to freeze or thaw closures or functions. It is possible however to assemble Lisp to bytecode and send that. This enables sending new functions across OS processes and is how MOSREF is able to compile Lisp on the console and send it to the drone. This example will compile a function from source to bytecode and run it:

>> (define code '((print "Hello World\n")))
>> (define proc (assemble (optimize (compile code))))
>> (proc)
Hello World

The result of assemble can be frozen, sent somewhere and thawed:

>> (define x (freeze (assemble (optimize (compile '((print "Hello World\n")))))))
>> (define y (thaw x))
>> (y)
Hello World

Using this we can have an upgradable server process:

(define (server3)
  (define server-output (current-output))

  (define (acceptor)
    (define chan (input-chain (current-input)
                              (thaw-filter)))

    (define (loop chan)
      (define data (wait chan))
      (cond
        ((function? data)
          (return ((data) chan)))
        (else
          (print* "OLD: " (format data) "\n")
          (return (loop chan)))))
     (loop chan))

  (spawn-tcp-server 10000 acceptor))

This will display the data sent to the server prefixed by "OLD:" unless it is sent a function. In which case it calls that function as the new server loop. An upgraded server loop to prefix with "NEW: " is:

(define (new-server3)
  (assemble
    (optimize
      (compile
        '((define (loop chan)
            (define data (wait chan))
            (cond
              ((function? data)
                (return ((data) chan)))
              (else
                (print* "NEW: " (format data) "\n")
                (return (loop chan))))))))))

We can't send a function directly so this compiles the new loop from source and returns the compiled procedure. This can be frozen, sent to the server and it will execute it as the new loop. An example interaction:

;; On Server
>> (server3)

;; On Client
>> (define s (tcp-connect "127.0.0.1" 10000))
>> (define chan (output-chain s (freeze-filter)))
>> (send '(one (two three)) chan)

;; On Server
OLD: (one (two three))

;; On Client
>> (send (new-server3) chan)
>> (send '(one (two three)) chan)

;; On Server
>> NEW: (one (two three))

Why not send the source to the server process and have it eval it? The approach of sending the bytecode allows the server process to skip including the Lisp compiler. The Wasp VM includes an interpreter and deserializer - the compiler and other libraries are all in Lisp. A Wasp executable consists of the VM stub with bytecode appended to the end of it. On execution it looks for the bytecode, deserializes it and runs it. This provides a minimal program that can have functionality added by sending it bytecode as needed.

An aside on tail call optimization

It's important that a process loop is tail recursive otherwise each call through the loop will increase stack size and eventually exhaust memory. The following is not tail recursive in Wasp Lisp, even though it looks like it should be:

(define (test1 chan)
  (define msg (wait chan))
  (cond
    ((eq msg 'foo)
      (test1 chan))
    ((eq msg 'bar)
      (test1 chan))
    (else
      (test1 chan))))

This is because the recursive call to 'test1' compiles down to bytecode that looks like:

(newf)
(ldg eq)
(arg)
(ldg msg)
(arg)
(ldc bar)
(arg)
(call)
(jf false-47) ;; If the msg is not 'bar then jump to false-47
...
false-47
(newf)
(ldg test1)
(arg)
(ldg chan)
(arg)
(call)        ;; recursively call 'test1'
done-46
done-44
(retn)        ;; return from function 'test1'

The stack frame for test1 is not exited (the retn instruction) until after the recursive call is done. Compare this to the obvious tail recursive case:

(newf)
(ldg wait)
(arg)
(ldg chan)
(arg)
(call)
(stg msg)
(newf)
(ldg test2)
(arg)
(ldg chan)
(arg)
(tail)

Note that tail instruction. This does an immediate jump rather than a call so a retn is not necessary. The call stack does not grow. The difference between the two cases is due to the way the Wasp Lisp compiler generates the instructions and optimizes looking for tail calls. The instructions generated can be viewed using:

(define x '(define (test2 chan)
             (define msg (wait chan))
             (test2 chan)))
(define code (compile x))
(for-each (lambda (x) (print* (format x) "\n")) code)

Using compile shows the first pass which does not look for tail calls:

(newf)
(ldg test2)
(arg)
(ldg chan)
(arg)
(call)
(retn)

Notice the call followed by retn. This is the sequence that optimize looks for to generate the tail instruction:

(define x '(define (test2 chan)
             (define msg (wait chan))
             (test2 chan)))
(define code (optimize (compile x)))
(for-each (lambda (x) (print* (format x) "\n")) code)
...
(newf)
(ldg test2)
(arg)
(ldg chan)
(arg)
(tail)

Looking back at the instructions for test1 the call is followed by a jump or a label before retn so the optimizer misses it. This can be worked around by doing an explicit return statement:

(define (test3 chan)
  (define msg (wait chan))
  (cond
    ((eq msg 'foo)
      (return (test1 chan)))
    ((eq msg 'bar)
      (return (test1 chan)))
    (else
      (return (test1 chan)))))

The code in the cond branches generates to the following which is now a tail call:

(jf false-93)
(newf)
(ldg test1)
(arg)
(ldg chan)
(arg)
(tail)

Some things to note

The Wasp VM is single threaded and non-preemptive. Threads yield to the scheduler explicitly using yield or implicitly when doing i/o or waiting on a queue. The bytecode is cross platform. Serialized objects on one architecture can be deserialized on another. The Wasp VM history comes from Mosquito Lisp and MOSREF - a penetration testing platform. It's written in C with some GNU extensions (nested functions are used in the VM).

This post came about from exploring the difference in Actor programming in the Pony programming language and a dynamic language where the Actor model isn't explicit. The programming style is similar in that pipelines of calls to actors to transform data is a common idiom.

Wasp Lisp isn't actively developed anymore but the author, Scott Dunlop, still processes pull requests and monitors it. I like to use it for projects and tinker with it as it's an interesting little cross platform lisp. MOSREF is useful as a way to access and maintain servers of different architectures, aside from its use as a penetration testing tool.

Some other Wasp resources:

Tags


This site is accessable over tor as hidden service 6vp5u25g4izec5c37wv52skvecikld6kysvsivnl6sdg6q7wy25lixad.onion, or Freenet using key:
USK@1ORdIvjL2H1bZblJcP8hu2LjjKtVB-rVzp8mLty~5N4,8hL85otZBbq0geDsSKkBK4sKESL2SrNVecFZz9NxGVQ,AQACAAE/bluishcoder/-61/


Tags

Archives
Links