Bluish Coder

Programming Languages, Martials Arts and Computers. The Weblog of Chris Double.


2006-10-29

Compilers and Interpreters in Factor

Writing compilers and interpreters in Factor is quite easy. By writing the grammar using parser combinators, and having those combinators produce an abstract syntax tree (AST), it's easy to write a simple compiler or interpreter for the tree. I've already covered writing simple parser combinators so I'll concentrate on processing the AST for a simple language in this post. A later post might generate a parser for it or it could be an 'exercise for the reader'.

For an example to start with I took the evaluator defined in this Lambda the Ultimate posting and converted it to Factor. For the interpreter I tried two different approaches to compare them. The first was to use pattern matching. For the second I use generic functions and Factor's OO system.

For the compiler I show how to compile to Factor code, which can then be interpreted or compiled by Factor itself, and I also compile to Javascript. This javascript can be run in a standalone Javascript interpreter like Rhino, or in a web browser.

To more easily re-use the AST across the two examples I added the ability to pattern match on tuples to the 'match' contrib library. This can be obtained from the standard Factor darcs repository:

darcs get http://www.factorcode.org/repos

The factor 'latest' boot images can be used to bootstrap from this repository. The instructions on how to do this are on factorcode.org. For an Intel x86 linux machine the steps would be:

darcs get http://www.factorcode.org/repos
cd factor/Factor
make linux-x86   
wget http://www.factorcode.org/images/latest/boot.image.x86
./f boot.image.x86

The examples here should also work with the released Factor 0.85 as long as you replace 'contrib/match/match.factor' with the version from my repository. To load the 'match' library in Factor use:

"contrib/match" require 
USE: match

In the Factor code snippets below I use '=>' to show the result of running Factor words. You shouldn't actually type that. Instead the result will be shown on the Factor stack or you can use '.' to print it.

The code for the examples in this post can be downloaded from interpreter.factor. If you copy it into the 'contrib' directory of your Factor installation you can load it with:

"contrib/interpreter" require

The evaluator that is being implemented has the following description from the Lambda the Ultimate posting:

data Term a where
    Lit  :: Int -> Term Int
    Inc  :: Term Int -> Term Int
    IsZ  :: Term Int -> Term Bool
    If   :: Term Bool -> Term a -> Term a -> Term a
    Pair :: Term a -> Term b -> Term (a,b)
    Fst  :: Term (a,b) -> Term a
    Snd  :: Term (a,b) -> Term b 
  eval :: Term a -> a
eval (Lit i)    = i
eval (Inc t)    = eval t + 1
eval (IsZ t)    = eval t == 0
eval (If b t e) = if eval b then eval t else eval e
eval (Pair a b) = (eval a, eval b)
eval (Fst t)    = fst (eval t)
eval (Snd t)    = snd (eval t)

Basically we have literal values, which in this example are integers. These can be incremented and tested to see if they are zero. A 'pair' can be created and the first and second elements obtained from the pair. An 'if' expression is available for testing an expression and evaluating a true or false clause. These are represented using Factor tuples:

TUPLE: lit i ;
TUPLE: inc t ;
TUPLE: isz t ;
TUPLE: iff b t e ;
TUPLE: pair a b ;
TUPLE: fst t ;
TUPLE: snd t ;

I used the same names for the types and elements as the example but changed 'if' to 'iff' to prevent clashing with Factor's standard 'if' word. An example tree can be created that is the increment of the number 5 with:

5 <lit> <inc>

When evaluated this should produce the result '6'.

The interpreter implementation that uses pattern matching uses 'match-cond'. This requires match variables to be set up which can be used to destructure sequences and tuples. For example:

5 <lit> T{ lit f ?t } match [ ?t ] bind => 5

'T{ lit f 123 }' is the Factor syntax for a tuple. It creates a tuple at parse time and the actual object is stored in the word. This is different from 123 <lit> which will produce the literal object at runtime. '?t' is a matching variable and the 'match' and 'match-cond' words use these to destructure matching elements of tuples and sequences. The match variable can then be used to get the actual value obtained. For more on the Factor pattern matching system you can read the documentation from within Factor with:

\ match help

The Factor implementation of the interpreter using pattern matching is:

: eval1 ( a -- a )
  {
    { T{ lit  f ?i }       [ ?i ] }
    { T{ inc  f ?t }       [ ?t eval1 1+ ] }
    { T{ isz  f ?t }       [ ?t eval1 zero? ] }
    { T{ iff  f ?b ?t ?e } [ ?b eval1 [ ?t ] [ ?e ] if eval1 ] }
    { T{ pair f ?a ?b }    [ ?a eval1 ?b eval1 2array ] }
    { T{ fst  f ?t }       [ ?t eval1 first ] }
    { T{ snd  f ?t }       [ ?t eval1 second ] }
  } match-cond ; 

It is pretty much a direct translation of the Haskell example. An example of using it:

5 <lit> <inc> eval1 => 6

The Lambda the Ultimate post gave a usage example:

stuff = Pair (If (IsZ (Inc (Inc (Lit 5))))
                 (Lit 6)
                 (Lit 7))
             (Fst (Snd (Pair (Lit 42)
                             (Pair (Lit 8) (Lit 9)))))

Translated to Factor it is:

: driver ( -- v )
  5 <lit> <inc> <inc> <isz> 
    6 <lit>
    7 <lit>
  <iff> 
  42 <lit> 8 <lit> 9 <lit> <pair> <pair> <snd> <fst> <pair> ;

Running the 'eval1' word on this produces the answer, { 7 8 }:

driver eval1 => { 7 8 }

Although 'eval1' is quite short and easy to understand it has a disadvantage in that to extend it with new AST types you need to modify the 'eval1' word. Using the Factor OO generic word system you extend the interpreter without having to modify existing code. Here's a generic word approach to the interpreter:

GENERIC: eval2 ( a -- a )
M: lit  eval2 ( a -- a ) lit-i ;
M: inc  eval2 ( a -- a ) inc-t eval2 1+ ;
M: isz  eval2 ( a -- a ) isz-t eval2 zero? ;
M: iff  eval2 ( a -- a ) dup iff-b eval2 [ iff-t ] [ iff-e ] if eval2 ;
M: pair eval2 ( a -- a ) dup pair-a eval2 swap pair-b eval2 2array ;
M: fst  eval2 ( a -- a ) fst-t eval2 first ;
M: snd  eval2 ( a -- a ) snd-t eval2 second ;

The 'M:' word is used to define a method for a generic word. This is similar to how generic functions and methods work in CLOS and Dylan. When a generic word is called the actual method invoked is based on the topmost item of the stack. The first symbol past the 'M:' is the type of that object that that method is for. The second is the name of the generic word the method will be added too. The rest of the definitions should be reasonably clear - they break apart the tuples and return results or call 'eval2' recursively. 'eval2' produces the same result as 'eval1':

driver eval2 => { 7 8 }

Which approach to use is really personal preference. Of the two, 'eval2' is the most efficient. This is because methods can be compiled in Factor whereas the current implementation of 'match-cond' cannot be. So 'eval1' will run in the Factor interpreter whereas 'eval2' will be compiled to native code. This can be seen by trying to compile the examples and running a timed test:

\ eval1 compile => "The word eval1 does not have a stack effect"
\ eval2 compile 
\ eval1 compiled? => f
\ eval2 compiled? => t
[ 1000 [ driver eval1 ] times ] time => 487ms run / 5 ms GC time
[ 1000 [ driver eval2 ] times ] time => 3ms run / 0 ms GC time

There is obviously room for improvement in the 'match' routines! I can say that because I wrote them :)

A compiler follows the same structure as the interpreter but instead of computing the result we generate code. This code is then later fed to the compiler. For the compiler examples I'm only going to show the pattern matching based example. The generic word implementation is very similar and would follow the relevant interpreter implementation.

Factor provides a 'make' word that can be used for dynamically appending to a new sequence. From within a 'make' scope you can use ',' to append an element and '%' to splice in a sequence to the constructed sequence. 'make' can be used for constructing arrays, strings, quotations, etc. The type of the constructed sequence is identified by an exemplar sequence passed to 'make'. For example:

[ 1 , 2 , { 3 4 } % { 5 } , ] { } make => { 1 2 3 4 { 5 } }

The top level 'compile' word will set up a 'make' scope for the AST compile routines to append to. Here's the compiler to Factor:

: (compile1) ( a -- )
  {
    { T{ lit  f ?i }       [ ?i , ] }
    { T{ inc  f ?t }       [ ?t (compile1) \ 1+ , ] }
    { T{ isz  f ?t }       [ ?t (compile1) \ zero? , ] }
    { T{ iff  f ?b ?t ?e } [ ?b (compile1) 
                             [ ?t (compile1) ] [ ] make , 
                             [ ?e (compile1) ] [ ] make , 
                             \ if , ] }
    { T{ pair f ?a ?b }    [ ?a (compile1) ?b (compile1) \ 2array , ] }
    { T{ fst  f ?t }       [ ?t (compile1) \ first , ] }
    { T{ snd  f ?t }       [ ?t (compile1) \ second , ] }
  } match-cond ;  

: compile1 ( a -- quot )
  [ (compile1) ] [ ] make ;

As you can see it is very similar to the interpreter. Instead of returning the result of the AST evaluation '(compile1)' appends the equivalent Factor code to the constructed quotation created in the 'compile1' word.

The implementation for handling 'lit' just appends the numeric value directly. For 'inc' it calls '(compile1)' on the term to be incremented so that gets appended to the quotation. It then appends the Factor '1+' word which will be used to increment it. The '\' is an escape mechanism to tell Factor to store the word itself rather than call it.

The main complication is in the implementation of 'iff'. This word must delay the computation of the two terms of the 'iff' statement. To to this it creates Factor quotations with 'make' and then recursively calls '(compile1)' for the terms. The results of the compilation for this recursive call will be stored in the most recently created quotation rather than the one in the top level 'compile1' word. That is, nested 'make' calls are dynamically scoped.

An example of usage of the compiler:

5 <lit> <inc> compile1 => [ 5 1 + ] 
[ 5 1 + ] call => 6

The 'driver' AST shown previously compiles to Factor correctly:

driver compile1 => [
  5 1+ 1+ zero? [
    6
  ] [ 
    7
  ] if 42 8 9 2array 2array second first 2array
]

This can be compiled to native code using Factor or run in the Factor interpreter:

driver compile1 compile-quot execute => { 7 8 } 
driver compile1 call => { 7 8 }

The final example is a compiler to Javascript. Again it follows the same basic outline of the interpreter. Instead of generating a Factor quotation it uses 'make' to generate a string:

: (compile2) ( a -- )
  {
    { T{ lit  f ?i }       [ ?i number>string % ] }
    { T{ inc  f ?t }       [ ?t (compile2) "+1" % ] }
    { T{ isz  f ?t }       [ ?t (compile2) "== 0" % ] }
    { T{ iff  f ?b ?t ?e } [ 
        "function() {if(" % 
        ?b (compile2) 
        ") {return " % 
        ?t (compile2)
        "} else { return " %
        ?e (compile2)
        "}}()" %
      ] 
    }
    { T{ pair f ?a ?b }    [ "{ first:" % ?a (compile2) ",second:" % ?b (compile2) " }" % ] }
    { T{ fst  f ?t }       [ ?t (compile2) ".first" % ] }
    { T{ snd  f ?t }       [ ?t (compile2) ".second" % ] }
  } match-cond ;  

: compile2 ( a -- quot )
  [ "(" % (compile2) ")" % ] "" make ;

The main complication with this compiler was handling 'if'. The Javascript 'if' has no result so cannot be assigned immediately. So I wrap the 'if' in a function which is called immediately. The two terms of the 'if' use 'return' to return the result from the function. A simple example:

5 <lit> <inc> compile2 => "(5+1)"

And 'driver' also compiles successfully:

driver compile2 => 
"({ 
    first:function() {
      if(5+1+1== 0) {
        return 6
      } else { 
        return 7
      }
    }(),
    second: { 
      first:42,
      second: { first:8,second:9 } 
    }.second.first 
})"

This result evaluates successfully in a browser.

Tags: factor 

2006-10-23

Narrative Javascript beta 1 released

Neil Mix has made Beta 1 of Narrative Javascript available. This release includes an official threading API with support for futures. I'll be revisiting my previous examples of using threading and futures with Narrative Javascript to move things over to this new API.

Tags: javascript 

2006-10-23

Factor 0.85 released

Slava has released Factor 0.85. There are lots of great additions and improvements.

Tags: factor 

2006-10-16

Flapjax

Flapjax is a very interesting looking client side browser framework. It's a superset of Javascript that provides Functional Reactive Programming to the client side browser. From the Flapjax website:

Flapjax is a new programming language designed around the demands of modern, client-based Web applications. Its principal features include:

  • Templating syntax
  • Event-driven, reactive evaluation
  • Persistent data saved on a data store we provide
  • Convenient data sharing
  • Access-control for shared data
  • Interfaces to external Web services

Lambda the Ultimate has a discussion about it and more information from one of the developers can be found here.

Tags: javascript 

2006-10-03

Online Poker in Jeopardy

As those that hang out in #concatenative know I've been playing online poker recently. After a slow start and a bit of book study I started winning some multi table tournaments and doing fairly well. I made more in poker in the last two weeks than I do at my job which is pretty neat. Actually that says more about how much I'm paid at work than my poker skills but never mind :-)

So, as luck would have it, just as I find something I can enjoy doing and get the odd win from, disaster strikes. The US government has passed a bill that bans funding of online gaming via credit cards and bank accounts. It looked for awhile like this bill wasn't going to go ahead but it was tacked onto another 'must pass' bill, which had nothing to do with online gaming, and got through. Slashdot recently covered this.

Most online poker sites are not based in the US but many of the players are. There are a lot of professional poker players that make a living playing on various sites. Their income just got destroyed from the looks of things. Recently, the Wold Championship for Online Poker was held. The winner got $600,000 US. See the results here. The top two placegetters in that list are from the US - they are now legally unable to play it seems, when George Bush signs the bill into law.

This innocent seeming bill has also sent a number of companies into a tailspin on the UK stock exchange. Neteller, a company similar to PayPal, handles much of the online gaming monetary transactions for deposits and withdrawals. Their stock has plummeted with the news. Hopefully they'll recover since that's where my winnings are currently!

I know that things like online access to gambling are quite a thorny issue. Personally I think such issues should be up to the individual and those that have trouble dealing with addiction issues should be helped - but not by punishing those without the problems.

It may seem that the law change is to help those with gambling addictions in the US - so they can't immediately gamble all their credit card money online. But the bill apparently excludes state sponsered gambling. State lotteries, Horse Racing, etc are all still allowed. This points to the real reason for the ban in my opinion - the fact that no taxes are obtained by the US from people playing online games for money through overseas companies.

There is much discussion on the issue in Two Plus Two's Poker Legislation Forum, from the perspective of the players, and in rec.gambling.poker.

Tags: poker 


This site is accessable over tor as hidden service 6vp5u25g4izec5c37wv52skvecikld6kysvsivnl6sdg6q7wy25lixad.onion, or Freenet using key:
USK@1ORdIvjL2H1bZblJcP8hu2LjjKtVB-rVzp8mLty~5N4,8hL85otZBbq0geDsSKkBK4sKESL2SrNVecFZz9NxGVQ,AQACAAE/bluishcoder/-61/


Tags

Archives
Links