Bluish Coder

Programming Languages, Martials Arts and Computers. The Weblog of Chris Double.


2006-12-16

Parser Combinator enhancements

While working on the Factor to Javascript compiler I struck some performance issues with how I was using Parser Combinators.

The problem manifested when parsing word definitions. For example:

: foo ( a b -- c d ) one two ;

This would parse very quickly. But as the identifiers became longer, or more got added, it got much slower:

: foo ( a b -- c d ) oneoneoneoneone twotwotwotwotwo ;

The reason for this is related to how the 'identifier' parsing word was defined. It can be demonstrated with a simple number parser as well:

[ digit? ] satisfy <+>

This is a parser that accepts one or more digits in a sequence. Like all parsers it returns a lazy list of successful parses. If at some point a parser fails to parse something successfully then the lazy list is traversed trying each parse in turn. For the digit parser you can see it creates an item in the list for each possible string of numbers:

"1234567890" [ digit? ] satisfy <+> parse [ . ] leach
  => T{ parse-result f { 49 50 51 52 53 54 55 56 57 48 } { ... } }
     T{ parse-result f { 49 50 51 52 53 54 55 56 57 } { ... } }
     T{ parse-result f { 49 50 51 52 53 54 55 56 } { ... } }
     T{ parse-result f { 49 50 51 52 53 54 55 } { ... } }
     T{ parse-result f { 49 50 51 52 53 54 } { ... } }
     T{ parse-result f { 49 50 51 52 53 } { ... } }
     T{ parse-result f { 49 50 51 52 } { ... } }
     T{ parse-result f { 49 50 51 } { ... } }
     T{ parse-result f { 49 50 } { ... } }
     T{ parse-result f { 49 } { ... } }

These are all valid parse results for the given input string. Most of the time when parsing a number you dont want all those extra results. You either want the entire number of nothing at all. There's no point having all those extra results being backtracked and taking time. By knowing in advance that we are only interested in the all or nothing result we can use a different word to that of <+>. The document I based the Factor parser combinators on outlines the solution to this and I implemented it in Factor. First a parser combinator called 'only-first' allows us to only get the first succesful result:

TUPLE: only-first-parser p1 ;

M: only-first-parser (parse) ( input parser -- list )
  #! Transform a parser into a parser that only yields
  #! the first possibility.
  only-first-parser-p1 parse 1 swap ltake ;

It takes an existing parser and uses 'ltake' to only return the first result in the list of successes. The existing combinators for matching sequences, <*> and <+>, can have alternatives for only matching the longest sequence:

LAZY: &lt;!*> ( parser -- parser ) <*> only-first ;
LAZY: &lt;!+> ( parser -- parser ) <+> only-first ;

Using <!+> instead of <+> in the number parser is now more efficient:

"1234567890" [ digit? ] satisfy &lt;!+> parse [ . ] leach
  => T{ parse-result f { 49 50 51 52 53 54 55 56 57 48 } { ... } }

"1234567890" [ digit? ] satisfy &lt;+> parse list>array length .
  => 10

"1234567890" [ digit? ] satisfy &lt;!+> parse list>array length .
  => 1

By making sure identifiers and numbers used these optimised parser combinators I was able to make parsing of longer word definitions much more efficient. The new words have been added to the 'parser-combinator' module.

Tags: factor 

2006-12-16

Factor interface to Google Search

About a year ago I wrote some Factor code to use the Google SOAP API to run automated Google searches. I wanted to put together the results of a quick search so I fixed the code to work with the latest Factor release. The code is in my repository and will hopefully eventually make to the main Factor repository. You can get it with:

darcs get http://www.bluishcoder.co.nz/repos/factor

An example of usage is:

"mygooglekey" "factor language" google-search

This will return a sequence of 'search-item' tuples. These contain the url, snippet and title of the result from the search request. Only the top 10 results are currently returned.

The 'mygooglekey' must be the Google SOAP API key. The bad news is Google deprecated this api a couple of weeks ago. They are keeping the service running but aren't providing any new keys. The intent is to migrate people to the Ajax search API. Unfortunately this API only runs within browsers so is not much good for automated server side searches. There's some discussion on the Google API forums about this change.

So if you have a Google key for the SOAP service this code might be useful. Otherwise it won't be. If you google around you might be able to find keys...

Here's an example of usage (it may take a few seconds to run): http://factor.bluishcoder.co.nz/search-rankings. It gives the top ten results from a few google searches on Sit and Go Poker Tournament Strategy. It also gives the pagerank of the sites in the search. The pagerank was obtained from the trynt pagerank API service. The pageranks are cached so I don't keep hitting the trynt service.

Tags: factor 

2006-12-12

Compiling Factor to Javascript

A while back I wrote a toy Factor to Javascript compiler to use as an example in my Compilers and Interpreters post. I ended up using a different example in that post so didn't do any further work on it.

Yuuki from #concatenative is writing a Factor tutorial and I thought it might be useful to put something online to try some of the examples so I decided to do more work on it.

The current work in progress is here. Entering a simple Factor expression in the textarea and clicking 'submit' will send the factor code to the server where it is compiled to Javascript. The Javascript is returned to the browser and evaluated. The compiled javascript is displayed as well as the current stack contents.

It's pretty basic at the moment - it's more of a compiler from a stack based language with similar syntax to Factor rather than an actual Factor compiler. I hope to expand on it though. It can handle arrays, quotations, conditionals, some library words, some combinators and defining new words. The built in functions are:

  • dup
  • drop
  • nip
  • over
  • +
  • -
  • *
  • /
  • .
  • call
  • map
  • reduce
  • clear
  • =
  • if
  • t
  • f
  • empty?

Here are some examples you might like to try:

{ 1 2 3 } [ 2 * ] map
{ 1 2 3 } 0 [ + ] reduce
"hello world!" .
: fac dup 1 = [ ] [ dup 1 - fac * ] if ;
5 fac .
: product 1 [ * ] reduce ;
{ 2 3 4 } product
5 { 1 2 3 } [ over + ] map

I'll add more words and functionality over soon, as well as trying some DOM and Ajax functionality. Leave a comment if there's a feature you'd like to see!

Source code is in my repository and will be in the main Factor repository when Slava pulls from it.

Tags: factor 

2006-12-02

Factor gets XML-RPC

Factor has an XML-RPC library thanks to Dan (littledan in #concatenative). Slava has a post demonstrating how to use it to talk to the paste.lisp.org pastebin.

Tags: factor 

2006-11-22

Propeller Forth

Cliff Biffle announced on the Parallax Forums the release of Propeller Forth.

This is an interactive Forth system for the Parallax Propeller Chip. The Propeller is a multi core chip. It has 8 32-bit chips processor on the one chip. Propeller Forth has multitasking words to take advantage of this parallelism.

Tags: forth 


This site is accessable over tor as hidden service 6vp5u25g4izec5c37wv52skvecikld6kysvsivnl6sdg6q7wy25lixad.onion, or Freenet using key:
USK@1ORdIvjL2H1bZblJcP8hu2LjjKtVB-rVzp8mLty~5N4,8hL85otZBbq0geDsSKkBK4sKESL2SrNVecFZz9NxGVQ,AQACAAE/bluishcoder/-61/


Tags

Archives
Links