Factor Partial Continuation Updates and Iterators

2006-03-19

Factor Partial Continuation Updates and Iterators

I've made some changes to the parser combinator library I wrote for Factor. The changes were to make the usage of the library to be more consistent with good Factor style.

The first change was to ensure that items on the stack are accessable in an intuitive manner from within the quotations passed to breset and bshift. When writing combinators in Factor it's important that the internals of the combinator implementation do not affect the callers view of items on the stack. For example:

20 [ drop 2 * ] breset

Recall that the quotation passed to breset has stack effect ( r -- v ). Once the 'r' is dropped the '*' should be expected to operate on the '2' and the '20'. The '20' is accessable as if it was directly under the 'r' on the stack from within the quotation.

In a prior implementation of breset I called the quotation using 'call' after creating a 'dup' of the 'r':

: breset 
  ... ( quot r -- )
  dup rot call ( r v -- )

The problem with this is that from within the quotation there is an extra 'r' above items on the stack before the quotation is called:

20 [ drop 2 * ] breset

The stack on entry to the quotation here is ( 20 r r -- ). Changing the breset implementation to use 'keep' instead of call fixes this problem:

: breset 
  ... ( quot r -- )
  swap keep ( v r -- )

Notice also that the return item from the quotation, 'v', is now below the 'r' and I didn't need to 'dup' it. In general, usage of words like 'keep' will enable combinators to more intuitively access stack items from outside the quotation.

Another way of solving this problem is to use the return stack words '>r' and 'r>' to move items temporarily off the stack so that the quotation being called is imediately above the relevant stack items supplied by the caller.

The second major change was to remove the use of 'with-scope' and namespaces to store the continuation delimiter. I previously stored this in a 'mark' and 'mark-old' variable. Now the implementations of breset and bshift manage these on the stack.

Storing variables in namespaces is seductive. It enables you to avoid sometimes complicated stack management by giving you named variables. Unfortunately it comes with a price. When setting the value of a variable in a namespace it is stored in the top most namespace on the namespace stack. This can be seen with code like this:

SYMBOL: myvar
5 myvar set
myvar get .
 => 5
10 [ myvar set ] keep drop myvar get .
 => 10

As expected setting the myvar variable in the quotation passed to 'keep' results in the global myvar value being set to 10. But if this is run inside a 'with-scope' you get caught by the fact that a new namespace is at the top of the stack:

myvar get .
  => 10
20 [ [ myvar set ] with-scope ] keep drop myvar get .
  => 10

Notice the value is still 10. This is because the variable is set in the new namespace which is dropped off the namespace stack when the 'with-scope' exits. Normally you would be aware of this when you write code, but if you use a combinator that uses 'with-scope' in its implementation, and it then calls your quotation from within that scope then all your variable sets will be lost at the end of the combinator call.

This may be desired, and the reason why the combinator is in a scope, but for many cases it's not the desired behaviour. So as a general rule I try to avoid using variables and with-scope in combinator implementations.

There was also a minor bug in my 'range' example in that it didn't give the correct range if the starting number was anything but '1'. The corrected 'range' implementation is:

: range ( r from to -- n )
 over - 1 + rot [ 
   -rot [ over + pick call drop ] each 2drop f  
 ]  bshift 2nip ;

Note the '2nip' at the end. 'bshift' and 'breset' operate like other continuation combinators in that they restore the stack to what they were before they were called. In this case the 'from' and 'to' were on the stack and we need to 'nip' them off. Simple usage works as before:

[ 1 5 range . f ] breset drop
 => 1
    2
    3
    4
    5

Nested usage works correctly now:

[ dup 1 3 range swap 10 12 range + . f ] breset drop
 => 11
    12
    13
    12
    13
    14
    13
    14
    15

It can be hard to reason about what usage of words like 'range' does. Think of it as returning a single value 'n', and calling the breset multiple times with the value of 'n' for each 'n' in the range. So you'll see that after the first 'range' call I 'swap' to swap the result of the range and get the continuation mark passed to breset back to the top of the stack to call the second 'range'.

You'll notice that the result of calling these snippets always returns 'f' on the stack. This is because the range implementation leaves 'f' on the stack at the end of the bshift quotation. This results in 'breset' exiting with that value.

A question was raised on the Factor mailing list about CLU-style iterators. An example of the usage from one of the links in that message is:

sum: INT := 0;
loop sum := sum + 1.upto!(10); end;
#OUT + sum + '\n'; -- Prints sum of integers from 1 to 10

This has a very direct translation in Factor using range and breset as:

SYMBOL: sum
0 sum set 
[ 1 10 range sum get + sum set f ] breset drop
sum get .
 => 55

Bluish Coder

Factor Partial Continuation Updates and Iterators

Tags