2006-09-19
Lazy file reading in Factor
I've added a couple of new lazy list routines in Factor to do lazy stream input/output. These are lazy counterparts to the standard contents and lines words.
The new words are 'lcontents' and 'llines'. The first returns a lazy list of all the characters in the file. The second returns a lazy list of all lines in the file. Both of these lazily retrieve from the stream as needed and can be used on files larger than the memory available to Factor:
"test.txt" <file-reader> llines [ . ] leach
While adding these I noticed the the mapping and subset operations didn't memoize their values. This meant the quotation for these operations were being called whenever the 'car' or 'cdr' of the cons was called. I added a <memoized-cons> type that wraps an existing cons and remembers the previous calls to 'car', 'cdr' and 'nil?'. 'lmap' and 'lsubset' automatically wrap their return type in a <memoized-cons>.
Now that they are memoized, the following 'fib' implementation works very fast:
: fib ( n -- )
dup 2 < [
drop 1
] [
dup 1 - "fibs" get lnth swap 2 - "fibs" get lnth +
] if ;
naturals [ fib ] lmap "fibs" set
5 fib
25 fib
60 fib
This creates a lazy list of fibonacci numbers. Retrieving the nth item from the list returns the nth fibonacci number. As the values are automatically memoized as a result of the lazy map operation, susbequent calls to 'fib' are very fast.