2006-12-22
Search and Replace with Parser Combinators
I've added some code to the Factor parser combinator library to allow search and replace using parser combinators. The idea was to be able to do similar things with parser combinators that regular expressions are commonly used for. I've also started adding some simple reusable parsers for numbers, strings, etc.
The search
word takes a string and a parser off the stack. It returns a sequence of all substrings in the original string that successfully parse by the parser. The result returned in the sequence is actually the result returned by the parser. For example:
"hello *world* from *factor*" 'bold' search .
=> { "world" "factor" }
"one 100 two 300" 'integer' search .
=> { 100 300 }
"one 100 two 300" 'integer' [ 2 * ] <@ search .
=> { 200 600 }
The search\*
word takes a string and a sequence of parsers off the stack. It returns a sequence of all substrings in the original string that successfully parse by any of the parsers in the sequence. It is functionally the same as combining parsers with the <|> combinator.
"hello 123 \"world\" from 456 factor" 'string' 'integer' 2array search* .
=> { 123 "world" 456 }
The replace
word takes a string and parser off the stack. It returns the original string with all substrings that successfully parse by the parser replaced by the result of that parser.
"Hello *World*" 'bold' [ "<strong>" swap "</strong>" 3append ] <@ replace .
=> "Hello <strong>World</strong>"
The replace\*
word takes a string and sequence of parsers off the stack. It does a fold, iterating through the sequence applying 'replace' using the parser and the results of the previous 'replace' call. The result is a string with all matching substrings from the parsers in the sequence being replaced by the results of those parsers.
"*Hello _World_*"
'bold' [ "<strong>" swap "</strong>" 3append ] <@
'italic' [ "<emphasis>" swap "</emphasis>" 3append ] <@
2array replace* .
=> "<strong>Hello <emphasis>World</emphasis></strong>"