Bluish Coder

Programming Languages, Martials Arts and Computers. The Weblog of Chris Double.


2008-07-10

Video Bling

Robert O'Callahan has been posting about his 'bling branch' which contains some very nice effects. See his blog posts for more detail:

As soon as I saw these I had to try the effects with a playing video. The video patches apply fine to the svg-integration branch.

The screencast below is from this special build, displaying a video played using <video>. There is a reflection below it using the tricks from Robert's posts. Shortly after that I change to a video playing that uses an SVG filter for edge detection when I mouse over the video. Finally there is a video with both effects combined.

You can download the video_bling.ogg file to play, or if you have a <video> enabled browser you can see it below. I've also uploaded it to YouTube.

Tags: mozilla 

2008-07-09

The Video and Audio element patch has landed

The patches in bug 382267 to add support for the WHATWG video and audio elements have been applied to the Firefox mozilla-central repository.

This means you can get the source for Firefox and build it with support for <video> and <audio> by using the configure flag '--enable-media'. Currently the media support is disabled by default so it won't appear in the nightly builds. At some point this will be changed and it will be enabled by default.

The patch that has landed does not yet include a backend decoder. It won't play any videos as a result. That will be fixed when the various backends are landed:

In the meantime you can apply the patches from those bugs to the mozilla-central source to get a video build that decodes video.

Tags: gstreamer 

2008-06-22

Parsing JavaScript with Factor

I've made some more changes to the Parsing Expression Grammar library in Factor. Most of the changes were inspired by things that OMeta can do. The grammar I used for testing is an OMeta-JS grammar for a subset of JavaScript. First the list of changes.

Actions in the EBNF syntax receive an AST (Abstract Syntax Tree) on the stack. The action quotation is expected to have stack effect ( ast -- ast ). It modifies the AST and leaves the new version on the stack. This led to code that looks like this:

expr = lhs '+' rhs => [[ first3 nip + ]]

Nothing wrong with that, but a later change added variables to the EBNF grammar to make it more readable:

expr = lhs:x '+' rhs:y => [[ drop x y + ]]

Code that uses variables a lot end up with a lot of 'drop' usage as the first word. I made a change recommended by Slava to have the action add the drop automatically depending on the stack effect of the action. So now this code works:

expr = lhs:x '+' rhs:y => [[ x y + ]]

So now if you use variables in a rule, the stack effect of the action should be ( -- ast ). If you don't, it should be ( ast -- ast ).

I added a way for one EBNF parser to call rules defined in another. This allows creating grammars which are hybrids of existing parsers. Or just to reuse common things like string handling expressions. These calls are called 'foreign' calls and appear on the right hand side of a rule in angle brackets. Here is a parser that parses strings between double quotation marks:

EBNF: parse-string
StringBody = (!('"') .)* 
String= '"' StringBody:b '"' => [[ b >string ]]
;EBNF

To call the 'String' rule from another parser:

EBNF: parse-two-strings
TwoStrings = <foreign parse-string String> <foreign parse-string String>
;EBNF

The <foreign> call in this example takes two arguments. The first is the name of an existing EBNF: defined parser. The second is the rule in that parser to invoke. It can also be used like this:

EBNF: parse-two-strings
TwoString = <foreign parse-string> <foreign parse-string>
;EBNF

If the first argument is the name of an EBNF: defined parser and no second argument is given, then the main rule of that parser is used. The main rule is the last rule in the parser body. A final way foreign can be used:

: a-token ( -- parser ) "a" token ;

EBNF: parse-abc
abc = <foreign a-token> 'b' 'c'
;EBNF

If the first argument given to foreign is not an EBNF: defined parser, it is assumed that it has stack effect ( -- parser ) and it will be called to return the parser to be used.

It is now possible to override the tokenizer in an EBNF defined parser. Usually the sequence to be parsed is an array of characters or a string. Terminals in a rule match successive characters in the array or string. For example:

EBNF: foo
rule = "++" "--"
;EBNF

This parser when run with the string "++--" or the array { CHAR: + CHAR: + CHAR: - CHAR: - } will succeed with an AST of { "++" --" }. If you want to add whitespace handling to the grammar you need to put it between the terminals:

EBNF: foo
space = (" " | "\r" | "\t" | "\n")
spaces = space* => [[ drop ignore ]]
rule = spaces "++" spaces "--" spaces
;EBNF

In a large grammar this gets tedious and makes the grammar hard to read. Instead you can write a rule to split the input sequence into tokens, and have the grammar operate on these tokens. This is how the previous example might look:

EBNF: foo
space = (" " | "\r" | "\t" | "\n")
spaces = space* => [[ drop ignore ]]
tokenizer = spaces ( "++" | "--" )
rule = "++" "--"
;EBNF

'tokenizer' is the name of a built in rule. Once defined it is called to retrieve the next complete token from the input sequence. So the first part of 'rule' is to try and match "++". It calls the tokenizer to get the next complete token. This ignores spaces until it finds a "++" or "--". It is as if the input sequence for the parser was actually { "++" "--" } instead of the string "++--". With the new tokenizer "...." sequences in the grammar are matched for equality against the token, rather than a string comparison against successive items in the sequence. This can be used to match an AST from a tokenizer:

TUPLE: ast-number value ;
TUPLE: ast-string value ;

EBNF: foo-tokenizer
space = (" " | "\r" | "\t" | "\n")
spaces = space* => [[ drop ignore ]]

number = [0-9]* => [[ >string string>number ast-number boa ]]
string = <foreign string-parser String> => [[ ast-string boa ]]
operator = ("+" | "-")

token = spaces ( number | string | operator )
tokens = tokenizer*
;EBNF

ENBF: foo
tokenizer = <foreign foo-tokenizer token>

number = . ?[ ast-number? ]? => [[ value>> ]]
string = . ?[ ast-string? ]? => [[ value>> ]]

rule = string:a number:b "+" number:c => [[ a b c + 2array ]]
;EBNF

In this example I split the tokenizer into a separate parser and use 'foreign' to call it from the main one. This allows testing of the tokenizer separately:

"123 456 +" foo-tokenizer ast>> .
=> { T{ ast-number f 123 } T{ ast-number f 456 } "+" }

The '.' EBNF production means match a single object in the source sequence. Usually this is a character. With the replacement tokenizer it is either a number object, a string object or a string containing the operator. Using a tokenizer in language grammars makes it easier to deal with whitespace. Defining tokenizers in this way has the advantage of the tokenizer and parser working in one pass. There is no tokenization occurring over the whole string followed by the parse of that result. It tokenizes as it needs too. You can even switch tokenizers multiple times during a grammar. Rules use the tokenizer that was defined lexically before the rule. This is usefull in the JavaScript grammar:

EBNF: javascript
tokenizer         = default 
nl                = "\r" "\n" | "\n"

tokenizer         = <foreign tokenize-javascript Tok>
...
End                = !(.)
Name               = . ?[ ast-name?   ]?   => [[ value>> ]] 
Number             = . ?[ ast-number? ]?   => [[ value>> ]]
String             = . ?[ ast-string? ]?   => [[ value>> ]]
RegExp             = . ?[ ast-regexp? ]?   => [[ value>> ]]
SpacesNoNl         = (!(nl) Space)* => [[ ignore ]]
Sc                 = SpacesNoNl (nl | &("}") | End)| ";"

Here the rule 'nl' is defined using the default tokenizer of sequential characters ('default' has the special meaning of the built in tokenizer). This is followed by using the JavaScript tokenizer for the remaining rules. This tokenizer strips out whitespace and newlines. Some rules in the grammar require checking for a newline. In particular the automatic semicolon insertion rule (managed by the 'Sc' rule here). If there is a newline, the semicolon can be optional in places.

"do" Stmt:s "while" "(" Expr:c ")" Sc    => [[ s c ast-do-while boa ]]

Even though the JavaScript tokenizer has removed the newlines, the 'nl' rule can be used to detect them since it is using the default tokenizer. This allows grammars to mix and match the tokenizer as required to make them more readable.

The JavaScript grammar is in the peg.javascript.parser vocabulary. The tokenizer is in peg.javascript.tokenizer. You can run it using the 'parse-javascript' word in peg.javascript:

USE: peg.javascript
"var a='hello'; alert(a);" parse-javascript ast>> pprint
T{ ast-begin f
  V{
      T{ ast-var f "a" T{ ast-string f "hello" } }
      T{ ast-call f
        T{ ast-get f "alert" } V{ T{ ast-get f "a" } } }
  }
}

The grammar is only for a subset of a JavaScript like language. It doesn't handle all of JavaScript yet. I'll continue to work on it as a testbed to improve EBNF. One thing I need to add next is decent error handling of a failed parse.

Tags: factor 

2008-05-27

Firefox HTML5 video and audio update

A week or so ago I updated the linux build of the gstreamer based HTML5 video implemention with some fixes that make it work nicely with the public sites using <video>. This includes wikimedia and metavid. Video's on those sites with that build show a much better user experience than previous builds.

Two new backends are in progress by other Kiwi Mozilla team members. A DirectShow backend for Windows being developed by Chris Pearce and a QuickTime backend for Mac OS X being developed by Matthew Gregan.

The git repository has been updated to include the start of an <audio> element implementation. Currently only the gstreamer backend has that support. Audio plays but there is no support for the 'controls' attribute and therefore no user interface yet. You can build your own with JavaScript though.

The git repository is based on regular imports of the Mozilla CVS repository. This repository tracks Firefox 3 which is close to being released so updates to CVS are few and far between. The video/audio work will not be in Firefox 3. The plan is to have them in a release soon after, scheduled for around the end of the year. If all goes well this will have backends for gstreamer, DirectShow and QuickTime for the relevant platforms.

The Mozilla repository for this release is currently a Mercurial repository called mozilla-central. I've not yet migrated my work over to this repository but will do so soon. My Firefox git repository now has a branch containing a regular import of the mozilla-central mercurial repository. Moving over to mozilla-central should just be a simple case of merging that branch into the video work.

The bugzilla bugs with the patches for this work will be updated in the next day or so. I wanted to get the refactoring for the <audio> element done before updating them for review.

Tags: mozilla 

2008-05-21

Tamarin Documentation

There have been some great weblog posts recently about the internals of Tamarin. If you're interested in more details about Tamarin, try the following:

For the brave there is an implementation of the Tamarin engine that runs as a scripting engine within Internet Explorer. Called 'Screaming Monkey' it is actually working and an alpha release is available. Screaming Monkey is based on the Tamarin Central engine, rather than the Tamarin Tracing engine.

Tags: tamarin 


This site is accessable over tor as hidden service 6vp5u25g4izec5c37wv52skvecikld6kysvsivnl6sdg6q7wy25lixad.onion, or Freenet using key:
USK@1ORdIvjL2H1bZblJcP8hu2LjjKtVB-rVzp8mLty~5N4,8hL85otZBbq0geDsSKkBK4sKESL2SrNVecFZz9NxGVQ,AQACAAE/bluishcoder/-61/


Tags

Archives
Links