Bluish Coder

2010-12-14

Simple Ur/Web Example

I've been admiring Ur/Web from afar for a while now and I've decided to dive into it and try it out on a project.

Ur/Web is described at the website as:

Ur is a programming language in the tradition of ML and Haskell, but featuring a significantly richer type system. Ur is functional, pure, statically-typed, and strict. Ur supports a powerful kind of metaprogramming based on row types.

Ur/Web is Ur plus a special standard library and associated rules for parsing and optimization. Ur/Web supports construction of dynamic web applications backed by SQL databases. The signature of the standard library is such that well-typed Ur/Web programs "don't go wrong" in a very broad sense.

Basically it's a programming language and system for building web applications with a rich type system to prevent common web programming errors from occuring.

The compiled Ur/Web applications do not use a garbage collector and are supposed to be very efficient in RAM usage. Ur/Web can also be used for some client side programming and allows writing code that uses threads in the web browser, functional reactive programming for client side updates and RPC calls between the browser and server as well as from the server to the browser. Ur/Web has a good reference manual and an excellent online demo/tutorial.

For the project I want to write I need to be able to spawn command line processes and get the output. Marc Weber has written a library for Ur/Web that does this called uw-process.

I started with a simple example program to get a taste of Ur/Web that uses uw-process. The application displays a single page showing the current time on the server as returned by date. This updates the time every second using Ur/Web's functional reactive programming system.

The clock.ur file implements the application. The getTime function uses uw-process to call date and return the standard output of that as a string:

fun getTime () =
  result <- Process.exec "date" (textBlob "") 100;
  return (Process.blobText (Process.blob (result)))

Process.exe takes three arguments. The command to run ("date" in this case), a text blob containing the standard input to be used by the command (an empty string here) and the maximum length of data to take from the standard output.

The main function is what will be executed when the web application is run:

fun main () =
  s <- source "" ;
  return <xml><head>
    <title>Current Server Time</title>
    </head>

    <body onload={spawn (timeLoop s)}>
      <dyn signal={t <- signal s; return <xml>{[t]}</xml>}/>
    </body>
  </xml>

This uses functional reactive programming on the client to continously call the server asking for the time. We use Ur/Web's support for embedded XML to construct the result page:

return <xml><head>
  <title>Current Server Time</title>
  </head>

  <body onload={spawn (timeLoop s)}>
    <dyn signal={t <- signal s; return <xml>{[t]}</xml>}/>
  </body>
</xml>

The 'onload' handler for the body of the page has this syntax:

{spawn (timeLoop s)}

spawn is used to spawn a thread in the browser. Here we spawn a thread to run the 'timeLoop' function passing it a source, s. This is compiled into JavaScript by Ur/Web. The timeLoop function written in Ur/Web is also compiled to JavaScript:

fun timeLoop s =
  t <- rpc (getTime ());
  set s t;
  sleep 1000;
  timeLoop s

timeLoop does a remote procedure call to the server executing the getTime function shown previously. The source s is set to the result and then the JavaScript thread sleeps for one second. A tail call is made to loop back to calling timeLoop again when the second is up.

As this function is called from a JavaScript handler (onload), it's compiled into JavaScript to run on the client side. The RPC call is converted into an HTTP request to the server to call the getTime function and return the result.

In the body of the HTML we use the source s in a way to display the time value so that it is automatically updated whenever it changes:

<dyn signal={t <- signal s; return <xml>{[t]}</xml>}/>

A source represents a dynamically changing value in the page. When the source changes the page changes automatically. The part that changes is the dyn element. The signal attribute of that element has Ur/Web code to call signal on the source. This retrieves the value that was changed (the time in our example) and we return an XML fragment containing that value. The syntax {[t]} coerces the value of t to something safe to be displayed in XML, taking into account cross site scripting issues, HTML encoding, etc.

The remaining Ur/Web files are clock.urs which contains the type signature for the main function exposed by our application, and clock.urp which is the project file.

To build the example:

$ urweb clock

This produces a clock.exe (a strange name for the resulting file given we're on a Linux system...). Running this file will start a webserver on port 8080 that runs our application:

$ ./clock.exe

Visiting http://localhost:8080/main will display the server time, updated every second, in the browser.

This is obviously a pretty trivial example but does show how easy it is to do threads in the browser and RPC calls. The Ur/Web Demo's show much more including database usage and various uses of the type system.

I've put this example in github. It can be cloned, built and run with:

$ git clone git://github.com/doublec/urweb-clock
$ cd urweb-clock
$ git submodule init
$ git submodule update
$ make
$ ./clock.exe

I include uw-process as a git submodule and the makefile assumes Ur/Web is installed in /usr/local.

I haven't written anything about the type system or other features of Ur/Web that make it interesting yet but I hope to look at some of these in later posts.

Tags: urweb

2010-11-24

Firefox video scaling during YCbCr to RGB conversion

When I changed to using the Chromium YCbCr color conversion code I didn't use the code that scales the video during the conversion as there was no infrastructure in place to do this at the time.

What this meant was we'd do the YCbCr conversion on the original image size and then later, during rendering, scale this using the browser's scaling routines. This was slow in the case of large videos being scaled using the 'width' and 'height' attributes on video elements.

YouTube's HTML 5 video user interface often scales down. By default it plays a good quality video scaled down to fit the standard YouTube user interface. In Firefox this results in a performance hit due to the slow scaling.

Bug 577843 was raised to address this issue. The fix was to implement scaling at YCbCr conversion time using the scaling code that already existed in Chromium's YCbCr code that we were using. This has now landed and will be in Firefox 4 resulting in better video playback when scaled.

As part of this work I also updated to the latest Chromium code. This was done in bug 583138. This gives more performant conversion routines and better quality. This has also landed recently and will be in Firefox 4.

For performance comparisons to see if there were improvements in playback I tested video playback using WebM, Theora and a commonly used open source flash-based video player playing an MP4 video. I compared dropped frame statistics using stats available from the flash player and the experimental playback statistics I wrote about earlier.

For a 1080p sized video, scaled down, the performance on my laptop improved by about 25% with these patches applied and was at least as good as the flash based player. For the Theora video performance was better - I suspect due to the lower CPU requirements to decode Theora video.

[update from feedback: Removed comment about YouTube's fullscreen not being hardware accelerated. Apparently Firefox trunk now accelerates it. Clarified that by 'flash player' I meant an open source video player written in Flash, not an open source Flash implementation]

Tags: mozilla firefox video

2010-11-23

More on type safety using C and ATS

I wrote previously about using the ATS programming language to safely use C libraries. I've recently been using ATS in a couple of projects and have had to access a few more C libraries. This post goes through a few more ways ATS can be used to make it harder to incorrectly use C libraries by making some classes of errors fail the typechecker.

For my project I needed to parse JSON requests. A good C library for this purpose is jansson. This library has a good implementation and documentation.

JSON objects used in the jansson API are reference counted and the documentation is quite clear about which API calls take ownership of references and which don't. Even so there are some areas where seemingly correct code can cause problems. These edge cases are what I'd hope an ATS wrapper could detect at compile time.

Reference Counting

Mismanaging reference counts are an obvious source of error. The following C program fails to dereference a JSON object and results in a memory leak:

#include <stdio.h>
#include <jansson.h>

int main() {
  json_t* a = json_string("this is a string");
  const char* s = json_string_value(a);
  printf("String value: %s\n", s);
  return 0;
}

Running the resulting program under valgrind confirms the leak. The fix is to decrement the reference count for the JSON object:

json_t* a = json_string("this is a string");
const char* s = json_string_value(a);
printf("String value: %s\n", s);
json_decref(a);

By modelling the JSON types as linear types in ATS we can ensure that the object is destroyed at compile time. The is done by declaring an 'absviewtype' for the 'json_t' equivalent and in the function declarations we preserve or consume the linear type as needed. The ATS definitions to wrap the jansson code looks like:

%{^
#include <jansson.h>
%}

absviewtype JSONptr (l:addr)

extern fun json_string
  (value: string)
  : [l:addr] JSONptr l = "#json_string"
extern fun json_string_value
  {l1:agz} (json: !JSONptr l1)
  : string = "#json_string_value"
extern fun json_decref
  {l:agz} (json: JSONptr l)
  : void = "#json_decref"
extern fun JSONptr_is_null
  {l:addr} (x: !JSONptr l)
  :<> bool (l == null) = "#atspre_ptr_is_null"
extern fun JSONptr_isnot_null
  {l:addr} (x: !JSONptr l)
  :<> bool (l > null) = "#atspre_ptr_isnot_null"

With these wrappers defined the body of the 'main' function looks very similar to the C version:

implement main () = () where {
  val a = json_string("this is a string")
  val () = assert_errmsg(JSONptr_isnot_null a, "json_string failed")
  val s = json_string_value(a)
  val () = print(s)
  val () = print_newline()
  val () = json_decref(a)
}

The ATS version however provides a couple of extra safety guarantees. According to the jansson documentation, 'json_string' can return NULL. I've made this part of the type in the ATS definition of 'json_string'. I've declared 'json_string_value' as requiring a non-null pointer. ATS will not compile the program if we don't check 'a' for null.

I've declared 'json_decref' as consuming the linear type. This means if the call to 'json_decref' is excluded like the original C version then the program will fail to compile.

Borrowed Objects

In jansson some API functions return 'borrowed' objects. These are reference counted JSON objects where the reference count has not been incremented. The caller does not own the object. If the caller wants to keep the object around it needs to increment the reference count itself.

This is probably done for performance reasons so the caller can optimize out reference counts if it's just retrieving the object to call a method on it then discarding it.

The jansson documentation is very good at identifying which functions return borrowed objects and which ones return objects owned by the caller.

Here is an example C program that incorrectly uses a borrowed object after it has been destroyed:

int main() {
  json_error_t e;
  json_t* a = json_loads("[\"a\", \"b\", \"c\"]", 0, &e);
  assert(a);
  json_t* a2 = json_array_get(a, 1);
  assert(a2);
  json_decref(a);
  const char* s = json_string_value(a2);
  printf("String value: %s\n", s);
  return 0;
}

Here the 'a2' object is used after 'a' is destroyed. But 'a2' is a borrowed object, owned by 'a'. When 'a' is destroyed so too is 'a2'. This compiles and even runs (on some systems it may fail) but gives an incorrect result. Running under valgrind shows the memory errors.

The equivalent ATS program (note this won't compile):

implement main () = () where {
  var e: json_error_t?
  val a = json_loads("[\"a\", \"b\", \"c\"]", 0, e)
  val () = assert_errmsg(JSONptr_isnot_null a, "json_loads failed")
  val (pf_a2 | a2) = json_array_get(a, 1)
  val () = assert_errmsg(JSONptr_isnot_null a2, "json_array_get failed")
  val () = json_decref(a)
  val s = json_string_value(a2)
  val () = print(s)
  val () = print_newline()
}

This requires ATS definitions for the additional jansson calls:

abst@ype json_error_t = $extype "json_error_t"
extern fun json_array_get
  {n:nat} {l1:agz} (json: !JSONptr l1, index: int n)
  : [l2:addr] (minus(JSONptr l1, JSONptr l2) | JSONptr l2)
  = "#json_array_get"
extern fun json_loads
  (input: string, flags: int, error: &json_error_t?)
  : [l:addr] JSONptr l = "#json_loads"

The 'json_array_get' function is defined to return a proof. The 'minus' is an ATS standard library proof function. It basically states that the result of 'json_array_get' is owned by the object originally passed to it. This proof must be consumed when the result is no longer needed.

By requiring this proof, and the need for it to be consumed, we ask the programmer to define the lifetime of the borrowed object. If we don't consume the proof we get a compile error. If we call 'json_decref' on the owning object then we can't consume the proof later as the owning object ('a' in this example) is consumed by the 'json_decref'. This forces the proof to be consumed within the lifetime of the owning object.

Another feature of the 'json_array_get' definition is that the 'index' parameter is defined as being a natural number. It is a compile error to pass a negative number as the index.

The correct ATS code becomes:

implement main () = () where {
  var e: json_error_t?
  val a = json_loads("[\"a\", \"b\", \"c\"]", 0, e)
  val () = assert_errmsg(JSONptr_isnot_null a, "json_loads failed")
  val (pf_a2 | a2) = json_array_get(a, 1)
  val () = assert_errmsg(JSONptr_isnot_null a2, "json_array_get failed")
  val s = json_string_value(a2)
  val () = print(s)
  val () = print_newline()
  prval () = minus_addback(pf_a2, a2 | a)
  val () = json_decref(a)
}

Note the call to the proof function 'minus_addback'. This is what returns the borrowed object to the owning object in the eyes of the proof system. Remember that these proof functions all occur at compile time. There is no runtime overhead.

We could control the lifetime of the borrowed object by incrementing its reference count and decrementing it when we're done. But this has runtime overhead. The point of 'borrowed' objects is for efficiency and we use the type system here to keep that but ensure we're still correct. Of course if we need 'a2' to outlive 'a' we can control its reference count.

Pointers to internal memory

Similar to the borrowed object issue are API calls that return pointers to memory internal to JSON objects. In the first example I used a call to 'json_string_value'. What's not documented is who owns this string memory. There is no API to 'free' it so the assumption is it's owned by the JSON object somehow. Looking at the jansson source you can see that the string returned is actually a pointer to the internal string value. This means its lifetime is bound by the lifetime of the JSON object.

There is also a 'json_string_set' call which lets you change the value of the string. This free's the old internal pointer and allocates a new one. Any prior call to 'json_string_value' will be pointing to free'd memory if the pointer to it remains live. Here's an example of a bad C program that does this:

int main() {
  json_t* a = json_string("this is a string");
  const char* s = json_string_value(a);
  json_string_set(a, "oops");
  printf("Old string value: %s\n", s);
  json_decref(a);
  return 0;
}

The call to 'json_set_string' prior to the printing of the prior call to 'json_string_value' results in the print using free'd memory.

In ATS we can enforce the constraint that if any pointer to the returned 'json_string_value' is live then it is a compile error to call 'json_set_string', or to decrement the reference count of the owning JSON object. This is what the ATS code equivalent to the above looks like, and won't compile:

implement main () = () where {
  val a = json_string("this is a string")
  val () = assert_errmsg(JSONptr_isnot_null a, "json_string failed")
  val (pf_s | s) = json_string_value(a)
  val _ = json_string_set(a, "oops")
  val () = print(s)
  val () = print_newline()
  prval () = JSONptr_minus_addback(pf_s, s | a)
  val () = json_decref(a)
}

Removing the 'json_string_set' call, or moving it to after the 'JSONptr_minus_addback' call allows it to compile. To get this to work I modified the wrappers for the jansson types so that they keep track of the count of callers to 'json_string_value'. The linear type for the 'json_t' pointer becomes:

absviewtype JSONptr (l:addr,n:int)

Previously it was parameterized over the pointer address. Now it includes an integer count. This count starts at zero when an object is created:

extern fun json_string
  (value: string)
  : [l:addr] JSONptr (l, 0) = "#json_string"

A call to 'json_string_value' increments the count. So the type of the JSON object after this call has a non-zero value:

extern fun json_string_value
  {l1:agz} {n:int} (
    json: !JSONptr (l1, n) >> JSONptr (l1, n+1)
  ) : [l2:addr] (JSONptr_minus (l1, strptr l2) | strptr l2)
  = "#json_string_value"

The ATS syntax 'X >> Y' means 'current_type_of_object >> type_of_object_after_call'. The return value has a 'JSONptr_minus' proof which is a modified version of the 'minus' proof discussed previously. It serves the same purpose but is simplified since it doesn't need to care about the 'n' value. There is an equivalent 'JSONptr_minus_addback' as well:

absview JSONptr_minus (addr, view)
extern prfun JSONptr_minus_addback {l1:agz} {v:view} {n:int}
  (pf1: JSONptr_minus (l1, v), pf2: v |
   obj: !JSONptr (l1, n) >> JSONptr (l1, n-1)): void

Notice that the 'JSONptr_minus_addback' call reduces the value of the 'n' parameter in the object. The secret to not allowing 'json_string_set' to be called if any 'json_string_value' results are live is to declare 'json_string_set' as accepting a JSON object where the count of live strings is only zero. 'json_decref' is changed in a similar manner so that it can only be called on objects that don't have internal strings active:

extern fun json_string_set
  {l:agz} (
    json: !JSONptr (l, 0), value: string
  ) : int = "#json_string_set"
extern fun json_decref
  {l:agz} (json: JSONptr (l, 0))
  : void = "#json_decref"

With these changes the type system will prevent the user from making the mistake of keeping pointers to unallocated memory around.

Again, all this wrapper code is used for typechecking only. The runtime code generated by ATS has no counts of live strings and looks almost exactly like the original C code.

Conclusion

Wrapping a C library from ATS gives an appreciation of how difficult manual resource management can be when there is no help from the type system. Coming up with ways to use the type system to make the usage safer makes you more aware when using C of exactly when you should be asking the question "Who owns this object/memory?".

Some of the methods in this post come from a discussion I started on the ATS mailing list. Read that thread for more on the ideas presented here.

There are other things that can be done to improve the use of jansson from ATS. Some of the jansson API functions return an integer representing success or failure. Checking of these values can be enforced. I describe a way to do this in my earlier article. The jansson API allows iteration over JSON objects. These should be wrapped to enforce safety guarantees for the lifetime of the iterator, modification of the object during iteration, etc.

Another possible improvement might be to ensure that incorrect API calls can't be made on JSON objects of the wrong type. For example, calling 'json_string_value' on an array type. This would force asserting the 'json_type' is checked before calling the type specific functions.

I've put my wrapper for the jansson library on github at http://github.com/doublec/ats-jansson. It's still a work in progress but can be used to try out some of the ideas I've written about here.

Tags: ats

2010-10-10

A Short Introduction to Bitcoin - A Peer to Peer Cryptocurrency

I recently added the ability to donate to TinyVid using bitcoins. The bitcoin website describes bitcoin as 'a peer-to-peer network based digital currency'.

Bitcoins are a 'virtual' currency. There are no physical coins involved. It's very similar to virtual currency in online games where people outside the game trade it for physical money.

The endpoints of a bitcoin transaction are known as an 'address'. When you want to send bitcoins to another person you ask them for an address and instruct your bitcoin client to send a number of coins there. When you want to receive bitcoins you generate an address using the client and pass the address on to the other party.

The process by which transactions are handled via the peer-to-peer network are described at the bitcoin website. There is also a good high level overview in this Bitcoin Electronic Currency article.

To get started with bitcoins, download the client or build from the source, available at the bitcoin website. Once running it will connect to the bitcoin network and proceed to download the existing blocks. This can take a few minutes as there's about 84,000 of them. Once that's done you can send and receive bitcoins.

You can receive some free bitcoins for testing from the bitcoin faucet. You'll need to create an address to receive the coins. From the bitcoin client choose 'New' to create an new receiving address. You can optionally give it a label so you can refer to it by name. Copy that new address to the clipboard and paste it into the facet website. When you click 'Get Some' you will receive a small number of coins.

The received coins will appear in the bitcoin client GUI almost immediately. The initial state will display as 'unconfirmed/0'. As more bitcoin nodes pick up the transaction the number will increase until eventually it says 'confirmed'. At this point you can use the coins with confidence. You can actually transfer and spend the coins while they are 'unconfirmed' but all transactions using them will also stay unconfirmed until the original one is confirmed.

If you want to send those coins elsewhere all you need to do is click 'Send Coins' in the client. Paste in a receiving address and an amount and click 'Send'.

How do you get coins in the first place? You need to find a merchant that's willing to trade you coins for something. Here are some merchants that will trade coins for cash or credit card payments:

bitcoinexchange.com (service no longer active). Buy and sell bitcoins using Euros via SMS/Phonepayment or bank transfer.
bitcoin4cash.com (service no longer active). This service offers bitcoins for cash sent through the mail. They also offer a service to pay for bitcoins via bank wire (iservice no longer active).
bitcoin2cash.com (service no longer active). Another service offering bitcoins for cash sent through the mail.

Online markets exist for trading bitcoins to and from various currencies. The most popular of these is probably mtgox.com (service no longer active). You deposit bitcoins or USD and can then buy and sell coins. You can withdraw USD or bitcoins. Currently bitcoins are trading at about 11 bitcoins per USD.

Bitcoin Watch has a list of the markets and the current exchange rates. It also has information on the state of the bitcoin network.

Bitcoins can be used in a number of places. Some I know about are listed below and the bitcoin trade (iservice no longer active) page lists more:

Cashcow Online Roulette (service no longer active). A roulette game written in Flash that uses bitcoins. Sign on with any credentials, deposit some coins, and once they're confirmed you can play roulette with them. This is a good example of how the sending and receiving of bitcoins can work.
Bidding Pond (service no longer active). An online auction site that uses bitcoins.
Minesweeper (service no longer active). A facebook app that uses bitcoins.
Bitcoin Sportsbook (service no longer active). Bet on sports events.
Qugelmatic Bookstore (service no longer active). Buy books through the 13th largest bookstore on eBay using bitcoins.
Auckland Bitcoin ATM - a bitcoin ATM available in Auckland, New Zealand.

Adding the ability to a website to send and receive payments using bitcoin is easy. The bitcoin client can run a JSON-RPC server that provides a full featured API. There are bindings for a number of different languages. I wrote and contributed a Factor bitcoin API recently. This is what I use on the tinyvid donation page to display the payment address and the current donation balance.

To receive via bitcoin you don't need to have the client running all the time. You can run the client, generate an address, publish that address to receive money and then close the client. At a later date you can run the client and it will catch up with the current state of the network and any money sent to your address will then be available.

You can even avoid running a bitcoin client completely. MyBitcoin (service no longer active) is a service that essentially runs one for you. You can create receiving addresses, send money, etc all from the website. They also provide a shopping cart interface to integrate payments into a website.

I'm interested in the approach of using bitcoin as a way to do micro-transactions and ease online payments and donations. I'll see how it works out on TinyVid and follow up at a future time with results.

Tags: bitcoin tinyvid

2010-09-15

Changes to Firefox Video Implementation

I mentioned in my previous post about the Firefox video implementation that some changes were going to be made to make our implementation more spec compliant. These have now landed and will be in Firefox 4. The relevant bugs are:

Bug 571822 - Fire timeupdate less frequently than once per frame
Bug 584615 - Make media progress events be simple events

Prior to these landing we would try to fire a 'timeupdate' event every time a video frame changed. This allowed JavaScript applications to listen to this event and do things (like display subtitles) at fairly accurate time intervals. Unfortunately other browsers fire this event at different rates, usually with about 200ms between events. This made applications non-portable across browsers if they relied on per-frame events. As a result we've moved to firing 'timeupdate' no more frequently than 250ms. If you need to process time changes more often than this then it's best to use 'setTimeout' or 'setInterval' to query 'currentTime' at the rate you need. This will work across all browsers that support the media API.

The other change was to remove the 'progress' information from the 'progress' events. This was driven by a spec change. The 'progress' events used to contain additional information about the byte position and total number of bytes for the download. This additional information has been removed from the spec and we were the only browser that implemented that part. To prevent cross browser issues we've removed the information too. The alternative is to use the 'buffered' attribute on the media element and work with time data. 'buffered' support was also landed recently which is why we can now remove the non-compliant 'progress' event data.

Tags: mozilla firefox video

← Older Newer →

This site is accessable over tor as hidden service 6vp5u25g4izec5c37wv52skvecikld6kysvsivnl6sdg6q7wy25lixad.onion, or Freenet using key:
USK@1ORdIvjL2H1bZblJcP8hu2LjjKtVB-rVzp8mLty~5N4,8hL85otZBbq0geDsSKkBK4sKESL2SrNVecFZz9NxGVQ,AQACAAE/bluishcoder/-61/