Bluish Coder

Programming Languages, Martials Arts and Computers. The Weblog of Chris Double.


2006-06-02

Server side Javascript

An update to the code demonstrating E4X support is here.

Ajaxian is reporting that Sun is releasing Phobos, a Javascript application server. This is a web server that runs Javascript for implementing the server side code. The source has not yet been released, according to Ajaxian, but will be.

A while back I worked on doing server side Javascripting using Rhino as the interpreter and Jetty 6 as the server. I got it to work but never tidied it up for release. Sun's Phobos has motivated me to make it available so I've tidied it up and it can be downloaded from javascript-server.tar.gz. A darcs repository with the code is here:

darcs get http://www.bluishcoder.co.nz/repos/javascript-server

The repository contains the Jetty 6 JAR files and the Rhino interpreter JAR, along with an example Javascript file showing how it works and a readme.

Once the 'example.js' is loaded into the Rhino interpreter you can start a Jetty 6 web server on port 8080 with the following command:

var s = startServer(8080);

A simple 'HelloWorld' style servlet looks like this in Javascript:

HelloWorldServlet = makeServlet({ 
  ProcessGet: function (req, resp) {
    var text = "Hello World!";
    resp.setContentType("text/plain")
    resp.setContentLength(text.length)
    resp.getOutputStream().print(text)
    resp.flushBuffer()
  }
});

It has a 'ProcessGet' function which is called when an HTTP GET is made. The 'req' and 'resp' objects and the standard Java HttpServletRequest and HttpServletResponse respectively.

The magic of deriving from HttpServlet is done by Rhino. But it can't actually override HttpServlet methods - it can only override abstract methods. To fix this I created a JavascriptServlet Java class which forwards doGet(...) to an abstract 'ProcessGet', which is the method you see overridden above.

With the Jetty server we started running from the interpreter earlier we can add this servlet dynamically:

addServlet(s, "/", HelloWorldServlet);

Now requests to http://localhost:8080/ will run HelloWorldServlet. When can dynamically add other servlets too:

addServlet(s, "/bye", GoodbyeWorldServlet);

Requests to http://localhost:8080/bye/ will run the GoodbyeWorldServlet.

Using the Jetty API you can work out how to add, remove and otherwise do some very cool stuff. Like play with Jetty 6's Ajax continuation support.

The server can be stopped with:

s.stop();

Running Javascript on the client and server gives some interesting possibilities. Sharing code for example. Or writing validation rules in Javascript which run both on the client and server.

The Dojo javascript toolkit can run client side or server side. This means you could use Dojo's packaging and other nice features for server side development in Javascript.

Rhino has continuation's which are serialisable. A continuation based web server could be implemented which serialises continuations like SISCWeb.

Download javascript-server, play with it, and let me know what you do with it. I hope to follow up on some of the above ideas too.

Tags: javascript 

2006-06-01

Amazon S3

I've been playing around with Amazon S3 for a project I'm working on and I'm quite impressed with it. Amazon S3 provides unlimited storage for $0.15 per gigabyte storage and $0.20 per gigabyte of bandwidth.

Once you sign up and get an account you can create 'buckets' which are named collections of stored objects. Each bucket acts as a billing point so it provides a way to seperate data stored for different applications so you can track bandwidth, storage utilisation and cost, etc.

Within a bucket you can store and retrieve objects by key. There is no means to update or append an object so it's either create a new one or replace an existing one completely. You can use 'partial get' in HTTP to retrieve partial contents of an object though.

The API to manage the S3 datastore is a REST based HTTP API or using SOAP. Amazon have provided a number of sample client libraries in different languages to get you going fast. This includes Java, Ruby, Perl and Python.

The objects stored can be made publically accessable via an HTTP URL, or private, allowing only the owner to access them. Every object stored has automatic bitorrent support. By appending a torrent suffix onto the URL for the object you get a bitorrent file that allows using the capabilities of the bitorrent protocol to share the bandwidth of downloading large files.

People have already started building interesting libraries on top of it. For example:

  • S3Ajax. This is a Javascript library that lets you call the S3 API from within the web browser. By hosting the Javascript files on your S3 system and providing access to them to publically via the browser you get a simple web site. As it is hosted on the S3 domain the Javascript functions can call the S3 API providing read/write access to the storage. S3Wiki is uses this for example.
  • This thread in the S3 forums discusses a filesystem built on top of S3. It works under Linux and can be mounted like a normal drive. Each block in the filesystem is stored as an object in the S3 datastore. It can only be mounted on one system at a time but once unmounted it can be mounted safely on another machine. Providing unlimited storage that can be accessed from any Linux system.
  • JungleDisk has a similar idea but uses a local WebDAV server and transparently copies and encrypts data to the S3 datastore. It can be used in Windows, Linux or Mac OS X. The main difference between JungleDrive and the S3 filesystem above is that JungleDrive is cross platform. It also stores files into S3 objects on a one-to-one mapping I believe, whereas the S3 filesystem stores the blocks that make up the files into an object. The latter approach makes it easier to support partial access to files and streaming without having to download the entire object first. At the risk of making it harder to reconstruct the file itself if things go wrong I guess.

I'm using S3 for holding uploads of media provided by users for later analysis and classification. I don't know how much data the system will eventually collect so having an unlimited (except by cost) storage capability is useful. It also means I don't have to pay for a storage immediately. Instead I can pay as I go.

So how is S3 different from something like Openomy? My undrstanding is the main intent for Openomy is to provide a place for users to store the data from web applications and the user owns that data. So an Openonmy compliant web application would have access granted to an area of the users Openomy storage and can read/write to it. Should the web application go out of business or become inaccessible the user still has the data in their control.

With S3 the usage model is that the web application uses their own S3 store for storing data. The user does not get S3 storage and provide it to the web application. To do that would require giving up their 'secret key' which can't be revoked. This would be bad. The web application could access all of the users data. With Openomy you can authorise and prevent an application from accessing the data. Both usage models are useful and I think they are complementary services.

Currently I'm using S3 from Javascript on the server. It's very easy to call the S3 Java API from Rhino. To import the basic Java classes into Rhino and create an authenticated connection:

importClass(Packages.com.amazon.s3.AWSAuthConnection);
importClass(Packages.com.amazon.s3.S3Object);

var conn = new AWSAuthConnection(accessKeyId, secretAccessKey);

The 'accessKeyId' and 'secretAccessKey' are the keys supplied by Amazon once you've subscribed to the S3 web service. Once you have a connection you can create buckets and store and retrieve objects:

js> conn.createBucket('mybucket1', null).connection.getResponseMessage()
OK
js> conn.listBucket("mybucket1", null, null, null, null).entries
[]
js> var obj1 = new S3Object(new java.lang.String("Hello!").getBytes(), null);
js> conn.put('mybucket1', 'key1', obj1, null).connection.getResponseMessage();
OK
js> conn.listBucket("mybucket1", null, null, null, null).entries
[key1]
js> new java.lang.String(conn.get('mybucket1', 'key1', null).object.data);
Hello!

Wrapping nicer server side Javascript API would probably be a good idea. For example I can't call the 'delete' method of AWSAuthConnection as 'delete' is a reseved word in Javascript. As Rhino allows serialising any Javascript object, you could even store continuations in S3.

Tags: javascript 

2006-05-27

Seaside for Dolphin Smalltalk

A preliminary port of Seaside, the continuation based web server framework, for Dolphin Smalltalk has been released.

Tags: seaside 

2006-05-27

Python to Javascript

Simon posted a comment on my post about the C# to Javascript compiler pointer out that there is a Python to Javascript system.

So far I've come across converters to Javascript from the following languages:

Even the JSScheme system might count, as the JIT essentially compiles Scheme into Javascript. I've been able to run this from Rhino to compile the Scheme code to Javascript and run it in the browser. All it would need is some nice libraries to wrap the DOM and use AJAX.

Tags: javascript 

2006-05-26

Continuation based Web Servers

There's a lot of discussion going around at the moment about continuation based web servers. Some of the comments I've seen seem to be based on a misunderstanding of what exactly this type of server provides.

Ian Griffiths has a post about why he thinks continuation based web fraemworks are a bad idea.

Ian writes about 'Abandoned Sessions':

This is very much not analogous to the function returning or throwing an exception. In the world of our chosen abstraction - that of sequential execution of a method - it looks like our thread has hung.

The problem with this is that a lot of the techniques we have learned for resource management stop working. Resource cleanup code may never execute because the function is abandoned mid-flow.

The problem of a user exiting an interaction in 'mid-flow' is not unique to continuation based web servers. If the user is stepping through a multi-form shopping cart checkout process then the data they have entered must be stored someone.

If it's in the database or web session then this data will live for a specified time (usually the session timeout) and if the user doesn't continue the flow then it is deleted.

The same logic occurs in continuation based servers. The continuation, if stored on the server, holds the data that the user has entered during the flow. After a set timeout the continuation is removed. In some systems the continuation data is stored in the web session so this is done automatically.

Like any other web framework the continuation data can be stored in a form field on the clients browser (assuming the framework allows serialisable continuations) so there is no need for this 'garbage collection' to occur.

Ian is right that it is important not to acquire a resource and clean it up after a continuation capture boundary. If the user never returns to continue the flow then it may not be cleaned up....unless your language with continuations also has a 'dynamic-wind' construct. This is like try/finally in Java except you can have code run when a block is entered or exited for any reason. So when a block is exited due to a continuation escape then the resource is automatically released. When it is entered again it is automatically acquired. This approach removes the worry about hanging resources.

It's important to remember that the 'thread' running the web request doesn't hang when the continuation is captured. After capturing it the thread is gracefully exited in the normal manner while the web page data is returned to the user.

Ian continues with Thread Affinity:

With an ordinary sequentially executing function, I can safely assume one thread will run the function from start to finish. But if I'm using continuations to provide the illusion that I've got sequential execution spanning multiple user interactions, then I might get a thread switch every time I generate a web page.

You'll get a thread switch every time the user makes a request in a standard framework that supplies threads from a thread pool. I've not had a situation where switching threads has been a problem. Given the 'dynamic-wind' feature mentioned before, anything that requries thread affinity can be released and re-acquired in the new thread when the continuation is resumed.

The same goes for the 'Web Farms' issue. In a system where the continuation can be serialised or sent to another machine then another machine on a web farm can deserialise the continuation and continue. If this is not the case then session affinity can be used to ensure that a request in the same session is processed by the same machine.

Ian has a number of issues with back button and branching. The nice thing about continuation based frameworks is that this is all handled for you.

If the user hits the back button then they go back to a previous continuation. A continuation is a snapshot of a stack frame so has access to a copy of all local variables at the time that it was captured. This means that the back button 'just works'. Ian writes:

Normal functions don't do that - they only jump back to earlier points if you use flow control constructs such as loops. Giving the user the option to inject goto statements at will is an unusual design choice, but anything that models user journeys as sequential execution of code will have to cope with this kind of rewinding, or it'll break the back button. And I don't know about you, but I hate sites that break the back button. (Yes, Windows Live Search, I'm looking at you.)

I completely agree that breaking the back button is a bad thing. With a continuation based framework you get a working back button for free because the execution stack is wound back to the point of the continuation for the page the user went back to.

You can choose not to have state wound back by most systems by storing data in the database, or keeping them globally in memory. Sometimes you don't want the user to go back. For example, if they've just processed a credit card you don't want them to go back and resubmit the form.

The way to do this is to have a way of marking a block as 'run-once'. If continuations captured within that block are attempted to run again then an error occurs, either displaying a message (don't submit twice) or going back to a known safe point. The framework handles this for you. This is easy to implement because it's as simple as checking at the beginning of the continuation entry if we've been there before by reading a flag.

Session cloning or 'bifurcating' as Ian calls it is handled just as easily. By opening a tab or new window on the existing page you get a resumption of the continuation the user is already looking at. Since these are copies of the execution stack the user effectively gets a copy of all the local variables. Modiying these has no effect on the information on the original page. So there is no need for the programmer to guard against concurrency here as Ian seems to think:

This means I have to write my function in such a way that it can cope not only being rewound, but also to being split so that multiple threads execute the function simultaneously, each taking different paths. But of course because I'm using continuations, each of these threads gets to use the same set of local variables. The fact that I enabled users to inject gotos into my code at will is now looking like a walk in the park - now they can add arbitrary concurrency!

You can choose to have state that does not get cloned of course. Just store it in the database (for example, the shopping cart).

Somethings you want to be 'global' and unaffected by the user using the back button, bookmarks and cloning. These can be stored in the database or some other persistent store. The shopping cart is the obvious example. You wouldn't want the user hitting 'back' and losing the last item put in the shopping cart. Or cloning the session to result in items not being added. Other things you may want wound back (Some types of data entered in a form, etc). These can be local variables. Continuation based servers give you the choice.

I highly recommend reading some of the papers mentioned in a previous posting of mine. They describe in much more detail some of the advantages (and disadvantages) of using continuations to model web flow.

Tags: continuations 


This site is accessable over tor as hidden service 6vp5u25g4izec5c37wv52skvecikld6kysvsivnl6sdg6q7wy25lixad.onion, or Freenet using key:
USK@1ORdIvjL2H1bZblJcP8hu2LjjKtVB-rVzp8mLty~5N4,8hL85otZBbq0geDsSKkBK4sKESL2SrNVecFZz9NxGVQ,AQACAAE/bluishcoder/-61/


Tags

Archives
Links