2008-05-20
Implementing Native Methods in Tamarin Tracing
2008-05-20: Minor update to get things working with latest Tamarin Tracing code, and updated times for test runs.
Tamarin Tracing can be extended by creating native methods. These are methods of a class where the implementation is in C rather than JavaScript.
For this example I'll use a native implementation of the fibonacci function and compare it to the JavaScript version in my previous post.
A JavaScript function that is implemented in C using the 'native' modifier in the JavaScript source. For example, a natively implemented 'fib' function would be declared in JavaScript as:
public native function fib(n:int):int;
Notice that this includes the type of the arguments and the return type. This is so the compiler can produce the correct C types in the C stub code it generates.
The native method must be implemented in C and linked into the final executable. The name of the function is in the following form:
[class]_[visibility]_[name]
In this fib example there is no class, so 'null' is used for that part of the name and visibility is public so that part of the name is left out. The end result is a native C function called null_fib needs to be implemented.
As part of the compilation process the compiler generates a C structure that will be accessed by the native implementation to extract the arguments passed to it. This structure looks like:
struct null_fib_args
{
public: ScriptObjectp /*global1*/ self; private: int32_t self_pad;
public: int32_t n; private: int32_t n_pad;
public: StatusOut* status_out;
};
The 'n' field of the structure is the argument passed from JavaScript callers. The native implementation, which we need to write, looks like this:
int32_t native_fib(int32_t n) {
if(n <= 1)
return 1;
else
return native_fib(n-1)+native_fib(n-2);
}
AVMPLUS_NATIVE_METHOD(int32_t, null_fib)
{
return native_fib(args->n);
}
First there is the native_fib C function that we want to call from JavaScript. The AVMPLUS_NATIVE_METHOD macro is used to declare the wrapper function that implements the 'native function fib' we declared in the JavaScript file. This receives an 'args' object that is an instance of the null_fib_args C structure mentioned previously. This is used in our example to extract the passed integer value and call the native C function and return the result.
Native function implementations must be linked into the tamarin tracing executable. It's not possible to compile a JavaScript file containing a native declaration and run it using the tamarin tracing 'avmshell' program. To integrate the fib code into 'avmshell' I modify the shell code to compile and link in the native implementation. We can then write JavaScript code that calls it and run it with 'avmshell'.
The first thing to do is write the JavaScript side of the 'fib' code. In a 'fib.as' file in the 'shell' directory of tamarin tracing I have the following code:
package testing {
public function fib(n) {
if(n <= 1)
return 1;
else
return fib(n-1) + fib(n-2);
}
public native function fib2(n:int):int;
}
This provides a JavaScript implementation of fibonacci and one called 'fib2', intended to be implemented with C code so I can compare the speed.
This file needs to be compiled to abc bytecode and have the args structure generated in a C header file. There is a script, shell.py, in the 'shell' subdirectory that does this for the other avmshell classes. Changing the line following the comment 'compile builtins' so it includes the 'fib.as' file just created will result in it being included in the build.
What this line in shell.py does is compile the JavaScript files using the Flex SDK compiler (See later about where to get this and where to put it). The command it runs is something like:
java -jar asc.jar -import builtin_full.abc ... fib.as
This produces the abc bytecode for our fibonacci code, as outlined in my previous post.
The next command run by 'shell.py' is the Flex Global Optimizer. This takes all the abc bytecode files for the shell, optimizes them, and produces a C header and implementation file. It is these C files that contain the generated arguments structure, and the implementation file actually contains a C array of the optimized bytecode. The output of this step will be compiled by a C compiler and linked into the 'avmshell' executable.
The native C implementation of the 'fib2' function should be placed in a file in the 'shell' subdirectory and that file added to the 'manifest.mk' makefile. The contents of this file for this example is:
#include "avmshell.h"
#include <stdlib.h>
namespace avmplus
{
int32_t native_fib(int32_t n) {
if(n <= 1)
return 1;
else
return native_fib(n-1)+native_fib(n-2);
}
AVMPLUS_NATIVE_METHOD(int32_t, null_fib2)
{
return native_fib(args->n);
}
}
I called this 'fibimpl.cpp' and added it to manifest.mk. You'll see in the 'shell' subdirectory various implementations of native methods in [foo]Class.cpp
files, where [foo]
is the JavaScript class being implemented. There are also [foo].as
files which have the JavaScript side of the implementation.
To build our new 'avmshell' which is able to call our native fibonacci implementation, run 'shell.py', and do the configure and make steps as outlined previously:
$ mkdir mybuild
$ cd mybuild
$ ../tamarin-tracing/configure --enable-shell
$ make
I wrote two simple test files to test the 'fib' and 'fib2' functions:
$ cat fib.as
import testing.*;
print("fib 30 = " + fib(30));
$ cat fib2.as
import testing.*;
print("fib 30 = " + fib2(30));
Here are some simple timings on my machine with the tracing jit enabled and disabled:
$ time ./shell/avmshell fib.abc
fib 30 = 1346269
real 0m0.417s
user 0m0.384s
sys 0m0.020s
$ time ./shell/avmshell fib2.abc
fib 30 = 1346269
real 0m0.092s
user 0m0.060s
sys 0m0.020s
$ time ./shell/avmshell -interp fib.abc
fib 30 = 1346269
real 0m7.496s
user 0m7.448s
sys 0m0.004s
$ time ./shell/avmshell -interp fib2.abc
fib 30 = 1346269
real 0m0.070s
user 0m0.060s
sys 0m0.004s
Another way of extending tamarin tracing is via forth. I'll cover that in a later post.
I mentioned earlier about needing the Flex ActionScript compiler and global optimizer from their asc.jar file. Unfortunately tamarin tracing needs a bleeding edge version of this to generate the correct C code. A recent version can be obtained from Mozilla public ftp. This should be placed in the 'utils' subdirectory to be picked up by the scripts. Even more unfortunately this version is out of date for the latest mercurial repository code. Hopefully this situation will be rectified soon, but in the meantime you can go back to changeset 302 from the mercurial repository. I tested the current asc.jar against that.
There are some interesting things from the Tamarin summit about the generated arguments structure. You'll notice it has some padding fields in it. When the native implementation function is called from Forth, the layout of the Forth stack looks like (in Forth stack format):
( obj arg1 ... argn status -- )
Each value on the Forth stack is a 64 bit value. The generated structure type exactly matches the Forth stack layout.
This means that when the Forth stack is ready for the native call, the argument object is actually a pointer to a location on the stack. There is no intermediate argument object actually allocated. The padding fields are to enable exactly matching up with the items on the stack.
Interestingly, if I recall correctly from the Tamarin summit, calling native methods from the tracing jit is actually less efficient than calling it from the interpreter. This is because the interpreter uses the stack layout trick for the arguments object above. But for the tracing jit the argument values are often stored in registers or other memory locations. These must be copied into an arguments object and then the native function called. This is a slight overhead.
Please feel free to leave a comment or email me if you have any questions or corrections to the above. It represents my understanding from attending the summit and playing with the code and may not necessarily be the best way of doing things, or may be incorrect in places.