2010-06-13
Functions in ATS
Using higher order functions in ATS was a bit of a struggle for me at first due to having to deal with getting types correct. I'm coming from a background of dynamically typed languages where it is easy to create and pass around anonymous functions. Even other typed languages like Haskell proved easy to pick up, primarily because of the excellent type inference available. I delved deeper into the different types of functions in ATS to get more familiar with it. This post covers the things that were important for me to start using them successfully in larger ATS programs.
ATS distiguishes between two types of functions:
- 'functions' which are exactly like functions in C. That is they are a memory address pointing to the machine code for the function body. Functions cannot capture the environment at the point they are created so do not act as 'closures' in other languages. By default top level ATS function definitions are 'functions'. They can be passed directly to external C routines that require function pointers.
- 'closures' which are functions combined with an environment mapping names to values. These are exactly like closures in other dynamic langauges that enable you to capture variables in the enclosing scope. To create a closure you must explicitly mark it as being a closure. A closure requires memory to be allocated to store the environment. This means you will either need to allocate the closure on the stack (so it is freed automatically when the scope exits), free the memory for the closure manually, or link against the garbage collector.
As well as these two types of functions, ATS can track the effects that occur during a function call. The types of effects that can be tracked are:
- exn - the function can raise an exception
- ntm - the function is possibly non-terminating
- ref - the function may potentially update the content of some shared memory
This leads to a number of combinations that can result in a function having a different type. These effects and whether it is a 'function' or 'closure' are annotated in the function definition as 'tags' in between '<' and '>' before the return type. A function definition without tags looks like:
fun add5 (n: int): int = n+5
Adding tags looks like:
fun add5 (n: int):<tag1,tag2,tag...> int = n+5
The valid values for the tags are:
- !exn - the function possibly raises exceptions
- !ntm - the function possibly is non-terminating
- !ref - the function possibly updates shared memory
- 0 - the function is pure (has no effects)
- 1 - the function can have all effects
- fun - the function is an ordinary, non-closure, function
- cloptr - the function is a linear closure that must be explicitly freed
- cloref - the function is a peristent closure that requires the garbage collector to be freed.
Note that the '0' and '1' tags can be suffixed to 'fun', 'cloptr' and 'cloref' for simplicity:
- fun0 - a pure function
- fun1 - a function that can have all effects
- cloptr0 - a linear closure that is pure
- cloptr1 - a linear closure that can have all effects
- cloref0 - a persistent closure that is pure
- cloref1 - a persistent closure that can have all effects
Some examples of tag usage follow. Where more than one usage example is given the definitions are equivalent.
A function that takes an int, returns an int, is pure, doesn't have side effects, doesn't raise exceptions and will terminate:
fun add5 (n: int):<fun0> int = n+5
fun add5 (n: int):<fun> int = n+5
fun add5 (n: int):<> int = n+5
A function that takes an int, returns an int, possibly has side effects, may raise an exception, might not terminate:
fun add5 (n: int):<fun1> int = n+5
fun add5 (n: int):> int = n+5
A function that takes an int, returns an int, is pure except it possibly raises exceptions
fun add5 (n: int):<!exn> int = n+5
An example of usage which causes a compile time error due to effects not matching. The 'outer' function is defined as pure but it is calling an impure function:
fn inner ():<fun1> int = 5
fn outer ():<fun0> int = inner()
implement main() = print_int(inner())
Notice that I use 'fn', not 'fun' here. 'fn' is used to explicitly define a function that cannot be recursive. 'fun' allows a function to call itself and can be recursive. The latter has no termination metrics annotated to the function so you get a compile error if 'fun' is used. I use 'fn' to automatically add the termination metric.
Another compile time error example due to effects not matching is having 'outer' not throw exceptions but calling a function that does:
fn inner ():<!exn> int = 5
fn outer ():<fun0> int = inner()
implement main() = print_int(inner())
The previous example modified so 'outer' can raise exceptions so it will compile:
fn inner ():<!exn> int = 5
fn outer ():<fun0,!exn> int = inner()
implement main() = print_int(inner())
Higher order functions
Functions can take other functions as arguments, or return functions. The types of these functions are defined using a syntax similar to that of defining top level functions except for the following:
- 'fun' or 'fn' is excluded
- no function name is given
- ':' is replaced with '-' before the return value and tags
For example, the following shows ways that function argument and returns types are defined:
typedef foo_t = int -<fun1> int
fun bar (f: foo_t): int = f(42);
fun add5 (n: int): int = n+5
fun baz (): foo_t = add5;
fun bar2 (f: int -<fun1> int) = f(42);
fun baz2 (): int -<fun1> int = add5;
Note that the the effects and types of the functions have to match. Here is an error due to pure vs impure usage:
fun add5 (n: int):<fun1> int = n+5
fun bar (f: int -<fun0> int) = f(42)
implement main() = print_int(bar(add5));
The 'bar' function expects to be passed a pure function as an argument, but 'add5' is impure.
ATS supports creation of anonymous functions using the 'lam' keyword. Anonymous functions using 'lam' are defined similar to the syntax of top level functions except for:
- 'fun' or 'fun' is replaced with 'lam'
- no function name is given
For example:
fun bar (f: int -<fun1> int) = f(42)
implement main() = print_int(bar(lam (n) => n+10));
Tags can be used with 'lam' too:
fun bar (f: int -<fun1> int) = f(42)
implement main() = print_int(bar(lam (n) =<fun1> n+10));
Functions of tag 'fun' can be passed directly to C functions that expect 'pointer to function' arguments. For example:
%{^
typedef int (*func)(int);
void adder(func f) {
int n = f(10);
printf("from C: %d\n", n);
}
%}
extern fun adder(f: (int) -<fun1> int):<fun1> void = "#adder"
fn add5 (n: int):<fun0> int = n+5
implement main() = adder(add5);
In a followup post I'll go through persistent and linear closures and usage of higher order functions in the ATS library.
Resources that helped with this post:
- Function or Closure in the ATS tutorial.
- ML Programmers Guide to ATS
- ATS/Anairats User's Guide
- Mailing list post: Strange compiler error message
- Mailing list post: cloptr1 vs cloref1