2017-02-22

## Borrowing Internal Pointers in ATS

Recently Robert O'Callahan wrote a post on what Rust can do that other languages can't. It involves getting a pointer to an internal field of a stack allocated object and having the type system ensure that the lifetime of that pointer doesn't outlive the owning object.

As soon as I saw this I thought "Aha, ATS can do that!" but I suspected it would be quite a bit more verbose. It turns out it's possible but tedious to do and while ATS 1 can do it in safe code, ATS 2 requires some help. This isn't a "Robert's post is wrong" post or a tear down of Rust, it's a thought experiment in how to do things in ATS that are easier in Rust. Plus ATS is in no way a "remotely mainstream language".

The following is a representation of the scenario that works in ATS 1. First we declare the type of an object that contains two fields:

```
viewtypedef X = @{
a= int,
b= int
}
```

A function that safely returns a pointer to one of the fields looks like this:

```
fun get_b_ref {l:addr} (pf: X @ l | x: ptr l):
[l2:addr] (int @ l2, int @ l2 -<lin> X @ l | ptr l2) =
let
val p = &(x->b)
prval pf = view@ (x->b)
in
#[.. | (pf, llam pf => let
prval () = view@ (x->b) := pf
in
view@ (!x)
end
| p)]
end
```

This can look imposing if you're not familiar with ATS. What does this all mean? Once you're familiar with ATS you can read this as `get_b_ref`

takes a pointer to an `X`

and a proof that you can use it. It consumes this proof meaning we can't access the `X`

object after calling. It returns a pointer to an `int`

- one of the internal fields of `X`

- and provides a proof to use it as an `int`

. It also returns a proof function that consumes that proof of an `int`

and returns the original proof that there is an `X`

at `l`

- allowing you to use the original `x`

as an `X`

again and preventing using the returned pointer to an `int`

due to the proof allowing you to use that getting consumed.

In more detail, it defines a function, `get_b_ref`

, that takes a pointer to an `X`

object as an argument: `(pf: X @ l | x: ptr l)`

. The items to the left of the `|`

are proof arguments and only exist at type checking time. This says that `pf`

contains a proof that an object of type `X`

exists at address `l`

. The `l`

is defined in the `{l:addr}`

section of the function declaration meaning `l`

is a memory address. The items to the right of the `|`

are normal values. In this case, `x`

is a pointer with address `l`

. Because we have a pointer at address `l`

and a proof that an object of type `X`

exists at `l`

we can treat the pointer safely as being a pointer to an `X`

. The function consumes the proof when called. That means that the proof cannot be used again so we can no longer treat `x`

as a pointer to an `X`

. It's a pointer to something we can't do anything with after calling.

The result of the function comes after the `:`

symbol. The `(int @ l2, int @ l2 -<lin> X @ l | ptr l2)`

is the meat of it. Again, proofs are to the left of the `|`

and normal values to the right. In this case the right is `ptr l2`

which means a pointer of some address, `l2`

, is returned. The intent of the function is to return an internal pointer to one of the `int`

fields so we'd expect to have it return a proof that `l2`

points to an `int`

. The proof returned is a tuple: `(int @ l2, int @ l2 -<lin> X @ l)`

.

The first field of that tuple is `int @ l2`

which is the proof we expected. So a caller of this function will have a pointer to an `int`

and the proof to use it as such. The second field of the tuple is `int @ l2 -<lin> X @ l`

. This is the syntax for declaring an anonymous function type. The items to the left of the `-<...>`

are the arguments to the function and the items to the right are the return value. The second field of the tuple is therefore a function that takes a proof that an `int`

exists at address `l2`

and returns a proof that an `X`

exists at `l`

. The `lin`

means this function is linear - it can only be used once. Because this tuple appears in the proof section of the result, this anonymous function is a proof function - it is called at type checking time.

The implementation of the function gets a pointer to the `b`

field of the object and a proof that we can use it:

```
val p = &(x->b)
prval pf = view@ (x->b)
```

The `prval`

line is proof code that runs at type checking time. `view@`

is a ATS keyword that gets the proof for an object if we can access it. Once we get a proof using `view@`

we need to return it later else the proof checking fails. This is a form of 'borrowing' - we are borrowing the proof to access the internal object and we have to give it back later otherwise we can't access that field again and the compiler complains if we don't give it back.

The main part of the return value is:

```
(pf, llam pf => let
prval () = view@ (x->b) := pf
in
view@ (!x)
end
| p)]
```

It is a tuple, with two proof arguments and a value. The first proof argument is the proof of `int @ someaddress`

to access the internal field we got earlier using `view@`

. The second is a linear lambda function. That function is what returns the proof back to the `view@`

for the internal field in this line:

```
prval () = view@ (x->b) := pf
```

This is the returning of the proof we borrowed earlier. The linear lambda function returns the `view@`

of the original `X`

object. The value being returned `p`

is the pointer to the internal field.

The `[.. | ]`

syntax is a way of telling the compiler that the return value is of a particular existential type and that it should compute it from the context of the expression such that it matches the `[l2:addr]`

existential result type we declared in the function declaration. Ideally the compiler would infer this without our help but unfortunately it does not. It will tell us if we get it wrong though.

Usage of the function looks like:

```
implement main() = let
var x: X = @{ a=1, b=2 }
val (pf, pff | p) = get_b_ref(view@ x | &x)
val _ = !p := 3
prval () = view@ x := pff(pf)
in
println!("x.b = ", x.b)
end
```

A variable `x`

is declared on the stack. We get a pointer to the `b`

field using the function above and set that field via the pointer. We're allowed to do this because we have a proof that `p`

is a pointer to an `int`

via the `pf`

proof variable. Calling `get_b_ref`

makes `x`

no longer usable as `get_b_ref`

consumed the proof to access it. The proof function, `pff`

is used to return the proof to access `X`

so we can then use `x`

again. After that we can't use the internal pointer to `b`

anymore. The compiler prevents the escape of the proof to access the internal pointer from the lifetime of the stack variable `x`

.

Unfortunately this code doesn't work in ATS 2. The equivalent code, which fails to compile, is:

```
fun get_b_ref {l:addr} .<>. (pf: X @ l | x: ptr l):<>
[l2:addr] (int @ l2, int @ l2 -<lin> X @ l | ptr l2) =
let
val p = addr@(x->b)
prval pf = view@ (x->b)
in
(pf, llam (pf) => let
prval () = view@ (x->b) := pf
in
view@ (!x)
end
| p)
end
```

ATS 2 does not allow the use of `view@ (!x)`

inside the lambda if the `x`

comes from outside the lambda as it does in this case. Maybe this will change in future iterations of ATS 2 but currently it does not. One workaround is to resort to embedded C code:

```
#include "share/atspre_define.hats"
#include "share/atspre_staload.hats"
viewtypedef X = @{
a= int,
b= int
}
extern viewtypedef "X" = X
%{
int* get_b_ref(X* x) {
return &x->atslab__b;
}
%}
extern fun get_b_ref {l:addr} (pf: X @ l | x: ptr l):<>
[l2:addr] (int @ l2, int @ l2 -<lin> X @ l | ptr l2) = "mac#get_b_ref"
implement main0() = let
var x: X = @{ a=1, b=2 }
val (pf, pff | p) = get_b_ref(view@ x | addr@ x)
val _ = !p := 3
prval () = view@ x := pff(pf)
in
println!("x.b = ", x.b)
end
```

Here the `X`

object is given a name "X" that can be accessed from `C`

using the line:

```
extern viewtypedef "X" = X
```

The `get_b_ref`

function is implemented in `C`

and given an `extern`

definition that matches the previous one. Although the C implementation is effectively 'unsafe' code, any calls of `get_b_ref`

from ATS is safe as the type defines exactly what it does. This is an example of implementing a typesafe kernel around potentially unsafe code. In this case there is no runtime overhead as the typesafe kernel is proof code.

Which approach to use depends on the project. Often the overhead of implementing safe pointer code in ATS in terms of satisfying the proofs of the implementation is too much at the time. Embedded C with a safe external definition is a faster approach to get things done. Later the C code could be replaced with proof-correct ATS as desired.

Hopefully this gives some insight in what ATS can do, and gives some appreciation of the simplicity of the approach Rust has taken. ATS tends towards providing the low level tools to build the level of access you want but this has cognitive and syntactic overhead. Once you're famliar with the system it becomes understandable but it's probably not "new user friendly" like Rust is.