2019-10-03
Defining Types in Shen
The Shen programming language has an extensible type system. Types are defined using sequent calculus and the system is powerful enough to create a variety of exotic types but it can be difficult when first starting with Shen to know how to use that power. In this post I hope to go through some basic examples of defining types in Shen without needing to know too much sequent calculus details.
For an overview of Shen there is Shen in 15 minutes and the Shen OS Kernel Manual. An interactive JavaScript REPL exists to try examples in the browser or pick on one of the existing Shen language ports. For these examples I'm using my Wasp Lisp port of Shen.
Shen is optionally typed. The type checker can be turned off and on. By default it is off and this can be seen in the Shen prompt by the presence of a '-
' character:
(0-) ...shen code...
Turning type checking on is done by using (tc +)
. The '-
' in the prompt changes to a '+
' to show type checking is active. It can be turned off again with (tc -)
:
(0-) (tc +)
(1+) (tc -)
(2-) ...
Types in Shen are defined using datatype
. The body of the datatype
definition contains a series of sequent calculus rules. These rules define how an object in Shen can be proved to belong to a particular type. Rather than go through a detailed description of sequent calculus, I'm going to present common examples of types in Shen to learn by example and dive into details as needed. There's the Shen Language Book for much more detail if needed.
Records
One way of storing collections of data in Shen is to use lists or vectors. For example, given an the concept of a 'person' that has a name and age, this can be stored in a list with functions to get the relevent data:
(tc -)
(define make-person
Name Age -> [Name Age])
(define get-name
[Name Age] -> Name)
(define get-age
[Name Age] -> Age)
(get-age (make-person "Person1" 42))
=> 42
In the typed subset of Shen we can define a type for this person object using datatype
:
(datatype person
N : string; A : number;
_______________________
[N A] : person;)
This defines one sequent calculus rule. The way to read it is starting with the code below the underscore line, followed by the code above it. In this case the rule states that if an expression matching the pattern [N A]
is encountered, where N
is a string
and A
is a number, then type that expression as person
. With that rule defined, we can ask Shen if lists are of the type person
:
(0+) ["Person" 42] : person
["Person1" 42] : person
(1+) ["Person1" "Person1"] : person
[error shen "type error"]
(2+) ["Person 42"]
["Person1" 42] : person
Given this person
type, we might write a get-age
function that is typed such that it only works on person
objects as follows (The { ...}
syntax in function definitions provides the expected type of the function):
(define get-age
{ person --> number }
[N A] -> A)
[error shen "type error in rule 1 of get-age"]
Shen rejects this definition as not being type safe. The reason for this is because our datatype
definition only states that [N A]
is a person
if N
is a string
and A
is a number
. It does not state that a person
object is constructed only of a string
and number
. For example, we could have an additional definition as follows:
(datatype person2
N : string; A : string;
_______________________
[N A] : person;)
Now we can create different types of person
objects:
(0+) ["Person" 42 ] : person
["Person" 42] : person
(1+) ["Person" "young"] : person
["Person" "young"] : person
get-age
is obviously badly typed in the presence of this additional type of person
which is why Shen rejected it originally. To resolve this we need to tell Shen that an [N A]
is a person
if and only if N
is a string
and A
is a person
. This is done with what is called a 'left rule'. Such a rule defines how a person
object can be deconstructed. It looks like this:
(datatype person3
N : string, A: number >> P;
___________________________
[N A] : person >> P;)
The way to read this type of rule is that, if [N A]
is a person
then N
is a string
and A
is a number. With that loaded into Shen, get-age
type checks:
(define get-age
{ person --> number }
[N A] -> A)
get-age : (person --> number)
(0+) (get-age ["Person" 42])
42 : number
The need to create a left rule, dual to the right rule, is common enough that Shen has a short method of defining both in one definition. It looks like this - note the use of '=' instead of '_' in the separator line:
(datatype person
N : string; A : number;
=======================
[N A] : person;)
(define get-age
{ person --> number }
[N A] -> A)
get-age : (person --> number)
(0+) (get-age ["Person" 42])
42 : number
The above datatype
is equivalent to declaring the two rules:
(datatype person
N : string; A : number;
_______________________
[N A] : person;
N : string, A: number >> P;
___________________________
[N A] : person >> P;)
Controlling type checking
When progamming at the REPL of Shen it's common to create datatype
definitions that are no longer needed, or part of a line of thought you don't want to pursue. Shen provides ways of excluding or including rules in the typechecker as needed. When defining a set of rules in a datatype
, that datatype
is given a name:
(datatype this-is-the-name
...
)
The rules within that definition can be removed from selection by the typechecker using preclude
, which takes a list of datatype
names to ignore during type checking:
(preclude [this-is-the-name])
To re-add a dataype
, use include
:
(include [this-is-the-name])
There is also include-all-but
and preclude-all-but
to include or remove all but the listed names. These commands are useful for removing definitions you no longer want to use at the REPL, but also for speeding up type checking in a given file if you know the file only uses a particular set of datatypes.
Enumerations
An example of an enumeration type would be days of the week. In an ML style language this can be done like:
datatype days = monday | tuesday | wednesday
| thursday | friday | saturday | sunday
In Shen this would be done using multiple sequent calculus rules.
(datatype days
____________
monday : day;
____________
tuesday : day;
____________
wednesday : day;
____________
thursday : day;
____________
friday : day;
____________
saturday : day;
____________
sunday : day;)
Here there are no rules above the dashed underscore line, meaning that the given symbol is of the type day
. A function that uses this type would look like:
(define day-number
{ day --> number }
monday -> 0
tuesday -> 1
wednesday -> 2
thursday -> 3
friday -> 4
saturday -> 5
sunday -> 6)
It's quite verbose to define a number of enumeration types like this. It's possible to add a test above the dashed underline which allows being more concise. The test is introduced using if
:
(datatype days
if (element? Day [monday tuesday wednesday thursday friday saturday sunday])
____________________________________________________________________________
Day : day;)
(0+) monday : day
monday : day
Any Shen code can be used in these test conditions. Multiple tests can be combined:
(datatype more-tests
if (number? X)
if (>= X 5)
if (<= X 10)
___________
X : between-5-and-10;)
(2+) 5 : between-5-and-10
5 : between-5-and-10
(3+) 4 : between-5-and-10
[error shen "type error\n"]
Polymorphic types
To create types that are polymorphic (ie. generic), like the built-in list
type, include a free variable representing the type. For example, something like the built in list where the list elements are stored as pairs can be approximated with:
(datatype my-list
_____________________
my-nil : (my-list A);
X : A; Y : (my-list A);
========================
(@p X Y) : (my-list A);)
(define my-cons
{ A --> (my-list A) --> (my-list A) }
X Y -> (@p X Y))
(0+) (my-cons 1 my-nil)
(@p 1 my-nil) : (my-list number)
(1+) (my-cons 1 (my-cons 2 my-nil))
(@p 1 (@p 2 my-nil)) : (my-list number)
(2+) (my-cons "a" (my-cons "b" my-nil))
(@p "a" (@p "b" my-nil)) : (my-list string)
Notice the use of the '=====' rule to combine left and right rules. This is required to enable writing something like my-car
which requires proving that the type of the car
of the list is of type A
:
(define my-car
{ (my-list A) --> A }
(@p X Y) -> X)
List encoded with size
Using peano numbers we can create a list where the length of the list is part of the type:
(datatype list-n
______
[] : (list-n zero A);
X : A; Y : (list-n N A);
================================
[ X | Y ] : (list-n (succ N) A);)
(define my-tail
{ (list-n (succ N) A) --> (list-n N A) }
[Hd | Tl] -> Tl)
(define my-head
{ (list-n (succ N) A) --> A }
[Hd | Tl] -> Hd)
This gives a typesafe head
and tail
operation whereby they can't be called on an empty list:
(0+) [] : (list-n zero number)
[] : (list-n zero number)
(1+) [1] : (list-n (succ zero) number)
[1] : (list-n (succ zero) number)
(2+) (my-head [])
[error shen "type error\n"]
(3+) (my-head [1])
1 : number
(4+) (my-tail [1 2 3])
[2 3] : (list-n (succ (succ zero)) number)
(5+) (my-tail [])
[error shen "type error\n"]
Power and Responsibility
Shen gives a lot of power in creating types, but trusts you to make those types consistent. For example, the following creates an inconsistent type:
(datatype person
N : string; A : number;
_______________________
[N A] : person;
N : string; A : string;
_______________________
[N A] : person;
N : string, A: number >> P;
___________________________
[N A] : person >> P;)
Here we are telling Shen that a string
and number
in a list
is a person
, and so too is a string
and another string
. But the third rule states that given a person
, that is is composed of a string
and a number
only. This leads to:
(0+) (get-age ["Person" "Person"])
...
This will hang for a long time as Shen attempts to resolve the error we've created.
Conclusion
Shen provides a programmable type system, but the responsibility lies on the programmer for making sure the types are consisitent. The examples given here provide a brief overview. For much more see The Book of Shen. The Shen OS Kernel Manual also gives some examples. There are posts on the Shen Mailing List that have more advanced examples of Shen types. Mark Tarver has a case study showing converting a lisp interpreter in Shen to use types.