December 26, 2016
I’ve been working on implementing a Lisp interpreter for my own
amusement, and the next feature on my todo list is first-class
continuations. call/cc
always seemed mysterious to me, and
since this is PL, the best way to get rid of that air of mystery is to
actually implement it in code. Before diving in to implement this for
Lisp, I thought it would be a good exercise to implement an interpreter
for a toy language. The idea is that once I’ve implemented
call/cc
there then I can have a better idea of how to
implement it for the Lisp interpreter.1
A quick recap on what continuations are. There’s many ways to describe them, but essentially they define the “context” of a computation. For example, let’s say you’re interpreting the expression:
2 + 3) + (4 + 5) (
To interpret this expression, you need to interpret its
subexpressions (2 + 3)
and (4 + 5)
. Say your
interpreter evaluates summands left to right so you interpret
(2 + 3)
first. The context of this computation is
+ (4 + 5) [ ]
Think of it as an expression with a hole.2 In
other words, [ ] + (4 + 5)
is the current
continuation while interpreting (2 + 3)
. It’s the
“rest of the computation.”
So that’s the current continuation. What then does
call/cc
or call-with-current-continuation
do?
call/cc
is a function that basically allows the programmer
to manipulate the current continuation or context of computation
within the language itself! It takes in another function
f
as an argument and passes in to f
a
“reified” version of the current continuation. The reified continuation
is basically a function (a lambda abstraction), except that whenever it
is called we erase the current continuation, which allows the programmer
to replace the current continuation with the reified one.
That might be confusing so here’s an example. Consider the addition
expression above, but with the addition of call/cc
as a
subexpression:
/cc (\k -> k (2 + 3))) + (4 + 5) (call
This looks weird, but actually it is equivalent to
(2 + 3) + (4 + 5)
. How? Notice that when we interpret the
call/cc
expression, the current continuation is
[ ] + (4 + 5)
. This continuation becomes reified as a
function
-> hole + (4 + 5)) (\hole
Basically, you replace the hole with a variable and create a (special
kind of) lambda abstraction so that this variable is bound. Recall that
call/cc
passes this function to its argument, and when it
is called it replaces the current continuation with an empty context, an
“identity continuation”. So the expression above would evaluate as
follows:
/cc (\k -> k (2 + 3))) + (4 + 5) =>
(call-> k (2 + 3)) (\hole -> hole + (4 + 5)) + (4 + 5) =>
((\k -> hole + (4 + 5)) (2 + 3) =>
(\hole 2 + 3) + (4 + 5) =>
(-- and so on...
The expression with call/cc
is equivalent to the one
without, as promised! This doesn’t really display the power of
call/cc
though. Because the continuation is reified as a
function, you can do whatever you want with it! For example,
you can “short-circuit” evaluation by not calling the reified
continuation at all:
/cc (\k -> 2 + 3)) + (4 + 5) =>
(call-> 2 + 3) (\hole -> hole + (4 + 5)) + (4 + 5) =>
((\k 2 + 3 =>
5
This is just a simple example. There’s a bewildering number of ways
to use call/cc
, from implementing
coroutines to much more. But now let’s think about how to implement
this. We want to keep track of the current continuation, and when we
call call/cc
we want to reify the current continuation into
a function. How do we do this?
Programs can usually be represented as an abstract syntax tree (AST). You can then think of interpreting a program as a traversal through an AST. What this traversal entails depends on the semantics of the language. In the case of arithmetic expressions, traversing the AST means to compute the number that the expression denotes. Continuations are the context of traversing the AST; they mark the position on the tree where interpretation is currently taking place. We want a good representation of this traversal, so that we can easily move the mark through the tree.
“Representing a traversal” might ring some bells: continuations can
be represented by zippers! When we
traverse a subexpression / subtree during interpretation, we push a new
expression with a hole into a stack, where the hole represents the
subexpression being traversed. This expression-with-a-hole is
part of the context of computation, and can be represented as a
function Expr -> Expr
in the metalanguage (i.e. the
implementation language). When the interpretation “bottoms out,” meaning
the current expression has no subexpressions left to be further
interpreted, we compute the value of the current expression, pop the
head of the stack, place the value into the hole of the head expression
(i.e., we apply the value as an argument to the function), and then
continue interpretation.
Let’s do an example to make this concrete. Here’s a trace of interpreting an expression:
CURRENT EXPR: (1 + (2 + 3)) + (4 + 5)
STACK: []
CURRENT CONTINUATION: (\hole -> hole)
1 + (2 + 3)) + (4 + 5)] => (traverse down)
[(
CURRENT EXPR: (1 + (2 + 3))
STACK: [(\h1 -> h1 + (4 + 5))]
CURRENT CONTINUATION: (\hole -> hole + (4 + 5))
1 + (2 + 3))] + (4 + 5) => (traverse down)
[(
CURRENT EXPR: (2 + 3)
STACK: [(\h2 -> 1 + h2), (\h1 -> h1 + (4 + 5))]
CURRENT CONTINUATION: (\hole -> (1 + hole) + (4 + 5))
1 + [(2 + 3)]) + (4 + 5) => (traverse up)
(
CURRENT EXPR: (1 + 5)
STACK: [(\h1 -> h1 + (4 + 5))]
CURRENT CONTINUATION: (\hole -> (1 + hole) + (4 + 5))
1 + 5)] + (4 + 5) => (traverse up)
[(
CURRENT EXPR: 6 + (4 + 5)
STACK: []
CURRENT CONTINUATION: (\hole -> hole)
6 + (4 + 5)] => (traverse down)
[
CURRENT EXPR: 4 + 5
STACK: [(\h1 -> 6 + h1)]
CURRENT CONTINUATION: (\hole -> 6 + hole)
6 + [(4+ 5)] => (traverse up)
CURRENT EXPR: 6 + 9
STACK: []
CURRENT CONTINUATION: (\hole -> hole)
6 + 9 => (traverse up)
CURRENT EXPR: 15
STACK: []
CURRENT CONTINUATION: (\hole -> hole)
15
Each interpretation step is annotated with the current expression,
the current stack, and the current continuation. Each transition
(=>)
is labeled as either “traverse down,” meaning next
we interpret a subexpression of the current expression, or “traverse
up,” meaning we interpret the current expression directly and place its
interpretation (value) in the context of the stack head.
Does this representation of continuations allow us to implement
call/cc
easily? Yes! To reify the current continuation, we
just glue the together holed expressions (functions) in the stack by
function composition. Try it with the example above. You’ll see that you
can compute the current continuation at any step just with the contents
of the stack: plug the first element into the second element, plug the
second element into the third, and so on, until you only have one
function remaining, which should be equivalent to the current
continuation.
Now that we have the idea, let’s write actual code. This toy language will be very simple, but to make it more interesting we’ll add commands. Here is the AST:
data Expr = ILit Int
| Var Int
| Plus Expr Expr
| Lam Int Expr
| App Expr Expr
| Unit
| Print Expr
| Seq [Expr]
| Cont Int Expr
| CallCC Int Expr
deriving (Show)
Hopefully this is straightforward. The language has integer literals
(ILit
), variables (Var
, indexed by integers
instead of the usual strings), lambda abstraction (Lam
),
application (App
), unit / zero-tuple (guess which
constructor??), a print command (Print
), a sequence of
commands (Seq
), continuations (Cont
), which
are basically lambdas with an extra side effect as we’ll see below, and
finally call/cc
itself. For simplicity I didn’t bother
having separate datatypes for commands and expressions; everything is
just an Expr
.
It is important that the interpreter we write for this toy language corresponds to the small step semantics of the language. The small step semantics defines exactly the order in which interpretation happens. The interpreter trace in the example above follows the small step semantics of an arithmetic language where summands are interpreted left to right. Contrast this with large step semantics, which defines only the value of expressions in a language. Interpreters written to reflect this style of semantics cannot capture the exact context of the ongoing computation because it leaves the order of interpretation up to the semantics of the implementation language. For example, a large step interpreter might have the following rule to interpret addition expressions:
interp e| Plus e1 e2 <- e = (interp e1) + (interp e2)
Which gets interpreted first, e1
or e2
? It
is left implicit, and depends on Haskell’s semantics.
Let’s see some code! First off, some type declarations and convenience functions:
import qualified Data.Map.Strict as M
type Env = M.Map Int Expr
type DCont = Expr -> Expr
type InterpState = [DCont]
type InterpM a = ExceptT String (ReaderT Env (StateT InterpState IO)) a
isValue :: Expr -> Bool
= case expr of
isValue expr ILit _ -> True
Var _ -> True
Lam _ _ -> True
App _ _ -> False
Unit -> True
Print _ -> False
Seq [] -> True
Seq _ -> False
Cont _ _ -> True
CallCC _ _ -> False
Notice that continuations (DCont
) are functions in the
metalanguage (i.e. the implementation language), not the object
language (i.e. the toy language). They are called DCont
s
because they are technically delimited continuations, which
represent not the entire context of a computation but only part of it.
We reify the (undelimited) continuation used by call/cc
by
composing these together. Reifying a continuation is the process of
turning this metalanguage function into an object language function
represented by the constructor Cont
. The control stack
(InterpState
) consists of a stack of
DCont
s.
We’ll be running the interpreter in the InterpM
monad,
which has StateT
for keeping track of the stack / current
continuation, a ReaderT
component for keeping track of
bound variables, and an ExceptT
for handling errors.
Lastly, we define an isValue
function to define certain
expression forms as values. We’ll use this information to determine
whether to “traverse up” during interpretation.
Now to the interpreter proper.
ret :: Expr -> InterpM Expr
= do
ret expr <- get
st case st of
-> return expr
[] :cs) -> do
(c
put cs eval (c expr)
The ret
function corresponds to the (initial part of
the) “traverse up” transition in the example trace above. During
interpretation, once we have a value expression (i.e., an expression
that cannot be interpreted further) we pass this to ret
,
which will either place it in the head expression of the stack and
continue interpretation (“traverse down”), or, if the stack is empty,
return the expression. Note that if the stack is empty then the current
continuation is the “identity continuation,”
(\hole -> hole)
.
pushToStack :: DCont -> InterpM ()
= do
pushToStack cont <- get
st :st) put (cont
pushToStack
is an aptly named helper function that
pushes a new delimited continuation on top of the stack. Alternatively,
if you’re fine with applicative notation or want to play some Haskell
code golf, you can write it like this:
pushToStack :: Cont -> InterpM ()
= (cont:) <$> get >>= put pushToStack cont
Here part the eval
function:
eval :: Expr -> InterpM Expr
eval expr| ILit n <- expr = ret expr
| Var v <- expr = do
<- ask
env case M.lookup v env of
Nothing -> throwError $ "unexpected free variable: v" ++ (show v)
Just val -> ret val
| Lam _ _ <- expr = ret expr
| Cont _ _ <- expr = do
ret expr
| Unit <- expr = ret expr
This is the boring part, since we’re just matching on value
expressions and immediately calling ret
. Note that
eval
and ret
are mutually recursive.
-- eval continued...
| App (Lam var body) arg <- expr, isValue arg = do
local (M.insert var arg) (eval body)
| App (Lam var body) arg <- expr, not (isValue arg) = do
$ App (Lam var body)
pushToStack
eval arg
| App func arg <- expr = do
<- ask
env let cont f = App f arg
pushToStack cont
eval func
| Plus (ILit x) (ILit y) <- expr = ret (ILit $ x + y)
| Plus (Var x) (ILit y) <- expr = do
<- ask
env case M.lookup x env of
Nothing -> throwError $ "unexpected free variable: v" ++ (show x)
Just (ILit xval) -> ret (ILit $ xval + y)
otherwise -> throwError "expected integer argument to Plus"
| Plus (ILit x) (Var y) <- expr = eval (Plus (Var y) (ILit x))
| Plus (Var x) (Var y) <- expr = do
<- ask
env case M.lookup x env of
Nothing -> throwError $ "unexpected free variable: v" ++ (show x)
Just (ILit xval) -> eval (Plus (ILit xval) (Var y))
otherwise -> throwError "expected integer argument to Plus"
Now it gets a bit interesting. Notice that when we are “traversing
down” during interpretation and need to interpret a subexpression, we
push a new delimited continuation to the stack and continue interpreting
that subexpression. This happens, for example, when addition expressions
have variables as summands, or when a function’s argument needs to be
interpreted further.3 When no subexpressions need to be
further interpreted, such as when we are adding two literals together,
we call ret
instead of eval
and “traverse
up.”
-- eval continued ...
| Print pexpr <- expr, isValue pexpr = do
$ print pexpr
liftIO Unit
ret
| Print pexpr <- expr, not (isValue pexpr) = do
Print
pushToStack
eval pexpr
| Seq (cmd:tlcmds) <- expr = do
$ const (Seq tlcmds)
pushToStack
eval cmd
| Seq [] <- expr = do
Unit ret
Here are the commands. They are interpreted similarly to the
expressions above, except of course they have side effects, as seen in
the first pattern match on Print
. The only interesting
thing to notice is that when interpreting sequences, the delimited
continuation for interpreting the next command of the sequence just
discards its argument. This makes sense since the context for
interpreting (executing) a command is just the rest of the commands to
be executed; no information from the current command is needed. Hence
the unused argument.
-- eval continued ...
| App (Cont var body) arg <- expr, isValue arg = do
put []
local (M.insert var arg) (eval body)
| App (Cont var body) arg <- expr, not (isValue arg) = do
$ App (Cont var body)
pushToStack eval arg
As promised, reified continuations are exactly like lambdas, except that when calling them we erase the current context of computation by erasing the stack. This essentially allows the reified continuation to replace the current continuation whenever it is called.
Finally, call/cc
:
-- eval continued ...
| CallCC k body <- expr = do
-- reify the current continuation into a function
<- get
st let contf = foldr (.) id $ reverse st
let cont = Cont (-1) (contf (Var (-1)))
-- evaluate the body
local (M.insert k cont) (eval body)
Again, we create the current continuation by composing the delimited
continuations in the stack together (contf
). We then reify
this continuation, which is a function in the metalanguage /
implementation language, into an object language continuation by
applying a variable expression to the function, so that it returns an
expression with a free variable Var (-1)
. By free
I mean free in the object language; also, we’re going to punt on issues
with capture for now. We then wrap this expression into a
Cont
to bind the variable, and we’re done!4
We can now evaluate the body, in whose environment the variable
k
will be mapped to the reified continuation. Note that
syntactically call/cc
takes in two arguments: the name of
the variable and the body expression to be evaluated whose environment
contains the variable so named mapped to the reified continuation. This
is equivalent to having call/cc
take a function as a single
argument.
That’s the interpreter. Let’s test it out!
>> let subexpr = Seq [ILit 98, App (Var 0) Unit]
>> let expr = Seq [CallCC 0 subexpr, Print (ILit 99)]
>> res <- evalStateT (runReaderT (runExceptT (eval expr)) M.empty) initState
>> putStrLn $ either id show res
>>
ILit 98
ILit 99
We have a sequence of commands where call/cc
is executed
first. So the current continuation when call/cc
is executed
is the rest of the sequence,
\hole -> Seq [Print (ILit 99)]
. call/cc
then passes this continuation to a function whose body is also a
sequence. 98
gets printed out first, and then we call the
continuation, which prints 99
.
This use of call/cc
is a bit pointless, since
99
would’ve been printed after 98
anyway.
Here’s a slightly less pointless use of call/cc
:
>> let subexpr = Seq [App (Var 0) Unit, Print (ILit 98)]
>> let expr = Seq [CallCC 0 subexpr, Print (ILit 99)]
>> res <- evalStateT (runReaderT (runExceptT (eval expr)) M.empty) initState
>> putStrLn $ either id show res
>>
ILit 99
So instead of printing 98
, we skip that command and
instead print 99
right away. This happens because calling
the reified continuation erases the stack, which destroys the current
continuation that would’ve printed 98
next. As you can see,
we can use call/cc
to implement jumps!
That’s it for now. In retrospect it was a good call to implement
call/cc
for a toy language first, since now I have a pretty
clear idea of how to do it. It certainly was more productive than my
futile initial attempts at implementing first-class continuations in my
Lisp interpreter, where I was bogged down with trying to fit the
implementation with the rest of the interpreter, which made the endeavor
a lot more complicated.
You can find the full source for the interpreter here. For further reading, Matt Might has a good blog post that describes the relationship between small step interpreters and continuations. He mentions the zipper approach that I followed here but he instead uses class methods to manipulate contexts of a computation.
Edit: Fixed some typos and added some clarifications.
While I’ve got a decent idea of how to implement this feature now, the interpreter is still a bit of a mess. It turns out first-class continuations are a big feature that touches pretty much every part of an interpreter!↩︎
You could also think of this expression-with-a-hole as a function from expressions to expressions: this fact will be important later.↩︎
Notice that this language is eagerly evaluated.↩︎
You might think, as I originally did, that the
interpreter should clear the stack when call/cc
is called.
A quick test in Racket disabused me of that misconception:
(+ 2 (call/cc (lambda (k) 2)))
evaluates to 4, not 2.
Explanation for why this answers our question is left as an exercise for
the reader!↩︎