Rolph Recto – Implementing call/cc

I’ve been working on implementing a Lisp interpreter for my own amusement, and the next feature on my todo list is first-class continuations. call/cc always seemed mysterious to me, and since this is PL, the best way to get rid of that air of mystery is to actually implement it in code. Before diving in to implement this for Lisp, I thought it would be a good exercise to implement an interpreter for a toy language. The idea is that once I’ve implemented call/cc there then I can have a better idea of how to implement it for the Lisp interpreter.¹

Continuations Recap

A quick recap on what continuations are. There’s many ways to describe them, but essentially they define the “context” of a computation. For example, let’s say you’re interpreting the expression:

To interpret this expression, you need to interpret its subexpressions (2 + 3) and (4 + 5). Say your interpreter evaluates summands left to right so you interpret (2 + 3) first. The context of this computation is

Think of it as an expression with a hole.² In other words, [ ] + (4 + 5) is the current continuation while interpreting (2 + 3). It’s the “rest of the computation.”

So that’s the current continuation. What then does call/cc or call-with-current-continuation do? call/cc is a function that basically allows the programmer to manipulate the current continuation or context of computation within the language itself! It takes in another function f as an argument and passes in to f a “reified” version of the current continuation. The reified continuation is basically a function (a lambda abstraction), except that whenever it is called we erase the current continuation, which allows the programmer to replace the current continuation with the reified one.

That might be confusing so here’s an example. Consider the addition expression above, but with the addition of call/cc as a subexpression:

(call/cc (\k -> k (2 + 3))) + (4 + 5)

This looks weird, but actually it is equivalent to (2 + 3) + (4 + 5). How? Notice that when we interpret the call/cc expression, the current continuation is [ ] + (4 + 5). This continuation becomes reified as a function

(\hole -> hole + (4 + 5))

Basically, you replace the hole with a variable and create a (special kind of) lambda abstraction so that this variable is bound. Recall that call/cc passes this function to its argument, and when it is called it replaces the current continuation with an empty context, an “identity continuation”. So the expression above would evaluate as follows:

(call/cc (\k -> k (2 + 3))) + (4 + 5) =>
((\k -> k (2 + 3)) (\hole -> hole + (4 + 5)) + (4 + 5) =>
(\hole -> hole + (4 + 5)) (2 + 3) =>
(2 + 3) + (4 + 5) =>
-- and so on...

The expression with call/cc is equivalent to the one without, as promised! This doesn’t really display the power of call/cc though. Because the continuation is reified as a function, you can do whatever you want with it! For example, you can “short-circuit” evaluation by not calling the reified continuation at all:

(call/cc (\k -> 2 + 3)) + (4 + 5) =>
((\k -> 2 + 3) (\hole -> hole + (4 + 5)) + (4 + 5) =>
2 + 3 =>
5

This is just a simple example. There’s a bewildering number of ways to use call/cc, from implementing coroutines to much more. But now let’s think about how to implement this. We want to keep track of the current continuation, and when we call call/cc we want to reify the current continuation into a function. How do we do this?

Programs can usually be represented as an abstract syntax tree (AST). You can then think of interpreting a program as a traversal through an AST. What this traversal entails depends on the semantics of the language. In the case of arithmetic expressions, traversing the AST means to compute the number that the expression denotes. Continuations are the context of traversing the AST; they mark the position on the tree where interpretation is currently taking place. We want a good representation of this traversal, so that we can easily move the mark through the tree.

“Representing a traversal” might ring some bells: continuations can be represented by zippers! When we traverse a subexpression / subtree during interpretation, we push a new expression with a hole into a stack, where the hole represents the subexpression being traversed. This expression-with-a-hole is part of the context of computation, and can be represented as a function Expr -> Expr in the metalanguage (i.e. the implementation language). When the interpretation “bottoms out,” meaning the current expression has no subexpressions left to be further interpreted, we compute the value of the current expression, pop the head of the stack, place the value into the hole of the head expression (i.e., we apply the value as an argument to the function), and then continue interpretation.

Let’s do an example to make this concrete. Here’s a trace of interpreting an expression:

CURRENT EXPR: (1 + (2 + 3)) + (4 + 5)
STACK: []
CURRENT CONTINUATION: (\hole -> hole)
[(1 + (2 + 3)) + (4 + 5)] => (traverse down)

CURRENT EXPR: (1 + (2 + 3))
STACK: [(\h1 -> h1 + (4 + 5))]
CURRENT CONTINUATION: (\hole -> hole + (4 + 5))
[(1 + (2 + 3))] + (4 + 5) => (traverse down)

CURRENT EXPR: (2 + 3)
STACK: [(\h2 -> 1 + h2), (\h1 -> h1 + (4 + 5))]
CURRENT CONTINUATION: (\hole -> (1 + hole) + (4 + 5))
(1 + [(2 + 3)]) + (4 + 5) => (traverse up)

CURRENT EXPR: (1 + 5)
STACK: [(\h1 -> h1 + (4 + 5))]
CURRENT CONTINUATION: (\hole -> (1 + hole) + (4 + 5))
[(1 + 5)] + (4 + 5) => (traverse up)

CURRENT EXPR: 6 + (4 + 5)
STACK: []
CURRENT CONTINUATION: (\hole -> hole)
[6 + (4 + 5)] => (traverse down)

CURRENT EXPR: 4 + 5
STACK: [(\h1 -> 6 + h1)]
CURRENT CONTINUATION: (\hole -> 6 + hole)
6 + [(4+ 5)] => (traverse up)

CURRENT EXPR: 6 + 9
STACK: []
CURRENT CONTINUATION: (\hole -> hole)
6 + 9  => (traverse up)

CURRENT EXPR: 15
STACK: []
CURRENT CONTINUATION: (\hole -> hole)
15

Each interpretation step is annotated with the current expression, the current stack, and the current continuation. Each transition (=>) is labeled as either “traverse down,” meaning next we interpret a subexpression of the current expression, or “traverse up,” meaning we interpret the current expression directly and place its interpretation (value) in the context of the stack head.

Does this representation of continuations allow us to implement call/cc easily? Yes! To reify the current continuation, we just glue the together holed expressions (functions) in the stack by function composition. Try it with the example above. You’ll see that you can compute the current continuation at any step just with the contents of the stack: plug the first element into the second element, plug the second element into the third, and so on, until you only have one function remaining, which should be equivalent to the current continuation.

Implementing The Toy Language

Now that we have the idea, let’s write actual code. This toy language will be very simple, but to make it more interesting we’ll add commands. Here is the AST:

data Expr   = ILit Int
            | Var Int
            | Plus Expr Expr
            | Lam Int Expr
            | App Expr Expr
            | Unit
            | Print Expr
            | Seq [Expr]
            | Cont Int Expr
            | CallCC Int Expr
            deriving (Show)

Hopefully this is straightforward. The language has integer literals (ILit), variables (Var, indexed by integers instead of the usual strings), lambda abstraction (Lam), application (App), unit / zero-tuple (guess which constructor??), a print command (Print), a sequence of commands (Seq), continuations (Cont), which are basically lambdas with an extra side effect as we’ll see below, and finally call/cc itself. For simplicity I didn’t bother having separate datatypes for commands and expressions; everything is just an Expr.

It is important that the interpreter we write for this toy language corresponds to the small step semantics of the language. The small step semantics defines exactly the order in which interpretation happens. The interpreter trace in the example above follows the small step semantics of an arithmetic language where summands are interpreted left to right. Contrast this with large step semantics, which defines only the value of expressions in a language. Interpreters written to reflect this style of semantics cannot capture the exact context of the ongoing computation because it leaves the order of interpretation up to the semantics of the implementation language. For example, a large step interpreter might have the following rule to interpret addition expressions:

interp e
  | Plus e1 e2 <- e = (interp e1) + (interp e2)

Which gets interpreted first, e1 or e2? It is left implicit, and depends on Haskell’s semantics.

Let’s see some code! First off, some type declarations and convenience functions:

import qualified Data.Map.Strict as M

type Env         = M.Map Int Expr
type DCont       = Expr -> Expr
type InterpState = [DCont]
type InterpM a   = ExceptT String (ReaderT Env (StateT InterpState IO)) a

isValue :: Expr -> Bool
isValue expr = case expr of
  ILit _ -> True
  Var _ -> True
  Lam _ _ -> True
  App _ _ -> False
  Unit -> True
  Print _ -> False
  Seq [] -> True
  Seq _ -> False
  Cont _ _ -> True
  CallCC _ _ -> False

Notice that continuations (DCont) are functions in the metalanguage (i.e. the implementation language), not the object language (i.e. the toy language). They are called DConts because they are technically delimited continuations, which represent not the entire context of a computation but only part of it. We reify the (undelimited) continuation used by call/cc by composing these together. Reifying a continuation is the process of turning this metalanguage function into an object language function represented by the constructor Cont. The control stack (InterpState) consists of a stack of DConts.

We’ll be running the interpreter in the InterpM monad, which has StateT for keeping track of the stack / current continuation, a ReaderT component for keeping track of bound variables, and an ExceptT for handling errors.

Lastly, we define an isValue function to define certain expression forms as values. We’ll use this information to determine whether to “traverse up” during interpretation.

ret :: Expr -> InterpM Expr
ret expr = do
  st <- get
  case st of
    [] -> return expr
    (c:cs) -> do
      put cs
      eval (c expr)

The ret function corresponds to the (initial part of the) “traverse up” transition in the example trace above. During interpretation, once we have a value expression (i.e., an expression that cannot be interpreted further) we pass this to ret, which will either place it in the head expression of the stack and continue interpretation (“traverse down”), or, if the stack is empty, return the expression. Note that if the stack is empty then the current continuation is the “identity continuation,” (\hole -> hole).

pushToStack :: DCont -> InterpM ()
pushToStack cont = do
  st <- get
  put (cont:st)

pushToStack is an aptly named helper function that pushes a new delimited continuation on top of the stack. Alternatively, if you’re fine with applicative notation or want to play some Haskell code golf, you can write it like this:

pushToStack :: Cont -> InterpM ()
pushToStack cont = (cont:) <$> get >>= put

eval :: Expr -> InterpM Expr
eval expr
  | ILit n <- expr = ret expr

  | Var v <- expr = do
    env <- ask
    case M.lookup v env of
      Nothing -> throwError $ "unexpected free variable: v" ++ (show v)
      Just val -> ret val

  | Lam _ _ <- expr = ret expr

  | Cont _ _ <- expr = do
    ret expr

  | Unit <- expr = ret expr

This is the boring part, since we’re just matching on value expressions and immediately calling ret. Note that eval and ret are mutually recursive.

  -- eval continued...

  | App (Lam var body) arg <- expr, isValue arg = do
    local (M.insert var arg) (eval body)

  | App (Lam var body) arg <- expr, not (isValue arg) = do
    pushToStack $ App (Lam var body)
    eval arg

  | App func arg <- expr = do
    env <- ask
    let cont f = App f arg
    pushToStack cont
    eval func

  | Plus (ILit x) (ILit y) <- expr = ret (ILit $ x + y)

  | Plus (Var x) (ILit y) <- expr = do
    env <- ask
    case M.lookup x env of
      Nothing -> throwError $ "unexpected free variable: v" ++ (show x)
      Just (ILit xval) -> ret (ILit $ xval + y)
      otherwise -> throwError "expected integer argument to Plus"

  | Plus (ILit x) (Var y) <- expr = eval (Plus (Var y) (ILit x))
  
  | Plus (Var x) (Var y) <- expr = do
    env <- ask
    case M.lookup x env of
      Nothing -> throwError $ "unexpected free variable: v" ++ (show x)
      Just (ILit xval) -> eval (Plus (ILit xval) (Var y))
      otherwise -> throwError "expected integer argument to Plus"

Now it gets a bit interesting. Notice that when we are “traversing down” during interpretation and need to interpret a subexpression, we push a new delimited continuation to the stack and continue interpreting that subexpression. This happens, for example, when addition expressions have variables as summands, or when a function’s argument needs to be interpreted further.³ When no subexpressions need to be further interpreted, such as when we are adding two literals together, we call ret instead of eval and “traverse up.”

  -- eval continued ...

  | Print pexpr <- expr, isValue pexpr = do
      liftIO $ print pexpr
      ret Unit

  | Print pexpr <- expr, not (isValue pexpr) = do
      pushToStack Print
      eval pexpr

  | Seq (cmd:tlcmds) <- expr = do
      pushToStack $ const (Seq tlcmds)
      eval cmd

  | Seq [] <- expr = do
      ret Unit

Here are the commands. They are interpreted similarly to the expressions above, except of course they have side effects, as seen in the first pattern match on Print. The only interesting thing to notice is that when interpreting sequences, the delimited continuation for interpreting the next command of the sequence just discards its argument. This makes sense since the context for interpreting (executing) a command is just the rest of the commands to be executed; no information from the current command is needed. Hence the unused argument.

  -- eval continued ...

  | App (Cont var body) arg <- expr, isValue arg = do
    put []
    local (M.insert var arg) (eval body)

  | App (Cont var body) arg <- expr, not (isValue arg) = do
    pushToStack $ App (Cont var body)
    eval arg

As promised, reified continuations are exactly like lambdas, except that when calling them we erase the current context of computation by erasing the stack. This essentially allows the reified continuation to replace the current continuation whenever it is called.

  -- eval continued ...

  | CallCC k body <- expr = do
    -- reify the current continuation into a function
    st <- get 
    let contf = foldr (.) id $ reverse st
    let cont = Cont (-1) (contf (Var (-1)))

    -- evaluate the body
    local (M.insert k cont) (eval body)

Again, we create the current continuation by composing the delimited continuations in the stack together (contf). We then reify this continuation, which is a function in the metalanguage / implementation language, into an object language continuation by applying a variable expression to the function, so that it returns an expression with a free variable Var (-1). By free I mean free in the object language; also, we’re going to punt on issues with capture for now. We then wrap this expression into a Cont to bind the variable, and we’re done!⁴

We can now evaluate the body, in whose environment the variable k will be mapped to the reified continuation. Note that syntactically call/cc takes in two arguments: the name of the variable and the body expression to be evaluated whose environment contains the variable so named mapped to the reified continuation. This is equivalent to having call/cc take a function as a single argument.

>> let subexpr = Seq [ILit 98, App (Var 0) Unit]
>> let expr = Seq [CallCC 0 subexpr, Print (ILit 99)]
>> res <- evalStateT (runReaderT (runExceptT (eval expr)) M.empty) initState
>> putStrLn $ either id show res
>>
ILit 98
ILit 99

We have a sequence of commands where call/cc is executed first. So the current continuation when call/cc is executed is the rest of the sequence, \hole -> Seq [Print (ILit 99)]. call/cc then passes this continuation to a function whose body is also a sequence. 98 gets printed out first, and then we call the continuation, which prints 99.

This use of call/cc is a bit pointless, since 99 would’ve been printed after 98 anyway. Here’s a slightly less pointless use of call/cc:

>> let subexpr = Seq [App (Var 0) Unit, Print (ILit 98)]
>> let expr = Seq [CallCC 0 subexpr, Print (ILit 99)]
>> res <- evalStateT (runReaderT (runExceptT (eval expr)) M.empty) initState
>> putStrLn $ either id show res
>>
ILit 99

So instead of printing 98, we skip that command and instead print 99 right away. This happens because calling the reified continuation erases the stack, which destroys the current continuation that would’ve printed 98 next. As you can see, we can use call/cc to implement jumps!

That’s it for now. In retrospect it was a good call to implement call/cc for a toy language first, since now I have a pretty clear idea of how to do it. It certainly was more productive than my futile initial attempts at implementing first-class continuations in my Lisp interpreter, where I was bogged down with trying to fit the implementation with the rest of the interpreter, which made the endeavor a lot more complicated.

You can find the full source for the interpreter here. For further reading, Matt Might has a good blog post that describes the relationship between small step interpreters and continuations. He mentions the zipper approach that I followed here but he instead uses class methods to manipulate contexts of a computation.

While I’ve got a decent idea of how to implement this feature now, the interpreter is still a bit of a mess. It turns out first-class continuations are a big feature that touches pretty much every part of an interpreter!↩︎
You could also think of this expression-with-a-hole as a function from expressions to expressions: this fact will be important later.↩︎
Notice that this language is eagerly evaluated.↩︎
You might think, as I originally did, that the interpreter should clear the stack when call/cc is called. A quick test in Racket disabused me of that misconception: (+ 2 (call/cc (lambda (k) 2))) evaluates to 4, not 2. Explanation for why this answers our question is left as an exercise for the reader!↩︎