Author: Eiko

Tags: template haskell, haskell, code generation, ghc, meta program

Time: 2024-09-11 11:28:49 - 2024-09-11 15:48:01 (UTC)

Template Haskell: Introduction to Dark Magic

Template Haskell is a magic for generating magic, i.e. writing programs to write programs, at compile time. This allows

  • Generating Haskell code at compile time.

  • Generating other language codes, helps with the use of embeded domain specific languages.

It’s pretty dark because you are generating magic (code) using magic, comparing to other more systemmatic meta programming approaches like Generics, it is less systemmatic and type-safe. Although it is dark in nature, it is very strong and powerful, as it can bring other languages into our magic power, and generating boilplate code ar compile time! Compared to generics, the code generated by Template Haskell is usually faster since the generics can have intermediate representations.

Remark : turn on {-# LANGUAGE TemplateHaskell #-}.

Code Generation

Example: n-variable curry function

This example is taken from haskell wiki, I added more explanations and comments to it. We want to generate functions that curries any \(n\)-tuples, like curry3, curry5.

We first write a meta function taking the number \(n\) as an argument, producing an Exp expression in a strange monad Q,

import Control.Monad (replicateM)
import Language.Haskell.TH

curryN :: Int -> Q Exp
curryN n = do
  f <- newName "f"                 -- producing a new symbol for f
  xs <- replicateM n (newName "x") -- producing n symbols for x
  let args = map VarP (f:xs)       -- turn symbols into variables
      ntup = TupE (map VarE xs)    -- (x1, ..., xn)
  return $ LamE args (AppE (VarE f) ntup)
  -- return $ \ f x1 .. xn -> f (x1, ..., xn)

Here

  • LamE is used to create lambda abstractions \.. -> .. or \(\lambda x. y\) if you wish

  • VarE maps a name to its corresponding ‘variable’ (actually, constant reference XD)

  • AppE is the application in lambda calculus, well, it’s just function application ($), applying function with argument.

  • So we have really returned a function \f x1 ... xn -> f (x1, ..., xn).

Calling template functions

to call the template function curryN we have written, you will need to write a ‘splice operator’ like $(curryN 5), this will evaluate to a Q Exp and turn into actual function.

Generating more functions at once

However, we can also create all these curry1, …, curry100 at once. For this we will generate a list of function declarations, producing a function of type Int -> Q [Dec]:

genCurries :: Int -> Q [Dec]
genCurries n = forM [1..n] mkCurryDec -- mkCurryDec has type Int -> Q Dec
  where mkCurryDec i = do
          cur <- curryN i  -- cur :: Q Exp
          let name = mkName $ "curry" ++ show i
          return $ FunD name [Clause [] (NormalB cur) []]

Here,

  • mkName :: String -> Name which we used to name functions is different from newName :: String -> Q Name, mkName generates precise name.

  • cur :: Q Exp is our function body.

Structure of Template Haskell

Algebraic Data Types For Code Generation

  • Exp for expressions, its constructors include

    • VarE Name for variables
    • ConE Name for constructors data T1 = C1 t1 t2; p = {C1} e1 e2
    • LitE Lit for literals
    • AppE Exp Exp for function application
    • AppTypeE Exp Type for type application
    • InfixE (Maybe Exp) Exp (Maybe Exp) for infix expressions (and partially applied infix)
    • UInfixE Exp Exp Exp for infix expressions
    • ParensE Exp for parenthesized expressions (e)
    • LamE [Pat] Exp for lambda abstractions \ pat1 pat2 -> e
    • LamCaseE [Match] for lambda case expressions \case { ... }
    • LamCasesE [Clause] for lambda case expressions \case { ... }
    • TupE [Maybe Exp] for tuples (e1, e2, e3), the Maybe is for tuple sections like (,e2,)
    • UnboxedTupE [Maybe Exp] for unboxed tuples
    • CondE Exp Exp Exp for conditional expressions if e1 then e2 else e3
    • MultiIfE [(Guard, Exp)] for multi-branch if expressions if | g1 -> e1 | g2 -> e2 | g3 -> e3
    • LetE [Dec] Exp for let expressions let { d1; d2; d3 } in e
    • CaseE Exp [Match] for case expressions case e of { ... }
    • DoE (Maybe ModName) [Stmt] for do expressions do { stmt1; stmt2; stmt3 }
    • MDoeE (Maybe ModName) [Stmt] for monad comprehensions mdo { stmt1; stmt2; stmt3 }
  • Pat for patterns

  • Dec for declarations

  • Type for types

  • Name for names (identifiers)

  • Clause [Pat] Body [Dec] for function clauses

For a complete reference, go to Hackage template-haskell. It’s really dark and crazy: it supports everything you can do in Haskell, all in abstract syntax trees. I actually learned a lot about Haskell syntax (probably a great way to learn?) from this package.

Monadic Nature of Building Code

The algebraic data structures does not reveal the monadic nature of building code, for example creating new identifiers creates side effect and it influences the context. These are what the monad Q intend to account for.

There are also some lifted functions for Q monad,

  • FunD :: Name -> [Clause] -> Dec for function declarations

    lifted version funD :: Name -> [Q Clause] -> Q Dec

    The Name field is the only field not lifted because they often need to be reused and should not be lifted.

  • Clause :: [Pat] -> Body -> [Dec] -> Clause for function clauses like f p1 p2 = body where dec1; dec2

    lifted version clause :: [Q Pat] -> Q Body -> [Q Dec] -> Q Clause

  • NormalB :: Exp -> Body for normal body like f p = { e }

    lifted version normalB :: Q Exp -> Q Body

These lifts are not deep magic, they are just a combination of liftMn and sequence, so they are widely defined, the above constructors like VarP, LamE all have their lifted versions.

So you could write our original curryN function as

curryN :: Int -> Q Exp
curryN n = do
    f <- newName "f"
    xs <- replicateM n (newName "x")
    lamE (varP <$> (f:xs)) (varE f `appE` tupE (varE <$> xs))

Remember

  • P are patterns, for matching

  • E are expression, for evaluating

Quotation Brackets

While in principle all programs could be done with these abstract syntax trees, it is possible to use quotation brackets to directly provide some raw haskell code by enclosing them in side the oxford brackets [| ... |].

genId :: Q Exp
genId = do
  x <- newName "x"
  lamE [varP x] (varE x)

can be written as

genId :: Q Exp
genId = [|\x -> x|]

This can be the shortcut when it applies, but for more complex code generation, we will still need to use the abstract syntax trees. Consider the following example taken from the original paper, creating a function that generalizes fst and snd, so that we can write $(sel 1 3) x instead of case x of (a,_,_) -> a.

This cannot be done via quotation since the number of components depend on \(n\). We can construct more directly, like

sel :: Int -> Int -> Q Exp
sel i n = lamE