Template Haskell (TH) is a powerful tool for specializing programs and allows shifting some work from runtime to compile time. It can be a bit intimidating to use for beginners. So I thought I would write up how to use TH to turn certain kind runtime computations into compile time computations.

In particular we will turn the initialization of a fully static data structure into a compile time operation. This pattern works for many data structures but we will look at IntSet in particular.

A working example

As an example, consider a function such as:

isStaticId :: Int -> Bool
isStaticId x =
    x `elem` staticIds
    staticIds = [1,2,3,5,7 :: Int]

We have a set of known things here, represented by a list named staticIds.

We use Int as it makes the example easier. But these could be Strings or all kinds of things. In particular, I was inspired by GHC’s list of known builtin functions.

Upsides of the list representation

The advantage of the code as written above is that the list is statically known. As a result the list will be built into the final object code as static data, and accessing it will not require any allocation/computation.

You can check this by looking at the core dump (-ddump-simpl). Don’t forget to enable optimizations or this might not work as expected. In the core there should be a number of definitions like the one below.

-- RHS size: {terms: 3, types: 1, coercions: 0, joins: 0/0}
isStaticId3 = : isStaticId8 isStaticId4

Note that the above is Core syntax for isStaticId3 = isStaticId8 : isStaticId4, i.e., this is just denoting a part of the list, and each element gets its own identifier. All these definitions will be compiled to static data, and will eventually be represented as just a number of words encoding the constructor and its fields.

We can confirm this by looking at the Cmm output where the corresponding fragment will look like this:

[section ""data" . isStaticId3_closure" {
         const :_con_info;
         const isStaticId8_closure+1;
         const isStaticId4_closure+2;
         const 3;

I won’t go into the details of how to read the Cmm, but it shows us that the binding will end up in the data section. The constant :_con_info; tells us that we are dealing with a Cons cell, and then we have the actual data stored in the cell.

What is important here is that this is static data. The GC won’t have to traverse it so having the data around does not affect GC performance. We also don’t need to compute it at runtime as it’s present in the object file in its fully evaluated form.

Switching to IntSet

What if we aggregate more data? If we blow up the list to a hundred, a thousand or more elements, it’s likely that performing a linear search will become a bottleneck for performance.

So we rewrite our function to use a set as follows:

isStaticIdSet :: Int -> Bool
isStaticIdSet x =
    x `S.member` staticIds
    staticIds = S.fromList [1,2,3,5,7 :: Int] :: IntSet

This looks perfectly fine on the surface. Instead of having O(n) lookups we should get O(log(n)) lookups, right?

Pitfalls of runtime initialization

However, what happens at runtime? In order to query the set we have to first convert the list into a set. This is where disaster strikes. We are no longer querying static data, as the list argument has to be converted into a set. The S.fromList function call will not be evaluated at compilation time.

In many cases, GHC may manage to at least share our created set staticIds across calls. But this is fragile, and depending on the exact code in question, it might not. Then we can end up paying the cost of set construction for each call to isStaticIdSet.

So while we reduced the lookup cost from O(n) to O(log(n)) the total cost is now O(n*min(n,W)+ log(n)), where n*min(n,W) is the cost of constructing the set from a list. We could optimize this somewhat by making sure the list is sorted and has no duplicates. But we would still end up worse than with the list-based code we started out with.

It’s a shame that GHC can’t evaluate S.fromList at compile time … or can it?

Template Haskell (TH) to the rescue

What we really want to do is to force GHC to fully evaluate our input data to an IntSet. Then ensure the IntSet is stored as static data just like it happens for the list in our initial example.

How can TH help?

Template Haskell allows us to specify parts of the program to compute at compile time.

So we “simply” tell GHC to compute the set at compile time and are done.

Like so:

{-# NOINLINE isStaticIdSet #-}
isStaticIdSet :: Int -> Bool
isStaticIdSet x =
    x `S.member` staticIds
    staticIds = $$( liftTyped (S.fromList [1,2,3,5,7] :: IntSet))

This results in Core that is even simpler as in the [Int] example above:

-- RHS size: {terms: 3, types: 0, coercions: 0, joins: 0/0}
isStaticIdSet1 = Tip 0# 174##

-- RHS size: {terms: 7, types: 3, coercions: 0, joins: 0/0}
  = \ x_a5ar ->
      case x_a5ar of { I# ww1_i5r2 -> $wmember ww1_i5r2 isStaticIdSet1 }

No longer will we allocate the set at compilation time; instead the whole set is encoded in isStaticIdSet1. We only get a single constructor because IntSet can encode small sets using a single constructor.

How it works

From the outside in:

$$( .. ) is TH syntax for a typed splice.1 Splicing is the process of inserting generated syntax into our program. The splice construct takes an expression denoting a syntax tree, evaluates it and inserts the resulting syntax at the place where the splice occurs.

The next piece of magic is liftTyped. It takes a regular Haskell expression, evaluates it at compile time to an abstract syntax tree that, when spliced, equals the evaluated value of the Haskell expression.

This leaves S.fromList [1,2,3,5,7] which is regular set creation.

Putting these together, during compilation GHC will:

  • Evaluate S.fromList [1,2,3,5,7].
  • Turn the resulting set into an abstract syntax tree using liftTyped.
  • Splice that abstract syntax tree into our program using $$(..), effectively inserting the fully evaluated set expression into our program.

The resulting code will be compiled like any other, in this case resulting in fully static data.

Full example

Now you might think this was too easy, and you are partially right. The main issue is that liftTyped requires a instance of the Lift typeclass.

But for the case of IntSet, we can have GHC derive one for us. So all it costs us is slightly more boiler plate.

Here is a full working example for you to play around with:

-- First module
{-# LANGUAGE TemplateHaskell #-} -- Enable TH
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE DeriveLift #-}

module TH_Lift  where

import Language.Haskell.TH.Syntax

import Data.IntSet.Internal

deriving instance Lift (IntSet)

-- Second module
{-# LANGUAGE TemplateHaskell #-}

module M (isStaticIdSet) where

import TH_Lift
import Data.IntSet as S
import Language.Haskell.TH
import Language.Haskell.TH.Syntax
type Id = Int

isStaticIdSet :: Int -> Bool
isStaticIdSet x =
    x `S.member` staticSet
    staticSet = $$(liftTyped (S.fromList [1,2,3,5,7] :: IntSet))

Why do we require two modules?

We translate liftTyped (S.fromList [1,2,3,5,7] :: IntSet) into a TH expression at compile time. For this, GHC will call the (already compiled) lift method of the Lift instance.

However if we define isStaticIdSet and the Lift instance in the same module, GHC can’t call liftTyped as it’s not yet compiled by the time we need it.

In practice most packages have companions which already offer Lift instances. For example, th-lift-instances offers instances for the containers package.2

Disclaimer: This won’t work for all data types!

For many data types the result of liftTyped will be an expression that can be compiled to static data as long as the contents are known.

This is in particular true for “simple” ADTs like the ones used by IntSet or Set.

However certain primitives like arrays can’t be allocated at compile time. This sadly means this trick won’t currently work for Arrays or Vectors. There is a ticket about removing this restriction on arrays on GHC’s issue tracker.. So hopefully, we will be able to lift arrays at some point in the future.

Note furthermore that lifting won’t work for infinite data structures, as it usually requires its argument to be evaluated completely if we want it to result in static data.

  1. We are using Typed Template Haskell here in order to advertise it a bit better. Typed Template Haskell ensures that we are building type-correct code. In an example, as simple as this, it hardly makes a difference, because even normal Template Haskell does type-check the generated code. We could equally well have written

    staticSet = $(lift (S.fromList [1,2,3,5,7] :: IntSet))
  2. Unfortunately, some of the instances defined in th-lift-instances up to version 0.1.16 are “wrong” for the purposes of this post. For example, the IntSet instance is based on a call to fromList, not statically building the internal representation. Make sure that you use th-lift-instances version 0.1.17 or later.↩︎