This is an announcement and explanation of the nothunks Haskell package that Edsko has been developing in the context of our work on Cardano for IOHK. It was originally published on the IOHK blog and is republished here with permission. There is also a video by Edsko available that was presented at MuniHac 2020.
Haskell is a lazy language. The importance of laziness has been widely discussed elsewhere: Why Functional Programming Matters is one of the classic papers on the topic, and A History of Haskell: Being Lazy with Class discusses it at length as well. For the purposes of this blog we will take it for granted that laziness is something we want. But laziness comes at a cost, and one of the disadvantages is that laziness can lead to memory leaks that are sometimes difficult to find. In this post we introduce a new library called nothunks aimed at discovering a large class of such leaks early, and helping to debug them. This library was developed for our work on the Cardano blockchain, but we believe it will be widely applicable in other projects too.
A motivating example
Consider the tiny application below, which processes incoming characters and reports how many characters there are in total, in addition to some per-character statistics:
import qualified Data.Map.Strict as Map
data AppState = AppState {
total :: !Int
indiv :: !(Map Char Stats)
,
}deriving (Show)
type Stats = Int
update :: AppState -> Char -> AppState
= st {
update st c = total st + 1
total = Map.alter (Just . aux) c (indiv st)
, indiv
}where
aux :: Maybe Stats -> Stats
Nothing = 1
aux Just n) = n + 1
aux (
initAppState :: AppState
= AppState {
initAppState = 0
total = Map.empty
, indiv
}
main :: IO ()
= interact $ show . foldl' update initAppState main
In this version of the code, the per-character statistics are simply how often we have seen each character. If we feed this code ‘aabbb’, it will tell us that it saw 5 characters, 2 of which were the letter ‘a’ and 3 of which were ‘b’:
$ echo -n aabbb | cabal run example1
AppState {
total = 5
, indiv = fromList [('a',2),('b',3)]
}
Moreover, if we feed the application a ton of data and construct a memory profile,
$ dd if=/dev/zero bs=1M count=10 | cabal run --enable-profiling example1 -- +RTS -hy
we see that the application runs in constant space:
So far so good. But now suppose we make an innocuous-looking change. Suppose, in addition to reporting how often every character occurs, we also want to know the offset of the last time that the character occurs in the file:
type Stats = (Int, Int)
update :: AppState -> Char -> AppState
= -- .. as before
update st c where
aux :: Maybe Stats -> Stats
Nothing = (1 , total st)
aux Just (n, _)) = (n + 1 , total st) aux (
The application works as expected:
$ echo -n aabbb | cabal run example2
AppState {
= 5
total = fromList [('a',(2,1)),('b',(3,4))]
, indiv }
and so the change is accepted in GitHub’s PR code review and gets merged. However, although the code still works, it is now a lot slower.
$ time (dd if=/dev/zero bs=1M count=100 | cabal run example1)
...
real 0m2,312s
$ time (dd if=/dev/zero bs=1M count=100 | cabal run example2)
...
real 0m15,692s
We have a slowdown of almost an order of magnitude, although we are barely doing more work. Clearly, something has gone wrong, and indeed, we have introduced a memory leak:
Unfortunately, tracing a profile like this to the actual problem in the code can be very difficult indeed. What’s worse, although our change introduced a regression, the application still worked fine and so the test suite probably wouldn’t have failed. Such memory leaks tend to be discovered only when they get so bad in production that things start to break (for example, servers running out of memory), at which point you have an emergency on your hands.
In the remainder of this post we will describe how nothunks
can help both with
spotting such problems much earlier, and debugging them.
Instrumenting the code
Let’s first see what usage of nothunks
looks like in our example. We modify
our code and derive a new class instance for our AppState
:
data AppState = AppState {
total :: !Int
indiv :: !(Map Char Stats)
,
}deriving (Show, Generic, NoThunks)
The NoThunks
class is defined in the nothunks library, as we will see in detail
later. Additionally, we will replace foldl'
with a new function:
repeatedly :: forall a b. (NoThunks b, HasCallStack)
=> (b -> a -> b) -> (b -> [a] -> b)
= ... repeatedly f
We will see how to define repeatedly
later, but, for now, think of it as
“foldl'
with some magic sprinkled on top”. If we run the code again, the
application will throw an exception almost immediately:
$ dd if=/dev/zero bs=1M count=100 | cabal run example3
(..)
example3: Unexpected thunk with context
["Int","(,)","Map","AppState"]
CallStack (from HasCallStack):
error, called at shared/Util.hs:22:38 in Util
repeatedly, called at app3/Main.hs:38:26 in main:Main
The essence of the nothunks
library is that we can check if a particular value
contains any thunks we weren’t expecting, and this is what repeatedly
is using
to make sure we’re not inadvertently introducing any thunks in the AppState
;
it’s this check that is failing and causing the exception. We get a
HasCallStack
backtrace telling us where we introduced that thunk, and – even
more importantly – the exception gives us a helpful clue about where the thunk
was:
"Int","(,)","Map","AppState"] [
This context tells us that we have an AppState
containing a Map
containing
tuples, all of which were in weak head normal form (not thunks), but the tuple
contained an Int
which was not in weak head normal form: a thunk.
From a context like this it is obvious what went wrong: although we are using a strict map, we have instantiated the map at a lazy pair type, and so although the map is forcing the pairs, it’s not forcing the elements of those pairs. Moreover, we get an exception the moment we introduce the thunk, which means that we can catch such regressions in our test suite. We can even construct minimal counter-examples that result in thunks, as we will see later.
Using nothunks
Before we look at how the library works, let’s first see how it’s used. In the
previous section we were using a magical function repeatedly
, but didn’t see
how we could define it. Let’s now look at this function:
repeatedly :: forall a b. (NoThunks b, HasCallStack)
=> (b -> a -> b) -> (b -> [a] -> b)
= go
repeatedly f where
go :: b -> [a] -> b
!b [] = b
go !b (a:as) =
go let !b' = f b a
in case unsafeNoThunks b' of
Nothing -> go b' as
Just thunk -> error . concat $ [
"Unexpected thunk with context "
show (thunkContext thunk)
, ]
The only difference between repeatedly
and foldl'
is the call to
unsafeNoThunks
, which is the function that checks if a given value contains any
unexpected thunks. The function is marked as “unsafe” because whether or not a
value is a thunk is not normally observable in Haskell; making it observable
breaks equational reasoning, and so this should only be used for debugging or
in assertions. Each time repeatedly
applies the provided function f
to update
the accumulator, it verifies that the resulting value doesn’t contain any
unexpected thunks; if it does, it errors out (in real code such a check would
only be enabled in test suites and not in production).
One point worth emphasizing is that repeatedly
reduces the value to weak head
normal form (WHNF) before calling unsafeNoThunks
. This is, of course, what
makes a strict fold-left strict
, and so repeatedly must do this to be a good
substitute for foldl'
. However, it is important to realize that if repeatedly
did not do that, the call to unsafeNoThunks
would trivially and immediately
report a thunk; after all, we have just created the f b a
thunk! Generally
speaking, it is not useful to call unsafeNoThunks
(or its IO
cousin noThunks
)
on values that aren’t already in WHNF.
In general, long-lived application state should never contain any unexpected
thunks, and so we can apply the same kind of pattern in other scenarios. For
example, suppose we have a server that is a thin IO layer on top of a mostly
pure code base, storing the application state in an IORef
. Here, too, we might
want to make sure that that IORef
never points to a value containing unexpected
thunks:
newtype StrictIORef a = StrictIORef (IORef a)
readIORef :: StrictIORef a -> IO a
StrictIORef v) = Lazy.readIORef v
readIORef (
writeIORef :: (NoThunks a, HasCallStack)
=> StrictIORef a -> a -> IO ()
StrictIORef v) !x = do
writeIORef (
check x
Lazy.writeIORef v x
check :: (NoThunks a, HasCallStack) => a -> IO ()
= do
check x <- noThunks [] x
mThunk case mThunk of
Nothing -> return ()
Just thunk ->
$ ThunkException
throw
(thunkContext thunk) callStack
Since check already lives in IO
, it can use noThunks
directly, instead of using
the unsafe pure wrapper; but otherwise this code follows a very similar
pattern: the moment we might introduce a thunk, we instead throw an exception.
One could imagine doing a very similar thing for, say, StateT
, checking for
thunks in put
:
newtype StrictStateT s m a = StrictStateT (StateT s m a)
deriving (Functor, Applicative, Monad)
instance (Monad m, NoThunks s)
=> MonadState s (StrictStateT s m) where
= StrictStateT $ get
get !s = StrictStateT $
put case unsafeNoThunks s of
Nothing -> put s
Just thunk -> error . concat $ [
"Unexpected thunk with context "
show (thunkContext thunk)
, ]
Minimal counter-examples
In some applications, there can be complicated interactions between the input
to the program and the thunks it may or may not create. We will study this
through a somewhat convoluted but, hopefully, easy-to-understand example.
Suppose we have a server that is processing two types of events, A
and B
:
data Event = A | B
deriving (Show)
type State = (Int, Int)
initState :: State
= (0, 0)
initState
update :: Event -> State -> State
A (a, b) = let !a' = a + 1 in (a', b)
update B (a, b)
update | a < 1 || b < 1 = let !b' = b + 1 in (a, b')
| otherwise = let b' = b + 2 in (a, b')
The server’s internal state consists of two counters, a
and b
. Each time we see
an A
event, we just increment the first counter. When we see a B
event,
however, we increment b
by 1
only if a
and b
haven’t reached 1
yet, and by 2
otherwise. Unfortunately, the code contains a bug: in one of these cases, part
of the server’s state is not forced and we introduce a thunk. (Disclaimer: the
code snippets in this blog post are not intended to be good examples of coding,
but to make it obvious where memory leaks are introduced. Typically, memory
leaks should be avoided by using appropriate data types, not by modifying
code.)
A minimal counter-example that will demonstrate the bug would therefore involve
two events A
and B
, in any order, followed by another B
event. Since we get an
exception the moment we introduce an exception, we can then use a framework
such as quickcheck-state-machine
to find bugs like this and construct such
minimal counter-examples.
Here’s how we might set up our test. Explaining how quickcheck-state-machine
(QSM) works is well outside the scope of this blog post; if you’re interested,
a good starting point might be An in-depth look at quickcheck-state-machine.
For this post, it is enough to know that in QSM we are comparing a real
implementation against some kind of model, firing off “commands” against both,
and then checking that the responses match. Here, both the server and the model
will use the update function, but the “real” implementation will use the
StrictIORef
type we introduced above, and the mock implementation will just use
the pure code, with no thunks check. Thus, when we compare the real
implementation against the model, the responses will diverge whenever the real
implementation throws an exception (caused by a thunk):
data T
type instance MockState T = State
type instance RealMonad T = IO
type instance RealHandles T = '[]
data instance Cmd T f hs where
Cmd :: Event -> Cmd T f '[]
data instance Resp T f hs where
-- We record any exceptions that occurred
Resp :: Maybe String -> Resp T f '[]
deriving instance Eq (Resp T f hs)
deriving instance Show (Resp T f hs)
deriving instance Show (Cmd T f hs)
instance NTraversable (Resp T) where
Resp ok) = pure (Resp ok)
nctraverse _ _ (
instance NTraversable (Cmd T) where
Cmd e) = pure (Cmd e)
nctraverse _ _ (
sm :: StrictIORef State -> StateMachineTest T
= StateMachineTest {
sm state = \(Cmd e) mock ->
runMock Resp Nothing, update e mock)
(= \(Cmd e) -> do
, runReal <- readIORef state
real <- try $ writeIORef state (update e real)
ex return $ Resp (checkOK ex)
= initState
, initMock = \_ -> Nil
, newHandles = \_ -> Just $
, generator At (Cmd A), At (Cmd B)]
elements [= \_ _ -> []
, shrinker = \_ -> writeIORef state initState
, cleanup
}where
checkOK :: Either SomeException () -> Maybe String
Left err) = Just (show err)
checkOK (Right ()) = Nothing checkOK (
(This uses the new Lockstep machinery in QSM that we introduced in the Munihac 2019 hackathon.)
If we run this test, we get the minimal counter-example we expect, along with
the HasCallStack
backtrace and the context telling us precisely that we have a
thunk inside a lazy pair:
*** Failed! Falsified (after 6 tests and 2 shrinks):
Commands
=
{ unCommands Command At { unAt = Cmd B } At { unAt = Resp Nothing } []
[ Command At { unAt = Cmd A } At { unAt = Resp Nothing } []
, Command At { unAt = Cmd B } At { unAt = Resp Nothing } []
,
]
}
...
Resp (Just "Thunk exception in context [Int,(,)]
called at shared/StrictIORef.hs:26:5 in StrictIORef
writeIORef, called at app5/Main.hs:71:37 in Main")
:/= Resp Nothing
The combination of a minimal counter-example, a clear context, and the backtrace, makes finding most such memory leaks almost trivial.
Under the hood
The core of the nothunks
library is the NoThunks
class:
-- | Check a value for unexpected thunks
class NoThunks a where
noThunks :: [String] -> a -> IO (Maybe ThunkInfo)
wNoThunks :: [String] -> a -> IO (Maybe ThunkInfo)
showTypeOf :: Proxy a -> String
data ThunkInfo = ThunkInfo {
thunkContext :: Context
}deriving (Show)
type Context = [String]
All of the NoThunks
class methods have defaults, so instances can be, and very
often are, entirely empty, or – equivalently – derived using DeriveAnyClass
.
The noThunks
function is the main entry point for application code,
and we have already seen it in use. Instances of NoThunks
, however,
almost never need to redefine noThunks
and can use the default
implementation, which we will take a look at shortly. Conversely,
wNoThunks
is almost never useful for application code but it’s where
most of the datatype-specific logic lives, and is used by the default
implementation of noThunks
; we will see a number of examples of it
below. Finally, showTypeOf
is used to construct a string
representation of a type when constructing the thunk contexts; it has a
default in terms of Generic
.
noThunks
Suppose we are checking if a pair contains any thunks. We should first
check if the pair itself is a thunk, before we pattern match on it.
After all, pattern matching on the pair would force it, and so if it had
been a thunk, we wouldn’t be able to see this any more. Therefore,
noThunks
first checks if a value itself is a thunk, and if it isn’t,
it calls wNoThunks
; the w
stands for WHNF: wNoThunks
is allowed to
assume (has as precondition) that its argument is not itself a thunk and
so can be pattern-matched on.
noThunks :: [String] -> a -> IO (Maybe ThunkInfo)
= do
noThunks ctxt x <- checkIsThunk x
isThunk if isThunk
then return $ Just ThunkInfo { thunkContext = ctxt' }
else wNoThunks ctxt' x
where
ctxt' :: [String]
= showTypeOf (Proxy @a) : ctxt ctxt'
Note that when wNoThunks
is called, the (string representation of)
type a
has already been added to the context.
wNoThunks
Most of the datatype-specific work happens in wNoThunks
; after all, we
can now pattern match. Let’s start with a simple example, a manual
instance for a type of strict pairs:
data StrictPair a b = StrictPair !a !b
instance (NoThunks a, NoThunks b)
=> NoThunks (StrictPair a b) where
= "StrictPair"
showTypeOf _ StrictPair x y) = allNoThunks [
wNoThunks ctxt (
noThunks ctxt x
, noThunks ctxt y ]
Because we have verified that the pair itself is in WHNF, we can just
extract both components, and recursively call noThunks
on both of
them. Function allNoThunks
is a helper defined in the library that
runs a bunch of thunk checks, stopping at the first one that reports a
thunk.
Occasionally we do want to allow for selected thunks. For example,
suppose we have a set of integers with a cached total
field, but we
only want to compute that total if it’s actually used:
data IntSet = IntSet {
toSet :: !(Set Int)
-- | Total
--
-- Intentionally /not/ strict:
-- Computed when needed (and then cached)
total :: Int
,
}deriving (Generic)
Since total
must be allowed to be a thunk, we skip it in wNoThunks
:
instance NoThunks IntSet where
IntSet xs _total) = noThunks ctxt xs wNoThunks ctxt (
Such constructions should probably only be used sparingly; if the
various operations on the set are not carefully defined, the set might
hold on to all kinds of data through that total
thunk. Code like that
needs careful thought and careful review.
Generic instance
If no implementation is given for wNoThunks
, it uses a default based
on GHC generics. This means that for types that implement Generic
,
deriving a NoThunks
instance is often as easy as in the AppState
example above, simply saying:
data AppState = AppState {
total :: !Int
indiv :: !(Map Char Stats)
,
}deriving (Show, Generic, NoThunks)
Many instances in the library itself are also defined using the generic instance; for example, the instance for (default, lazy) pairs is just:
instance (NoThunks a, NoThunks b) => NoThunks (a, b)
Deriving-via wrappers
Sometimes, we don’t want the default behavior implemented by the generic
instance, but defining an instance by hand can be cumbersome. The
library therefore provides a few newtype
wrappers that can be used to
conveniently derive custom instances. We will discuss three such
wrappers here; the library comes with a few more.
Only check for WHNF
If all you want to do is check if a value is in weak head normal form
(ie, check that it is not a thunk itself, although it could contain
thunks), you can use OnlyCheckIsWhnf
. For example, the library defines
the instance for Bool
as:
deriving via OnlyCheckWhnf Bool
instance NoThunks Bool
For Bool
, this is sufficient: when a boolean is in weak head normal
form, it won’t contain any thunks. The library also uses this for
functions:
deriving via OnlyCheckWhnfNamed "->" (a -> b)
instance NoThunks (a -> b)
(Here, the Named
version allows you to explicitly define the string
representation of the type to be included in the thunk contexts.) Using
OnlyCheckWhnf
for functions means that any values in the function
closure will not be checked for thunks. This is intentional and a
subtle design decision; we will come back to this in the section on
permissible thunks below.
Skipping some fields
For types such as IntSet
where most fields should be checked for
thunks, but some fields should be skipped, we can use AllowThunksIn
:
deriving via AllowThunksIn '["total"] IntSet
instance NoThunks IntSet
This can be handy for large record types, where giving the instance by
hand is cumbersome and, moreover, can easily get out of sync when
changes to the type (for example, a new field) are not reflected in the
definition of wNoThunks
.
Inspecting the heap directly
Instead of going through the class system and the NoThunks
instances,
we can also inspect the GHC heap directly. The library makes this
available through the InspectHeap
newtype, which has an instance:
instance Typeable a => NoThunks (InspectHeap a) where
...
Note that this does not depend on a NoThunks
instance for a
. We can
use this like any other deriving-via wrappers, for example:
deriving via InspectHeap TimeOfDay
instance NoThunks TimeOfDay
The advantage of such an instance is that we do not require instances
for any nested types; for example, although TimeOfDay
has a field of
type Pico
, we don’t need a NoThunks
instance for it.
The disadvantage is that we lose all compositionality. If there are
any types nested inside for which we want to allow for thunks, we have
no way of overriding the behaviour of the no-thunks check for those
types. Since we are inspecting the heap directly, and the runtime system
does not record any type information, any NoThunks
instances for those
types are irrelevant and we will report any thunks that it finds.
Moreover, when we do find such a thunk, we cannot report a useful
context, because – again – we have no type information. If noThunks
finds a thunk deeply nested inside some T
(whose NoThunks
instance
was derived using InspectHeap
), it will merely report "...", "T"
as
the context (plus perhaps any context leading to T
itself).
Permissible thunks
Some data types inherently depend on the presence of thunks. For
example, the Seq
type defined in
Data.Sequence
internally uses a finger tree. Finger trees are a specialized data type
introduced by Ralf Hinze and Ross
Paterson; for
our purposes, all you need to know is that finger trees make essential
use of thunks in their spines to achieve their asymptotic complexity
bounds. This means that the NoThunks
instance for Seq
must allow
for thunks in the spine of the data type, although it should still
verify that there are no thunks in any of the elements in the
sequence. This is easy enough to do; the instance in the library is:
instance NoThunks a => NoThunks (Seq a) where
= "Seq"
showTypeOf _ = noThunksInValues ctxt . toList wNoThunks ctxt
Here, noThunksInValues
is a helper function that checks a list of
values for thunks, without checking the list itself.
However, the existence of types such as Seq
means that the
non-compositionality of InspectHeap
can be a big problem. It is also
the reason that for functions we merely check if the function is in weak
head normal form. Although the function could have thunks in its
closure, we don’t know what their types are. We could check the
function closure for thunks (using InspectHeap
), but if we did, and
that closure contained, say, a Seq
among its values, we might
incorrectly report an unexpected thunk. Because it is more problematic
if the test reports a bug when there is none than when an actual bug is
not reported, the library opts to check only functions for WHNF. If in
your application you store functions, and it is important that these
functions are checked for thunks, then you can define a custom newtype
around a -> b
with a NoThunks
instance defined using InspectHeap
(but only if you are sure that your functions don’t refer to types that
must be allowed to have thunks).
Comparison with the heap/stack limit size method
In 2016, Neil Mitchell gave a very nice talk at
HaskellX,
where he presented a method for finding memory leaks (he has also
written a blog
post
on the topic). The essence of the method is to run your test suite with
much reduced stack and heap limits, so that if there is a memory leak in
your code, you will notice it before it hits production. He then
advocates the use of the -xc
runtime flag to get a stack trace when
such a “stack limit exhausted” exception is thrown.
The technique advocated in this post has a number of advantages. We get
an exception the moment a thunk is created, so the stack trace we get
is often much more useful. Together with the context reported by
noThunks
, finding the problem is usually trivial. Interpreting the
stack reported by -xc
can be more difficult, because this exception is
thrown when the limit is exhausted, which may or may not be related to
the code that introduced the leak in the first place. Moreover, since
the problem only becomes known when the limit is exhausted, minimal
counter-examples are out of the question. It can also be difficult to
pick a suitable value for the limit; how much memory does the test site
actually need, and what would constitute a leak? Finally, -xc
requires your program to be compiled with profiling enabled, which means
you’re debugging something different to what you’d run in production,
which is occasionally problematic.
Having said all that, the nothunks
method does not replace the
heap/stack limit method, but complements it. The nothunks
approach is
primarily useful for finding space leaks in pieces of data where it’s
clear that we don’t want any thunk build-up, typically long-lived
application state. It is less useful for finding more ‘local’ space
leaks, such as a function accumulator not being updated strictly. For
finding such leaks, setting stack/heap limits is still a useful
technique.
Conclusions
Long-lived application data should, typically, not have any thunk
build-up. The nothunks
library can verify this through the
noThunks
and unsafeNoThunks
function calls, which check if the
supplied argument contains any unexpected thunks. These checks can then
be used in assertions to check that no thunks are created. This means
that if we do introduce a thunk by mistake, we get an immediate test
failure, along with a callstack to the place where the thunk was created
as well as a context providing a helpful hint on where the thunk is.
Together with a testing framework, this makes memory leaks much easier
to debug and avoid. Indeed, they have mostly been a thing of the past in
our work on Cardano since we started using this approach.