In this blog post we will introduce a new open source Haskell library called
debuggable
, which provides various utilities designed to
make it easier to debug your applications. Some of these are intended for use
during actual debugging, others are designed to be a regular part of your
application, ready to be used if and when necessary.
Non-interleaved output
Ever see output like this when debugging concurrent applications?
ATnhdi st hiiss ai sm eas smaegses afgreo mf rtohme tfhier sste ctohnrde atdh
read
AndT htihsi si si sa am emsessasgaeg ef rformo mt hteh ef isrescto ntdh rteharde
ad
TAhnids tihsi sa imse sas amgees sfargoem ftrhoem ftihres ts etchorneda dt
hread
The problem is that concurrent calls to putStrLn
can result in interleaved
output. To solve this problem, debuggable
offers
Debug.NonInterleavedIO
, which provides variants of
putStrLn
and friends, as well as trace
and its variants, all of which can
safely be called concurrently without ever resulting in interleaved output.
For example:
import Debug.NonInterleavedIO qualified as NIIO
useDebuggable :: IO ()
= do
useDebuggable
concurrently_10 $ do
( replicateM_ "This is a message from the first thread"
NIIO.putStrLn 100_000
threadDelay
)10 $ do
( replicateM_ "And this is a message from the second thread"
NIIO.putStrLn 100_000
threadDelay )
If we run this as-is, we will only see
niio output to /tmp/niio2418318-0
on the terminal; inspecting /tmp/niio2418318-0
will show
And this is a message from the second thread
This is a message from the first thread
And this is a message from the second thread
This is a message from the first thread
...
If you want to send the output to a specific file (or /dev/stdout
for output
to the terminal), you can set the NIIO_OUTPUT
environment variable.
Provenance
Provenance is about tracking of what was called when and where.
Call-sites
Consider the following example:
f1 :: IO ()
= f2
f1
f2 :: HasCallStack => IO ()
= f3
f2
f3 :: HasCallStack => IO ()
= putStrLn $ prettyCallStack callStack f3
The callstack we get from this example looks something like this:
CallStack (from HasCallStack):
f3, called at Demo/Provenance.hs:15:6 in ..
f2, called at Demo/Provenance.hs:12:6 in ..
Callstacks are awesome, and a huge help during debugging, but there are some minor issues with this example:
- Personally, this has always felt a bit “off by one” to me: the first entry
tells us that we are in
f3
, but we were called from line 15, which isf2
; likewise, the second entry in the stack tells us that we are inf2
, but we were called from line 12, which isf1
. Not a huge deal, but arguably a bit confusing. (See also GHC ticket #25546: MakeHasCallStack
include the caller.) - Somewhat relatedly, when we are in
f3
, and ask for aCallStack
, being told that we are inf3
is not particularly helpful (we knew that already). - Finally, it is sometimes useful to have just the “first” entry in the callstack; “we were called from line such and such, which is function so and so”.
For this reason, Debug.Provenance provides a CallSite
abstraction
g1 :: IO ()
= g2
g1
g2 :: HasCallStack => IO ()
= g3
g2
g3 :: HasCallStack => IO ()
= print callSite g3
This outputs:
g2 -> g3 (Demo/CallSite.hs:31:6)
where line 31 is the call to g3
in g2
. Due to the (alleged) “off-by-one”,
both g2
and g3
must be given a HasCallStack
constraint, otherwise we get
{unknown} -> g3 (Demo/CallSite.hs:31:6)
when g2
lacks the constraint, or
{unknown} -> {unknown} ()
when g3
does. There is also a variant callSiteWithLabel
, which results in
output such as
g2 -> g3 (Demo/CallSite.hs:31:6, "foo")
Invocations
Sometimes we are not so much interested in where we are called from, but how often a certain line in the source is called. Debug.Provenance offers “invocations” to track this:
g1 :: IO ()
= replicateM_ 2 g2
g1
g2 :: HasCallStack => IO ()
= do
g2 print =<< newInvocation
2 g3
replicateM_
g3 :: HasCallStack => IO ()
= print =<< newInvocation g3
This results in output such as
g2 (Demo/Invocation.hs:30:15) #1
g3 (Demo/Invocation.hs:34:16) #1
g3 (Demo/Invocation.hs:34:16) #2
g2 (Demo/Invocation.hs:30:15) #2
g3 (Demo/Invocation.hs:34:16) #3
g3 (Demo/Invocation.hs:34:16) #4
We see the first call to g2
, then the first and second call to g3
, then
the second call to g2
, and finally the third and fourth call to h3
.
When debugging problems such as deadlocks, it is often useful to insert
putStrLn
statements like this:
f4 :: IO ()
= do
f4 putStrLn "f4:1"
-- f4 does something ..
putStrLn "f4:2"
-- f4 does something else ..
putStrLn "f4:3"
This pattern too can be made a bit simpler by using invocations:
g4 :: HasCallStack => IO ()
= do
g4 print =<< newInvocation
-- f4 does something ..
print =<< newInvocation
-- f4 does something else ..
print =<< newInvocation
Resulting in output such as
g4 (Demo/Invocation.hs:48:15) #1
g4 (Demo/Invocation.hs:50:15) #1
g4 (Demo/Invocation.hs:52:15) #1
Scope
The definition of g4
above is still a little clunky, especially if we also
want to include other output than just the invocation itself. We can do better:
import Debug.NonInterleavedIO.Scoped qualified as Scoped
g4 :: HasCallStack => IO ()
= do
g4 "start"
Scoped.putStrLn -- f4 does something ..
"middle"
Scoped.putStrLn -- f4 does something else ..
"end" Scoped.putStrLn
outputs
[g4 (Demo/Scope.hs:21:5) #1] start
[g4 (Demo/Scope.hs:23:5) #1] middle
[g4 (Demo/Scope.hs:25:5) #1] end
As the name suggests, though, there is more going on here than simply a more
convenient API: Debug.Provenance.Scope
offers a combinator
called scoped
for scoping invocations:
g1 :: IO ()
= g2
g1
g2 :: HasCallStack => IO ()
= scoped g3
g2
g3 :: HasCallStack => IO ()
= scoped g4 g3
This results in
[g4 (Demo/Scope.hs:29:5) #1, g3 (Demo/Scope.hs:25:6) #1, g2 (Demo/Scope.hs:22:6) #1] start
[g4 (Demo/Scope.hs:31:5) #1, g3 (Demo/Scope.hs:25:6) #1, g2 (Demo/Scope.hs:22:6) #1] middle
[g4 (Demo/Scope.hs:33:5) #1, g3 (Demo/Scope.hs:25:6) #1, g2 (Demo/Scope.hs:22:6) #1] end
Threads
The counters that are part of an Invocation
can be very useful to
cross-reference output messages from multiple threads. Continuing with the g4
example we introduced in the section on Scope, suppose we have
concurrent :: IO ()
= concurrently_ g4 g4 concurrent
we might get output like this:
[g4 (Demo/Scope.hs:32:5) #1] start
[g4 (Demo/Scope.hs:32:5) #2] start
[g4 (Demo/Scope.hs:34:5) #1] middle
[g4 (Demo/Scope.hs:34:5) #2] middle
[g4 (Demo/Scope.hs:36:5) #1] end
[g4 (Demo/Scope.hs:36:5) #2] end
(where the scheduling between the two thread might be different, of course).
Scope is always thread local, but debuggable
provides a way to explicitly
inherit the scope of a parent thread in a child thread:
h1 :: IO ()
= h2
h1
h2 :: HasCallStack => IO ()
= scoped h3
h2
h3 :: HasCallStack => IO ()
= scoped $ do
h3 <- myThreadId
tid
concurrently_>> g4)
(inheritScope tid >> g4) (inheritScope tid
results in
[g4 (Demo/Scope.hs:34:5) #1, h3 (Demo/Scope.hs:50:6) #1, h2 (Demo/Scope.hs:47:6) #1] start
[g4 (Demo/Scope.hs:34:5) #2, h3 (Demo/Scope.hs:50:6) #1, h2 (Demo/Scope.hs:47:6) #1] start
[g4 (Demo/Scope.hs:36:5) #1, h3 (Demo/Scope.hs:50:6) #1, h2 (Demo/Scope.hs:47:6) #1] middle
[g4 (Demo/Scope.hs:36:5) #2, h3 (Demo/Scope.hs:50:6) #1, h2 (Demo/Scope.hs:47:6) #1] middle
[g4 (Demo/Scope.hs:38:5) #1, h3 (Demo/Scope.hs:50:6) #1, h2 (Demo/Scope.hs:47:6) #1] end
[g4 (Demo/Scope.hs:38:5) #2, h3 (Demo/Scope.hs:50:6) #1, h2 (Demo/Scope.hs:47:6) #1] end
Callbacks
Suppose we have some functions which take another function, a callback, as argument, and invoke that callback at some point:
f1 :: HasCallStack => (Int -> IO ()) -> IO ()
= f2 k
f1 k
f2 :: HasCallStack => (Int -> IO ()) -> IO ()
= scoped $ k 1 f2 k
Let’s use this example callback function:
g1 :: HasCallStack => Int -> IO ()
= g2 n
g1 n
g2 :: HasCallStack => Int -> IO ()
= Scoped.putStrLn $ "n = " ++ show n ++ " at " ++ prettyCallStack callStack g2 n
and invoke f1
as follows:
withoutDebuggable :: HasCallStack => IO ()
= f1 g1 withoutDebuggable
This outputs:
[g2 (Demo/Callback.hs:26:8) #1, f2 (Demo/Callback.hs:20:8) #1]
n = 1 at CallStack (from HasCallStack):
g2, called at Demo/Callback.hs:23:8 in ..
g1, called at Demo/Callback.hs:29:24 in ..
withoutDebuggable, called at Demo.hs:25:36 in ..
Confusingly, this callstack does not include any calls to f1
or f2
. This
happens because the call to k
in f2
does not pass the current CallStack
;
instead we see the CallStack
as it was when we defined g1
.
For callbacks like this it is often useful to have two pieces of information:
the CallStack
that shows how the callback is actually invoked, and the
CallSite
where the callback was defined.
Debug.Provenance.Callback
provides a Callback
abstraction that does exactly this. A Callback m a b
is essentially a function
a -> m b
, modulo treatment of the CallStack
. Let’s change f1
and f2
to
take a CallBack
instead:
h1 :: HasCallStack => Callback IO Int () -> IO ()
= h2 k
h1 k
h2 :: HasCallStack => Callback IO Int () -> IO ()
= scoped $ invokeCallback k 1 h2 k
If we now use this top-level function
useDebuggable :: HasCallStack => IO ()
= h1 (callback g1) useDebuggable
we get a much more useful CallStack
:
[g2 (Demo/Callback.hs:26:8) #1, h2 (Demo/Callback.hs:39:8) #1]
n = 1 at CallStack (from HasCallStack):
g2, called at Demo/Callback.hs:23:8 in ..
g1, called at Demo/Callback.hs:42:30 in ..
callbackFn, called at src/Debug/Provenance/Callback.hs:57:48 in ..
invoking callback defined at useDebuggable (Demo/Callback.hs:42:21), called at ..
h2, called at Demo/Callback.hs:36:8 in ..
h1, called at Demo/Callback.hs:42:17 in ..
useDebuggable, called at Demo.hs:26:36 in ..
Alternative: profiling backtraces
in addition to HasCallStack
-style backtraces, there may also be other types of
backtraces available, depending on how we build and how we run the code (we
discuss some of these in the context of exception handling in episode 29 of the
Haskell Unfolder). The most important of these is probably the
profiling (cost centre) backtrace.
We can request the “current” callstack with currentCallstack
, and the
callstack attached to an object (“where was this created”) using whoCreated
.
This allows us to make similar distinctions that we made in Callback
, for
example:
f1 :: (Int -> IO ()) -> IO ()
= do
f1 k <- whoCreated k
cs putStrLn $ "f1: invoking callback defined at " ++ show (cs)
f2 k
f2 :: (Int -> IO ()) -> IO ()
= k 1
f2 k
g1 :: Int -> IO ()
= g2 n
g1 n
g2 :: Int -> IO ()
= do
g2 n <- currentCallStack
cs putStrLn $ "n = " ++ show n ++ " at " ++ show cs
This does require the code to be compiled with profiling enabled. The profiling
callstacks are sometimes more useful than HasCallstack
callstacks, and sometimes
worse; for example, in
demo :: Maybe Int -> IO ()
Nothing = f1 (\x -> g1 x)
demo Just i) = f1 (\x -> g1 (x + i)) demo (
the function defined in the Just
case will have a useful profiling callstack,
but since the function defined in the Nothing
case is entirely static (does
not depend on any runtime info), its callstack is reported as
["MAIN.DONT_CARE (<built-in>)"]
It would be useful to extend debuggable
with support for both types of
backtraces in a future release.
Performance considerations
Adding permanent HasCallStack
constraints to functions does come at a slight
cost, since they correspond to additional arguments that must be passed at
runtime. For most functions this is not a huge deal; personally, I consider some
well-placed HasCallStack
constraints part of designing with debugging in mind.
That said, you will probably want to avoid adding HasCallStack
constraints to
functions that get called repeatedly in tight inner loops; similar
considerations also apply to the use of the Callback
abstraction.
Conclusions
Although debuggable
is a small library, it offers some functionality that has
proven quite useful in debugging applications, especially concurrent ones.
We can probably extend it over time to cover more use cases; “design for
debuggability” is an important principle, and is made easier with proper
library support. Contributions and comments are welcome!
As a side note, the tracing infrastructure of debuggable
can also be combined
with the recover-rtti package, which implements some dark magic
to recover runtime type information by looking at the heap; in particular, it
offers
anythingToString :: forall a. a -> String
which can be used to print objects without having a Show a
instance available
(though this is not the only use of recover-rtti
). The only reason that
debuggable
doesn’t provide explicit support for this is that the dependency
footprint of recover-rtti
is a bit larger.