Recently IOG and QuviQ released a
new library for testing stateful systems called
quickcheck-dynamic
.
In this blog post we will take a look at this library, and how it relates to
quickcheck-state-machine
.
We will focus on the state machine testing aspect; quickcheck-dynamic
also has
support for dynamic logic, but we will not discuss that here.
Specifically, we will consider how we might do lockstep-style testing with
quickcheck-dynamic
. This is a particular approach to testing that we described
in great detail in an earlier blog post, An in-depth look at
quickcheck-state-machine. We will recap the
general philosophy in this new blog post, but we will focus here on the hows,
not necessarily the whys; it might be helpful to be familiar with the previous
blog post to understand the larger context of what we’re trying to achieve.
We have developed a library called
quickcheck-lockstep
which
builds on top of quickcheck-dynamic
to provide an abstraction called
InLockstep
which provides support for lockstep-style testing. In this blog post
we will describe this library in two parts:
- In the first half we will show a test author’s perspective of how to use the abstraction.
- In the second half we show how we can implement the abstraction on top
of
quickcheck-dynamic
.
Part one will suffice for users who simply want to use quickcheck-lockstep
.
Part two serves two purposes:
- It will give an illustrated example of how to use
quickcheck-dynamic
for state based testing. We will use most of the core features of the library to implement our abstraction on top of it. - Since the goal is to provide the end user with a very similar style of testing
that we previously provided for
quickcheck-state-machine
(see specifically Test.StateMachine.Lockstep.NAry), the implementation will serve as a good test testbed for comparing the two libraries.
NOTE: quickcheck-lockstep
currently depends on an as-yet unreleased version
of quickcheck-dynamic
. Once this is released, we will also make a Hackage
release of quickcheck-lockstep
; at the moment, please refer to the
GitHub repository instead.
The example that we discuss in part 1 is also available in that repository,
as an example use case.
Part 1: Lockstep-style testing
In this section we will show how we can do lockstep-style testing using a new
abstraction called InLockstep
. In Part 2 we will see how we can implement
this new abstraction.
Testing philosophy
Lockstep-style testing of stateful systems is quite simple:
- We have a stateful API that we want to test; this could be a database, a file system, etc.
- We will reify that stateful API as a datatype with constructors for each of the API calls.
- We then write two interpreters for this API: one against the system we want to test, and one against a model.
- We regard the system as a black box: we cannot see the internal state of the database, the contents of the file system, etc. The only thing we can see is the results of the API calls.
- Here is why we call this lockstep testing: to test the system, we generate an arbitrary sequence of commands, then execute those against the system under test and against the model. The only thing we check at each point is that both systems return the same results, modulo observability.
- We cannot insist on exactly the same results: for example, opening a file might result in a file handle, which the model cannot reproduce. The model must be allowed to have its own type for “model handles” that is different from real handles, and we do not want to try and compare those to real handles. If the system somehow returns the “wrong” handle, then this will become evident later in the test when we use that handle.
Running example
Our running example will be a file system: it will be precisely the same
example we used previously when discussing
quickcheck-state-machine
: same API, same model,
same properties we want to test, same considerations regarding labelling tests
and shrinking them—but a different testing framework. If you want to follow
along, the code is available on
GitHub.
The model is a simple model for a file system. It consists of the following functions:
mMkDir :: Dir -> Mock -> (Either Err () , Mock)
mOpen :: File -> Mock -> (Either Err MHandle , Mock)
mWrite :: MHandle -> String -> Mock -> (Either Err () , Mock)
mClose :: MHandle -> Mock -> (Either Err () , Mock)
mRead :: File -> Mock -> (Either Err String , Mock)
StateModel
implementation
StateModel
is the central class in quickcheck-dynamic
for stateful testing.
Instances of StateModel
define the datatype that describes the API, how to
generate values of that datatype, how to interpret it, etc. When using the
InLockstep
infrastructure however, we only define the API datatype; everything
else is delegated to InLockstep
.
We will define the type of our model as
data FsState = FsState Mock Stats
initState :: FsState
= FsState Mock.emptyMock initStats initState
Here, Mock
is the mock file system implementation, and Stats
keeps some
statistics about the running test. We will see why we need this statistics
when we discuss labelling.
Let’s now define two type synonyms. First, one of the type of actions:
type FsAct a = Action (Lockstep FsState) (Either Err a)
Here, Action
is the associated data type from StateModel
, and Lockstep
is an opaque datatype from the lockstep infrastructure. All our actions can
return errors, and we want to make sure that the model and the real system
agree on what those errors are. So, the result of an FsAct
is always
of the form Either Err a
, where Err
is also defined in the model.
Secondly, the type of variables:
type FsVar a = ModelVar FsState a
Variables are an essential part of stateful testing: a variable allows us to
refer back to the result of a previously executed command. For example, if we
want to write to a file, we need to generate an action that says “write this
string to the handle that you got when you opened that file a while ago”.
ModelVar
are a special kind of variables provided by the lockstep
infrastructure; we will discuss them in more detail later.
We can now give the StateModel
instance:
type FsVar a = ModelVar FsState a
type FsAct a = Action (Lockstep FsState) (Either Err a)
instance StateModel (Lockstep FsState) where
data Action (Lockstep FsState) a where
MkDir :: Dir -> FsAct ()
Open :: File -> FsAct (IO.Handle, File)
Write :: FsVar IO.Handle -> String -> FsAct ()
Close :: FsVar IO.Handle -> FsAct ()
Read :: Either (FsVar File) File -> FsAct String
= Lockstep.initialState initState
initialState = Lockstep.nextState
nextState = Lockstep.precondition
precondition = Lockstep.arbitraryAction
arbitraryAction = Lockstep.shrinkAction shrinkAction
Some comments:
Write
andClose
both take a variable to a handle, rather than an actual handle. This is what enables us to refer the handles that we got from previous commands.- In both cases, the type of that variable is
FsVar IO.Handle
, but the model implementation requires mock handles instead; we will see how that is resolved in the next section when we discuss relating results from the real system to model results. Open
returns the file path of the file it just opened along with the handle, andRead
takes either a concrete file path as an argument or a variable to such a file path (e.g., one that might have been returned byOpen
). This allows us to express “read from the same file that you opened previously in the test”; see the section on Dependencies between commands from the previous post why this can lead to better (more minimal) counter examples.- The lockstep infrastructure provides default implementation for the methods of
StateModel
. In many cases you can just them as-is, like we did here, but of course you don’t have to. For example, the default precondition isn’t always strong enough.
From real results to model results
When we open a file in the real file system, we get an IO.Handle
, or possibly
an exception. In the model however we have
mOpen :: File -> Mock -> (Either Err MHandle, Mock)
We can map the exception to an Err
, so that’s not a problem, but we cannot
map an IO.Handle
to an MHandle
or vice versa: we want to allow the model
to return something of a different type here.
The Action
datatype from quickcheck-dynamic
is a GADT, where the type index
describes the result of the action. For example, consider this method from
the StateModel
class:
postcondition :: (state, state) -> Action state a -> LookUp m -> Realized m a -> m Bool
This method is the check that quickcheck-dynamic
does every after action. It
has the following parameters:
- The before and after state of the model
- The action that was executed
- A way to look up the values of any variables in those actions
- The result of the action in the system under test
The type of the result is Realized m a
; this is an abstraction introduced in
quickcheck-dynamic
2.0 which allows to run the same tests with different test
execution backends; for example, we might run our tests in the real IO monad, or
in an IO monad simulator. This is
orthogonal to the abstractions provided by InLockstep
: no matter the test
execution backend, we will always run against the same model. For our purposes
(and this will be true for most lockstep-style tests1), we will exclusively run our tests
in ReaderT r IO
, where quickcheck-dynamic
already defines for us that
Realized (ReaderT r IO) a = a
So for the purposes of this blogpost, whenever you see Realized m a
, you can
translate that to simply a
in your head.
In lockstep-style testing, we want to compare that result of type a
to the
response from the model but, as we saw, the model might return something of a
slightly different type. The InLockstep
class therefore introduces an
associated data type called ModelValue
; the idea is that whenever the system
under test returns something of type a
(technically, Realized m a
), we
expect the model to return a result of type ModelValue a
.
As before, we will define a type synonym:
type FsVal a = ModelValue FsState a
Here’s the definition for FsState
:
instance InLockstep FsState where
data ModelValue FsState a where
MHandle :: Mock.MHandle -> FsVal IO.Handle
-- Rest is regular:
MErr :: Err -> FsVal Err
MFile :: File -> FsVal File
MString :: String -> FsVal String
MUnit :: () -> FsVal ()
MEither :: Either (FsVal a) (FsVal b) -> FsVal (Either a b)
MPair :: (FsVal a, FsVal b) -> FsVal (a, b)
-- .. other members of InLockstep elided
We see that an FsVal a
is just a wrapper around an a
, unless that
a
is an IO.Handle
in which case FsVal IO.Handle
instead wraps a
Mock.MHandle
.
Recall that we defined
type FsVar a = ModelVar FsState a
We can now be more precise: a ModelVar s a
is a variable to a
ModelValue s a
.
Comparing values
ModelValue
allows the model to return something of a different type than the
implementation, but when we compare the two, we need something of the same
type.2 InLockstep
therefore
defines a second associated type Observable
, which is the observable result.
The definition is similar but a bit simpler:
type FsObs a = Observable FsState a
instance InLockstep FsState where
data Observable FsState a where
OHandle :: FsObs IO.Handle
OId :: (..) => a -> FsObs a
OEither :: Either (FsObs a) (FsObs b) -> FsObs (Either a b)
OPair :: (FsObs a, FsObs b) -> FsObs (a, b)
-- .. other members of InLockstep elided
This follows a similar structure as ModelValue
, with two differences:
- In the case of a handle, we don’t observe anything. If the system (or the model) returns the wrong handle, we cannot notice this when the open a file; we will only notice it later when we try to read from that file.
- In the case of
ModelValue
, we need a guarantee that if we have a value ofFsVal IO.Handle
, that this is really anMock.MHandle
. We do not need that guarantee forObservable
, and so it suffices to define a single constructorOId
that can be used for any type at all where the model and the system have a result of the same type.
We also have to explain how to translate from mock results to observable results:
instance InLockstep FsState where
observeModel :: FsVal a -> FsObs a
= \case
observeModel MHandle _ -> OHandle
MErr x -> OId x
MString x -> OId x
MUnit x -> OId x
MFile x -> OId x
MEither x -> OEither $ bimap observeModel observeModel x
MPair x -> OPair $ bimap observeModel observeModel x
-- .. other members of InLockstep elided
We have to do the same for results from the system under test, but we will see that when we discuss actually running the tests. This is a bit of boilerplate, but not difficult to write.
Interpreter for the model
We can now write the interpreter for the model: a function that takes a
valid from our reified API, calls the corresponding functions from the model,
and then wraps the result in the appropriate constructors of ModelValue
:
runMock ::
ModelLookUp FsState
-> Action (Lockstep FsState) a
-> Mock -> (FsVal a, Mock)
= \case
runMock lookUp MkDir d -> wrap MUnit . Mock.mMkDir d
Open f -> wrap (mOpen f) . Mock.mOpen f
Write h s -> wrap MUnit . Mock.mWrite (getHandle $ lookUp h) s
Close h -> wrap MUnit . Mock.mClose (getHandle $ lookUp h)
Read f -> wrap MString . Mock.mRead (either (getFile . lookUp) id f)
where
wrap :: (a -> FsVal b) -> (Either Err a, Mock) -> (FsVal (Either Err b), Mock)
= first (MEither . bimap MErr f)
wrap f
mOpen :: File -> Mock.MHandle -> FsVal (IO.Handle, File)
= MPair (MHandle h, MFile f)
mOpen f h
getHandle :: ModelValue FsState IO.Handle -> Mock.MHandle
getFile :: ModelValue FsState File -> File
MHandle h) = h
getHandle (MFile f) = f getFile (
The only slightly non-trivial thing here is that when we encounter a command
with variables, we need to resolve those variables. InLockstep
gives us
a function of type ModelLookUp FsState
, which allows us to resolve any
variable we see (the default InLockstep
precondition guarantees that this
resolution must always succeed). The result of looking up a variable of type
a
will be a value of type FsVal a
; we then need to match on that to extract
the wrapped value. In getHandle
we see why it’s so important that
a FsVal IO.Handle
must contain a mock handle, rather than an IO.Handle
.
With the interpreter defined, we can complete the next method definition of
InLockstep
:
instance InLockstep FsState where
modelNextState :: forall a.
LockstepAction FsState a
-> ModelLookUp FsState
-> FsState -> (FsVal a, FsState)
FsState mock stats) =
modelNextState action lookUp ($ runMock lookUp action mock
auxStats where
auxStats :: (FsVal a, Mock) -> (FsVal a, FsState)
=
auxStats (result, state') FsState state' $ updateStats action result stats)
(result,
-- .. other members of InLockstep elided
All we do here is call the interpreter we just wrote, and then additionally update the statistics (discussed below).
Variables
As discussed above, variables allow us to refer back to the results of
previously executed commands. We have been glossing over an important detail,
however. Recall the types of Open
and Close
(with the FsAct
type synonym
expanded):
Open :: File -> Action .. (Either Err (IO.Handle, File))
Close :: FsVar IO.Handle -> Action .. (Either Err (IO.Handle, ()))
The result of opening a file is either an error, or else a pair of a handle and
a filepath. In quickcheck-dynamic
, we get a single variable for the
execution of each command, and this is (therefore) true also for the lockstep
infrastructure.3 So, after opening a file, we have a
variable of type Either Err (IO.Handle, File)
, but we don’t want a variable of
that type as the argument to Close
: instead, we want a variable to a
IO.Handle
. Most importantly, we want to rule out the possibility of trying to
a close a file that we never managed to open in the first place; such a test
would be nonsensical.4
One of the most important features that the lockstep infrastructure adds on top
of core quickcheck-dynamic
is a concept of variables with a Functor
-esque
structure: they support an operation that allows us to change the type of that
variable.5 The key
datatype is a “generalized variable” GVar
; the intuition is that a GVar
of
type y
is actually a Var
of some other type x
, bundled with a
function6
from x -> Maybe y
:
data GVar :: Type -> Type where -- not the real definition
GVar :: Typeable x => Var x -> (x -> Maybe y) -> GVar y
For technical reasons,7 this
doesn’t quite work. Instead of that function x -> Maybe y
, we instead have
essentially a small DSL for defining such functions:
data Op a b where
OpId :: Op a a
OpFst :: Op (a, b) a
OpSnd :: Op (b, a) a
OpLeft :: Op (Either a b) a
OpRight :: Op (Either b a) a
OpComp :: Op b c -> Op a b -> Op a c
This DSL can be used to extract the left or right coordinate of a pair, as well
as to pattern match on an Either
. This will suffice for many test cases but
not all, so GVar
generalizes over the exact choice of DSL:
data GVar op f where
GVar :: Typeable x => Var x -> op x y -> GVar op y
InLockstep
has an associated type family ModelOp
which records the choice of
DSL. It defaults to Op
, which is just fine for our running example. We do
have to specify how to execute this DSL against model values by giving an
instance of this class:
class Operation op => InterpretOp op f where
intOp :: op a b -> f a -> Maybe (f b)
The instance for our FsVal
is straightforward:
instance InterpretOp Op (ModelValue FsState) where
OpId = Just
intOp OpFst = \case MPair x -> Just (fst x)
intOp OpSnd = \case MPair x -> Just (snd x)
intOp OpLeft = \case MEither x -> either Just (const Nothing) x
intOp OpRight = \case MEither x -> either (const Nothing) Just x
intOp OpComp g f) = intOp g <=< intOp f intOp (
The other variable-related thing we need to do in our InLockstep
instead is
that we need to define which variables are used in all commands:
instance InLockstep FsState where
usedVars :: LockstepAction FsState a -> [AnyGVar (ModelOp FsState)]
= \case
usedVars MkDir{} -> []
Open{} -> []
Write h _ -> [SomeGVar h]
Close h -> [SomeGVar h]
Read (Left h) -> [SomeGVar h]
Read (Right _) -> []
-- .. other members of InLockstep elided
SomeGVar
here is just a way to hide the type of the variable, so that
we can have a list of variables of different types:
data AnyGVar op where
SomeGVar :: GVar op y -> AnyGVar op
Again, the definition of usedVars
involves some boilerplate, but not difficult
to write. It is important to get this function right; however. When a
counter-example is found, quickcheck-dynamic
will try to shrink the list of
actions, to throw out any irrelevant detail. But if, say, a call to Open
is
removed, then any calls to Close
which referenced that open should also be
removed. This is done through preconditions, and the default precondition from
InLockstep
ensures that this will happen by staying that all usedVars
must
be defined. However, if usedVars
misses some variables, the test will fail
during shrinking with a confusing error message about undefined variables.
Generating and shrinking actions
The type of the method for generating actions is
type ModelFindVariables state = forall a.
Typeable a
=> Proxy a -> [GVar (ModelOp state) a]
class (..) => InLockstep state where
arbitraryWithVars ::
ModelFindVariables state
-> state
-> Gen (Any (LockstepAction state))
-- .. other members of InLockstep elided
Thus, we need to generate an arbitrary action given the current state of the
model and a way to find all available variables of a specified type. For
example, if we previously executed an open command, thenModelFindVariables
will tell us that we have a variable of type Either Err (IO.Handle, File)
. If
we have a such a variable, we can turn it into a variable of the type we need
using:
handle :: GVar Op (Either Err (IO.Handle, File)) -> GVar Op IO.Handle
= mapGVar (\op -> OpFst `OpComp` OpRight `OpComp` op) handle
The situation for shrinking is very similar:
class (..) => InLockstep state where
shrinkWithVars ::
ModelFindVariables state
-> state
-> LockstepAction state a
-> [Any (LockstepAction state)]
-- .. other members of InLockstep elided
We will not show the full definition of the generator and the shrinker here.
Apart from generating variables, they follow precisely the same lines as the we
showed previously with quickcheck-state-machine
.
You can find the full definition in the
repository.
Labelling
When we are testing with randomly generated test data, it is important that we understand what kind of data we are testing with. For example, we might want to verify that certain edge cases are being tested. Labelling is one way to do this: we label specific kind of test inputs, and then check that we see tests being executed with those labels.
For our running the example, the labels, or tags, that we previously considered were
data Tag = OpenTwo | SuccessfulRead
The idea was that a test would be labelled with OpenTwo
if it opens at least
two different files, and with SuccessfulRead
if it manages to execute at least
one read successfully.
The abstraction that InLockstep
provides for tagging is
class (..) => InLockstep state where
tagStep ::
Show a
=> (state, state)
-> LockstepAction state a
-> ModelValue state a
-> [String]
-- .. other members of InLockstep elided
This enables us to take an action given the before and after state, the action,
and its result; we do not see any previously executed actions.8
This means that for our OpenTwo
tag we need to record in the state
how many
different files have been opened. This is the purpose of the Stats
:
type Stats = Set File
initStats :: Stats
= Set.empty initStats
Updating the statistics is easy (recall that we used this function in
modelNextState
above). We just look at the action and its result: if the
action is an Open
, and the result is a Right
value (indicating the Open
was success), we insert the filename into the set:
updateStats :: LockstepAction FsState a -> FsVal a -> Stats -> Stats
=
updateStats action result case (action, result) of
Open f, MEither (Right _)) -> Set.insert f
(-> id _otherwise
Tagging is now equally easy. If it’s a Read
, we check to see if the result
was successful, and if so we add the SuccessfulRead
tag. If it’s an Open
,
we look at the statistics to see if we have opened at least two files:
tagFsAction :: Stats -> LockstepAction FsState a -> FsVal a -> [Tag]
= \case
tagFsAction openedFiles Read _ -> \v -> [SuccessfulRead | MEither (Right _) <- [v]]
Open _ -> \_ -> [OpenTwo | Set.size openedFiles >= 2]
-> \_ -> [] _
Running the tests
Before we can run any quickcheck-dynamic
tests, we have to give an instance of
the RunModel
class. This class is somewhat confusingly named: it’s main method
perform
does not explain how to run the model, but rather how to run the
system under test. Name aside, instances of RunModel
are simple when using
quickcheck-lockstep
:
type RealMonad = ReaderT FilePath IO
instance RunModel (Lockstep FsState) RealMonad where
= \_st -> runIO
perform = Lockstep.postcondition
postcondition = Lockstep.monitoring (Proxy @RealMonad) monitoring
We have to choose a monad to run our system under test in; we choose ReaderT FilePath IO
, where the FilePath
is the root directory of the file system that we are simulating. The definitions of postcondition
and monitoring
come straight from quickcheck-lockstep
; we just have to provide an interpreter for actions for the system under test:
runIO :: LockstepAction FsState a -> LookUp RealMonad -> RealMonad (Realized RealMonad a)
Writing this interpreter is straight-forward and we will not show it here; the
only minor wrinkle is that we need to turn the lookup function for Var
that
quickcheck-dynamic
gives us into a lookup function for GVar
;
quickcheck-lockstep
provides this functionality:9
lookUpGVar :: (..) => Proxy m -> LookUp m -> GVar op a -> Realized m a
The final thing we have to do is provide an instance of RunLockstep
; this is a
subclass of InLockstep
with a single method observeReal
; it is a separate
class, because RunLockstep
itself is not aware of the monad used to run the
system under test:
instance RunLockstep FsState RealMonad where
= \case
observeReal _proxy MkDir{} -> OEither . bimap OId OId
Open{} -> OEither . bimap OId (OPair . bimap (const OHandle) OId)
Write{} -> OEither . bimap OId OId
Close{} -> OEither . bimap OId OId
Read{} -> OEither . bimap OId OId
To actually run our tests, we can make use of this function provided by
quickcheck-lockstep
:
runActionsBracket ::
RunLockstep state (ReaderT st IO)
=> Proxy state
-> IO st -- ^ Initialisation
-> (st -> IO ()) -- ^ Cleanup
-> Actions (Lockstep state) -> Property
For example, if we have a bug in our mock system such that get a “does not exist” error message instead of an “already exists” error when we create a directory that already exists, the test output might look something like this:
*** Failed! Assertion failed (after 7 tests and 4 shrinks):
Actions
[Var 4 := MkDir (Dir ["x"]),
Var 6 := MkDir (Dir ["x"])]
State: FsState {.. state1 elided ..}
State: FsState {.. state2 elided ..}
System under test returned: OEither (Left (OId AlreadyExists))
but model returned: OEither (Left (OId DoesNotExist))
(where we have elided some output) We see the state of the system after every action, as well as the final failed postcondition.
Generating labelled examples
For generating labelled examples, quickcheck-lockstep
provides
tagActions :: InLockstep state => Proxy state -> Actions (Lockstep state) -> Property
(This functionality is not provided by quickcheck-dynamic
.10) We can use this with
the standard QuickCheck labelledExamples
function.
As stated, this is very useful both for testing labelling, but also to test the
shrinker, because QuickCheck will give us minimal labelled examples. For
example, we might get the following minimal example for “successful read”
*** Found example of Tags: ["SuccessfulRead"]
Actions
Var 8 := Open (File {dir = Dir [], name = "t0"}),
[Var 9 := Close (GVar (Var 8) (fst . fromRight . id)),
Var 51 := Read (Left (GVar (Var 8) (snd . fromRight . id)))]
The syntax might be a little difficult to read here, but we (1) open a file, then (2) close the file we opened in step (1), and finally (3) read the file that we opened in step (1).
Part 2: Implementation
Now that we have seen all the ingredients, let’s see how the lockstep
abstraction is actually implemented. We will first describe which state we
track, and then discuss all of the default implementations for the methods
of StateModel
; this will serve both as an explanation of the implementation,
as well as an example of how to define StateModel
instances. Fortunately,
we have already seen most of the pieces; it’s just a matter of putting them
together now.
State
During test execution, quickcheck-dynamic
internally maintains a mapping
from variables to the values as returned by the system under test:
type Env m = [EnvEntry m]
data EnvEntry m where
(:==) :: Typeable a => Var a -> Realized m a -> EnvEntry m
Variables of different types are distinguished at runtime through dynamic typing; this is common for model testing libraries like this, and is not really visible to end users.
The state maintained by the lockstep infrastructure is the user defined model
state, along with an environment similar to Env
, but for the values returned
by the model:
data Lockstep state = Lockstep {
lockstepModel :: state
lockstepEnv :: EnvF (ModelValue state)
, }
The definition of EnvF
is similar to Env
, but maps variables of type a
to
values of type f a
:
data EnvEntry f where
EnvEntry :: Typeable a => Var a -> f a -> EnvEntry f
newtype EnvF f = EnvF [EnvEntry f]
Initialising and stepping the state
State initialisation is simple:
initialState :: state -> Lockstep state
= Lockstep {
initialState state = state
lockstepModel = EnvF.empty
, lockstepEnv }
Stepping the state (the implementation of nextState
) is one of the functions
at the heart of the abstraction, but we have actually already seen nearly all
the ingredients:
nextState :: forall state a.
InLockstep state, Typeable a)
(=> Lockstep state
-> LockstepAction state a
-> Var a
-> Lockstep state
Lockstep state env) action var =
nextState (Lockstep state' $ EnvF.insert var modelResp env
where
modelResp :: ModelValue state a
state' :: state
= modelNextState (GVar.lookUpEnvF env) action state (modelResp, state')
We are given the current state, an action to take, and a fresh variable to hold
the result of this action, and must compute the result according to the model
and new model state. The model result and the new model state come straight from
the modelNextState
method of InLockstep
; the only other thing left to do is
to add the variable binding to our environment.
Precondition and postcondition
The only precondition that we have by default is that all variables must be
well-defined. This means not only that they have a value, but also that the
evaluation of the embedded Op
will succeed too. This is verified by
definedInEnvF :: (..) => EnvF f -> GVar op a -> Bool
So the precondition is simply
precondition ::
InLockstep state
=> Lockstep state -> LockstepAction state a -> Bool
Lockstep _ env) =
precondition (all (\(SomeGVar var) -> GVar.definedInEnvF env var) . usedVars
The postcondition is also simple: quickcheck-dynamic
gives us the action
and the result from the system under test; we (re)compute the result from the
model and compare “up to observability”, as described above:
checkResponse :: forall m state a.
RunLockstep state m
=> Proxy m
-> Lockstep state -> LockstepAction state a -> Realized m a -> Maybe String
Lockstep state env) action a =
checkResponse p (
compareEquality (observeReal p action a) (observeModel modelResp)where
modelResp :: ModelValue state a
= fst $ modelNextState (GVar.lookUpEnvF env) action state
modelResp
compareEquality :: Observable state a -> Observable state a -> Maybe String
compareEquality real mock| real == mock = Nothing
| otherwise = Just $ concat [
"System under test returned: "
show real
, "\nbut model returned: "
, show mock
,
]
postcondition :: forall m state a.
RunLockstep state m
=> (Lockstep state, Lockstep state)
-> LockstepAction state a
-> LookUp m
-> Realized m a
-> m Bool
=
postcondition (before, _after) action _lookUp a pure $ isNothing $ checkResponse (Proxy @m) before action a
Unlike postcondition
, which can only return a boolean, checkResponse
actually gives a user-friendly error message in case the postcondition is not
satisfied. We will reuse this in monitoring
below to ensure that this error
message is included in the test output.
Generation, shrinking and monitoring
The definitions of arbitraryAction
and shrinkAction
are thin wrappers around
the corresponding methods from InLockstep
: we just need to pass them a way to
find out which variables are available:
varsOfType ::
InLockstep state
=> EnvF (ModelValue state) -> ModelFindVariables state
= map GVar.fromVar $ EnvF.keysOfType p env varsOfType env p
This depends on
keysOfType :: Typeable a => Proxy a -> EnvF f -> [Var a]
to find variables of the appropriate type. Action generation and shrinking are now trivial:
Lockstep state env) = arbitraryWithVars (varsOfType env) state
arbitraryAction (Lockstep state env) = shrinkWithVars (varsOfType env) state shrinkAction (
Finally, quickcheck-dynamic
allows us to “monitor” test execution: we can add
additional information to running tests. We will use this both to label tests
with inferred tags, as well as to add the state after every step and the result
of checkResponse
to the test-output in case there is a test failure:
monitoring :: forall m state a.
RunLockstep state m
=> Proxy m
-> (Lockstep state, Lockstep state)
-> LockstepAction state a
-> LookUp m
-> Realized m a
-> Property -> Property
=
monitoring p (before, after) action _lookUp realResp "State: " ++ show after)
QC.counterexample (. maybe id QC.counterexample (checkResponse p before action realResp)
. QC.tabulate "Tags" tags
where
tags :: [String]
= tagStep (lockstepModel before, lockstepModel after) action modelResp
tags
modelResp :: ModelValue state a
= fst $ modelNextState
modelResp $ lockstepEnv before)
(GVar.lookUpEnvF
action (lockstepModel before)
Conclusions
The interface for stateful testing provided by quickcheck-dynamic
is fairly
minimal. The key methods that a test must implement are:
- The initial state of the model, and a way to step that state given an action.
- A precondition which is checked during generation and (importantly) during shrinking to rule out nonsensical tests.
- A postcondition which is checked after every action and determines whether or not a test is considered successful.
- Generation and shrinking of actions.
- Optionally, a way to add additional information to a test.
Although it’s nice to have a minimal API, it leaves end users with a lot of
different ways in which they might structure their tests. Sometimes that is
useful, but for many situations a more streamlined approach is useful. In this
blog post we described the quickcheck-lockstep
library, which provides support
for “lockstep-style” model testing on top of quickcheck-dynamic
. The key
difference here is that the postcondition
is always the same: we insist that
the system under test and the model must return the same results, “up to
observability”. By default, the precondition is also always the same: we only
insist that all variables are defined.
We previously implemented the same kind of infrastructure for
quickcheck-state-machine
, so implementing it now for quickcheck-dynamic
provided a good comparison point between the two libraries.
- In terms of model based testing, the two libraries basically have feature parity: they provide the same core functionality.
- In addition to this core functionality
quickcheck-dynamic
additionally offers support for dynamic logic, which is absent fromquickcheck-state-machine
. Conversely,quickcheck-state-machine
offers support for parallel test execution, which is currently absent fromquickcheck-dynamic
. We have not talked about either topic in this blog post.
The differences between the two libraries are mostly technical in nature:
- Probably the most important downside of
quickcheck-dynamic
is there is precisely one variable that records the result of an action.11 This is not the case inquickcheck-state-machine
, where the number of variables bound by an action is determined at runtime. We can use this to return no bound variables if the action failed, or indeed multiple bound variables if the action returned multiple values (such as ourOpen
example). Inquickcheck-lockstep
we therefore provide theGVar
abstraction, which provides a way to “map” over the type of variables. It might be useful to lift this abstraction into the main library at some point. - At the moment,
quickcheck-dynamic
does not provide explicit support for generating labelled examples. As we saw, we can implement this functionality on top ofquickcheck-dynamic
(quickcheck-lockstep
offers it), but as withGVar
, it might be useful to move (a version of) this functionality into to the main library. - In
quickcheck-state-machine
the types of variables are type-level arguments to actions and responses. This means that some functionality such as getting the list of variables used (usedVars
) can be defined generically. Moreover, variables can be resolved by the framework, whereas inquickcheck-dynamic
test authors are responsible for manually calling theLookUp
function whenever necessary. However, we pay a price for this functionality inquickcheck-state-machine
; especially when dealing with multiple types of variables, the required type-level machinery gets pretty sophisticated. - In
quickcheck-dynamic
the argument to tests isActions
, which is a list of steps where each step consists of a variable for that step and the action to execute. The corresponding datatype inquickcheck-state-machine
isCommands
; this is very similar, but in addition to the action, it also records the result of the action. This makesCommands
a bit more useful forActions
for things like tagging commands, since we get the full history. Inquickcheck-dynamic
,tagActions
must effectively re-run the full set of actions to construct the right test label. - Unlike
quickcheck-state-machine
,quickcheck-dynamic
keeps the definition of the interpreter forAction
(RunModel
) separate from theStateModel
class. This separation is useful, because running the test against the real system often needs some additional state (a database handle, for example) which is not necessary for many other parts of the test framework. Inquickcheck-state-machine
this can often lead to uglyerror "state unused"
calls.
All in all, the libraries are quite similar in terms of the core state model
testing functionality. For lockstep-style testing, however,
quickcheck-lockstep
is probably more user-friendly than the corresponding
functionality in
quickcheck-state-machine
because there is less advanced type machinery required. The downside of the
single-variable-per-command of quickcheck-dynamic
is resolved by the GVar
abstraction in quickstep-lockstep
.
Footnotes
If we wanted to execute lockstep-style tests against multiple execution backends, we would have to introduce another abstraction to ensure that we can compare model responses to system responses for all of those backends.↩︎
InLockstep
could alternatively require a functioncompareResult :: a -> ModelValue s a -> Bool
, but writing such a function is often bit cumbersome, whereas equality forObservable s
can be derived.↩︎Core
quickcheck-dynamic
takes care of variables for the system under test, but not for the model.↩︎For an
Action .. a
, all we see inpostcondition
is a value of typeRealized m a
. We want to ensure not just that the model and the system under test have the same behaviour when no exceptions are present, but also that they return the same errors. We must therefore reflect the possibility for an error in the result type ofOpen
.↩︎Perhaps some of this functionality can be merged with the main library; it certainly seems useful beyond lockstep-style testing.↩︎
The
quickcheck-dynamic
infrastructure insists that actions haveEq
andShow
instances. Since variables occur in actions, the same must be true forGVar
. Secondly, a function fromx -> y
would not be enough; we would also need a second function of typeModelValue s x -> ModelValue s y
. The indirection through the DSL avoids both of these problems: operations haveEq
andShow
instances, and we can simply insist on two interpreters ofOp
: one forIdentity
and one forModelValue s
.↩︎We use
tagStep
not just inlabelledExamples
, but also in the standardStateModel
methodmonitoring
, to tag tests as they are executed. While the former would in principle allow us to tag an entire list of actions, the latter does not.↩︎The proxy argument is necessary because
Realized
is a non-injective type family;quickcheck-dynamic
relies onAllowAmbiguousTypes
instead.↩︎In
StateModel
we havemonitoring
, butmonitoring
cannot really be used withlabel
, as this would result in lots of calls tolabel
as the test executes (once per action) and each of those calls would result in a separate table in the test output. We must therefore usetabulate
instead, but this is not supported by QuickCheck’slabelledExamples
. Moreover, the only way to turn a list of actions into aPropertyM
inquickcheck-dynamic
isrunActions
, which requires theRunModel
argument; butRunModel
should not be needed for creating labelled examples. In the lockstep infrastructure we provide instead a functiontagActions :: Actions (Lockstep state) -> Property
, which basically executes all of the actions, collecting the tags as it goes, and then makes a single call tolabel
with the final list of tags. This then works well with the standardlabelledExamples
functionality from QuickCheck.↩︎The registry example from
quickcheck-dynamic
skirts around the problem: some actions fail, and some actions return newModelThreadId
, but there are no actions that can fail or return a newModelThreadId
.↩︎