Lockstep-style testing with `quickcheck-dynamic`

Edsko de Vries – Thursday, 08 September 2022

Recently IOG and QuviQ released a new library for testing stateful systems called quickcheck-dynamic. In this blog post we will take a look at this library, and how it relates to quickcheck-state-machine. We will focus on the state machine testing aspect; quickcheck-dynamic also has support for dynamic logic, but we will not discuss that here.

Specifically, we will consider how we might do lockstep-style testing with quickcheck-dynamic. This is a particular approach to testing that we described in great detail in an earlier blog post, An in-depth look at quickcheck-state-machine. We will recap the general philosophy in this new blog post, but we will focus here on the hows, not necessarily the whys; it might be helpful to be familiar with the previous blog post to understand the larger context of what we’re trying to achieve.

We have developed a library called quickcheck-lockstep which builds on top of quickcheck-dynamic to provide an abstraction called InLockstep which provides support for lockstep-style testing. In this blog post we will describe this library in two parts:

In the first half we will show a test author’s perspective of how to use the abstraction.
In the second half we show how we can implement the abstraction on top of quickcheck-dynamic.

Part one will suffice for users who simply want to use quickcheck-lockstep. Part two serves two purposes:

It will give an illustrated example of how to use quickcheck-dynamic for state based testing. We will use most of the core features of the library to implement our abstraction on top of it.
Since the goal is to provide the end user with a very similar style of testing that we previously provided for quickcheck-state-machine (see specifically Test.StateMachine.Lockstep.NAry), the implementation will serve as a good test testbed for comparing the two libraries.

NOTE: quickcheck-lockstep currently depends on an as-yet unreleased version of quickcheck-dynamic. Once this is released, we will also make a Hackage release of quickcheck-lockstep; at the moment, please refer to the GitHub repository instead. The example that we discuss in part 1 is also available in that repository, as an example use case.

Part 1: Lockstep-style testing

In this section we will show how we can do lockstep-style testing using a new abstraction called InLockstep. In Part 2 we will see how we can implement this new abstraction.

Testing philosophy

Lockstep-style testing of stateful systems is quite simple:

We have a stateful API that we want to test; this could be a database, a file system, etc.
We will reify that stateful API as a datatype with constructors for each of the API calls.
We then write two interpreters for this API: one against the system we want to test, and one against a model.
We regard the system as a black box: we cannot see the internal state of the database, the contents of the file system, etc. The only thing we can see is the results of the API calls.
Here is why we call this lockstep testing: to test the system, we generate an arbitrary sequence of commands, then execute those against the system under test and against the model. The only thing we check at each point is that both systems return the same results, modulo observability.
We cannot insist on exactly the same results: for example, opening a file might result in a file handle, which the model cannot reproduce. The model must be allowed to have its own type for “model handles” that is different from real handles, and we do not want to try and compare those to real handles. If the system somehow returns the “wrong” handle, then this will become evident later in the test when we use that handle.

Running example

Our running example will be a file system: it will be precisely the same example we used previously when discussing quickcheck-state-machine: same API, same model, same properties we want to test, same considerations regarding labelling tests and shrinking them—but a different testing framework. If you want to follow along, the code is available on GitHub.

The model is a simple model for a file system. It consists of the following functions:

mMkDir :: Dir               -> Mock -> (Either Err ()      , Mock)
mOpen  :: File              -> Mock -> (Either Err MHandle , Mock)
mWrite :: MHandle -> String -> Mock -> (Either Err ()      , Mock)
mClose :: MHandle           -> Mock -> (Either Err ()      , Mock)
mRead  :: File              -> Mock -> (Either Err String  , Mock)

`StateModel` implementation

StateModel is the central class in quickcheck-dynamic for stateful testing. Instances of StateModel define the datatype that describes the API, how to generate values of that datatype, how to interpret it, etc. When using the InLockstep infrastructure however, we only define the API datatype; everything else is delegated to InLockstep.

We will define the type of our model as

data FsState = FsState Mock Stats

initState :: FsState
initState = FsState Mock.emptyMock initStats

Here, Mock is the mock file system implementation, and Stats keeps some statistics about the running test. We will see why we need this statistics when we discuss labelling.

Let’s now define two type synonyms. First, one of the type of actions:

type FsAct a = Action (Lockstep FsState) (Either Err a)

Here, Action is the associated data type from StateModel, and Lockstep is an opaque datatype from the lockstep infrastructure. All our actions can return errors, and we want to make sure that the model and the real system agree on what those errors are. So, the result of an FsAct is always of the form Either Err a, where Err is also defined in the model.

Secondly, the type of variables:

type FsVar a = ModelVar FsState a

Variables are an essential part of stateful testing: a variable allows us to refer back to the result of a previously executed command. For example, if we want to write to a file, we need to generate an action that says “write this string to the handle that you got when you opened that file a while ago”. ModelVar are a special kind of variables provided by the lockstep infrastructure; we will discuss them in more detail later.

We can now give the StateModel instance:

type FsVar a = ModelVar FsState a
type FsAct a = Action (Lockstep FsState) (Either Err a)

instance StateModel (Lockstep FsState) where
  data Action (Lockstep FsState) a where
    MkDir :: Dir                        -> FsAct ()
    Open  :: File                       -> FsAct (IO.Handle, File)
    Write :: FsVar IO.Handle  -> String -> FsAct ()
    Close :: FsVar IO.Handle            -> FsAct ()
    Read  :: Either (FsVar File) File   -> FsAct String

  initialState    = Lockstep.initialState initState
  nextState       = Lockstep.nextState
  precondition    = Lockstep.precondition
  arbitraryAction = Lockstep.arbitraryAction
  shrinkAction    = Lockstep.shrinkAction

Some comments:

Write and Close both take a variable to a handle, rather than an actual handle. This is what enables us to refer the handles that we got from previous commands.
In both cases, the type of that variable is FsVar IO.Handle, but the model implementation requires mock handles instead; we will see how that is resolved in the next section when we discuss relating results from the real system to model results.
Open returns the file path of the file it just opened along with the handle, and Read takes either a concrete file path as an argument or a variable to such a file path (e.g., one that might have been returned by Open). This allows us to express “read from the same file that you opened previously in the test”; see the section on Dependencies between commands from the previous post why this can lead to better (more minimal) counter examples.
The lockstep infrastructure provides default implementation for the methods of StateModel. In many cases you can just them as-is, like we did here, but of course you don’t have to. For example, the default precondition isn’t always strong enough.

From real results to model results

When we open a file in the real file system, we get an IO.Handle, or possibly an exception. In the model however we have

mOpen :: File -> Mock -> (Either Err MHandle, Mock)

We can map the exception to an Err, so that’s not a problem, but we cannot map an IO.Handle to an MHandle or vice versa: we want to allow the model to return something of a different type here.

The Action datatype from quickcheck-dynamic is a GADT, where the type index describes the result of the action. For example, consider this method from the StateModel class:

postcondition :: (state, state) -> Action state a -> LookUp m -> Realized m a -> m Bool

This method is the check that quickcheck-dynamic does every after action. It has the following parameters:

The before and after state of the model
The action that was executed
A way to look up the values of any variables in those actions
The result of the action in the system under test

The type of the result is Realized m a; this is an abstraction introduced in quickcheck-dynamic 2.0 which allows to run the same tests with different test execution backends; for example, we might run our tests in the real IO monad, or in an IO monad simulator. This is orthogonal to the abstractions provided by InLockstep: no matter the test execution backend, we will always run against the same model. For our purposes (and this will be true for most lockstep-style tests¹), we will exclusively run our tests in ReaderT r IO, where quickcheck-dynamic already defines for us that

Realized (ReaderT r IO) a = a

So for the purposes of this blogpost, whenever you see Realized m a, you can translate that to simply a in your head.

In lockstep-style testing, we want to compare that result of type a to the response from the model but, as we saw, the model might return something of a slightly different type. The InLockstep class therefore introduces an associated data type called ModelValue; the idea is that whenever the system under test returns something of type a (technically, Realized m a), we expect the model to return a result of type ModelValue a.

As before, we will define a type synonym:

type FsVal a = ModelValue FsState a

Here’s the definition for FsState:

instance InLockstep FsState where
  data ModelValue FsState a where
    MHandle :: Mock.MHandle -> FsVal IO.Handle

    -- Rest is regular:

    MErr    :: Err    -> FsVal Err
    MFile   :: File   -> FsVal File
    MString :: String -> FsVal String
    MUnit   :: ()     -> FsVal ()

    MEither :: Either (FsVal a) (FsVal b) -> FsVal (Either a b)
    MPair   :: (FsVal a, FsVal b)         -> FsVal (a, b)

  -- .. other members of InLockstep elided

We see that an FsVal a is just a wrapper around an a, unless that a is an IO.Handle in which case FsVal IO.Handle instead wraps a Mock.MHandle.

Recall that we defined

type FsVar a = ModelVar FsState a

We can now be more precise: a ModelVar s a is a variable to a ModelValue s a.

Comparing values

ModelValue allows the model to return something of a different type than the implementation, but when we compare the two, we need something of the same type.² InLockstep therefore defines a second associated type Observable, which is the observable result. The definition is similar but a bit simpler:

type FsObs a = Observable FsState a

instance InLockstep FsState where
  data Observable FsState a where
    OHandle :: FsObs IO.Handle
    OId     :: (..) => a -> FsObs a
    OEither :: Either (FsObs a) (FsObs b) -> FsObs (Either a b)
    OPair   :: (FsObs a, FsObs b) -> FsObs (a, b)

  -- .. other members of InLockstep elided

This follows a similar structure as ModelValue, with two differences:

In the case of a handle, we don’t observe anything. If the system (or the model) returns the wrong handle, we cannot notice this when the open a file; we will only notice it later when we try to read from that file.
In the case of ModelValue, we need a guarantee that if we have a value of FsVal IO.Handle, that this is really an Mock.MHandle. We do not need that guarantee for Observable, and so it suffices to define a single constructor OId that can be used for any type at all where the model and the system have a result of the same type.

We also have to explain how to translate from mock results to observable results:

instance InLockstep FsState where
  observeModel :: FsVal a -> FsObs a
  observeModel = \case
      MHandle _ -> OHandle
      MErr    x -> OId x
      MString x -> OId x
      MUnit   x -> OId x
      MFile   x -> OId x
      MEither x -> OEither $ bimap observeModel observeModel x
      MPair   x -> OPair   $ bimap observeModel observeModel x

  -- .. other members of InLockstep elided

We have to do the same for results from the system under test, but we will see that when we discuss actually running the tests. This is a bit of boilerplate, but not difficult to write.

Interpreter for the model

We can now write the interpreter for the model: a function that takes a valid from our reified API, calls the corresponding functions from the model, and then wraps the result in the appropriate constructors of ModelValue:

runMock ::
     ModelLookUp FsState
  -> Action (Lockstep FsState) a
  -> Mock -> (FsVal a, Mock)
runMock lookUp = \case
    MkDir d   -> wrap MUnit     . Mock.mMkDir d
    Open f    -> wrap (mOpen f) . Mock.mOpen f
    Write h s -> wrap MUnit     . Mock.mWrite (getHandle $ lookUp h) s
    Close h   -> wrap MUnit     . Mock.mClose (getHandle $ lookUp h)
    Read f    -> wrap MString   . Mock.mRead (either (getFile . lookUp) id f)
  where
    wrap :: (a -> FsVal b) -> (Either Err a, Mock) -> (FsVal (Either Err b), Mock)
    wrap f = first (MEither . bimap MErr f)

    mOpen :: File -> Mock.MHandle -> FsVal (IO.Handle, File)
    mOpen f h = MPair (MHandle h, MFile f)

    getHandle :: ModelValue FsState IO.Handle -> Mock.MHandle
    getFile   :: ModelValue FsState File      -> File

    getHandle (MHandle h) = h
    getFile   (MFile   f) = f

The only slightly non-trivial thing here is that when we encounter a command with variables, we need to resolve those variables. InLockstep gives us a function of type ModelLookUp FsState, which allows us to resolve any variable we see (the default InLockstep precondition guarantees that this resolution must always succeed). The result of looking up a variable of type a will be a value of type FsVal a; we then need to match on that to extract the wrapped value. In getHandle we see why it’s so important that a FsVal IO.Handle must contain a mock handle, rather than an IO.Handle.

With the interpreter defined, we can complete the next method definition of InLockstep:

instance InLockstep FsState where
  modelNextState :: forall a.
       LockstepAction FsState a
    -> ModelLookUp FsState
    -> FsState -> (FsVal a, FsState)
  modelNextState action lookUp (FsState mock stats) =
      auxStats $ runMock lookUp action mock
    where
      auxStats :: (FsVal a, Mock) -> (FsVal a, FsState)
      auxStats (result, state') =
          (result, FsState state' $ updateStats action result stats)

  -- .. other members of InLockstep elided

All we do here is call the interpreter we just wrote, and then additionally update the statistics (discussed below).

Variables

As discussed above, variables allow us to refer back to the results of previously executed commands. We have been glossing over an important detail, however. Recall the types of Open and Close (with the FsAct type synonym expanded):

Open  :: File            -> Action .. (Either Err (IO.Handle, File))
Close :: FsVar IO.Handle -> Action .. (Either Err (IO.Handle, ()))

The result of opening a file is either an error, or else a pair of a handle and a filepath. In quickcheck-dynamic, we get a single variable for the execution of each command, and this is (therefore) true also for the lockstep infrastructure.³ So, after opening a file, we have a variable of type Either Err (IO.Handle, File), but we don’t want a variable of that type as the argument to Close: instead, we want a variable to a IO.Handle. Most importantly, we want to rule out the possibility of trying to a close a file that we never managed to open in the first place; such a test would be nonsensical.⁴

One of the most important features that the lockstep infrastructure adds on top of core quickcheck-dynamic is a concept of variables with a Functor-esque structure: they support an operation that allows us to change the type of that variable.⁵ The key datatype is a “generalized variable” GVar; the intuition is that a GVar of type y is actually a Var of some other type x, bundled with a function⁶ from x -> Maybe y:

data GVar :: Type -> Type where -- not the real definition
  GVar :: Typeable x => Var x -> (x -> Maybe y) -> GVar y

For technical reasons,⁷ this doesn’t quite work. Instead of that function x -> Maybe y, we instead have essentially a small DSL for defining such functions:

data Op a b where
  OpId    :: Op a a
  OpFst   :: Op (a, b) a
  OpSnd   :: Op (b, a) a
  OpLeft  :: Op (Either a b) a
  OpRight :: Op (Either b a) a
  OpComp  :: Op b c -> Op a b -> Op a c

This DSL can be used to extract the left or right coordinate of a pair, as well as to pattern match on an Either. This will suffice for many test cases but not all, so GVar generalizes over the exact choice of DSL:

data GVar op f where
  GVar :: Typeable x => Var x -> op x y -> GVar op y

InLockstep has an associated type family ModelOp which records the choice of DSL. It defaults to Op, which is just fine for our running example. We do have to specify how to execute this DSL against model values by giving an instance of this class:

class Operation op => InterpretOp op f where
  intOp :: op a b -> f a -> Maybe (f b)

The instance for our FsVal is straightforward:

instance InterpretOp Op (ModelValue FsState) where
  intOp OpId         = Just
  intOp OpFst        = \case MPair   x -> Just (fst x)
  intOp OpSnd        = \case MPair   x -> Just (snd x)
  intOp OpLeft       = \case MEither x -> either Just (const Nothing) x
  intOp OpRight      = \case MEither x -> either (const Nothing) Just x
  intOp (OpComp g f) = intOp g <=< intOp f

The other variable-related thing we need to do in our InLockstep instead is that we need to define which variables are used in all commands:

instance InLockstep FsState where
  usedVars :: LockstepAction FsState a -> [AnyGVar (ModelOp FsState)]
  usedVars = \case
      MkDir{}        -> []
      Open{}         -> []
      Write h _      -> [SomeGVar h]
      Close h        -> [SomeGVar h]
      Read (Left h)  -> [SomeGVar h]
      Read (Right _) -> []

  -- .. other members of InLockstep elided

SomeGVar here is just a way to hide the type of the variable, so that we can have a list of variables of different types:

data AnyGVar op where
  SomeGVar :: GVar op y -> AnyGVar op

Again, the definition of usedVars involves some boilerplate, but not difficult to write. It is important to get this function right; however. When a counter-example is found, quickcheck-dynamic will try to shrink the list of actions, to throw out any irrelevant detail. But if, say, a call to Open is removed, then any calls to Close which referenced that open should also be removed. This is done through preconditions, and the default precondition from InLockstep ensures that this will happen by staying that all usedVars must be defined. However, if usedVars misses some variables, the test will fail during shrinking with a confusing error message about undefined variables.

Generating and shrinking actions

The type of the method for generating actions is

type ModelFindVariables state = forall a.
          Typeable a
       => Proxy a -> [GVar (ModelOp state) a]

class (..) => InLockstep state where
  arbitraryWithVars ::
       ModelFindVariables state
    -> state
    -> Gen (Any (LockstepAction state))

  -- .. other members of InLockstep elided

Thus, we need to generate an arbitrary action given the current state of the model and a way to find all available variables of a specified type. For example, if we previously executed an open command, thenModelFindVariables will tell us that we have a variable of type Either Err (IO.Handle, File). If we have a such a variable, we can turn it into a variable of the type we need using:

handle :: GVar Op (Either Err (IO.Handle, File)) -> GVar Op IO.Handle
handle = mapGVar (\op -> OpFst `OpComp` OpRight `OpComp` op)

The situation for shrinking is very similar:

class (..) => InLockstep state where
  shrinkWithVars ::
       ModelFindVariables state
    -> state
    -> LockstepAction state a
    -> [Any (LockstepAction state)]

  -- .. other members of InLockstep elided

We will not show the full definition of the generator and the shrinker here. Apart from generating variables, they follow precisely the same lines as the we showed previously with quickcheck-state-machine. You can find the full definition in the repository.

Labelling

When we are testing with randomly generated test data, it is important that we understand what kind of data we are testing with. For example, we might want to verify that certain edge cases are being tested. Labelling is one way to do this: we label specific kind of test inputs, and then check that we see tests being executed with those labels.

For our running the example, the labels, or tags, that we previously considered were

data Tag = OpenTwo | SuccessfulRead

The idea was that a test would be labelled with OpenTwo if it opens at least two different files, and with SuccessfulRead if it manages to execute at least one read successfully.

The abstraction that InLockstep provides for tagging is

class (..) => InLockstep state where
  tagStep ::
       Show a
    => (state, state)
    -> LockstepAction state a
    -> ModelValue state a
    -> [String]

  -- .. other members of InLockstep elided

This enables us to take an action given the before and after state, the action, and its result; we do not see any previously executed actions.⁸ This means that for our OpenTwo tag we need to record in the state how many different files have been opened. This is the purpose of the Stats:

type Stats = Set File

initStats :: Stats
initStats = Set.empty

Updating the statistics is easy (recall that we used this function in modelNextState above). We just look at the action and its result: if the action is an Open, and the result is a Right value (indicating the Open was success), we insert the filename into the set:

updateStats :: LockstepAction FsState a -> FsVal a -> Stats -> Stats
updateStats action result  =
   case (action, result) of
     (Open f, MEither (Right _)) -> Set.insert f
     _otherwise                  -> id

Tagging is now equally easy. If it’s a Read, we check to see if the result was successful, and if so we add the SuccessfulRead tag. If it’s an Open, we look at the statistics to see if we have opened at least two files:

tagFsAction :: Stats -> LockstepAction FsState a -> FsVal a -> [Tag]
tagFsAction openedFiles = \case
    Read _ -> \v -> [SuccessfulRead | MEither (Right _) <- [v]]
    Open _ -> \_ -> [OpenTwo        | Set.size openedFiles >= 2]
    _      -> \_ -> []

Running the tests

Before we can run any quickcheck-dynamic tests, we have to give an instance of the RunModel class. This class is somewhat confusingly named: it’s main method perform does not explain how to run the model, but rather how to run the system under test. Name aside, instances of RunModel are simple when using quickcheck-lockstep:

type RealMonad = ReaderT FilePath IO

instance RunModel (Lockstep FsState) RealMonad where
  perform       = \_st -> runIO
  postcondition = Lockstep.postcondition
  monitoring    = Lockstep.monitoring (Proxy @RealMonad)

We have to choose a monad to run our system under test in; we choose ReaderT FilePath IO, where the FilePath is the root directory of the file system that we are simulating. The definitions of postcondition and monitoring come straight from quickcheck-lockstep; we just have to provide an interpreter for actions for the system under test:

runIO :: LockstepAction FsState a -> LookUp RealMonad -> RealMonad (Realized RealMonad a)

Writing this interpreter is straight-forward and we will not show it here; the only minor wrinkle is that we need to turn the lookup function for Var that quickcheck-dynamic gives us into a lookup function for GVar; quickcheck-lockstep provides this functionality:⁹

lookUpGVar :: (..) => Proxy m -> LookUp m -> GVar op a -> Realized m a

The final thing we have to do is provide an instance of RunLockstep; this is a subclass of InLockstep with a single method observeReal; it is a separate class, because RunLockstep itself is not aware of the monad used to run the system under test:

instance RunLockstep FsState RealMonad where
  observeReal _proxy = \case
      MkDir{} -> OEither . bimap OId OId
      Open{}  -> OEither . bimap OId (OPair . bimap (const OHandle) OId)
      Write{} -> OEither . bimap OId OId
      Close{} -> OEither . bimap OId OId
      Read{}  -> OEither . bimap OId OId

To actually run our tests, we can make use of this function provided by quickcheck-lockstep:

runActionsBracket ::
     RunLockstep state (ReaderT st IO)
  => Proxy state
  -> IO st         -- ^ Initialisation
  -> (st -> IO ()) -- ^ Cleanup
  -> Actions (Lockstep state) -> Property

For example, if we have a bug in our mock system such that get a “does not exist” error message instead of an “already exists” error when we create a directory that already exists, the test output might look something like this:

*** Failed! Assertion failed (after 7 tests and 4 shrinks):
Actions
 [Var 4 := MkDir (Dir ["x"]),
  Var 6 := MkDir (Dir ["x"])]
State: FsState {.. state1 elided ..}
State: FsState {.. state2 elided ..}
System under test returned: OEither (Left (OId AlreadyExists))
but model returned:         OEither (Left (OId DoesNotExist))

(where we have elided some output) We see the state of the system after every action, as well as the final failed postcondition.

Generating labelled examples

For generating labelled examples, quickcheck-lockstep provides

tagActions ::  InLockstep state => Proxy state -> Actions (Lockstep state) -> Property

(This functionality is not provided by quickcheck-dynamic.¹⁰) We can use this with the standard QuickCheck labelledExamples function. As stated, this is very useful both for testing labelling, but also to test the shrinker, because QuickCheck will give us minimal labelled examples. For example, we might get the following minimal example for “successful read”

*** Found example of Tags: ["SuccessfulRead"]
Actions
 [Var 8 := Open (File {dir = Dir [], name = "t0"}),
  Var 9 := Close (GVar (Var 8) (fst . fromRight . id)),
  Var 51 := Read (Left (GVar (Var 8) (snd . fromRight . id)))]

The syntax might be a little difficult to read here, but we (1) open a file, then (2) close the file we opened in step (1), and finally (3) read the file that we opened in step (1).

Part 2: Implementation

Now that we have seen all the ingredients, let’s see how the lockstep abstraction is actually implemented. We will first describe which state we track, and then discuss all of the default implementations for the methods of StateModel; this will serve both as an explanation of the implementation, as well as an example of how to define StateModel instances. Fortunately, we have already seen most of the pieces; it’s just a matter of putting them together now.

State

During test execution, quickcheck-dynamic internally maintains a mapping from variables to the values as returned by the system under test:

type Env m = [EnvEntry m]

data EnvEntry m where
  (:==) :: Typeable a => Var a -> Realized m a -> EnvEntry m

Variables of different types are distinguished at runtime through dynamic typing; this is common for model testing libraries like this, and is not really visible to end users.

The state maintained by the lockstep infrastructure is the user defined model state, along with an environment similar to Env, but for the values returned by the model:

data Lockstep state = Lockstep {
      lockstepModel :: state
    , lockstepEnv   :: EnvF (ModelValue state)
    }

The definition of EnvF is similar to Env, but maps variables of type a to values of type f a:

data EnvEntry f where
  EnvEntry :: Typeable a => Var a -> f a -> EnvEntry f

newtype EnvF f = EnvF [EnvEntry f]

Initialising and stepping the state

State initialisation is simple:

initialState :: state -> Lockstep state
initialState state = Lockstep {
      lockstepModel = state
    , lockstepEnv   = EnvF.empty
    }

Stepping the state (the implementation of nextState) is one of the functions at the heart of the abstraction, but we have actually already seen nearly all the ingredients:

nextState :: forall state a.
     (InLockstep state, Typeable a)
  => Lockstep state
  -> LockstepAction state a
  -> Var a
  -> Lockstep state
nextState (Lockstep state env) action var =
    Lockstep state' $ EnvF.insert var modelResp env
  where
    modelResp :: ModelValue state a
    state'    :: state
    (modelResp, state') = modelNextState (GVar.lookUpEnvF env) action state

We are given the current state, an action to take, and a fresh variable to hold the result of this action, and must compute the result according to the model and new model state. The model result and the new model state come straight from the modelNextState method of InLockstep; the only other thing left to do is to add the variable binding to our environment.

Precondition and postcondition

The only precondition that we have by default is that all variables must be well-defined. This means not only that they have a value, but also that the evaluation of the embedded Op will succeed too. This is verified by

definedInEnvF :: (..) => EnvF f -> GVar op a -> Bool

So the precondition is simply

precondition ::
     InLockstep state
  => Lockstep state -> LockstepAction state a -> Bool
precondition (Lockstep _ env) =
    all (\(SomeGVar var) -> GVar.definedInEnvF env var) . usedVars

The postcondition is also simple: quickcheck-dynamic gives us the action and the result from the system under test; we (re)compute the result from the model and compare “up to observability”, as described above:

checkResponse :: forall m state a.
     RunLockstep state m
  => Proxy m
  -> Lockstep state -> LockstepAction state a -> Realized m a -> Maybe String
checkResponse p (Lockstep state env) action a =
    compareEquality (observeReal p action a) (observeModel modelResp)
  where
    modelResp :: ModelValue state a
    modelResp = fst $ modelNextState (GVar.lookUpEnvF env) action state

    compareEquality ::  Observable state a -> Observable state a -> Maybe String
    compareEquality real mock
      | real == mock = Nothing
      | otherwise    = Just $ concat [
            "System under test returned: "
          , show real
          , "\nbut model returned:         "
          , show mock
          ]

postcondition :: forall m state a.
     RunLockstep state m
  => (Lockstep state, Lockstep state)
  -> LockstepAction state a
  -> LookUp m
  -> Realized m a
  -> m Bool
postcondition (before, _after) action _lookUp a =
    pure $ isNothing $ checkResponse (Proxy @m) before action a

Unlike postcondition, which can only return a boolean, checkResponse actually gives a user-friendly error message in case the postcondition is not satisfied. We will reuse this in monitoring below to ensure that this error message is included in the test output.

Generation, shrinking and monitoring

The definitions of arbitraryAction and shrinkAction are thin wrappers around the corresponding methods from InLockstep: we just need to pass them a way to find out which variables are available:

varsOfType ::
     InLockstep state
  => EnvF (ModelValue state) -> ModelFindVariables state
varsOfType env p = map GVar.fromVar $ EnvF.keysOfType p env

This depends on

keysOfType :: Typeable a => Proxy a -> EnvF f -> [Var a]

to find variables of the appropriate type. Action generation and shrinking are now trivial:

arbitraryAction (Lockstep state env) = arbitraryWithVars (varsOfType env) state
shrinkAction    (Lockstep state env) = shrinkWithVars    (varsOfType env) state

Finally, quickcheck-dynamic allows us to “monitor” test execution: we can add additional information to running tests. We will use this both to label tests with inferred tags, as well as to add the state after every step and the result of checkResponse to the test-output in case there is a test failure:

monitoring :: forall m state a.
     RunLockstep state m
  => Proxy m
  -> (Lockstep state, Lockstep state)
  -> LockstepAction state a
  -> LookUp m
  -> Realized m a
  -> Property -> Property
monitoring p (before, after) action _lookUp realResp =
      QC.counterexample ("State: " ++ show after)
    . maybe id QC.counterexample (checkResponse p before action realResp)
    . QC.tabulate "Tags" tags
  where
    tags :: [String]
    tags = tagStep (lockstepModel before, lockstepModel after) action modelResp

    modelResp :: ModelValue state a
    modelResp = fst $ modelNextState
                        (GVar.lookUpEnvF $ lockstepEnv before)
                        action
                        (lockstepModel before)

Conclusions

The interface for stateful testing provided by quickcheck-dynamic is fairly minimal. The key methods that a test must implement are:

The initial state of the model, and a way to step that state given an action.
A precondition which is checked during generation and (importantly) during shrinking to rule out nonsensical tests.
A postcondition which is checked after every action and determines whether or not a test is considered successful.
Generation and shrinking of actions.
Optionally, a way to add additional information to a test.

Although it’s nice to have a minimal API, it leaves end users with a lot of different ways in which they might structure their tests. Sometimes that is useful, but for many situations a more streamlined approach is useful. In this blog post we described the quickcheck-lockstep library, which provides support for “lockstep-style” model testing on top of quickcheck-dynamic. The key difference here is that the postcondition is always the same: we insist that the system under test and the model must return the same results, “up to observability”. By default, the precondition is also always the same: we only insist that all variables are defined.

We previously implemented the same kind of infrastructure for quickcheck-state-machine, so implementing it now for quickcheck-dynamic provided a good comparison point between the two libraries.

In terms of model based testing, the two libraries basically have feature parity: they provide the same core functionality.
In addition to this core functionality quickcheck-dynamic additionally offers support for dynamic logic, which is absent from quickcheck-state-machine. Conversely, quickcheck-state-machine offers support for parallel test execution, which is currently absent from quickcheck-dynamic. We have not talked about either topic in this blog post.

The differences between the two libraries are mostly technical in nature:

Probably the most important downside of quickcheck-dynamic is there is precisely one variable that records the result of an action.¹¹ This is not the case in quickcheck-state-machine, where the number of variables bound by an action is determined at runtime. We can use this to return no bound variables if the action failed, or indeed multiple bound variables if the action returned multiple values (such as our Open example). In quickcheck-lockstep we therefore provide the GVar abstraction, which provides a way to “map” over the type of variables. It might be useful to lift this abstraction into the main library at some point.
At the moment, quickcheck-dynamic does not provide explicit support for generating labelled examples. As we saw, we can implement this functionality on top of quickcheck-dynamic (quickcheck-lockstep offers it), but as with GVar, it might be useful to move (a version of) this functionality into to the main library.
In quickcheck-state-machine the types of variables are type-level arguments to actions and responses. This means that some functionality such as getting the list of variables used (usedVars) can be defined generically. Moreover, variables can be resolved by the framework, whereas in quickcheck-dynamic test authors are responsible for manually calling the LookUp function whenever necessary. However, we pay a price for this functionality in quickcheck-state-machine; especially when dealing with multiple types of variables, the required type-level machinery gets pretty sophisticated.
In quickcheck-dynamic the argument to tests is Actions, which is a list of steps where each step consists of a variable for that step and the action to execute. The corresponding datatype in quickcheck-state-machine is Commands; this is very similar, but in addition to the action, it also records the result of the action. This makes Commands a bit more useful for Actions for things like tagging commands, since we get the full history. In quickcheck-dynamic, tagActions must effectively re-run the full set of actions to construct the right test label.
Unlike quickcheck-state-machine, quickcheck-dynamic keeps the definition of the interpreter for Action (RunModel) separate from the StateModel class. This separation is useful, because running the test against the real system often needs some additional state (a database handle, for example) which is not necessary for many other parts of the test framework. In quickcheck-state-machine this can often lead to ugly error "state unused" calls.

All in all, the libraries are quite similar in terms of the core state model testing functionality. For lockstep-style testing, however, quickcheck-lockstep is probably more user-friendly than the corresponding functionality in quickcheck-state-machine because there is less advanced type machinery required. The downside of the single-variable-per-command of quickcheck-dynamic is resolved by the GVar abstraction in quickstep-lockstep.

Footnotes

If we wanted to execute lockstep-style tests against multiple execution backends, we would have to introduce another abstraction to ensure that we can compare model responses to system responses for all of those backends.↩︎
InLockstep could alternatively require a function compareResult :: a -> ModelValue s a -> Bool, but writing such a function is often bit cumbersome, whereas equality for Observable s can be derived.↩︎
Core quickcheck-dynamic takes care of variables for the system under test, but not for the model.↩︎
For an Action .. a, all we see in postcondition is a value of type Realized m a. We want to ensure not just that the model and the system under test have the same behaviour when no exceptions are present, but also that they return the same errors. We must therefore reflect the possibility for an error in the result type of Open.↩︎
Perhaps some of this functionality can be merged with the main library; it certainly seems useful beyond lockstep-style testing.↩︎
This is closely related to Coyoneda.↩︎
The quickcheck-dynamic infrastructure insists that actions have Eq and Show instances. Since variables occur in actions, the same must be true for GVar. Secondly, a function from x -> y would not be enough; we would also need a second function of type ModelValue s x -> ModelValue s y. The indirection through the DSL avoids both of these problems: operations have Eq and Show instances, and we can simply insist on two interpreters of Op: one for Identity and one for ModelValue s.↩︎
We use tagStep not just in labelledExamples, but also in the standard StateModel method monitoring, to tag tests as they are executed. While the former would in principle allow us to tag an entire list of actions, the latter does not.↩︎
The proxy argument is necessary because Realized is a non-injective type family; quickcheck-dynamic relies on AllowAmbiguousTypes instead.↩︎
In StateModel we have monitoring, but monitoring cannot really be used with label, as this would result in lots of calls to label as the test executes (once per action) and each of those calls would result in a separate table in the test output. We must therefore use tabulate instead, but this is not supported by QuickCheck’s labelledExamples. Moreover, the only way to turn a list of actions into a PropertyM in quickcheck-dynamic is runActions, which requires the RunModel argument; but RunModel should not be needed for creating labelled examples. In the lockstep infrastructure we provide instead a function tagActions :: Actions (Lockstep state) -> Property, which basically executes all of the actions, collecting the tags as it goes, and then makes a single call to label with the final list of tags. This then works well with the standard labelledExamples functionality from QuickCheck.↩︎
The registry example from quickcheck-dynamic skirts around the problem: some actions fail, and some actions return new ModelThreadId, but there are no actions that can fail or return a new ModelThreadId.↩︎

Lockstep-style testing with quickcheck-dynamic