On a UTxO-style blockchain such as Cardano, transaction outputs are normally to the (hash) of a public key; in order to spend such an output, the spending transaction must be signed with the corresponding private key as evidence that the party creating the transaction has the right to spend the output.
The extended UTxO
model
introduces a second kind of output: an output to a smart contract: a piece of
code f
. Along with the contract itself, the output also contains an additional
argument d
, known as the datum. When a transaction spends such an output,
f(d)
is executed; when f
indicates that all is fine, the transaction is
approved; otherwise, it is rejected.
An immediate consequence of this model is that outputs are only verified when they are spent, not when they are created. This can lead to some tricky-to-verify initial conditions. We will explore this problem in this blog post, and suggest some ways in which we can solve it. Along the way, we will recap some often misunderstood Plutus concepts.
This work was done as part of the development of Be, a (smart) contract platform.
Stage restriction: datums vs parameters
Script outputs consist of three things: the value at the output, the hash of the script, and a datum or datum hash. We will ignore the value in the rest of this section.
We can think of a script hash and a datum as a pair of a function f
and an
argument datum d
, which we might write as f(d)
. Scripts often have arguments
other than the datum too; we might write this as
fx1, .., xN(d)
.
These additional arguments are commonly referred to as
parameters.
The difference between parameters and the datum is one of stage: parameters are applied at the stage of script compilation, whereas the datum is applied at the stage of script execution. Let’s consider this from a few different angles.
(Data) values1 that must be computed off-chain should be parameters; values that must be computed on-chain should be in the datum.
For example, stateful scripts often have to compute their next state. Since this computation happens on-chain, this state must therefore live in the script’s datum.
From the on-chain code’s perspective,
fx
andfx’
are different scripts with different hashes, whereasf(d)
andf(d')
look like the same script, applied to different datums.Sometimes there are reasons to prefer that a particular value is a parameter rather than a datum: after all, datums are just values and cannot a priori be trusted (we will come back to this in detail later). In principle it’s not impossible to compute a parameter on-chain, but it’s difficult: we must have sufficient information on-chain to be able to compute, for a given
x
, the hash offx
(that is, the hash of the serialised source code offx
). In practice, on Cardano this is currently impossible, because the appropriate hashing algorithm is not available on-chain.2Conversely, sometimes we might prefer that a value is stored in the datum rather than is passed as a parameter. Off-chain code can easily recognize outputs to
f(d)
in the current UTxO set without needing to be aware ofd
; however, outputs tofx
cannot be recognized as such withoutx
, since the hash offx
is different for everyx
.
Self-reference: direct and indirect recursion
Scripts often need to be aware of their own hash. For example, a script f
may need to check that any transaction spending an output to f
also contains
an output back to f
:
Clearly, the source of f
cannot literally contain the hash of that very
same source code: the hash would be uncomputable. Scripts must therefore
be told their own hash when they run. This has some important
consequences.
For example, suppose we have a parameterized minting policy
πf
, such that πf
only allows
minting of tokens if those tokens are output to f
. Suppose furthermore that
f
needs to check if particular inputs contain πf
tokens. In principle this should be fine: since f
is told its own hash, it
should be able to compute the hash of πf
. However, we
run into the stage restriction discussed above: the script’s own hash is a
run-time value, whereas the parameter to π
must be provided at compile time.
Again, it’s not in principle impossible to solve this problem3, but in practice it’s
difficult, and as we saw, in fact currently impossible on Cardano for technical
reasons.
We must therefore tell script f
what the hash of πf
is. This cannot be a parameter, because this would lead to uncomputable
self-referential hashes again. Indeed, even our notation would break down here:
We can put (the hash of) πf
in the datum d
instead;
(f, (πf, d), V)
is unproblematic for hash computations, but now we have a different problem: f
has no way of verifying that the datum contains the right hash. We will
come back to back this point below.
It may also be possible to break the mutual dependency between f
and
πf
in different ways. For example, perhaps f
has an
associated NFTo
, so that we could parameterize both f
and π
by NFTo
instead. As we shall see, this is also
not without problems.
Stateful scripts
In the EUTxO model, state is never actually updated; instead, old state is consumed and new state is created. Typically, this state resides in the datum of a script:
Inductive reasoning
When Tx
is validated, f
can verify the evolution of
dold
to dnew
, but it cannot
verify dold
; instead, we want to reason inductively, and
say that f
must at some point have verified that previous state as well:
Unfortunately, this is an induction without a base case, because the
first output to f
is not verified by f
:
The critical transaction here is Tx1
: ideally it should
not only verify that d1
is the correct next state after
d0
, but also the base case: d0
should be the correct initial state.
Unfortunately, the fact that d0
should be the base case
is deducible from Tx0
, but on Cardano that information
is not reflected in Tx1
, and Tx1
is the only context available to f
when it validates
d0
. This means that contracts do not know when the base
case should apply, and hence cannot verify their own base case.
There are two ways to solve this problem.
Declare that it is the responsibility of the off-chain code to verify the base case. In rare cases it may be possible to verify this by inspecting the current datum, but in most cases it will mean looking through the chain history to find the original output (
Tx0
in the diagram above).We can solve the problem entirely on-chain through the clever use of NFTs. We will discuss this in detail below.
State tokens
Often stateful scripts have an associated NFT, anchored at a random output o
,
which is included in the value of each script output:
Such NFT is useful to be able to uniquely identify the current script
output.4 However, the presence of such an NFT does not
change the narrative above in any meaningful way. While it is true that if the
NFT is minted in Tx0
(and f
checks for the presence of
the NFT as part of its checks), then d0
must indeed be
the base case, but this is once again not visible from
Tx1
.
Conversely, the presence of this kind of NFT in an input does not mean that that
input spends a script output to f
; this will only be true once the NFT is
locked in the script (at which point f
will validate that it will remain
locked in the script). The base case should be that the NFT is locked in the
contract immediately when it’s minted, but verifying this again involves
checking the chain history.
Parameterized NFTs
The NFT minting policy NFTo
simply checks that the
enclosing transaction spends output o
. Since outputs can only be spent once,
this guarantees that the resulting token is indeed a singleton. If we
additionally want the guarantee that the NFT is locked in some script f
immediately upon minting, we can define an NFT minting policy which is
parameterized by an output and a script it’s intended for: NFTo,
f
would verify that the enclosing transaction spends o
, and
that the resulting token is locked in a script output to f
.
Unfortunately, this leads to precisely the kind of mutual dependency between f
and NFTo, f
that we discussed above: f
needs to know
the hash of NFTo, f
so that it can verify that the NFT
remains locked in the script, and NFTo, f
needs
to know the hash of f
so that it can verify the NFT is locked in the script
when it is first minted.
We must therefore store the hash of the NFT in the datum associated with f
,
but now f
has no way of verifying that datum. It can verify that the hash
never changes, but it has no way of verifying that the hash is correct. It’s
another example of the base case problem discussed above, but in a sense even
more severe: even if the script could tell when the base case should apply, it
would still not know what the base case should be.
Fortunately, if we use an NFT to identify script outputs, we can solve
the base case problem once and for all, by additionally parameterizing the NFT
with the expected initial state. NFTo, f, d
will allow
minting if the enclosing transaction
- spends
o
- locks the resulting token in an output to script
f
- the associated datum5 is
(NFTo, f, d, d)
This means that if we interact with a script through NFTo, f,
d
, the NFT guarantees the inductive base case. Put another way: we
have encoded the base case right in the hash of the NFT.6
Stateful minting policies
A minting policy π
is a script that determines whether a particular token is
allowed to be minted or burned. Minting policies are stateless: they do not have
associated datums (indeed, there are no outputs to a minting policy), nor can we
somehow encode state in the hash of the minting policy itself, because we have
no way of “evolving” that hash.
Many minting policies however do need some state. When this happens, the minting
policy needs to be paired with a stateful script f
. The minting policy π
could then check the transaction for an input that spends a script output (f, d)
, and then use the associated datum d
as its state. This is unproblematic
as long as the minting policy only needs to read the state, but does not need to
modify it. In this case, f
is still responsible for checking the state in
the continuation output, as normal.
If however the minting policy does need to change the state, we have a problem.
We could attempt to set things up such that the minting policy
π
informs the regular scriptf
thatπ
will take over duties for verifying the continuation output, by having the minting policy verify that the redeemer7 forf
tells it not to check the continuation output. This is however not sound: a malicious user could create a transaction that doesn’t mint but does use that special redeemer value, and would then be able to change the state off
at will.f
could check the transaction to see if it it burns or mints anyπ
tokens, and delegate verification of the output datum toπ
when this is the case. This is sound but results in the same kind of mutual dependency problem that we already encountered above:π
now needs the hash off
(to recognize outputs tof
), andf
needs the hash ofπ
(to check whether or not anyπ
tokens are minted).Most stateful scripts use some state token
NFTo
. It is tempting to think that we could break the cycle by having the minting policyπ
check for the input withNFTo
instead of the input spending a script output tof
: nowπ
no longer needs to know the hash off
. Unfortunately, as we saw, we are only guaranteed that the presence of the NFT implies that the input spends an output tof
if we verify through other means that the NFT is locked in the scriptf
immediately upon minting.8We must therefore store the hash of the minting policy in the datum of
f
. We can use the parameterized NFT we discussed above, and useNFTo, f, (πf, d)
to ensure that the initial value of the minting policy is correct9 (f
itself must ensure that the hash never changes).
Variable base case
When we discussed inductive reasoning, we saw that scripts cannot verify their own base case. We will now consider what happens when the base case for one script is another script:
In this case there is no fixed base case for f
; instead, the guarantee
required by f
is that g
has verified d
; as always, this guarantee must be
derivable from information present in Tx
only.
We can do this by defining a minting policy Linkg, f
,
which allows minting in a transaction Tx'
if
Tx'
spends an input tog
- The only
Linkg, f
tokens in the outputs ofTx'
are locked in outputs tof
.
Scripts g
and f
then have the following responsibilities:
g
must verify the datums in all outputs tof
inTx'
.f
must check for the presence of theLinkg, f
token in the input it is verifying. It must also check that the only outputs of theLinkg, f
token in transactionTx
are back tof
.
These tokens are not NFTs, and do not need to be. We just need to be make sure the tokens never ``leak’’ (which the rules above do ensure).
Once more we have to solve the problem of mutual dependencies between f
and
Linkg, f
and between g
and Linkg,
f
. First, we can assume without loss of generality that g
is
stateful; after all, if it wasn’t, then this setup provides no benefits over a
setup with a fixed base case. We can therefore store the hash of
Linkg, f
in the state (datum) associated with g
, and
then rely on the NFTo, g, (Linkg, ; f, d’)
to
ensure that the hash is recorded correctly. We can use g
to verify that the
datum for f
also records the correct hash of Linkg, f
.
This also avoids any dependencies between f
and g
directly: g
can identify
outputs to f
by looking for outputs that include a Linkg,
f
token, and f
verifies that its inputs include that token.
Conclusions
Stateful Plutus scripts can verify that the evolution of state happens according to the rules defined in the script, but cannot verify the initial state: they lack the context that would tell them when the base case applies.
Moreover, the lack of on-chain script hash computation (or, put another way, the strict stage separation between script parameters and the script datum) means that we must often include the hash of one script in the datum of another; these hashes cannot be verified at all, even if the script could know when a state should be initial.
Often the verification of these initial conditions are relegated to off-chain code instead, but this is unsatisfactory and dangerous. The same on-chain code could be interacted with by multiple off-chain applications; a single forgotten check in one of those off-chain applications could result in security vulnerabilities.
Stateful scripts often have an associated NFT, used to identify the current
script output in a UTxO set. As it turns out, such NFTs are subject to their own
unverified initial conditions. However, we showed that we can resolve all of
these problems by defining an NFTo, f, d
which verifies
that the NFT is locked in the script f
immediately upon minting and that the
initial datum is d
. In addition, we saw that if the base case for one script
f
is another script g
, we can use NFTo, g, d
to
guarantee the base case for g
, and a special token Linkg,
f
to verify the base case for f
.
Footnotes
Data values as opposed to currency values.↩︎
Script hashes use BLAKE-224; the available on-chain hashing algorithms are SHA2-256, SHA3-256, and BLAKE-256.↩︎
Specifically,
f
would need enough information to be able to computeblake224(serialise(πf))
.↩︎This can be useful off-chain, in order to figure out which output in the UTxO set corresponds to the current state of the contract we’re interacting with. It can also be useful on-chain, in case there could be multiple instances of
f
within a single transaction, and the script needs to figure out which output belongs to which input.↩︎This latter check is unproblematic becauses scripts are told their own hash when they run.↩︎
It is true that the hash is carrying a lot of weight here; we are depending on that fixed size hash to encode a lot of information. However, this is no different from representing script outputs in the first place; we are comfortable with hashes representing
fx1, .., xN
, in which case the hash is also encoding a lot of information: the definition off
as well as all parametersxi
.↩︎As we saw earlier, a script output contains (the hash of) a function
fx1, .., xN
, and a datumd
. When a script output is spent, the spending input additionally provides a redeemer valuer
; the actual code that is run is thenfx1, .., xN(d, r)
. In Plutus V1 it in fact not possible for the minting policy to check the redeemer value for other scripts, as this information is not present in theScriptContext
; this is resolved in V2.↩︎We cannot use
NFTo, f, d
: this is guaranteed to be locked in the script immediately but, unlikeNFTo
, still depends onf
.↩︎When
π
tokens are minted, we could rely onπ
instead to verify that the right hash is recorded. However, this check will not happen until the firstπ
-token is minted, and we wouldn’t know what happened before then.↩︎