GHC 9.2 introduces two new language extensions called OverloadedRecordDot and
OverloadedRecordUpdate, which overload record field access (myRecord.x.y)
and record field update (myRecord{x.y = a}), respectively. In this short blog
post we will discuss these two extensions in the context of two libraries,
large-anon and large-records. Of course, OverloadedRecordDot and
OverloadedRecordUpdate can be useful in many other contexts too; the
principles we explain in this blog post should apply universally.
We will start by discussing large-anon, which is a library
that provides support for scalable anonymous records (records that do not have
to be defined before they can be used), along with a rich API for manipulating
such records. We will see how we can take advantage of OverloadedRecordDot and
OverloadedRecordUpdate to access and update fields of such records.
We then briefly discuss large-records, which provides
“regular” records but with a linear compilation time, as opposed to the
quadratic compilation time for standard records; this can really matter for
large records (records with more than say 20 or 30 fields). The usage of
OverloadedRecordDot and OverloadedRecordUpdate is pretty similar to their
usage in large-anon, but as we will see, there is a small technical
complication.
OverloadedRecordDot
When the OverloadedRecordDot language extension
is enabled, then r.b for some record r and field b is interpreted as
getField @"b" r, where getField is a method of the HasField class defined
in GHC.Records:
class HasField x r a | x r -> a where
getField :: r -> aFor example, the following code1, which uses the
large-anon library for anonymous records, will print
True:
{-# LANGUAGE OverloadedRecordDot #-}
{-# OPTIONS_GHC -fplugin=Data.Record.Anon.Plugin #-}
import Data.Record.Anon
import Data.Record.Anon.Simple qualified as S
test :: IO ()
test = print r.b
where
r :: S.Record [ "a" := Int, "b" := Bool ]
r = ANON { a = 5, b = True }The syntax also extends to nested fields; for example this code will print
Just 'x':
import Data.Record.Anon.Advanced qualified as A
test :: IO ()
test = print r.b.d
where
r :: S.Record [ "a" := Int, "b" := A.Record Maybe [ "c" := Bool, "d" := Char ] ]
r = ANON { a = 5, b = ANON_F { c = Nothing, d = Just 'x' } }(The difference between “simple” anonymous records and “advanced” anonymous
records is that the latter support an additional functor argument; S.Record ["a" := Int, "b" := Bool ] is a record with one field or type Int and one
field of type Bool, whereas A.Record Maybe [ "c" := Bool, "d" := Char ] is a
record with one field of type Maybe Bool and one field of type Maybe Char.
For an in-depth introduction, see large-anon: Practical scalable anonymous
records for Haskell.)
OverloadedRecordUpdate
Unfortunately, the situation with
OverloadedRecordUpdate is not as clean as
with OverloadedRecordDot. The current situation is that the syntax r{a = 6}
translates to setField @"a" r 6, without specifying what setField is,
precisely. The design and implementation of the corresponding class is not yet
finalized; the GHC wiki entry on overloaded
record fields contains a bunch of proposals.
In an attempt to guarantee forwards compatibility, therefore,
OverloadedRecordUpdate currently requires
RebindableSyntax. This language extension changes
GHC so certain pieces of syntax no longer refer to their
standard definition, but rather to whatever definition is in scope; for example,
it makes it possible to change what
OverloadedStrings does (by using a different
definition of fromString) or to change how do
notation is desugared (by using a different definition of
>>=). For records, it will use whatever definition of
getField is in scope instead of using the HasField class; and it will also
whatever definition of setField is in scope, except that in this case there
is no standard definition yet.
The large-anon library offers a module Data.Record.Anon.Overloading which
can be used in combination with RebindableSyntax: it restores all definitions
to their standard definition again, and it defines a setField function which
depends on the HasField class from the
record-hasfield package:
class HasField x r a | x r -> a where
hasField :: r -> (a -> r, a)For example, the following code prints ANON { a = 6, b = True }:
{-# LANGUAGE OverloadedRecordUpdate #-}
{-# LANGUAGE RebindableSyntax #-}
import Data.Record.Anon.Overloading
test :: IO ()
test = print r { a = 6 }
where
r :: S.Record [ "a" := Int, "b" := Bool ]
r = ANON { a = 5, b = True }Like record field access, OverloadedRecordUpdate also works for nested fields
(in this case OverloadedRecordDot is also required):
test :: IO ()
test = print r { b.d = Just 'a' }
where
r :: S.Record [ "a" := Int, "b" := A.Record Maybe [ "c" := Bool, "d" := Char ] ]
r = ANON { a = 5, b = ANON_F { c = Just True, d = Nothing } }large-records
The above examples all used the large-anon library for
anoynmous records; the situation with the large-records
is similar, with one somewhat annoying technical difference. The
large-records library provides records whose compilation time scales linearly
with the size of the record, as opposed to the quadratic compilation time of
standard records. As discussed in Avoiding quadratic core code size with large records, it
needs to jump through quite a few hoops to achieve this. When a module
contains a record declaration such as
{-# OPTIONS_GHC -fplugin=Data.Record.Plugin #-}
{-# ANN type Person largeRecord #-}
data Person = Person { name :: String }then large-records turns it into something like this:
data Person = forall a. a ~ String => Person { name :: a }This unusual representation achieves two goals: the strange existential type parameter prevent GHC from generating field accessors (one of the many sources of quadratic code), but we can still construct and pattern match on these records in a normal way.
As discussed in the GHC manual, the existential type
parameter will also prevent GHC from automatically derive a HasField
instance for fields such as name. This wouldn’t be a problem (large-records
could generate the instance instead), if it wasn’t for this proviso in the
manual:
If a field has a higher-rank or existential type, the corresponding HasField constraint will not be solved automatically (..), but in the interests of simplicity we do not permit users to define their own instances either.
There is a GHC proposal to relax this
condition; until that proposal is implemented,
however, the use of OverloadedDot with large-records will depend on
RebindableSyntax (which then bypasses the use of the standard HasField class
entirely). Of course, as we saw in the previous section,
OverloadedRecordUpdate depends on RebindableSyntax anyway, so this isn’t
as much of a limitation as it might seem.
Conclusions
Record dot syntax has been available since GHC 8.6 through the use of the
record-dot-preprocessor. From 9.2 onwards
this syntax is now available without a preprocessor or plugin; for record field
access this is achieved through the use of the HasField class from
GHC.Records. For record field update the design is not
yet finalized, and we instead rely on RebindableSyntax.
As of their latest releases, the large-anon and
large-records libraries, as well as the supporting
libraries large-generics and
typelet, are now compatible with GHC 9.2 and 9.4 and make use
of these extensions. It is also still possible to use record-dot-preprocessor
(you will need at least version 0.2.16 for GHC 9.4). The only complication is
that GHC 9.4 changed the order in which plugins are
loaded. This matters because the
large-records plugin must be loaded before RecordDotPreprocessor; to avoid
users having to use CPP to distinguish between GHC versions, large-records now
offers Data.Record.Plugin.WithRDP which combines both plugins (for
large-anon the order does not matter).
Here and elsewhere we will only show the most important language pragmas and includes.↩︎