An introduction to Cabal Hooks for package authors

Sam Derbyshire – Friday, 31 January 2025

all build-systems cabal community open-source sovereign-tech-fund

Over the last year, Well-Typed have carried out significant work in Cabal, Haskell’s build system, thanks to funding from the Sovereign Tech Fund. Our main goal was to re-think the Cabal architecture for building packages. This was historically tied to the Setup command-line interface, with each package technically capable of providing its own independent build system via the Custom build-type. In practice, the full generality of this interface is not useful, and it obstructs the development of new features and created a drag on maintenance, so there has long been an appetite to reimagine this interface within Cabal.¹

With the release of Cabal-3.14.0.0 and cabal-install-3.14.1.1, the new Hooks build-type we have developed, together with the Cabal-hooks library, are now available to package authors. Over time, we hope to see packages that depend on the Custom build-type gradually migrate to use Hooks instead.

For more background on this work, check out:

our blog post introducing the project,
our HF Tech Proposal in which the design was discussed with the developer community, and
the Distribution.Simple.SetupHooks documentation.

In the remainder of this post, we will:

dive into the background details of how Cabal works,
provide an introduction to the new interfaces for package authors who may wish to adapt their packages.

This post is based on Sam’s talk at the Haskell Ecosystem Workshop 2024.

Background

The Cabal specification

The Cabal specification (2005) was designed to allow Haskell tool authors to package their code and share it with other developers.

The Haskell Package System (Cabal) has the following main goal:

to specify a standard way in which a Haskell tool can be packaged, so that it is easy for consumers to use it, or re-package it, regardless of the Haskell implementation or installation platform.

The Cabal concept of a package is a versioned unit of distribution in source format, with enough metadata to allow it to be built and packaged by downstream distributors (e.g. Linux distributions and other build tools).

A Cabal package consists of multiple components which map onto individual Haskell units (e.g. a single library or executable).

The Cabal package model

Each package must bundle some metadata, specified in a .cabal file. Chiefly:

the package name and version number,
its dependencies, including version bounds (e.g. base >= 4.17 && < 4.21, lens ^>= 5.3),
what the package provides (libraries and their exposed modules, executables…),
how to build the package (e.g. build-type: Simple).

The Cabal library then implements everything required to build individual packages, first parsing the .cabal file and then building and invoking the Setup script of the package.

The `Setup` interface

The key component of the original Cabal specification is that each package must provision an executable which is used to build it. As written in an early draft:

To help users install packages and their dependencies, we propose a system similar to Python’s Distutils, where each Haskell package is distributed with a script which has a standard command-line interface.

More precisely, to comply with the Cabal specification, the build system of a package need only implement the Setup command-line interface, i.e. provide a Setup executable that supports invocations of the form ./Setup <cmd>:

`<cmd>`	description
`configure`	resolve compiler, tools and dependencies
`build`/`haddock`/`repl`	prepare sources and build/generate docs/open a session in the interpreter
`test`/`bench`	run testsuites or benchmarks
`copy`/`install`/`register`	move files into an image dir or final location/register libraries with the compiler
`sdist`	create an archive for distribution/packaging
`clean`	clean local files (local package store, local build artifacts, …)

In practice, the ./Setup configure command takes a large number of parameters (as represented in the Cabal ConfigFlags datatype). This configuration is preserved for subsequent invocations, which usually only take a couple of parameters (e.g. ./Setup build -v2 --builddir=<dir>).

This interface can be used directly to build any package, by executing the the following recipe:

build and install the dependencies in dependency order;
to build each individual unit:
- ./Setup configure <componentName> <configurationArgs>
- ./Setup build --builddir=<buildDir>
- ./Setup haddock --builddir=<buildDir> <haddockArgs> (optional, to generate documentation)
to make a unit available to units that depend on it:
- ./Setup copy --builddir=<buildDir> --destDir=<destDir> (this makes executables available, e.g. for build-tool-depends)
- for libraries, registration (see § Library registration):
  - ./Setup register --builddir=<buildDir> --gen-pkg-config=<unitPkgRegFile>
  - hc-pkg register --package-db=<pkgDb> <unitPkgRegFile>

Usually, these steps will be executed by a build tool such as cabal-install, which provides a more convenient user interface than invoking Setup commands directly. Some systems (such as nixpkgs) do directly use this interface, however.

The tricky parts in the above are:

passing appropriate arguments to ./Setup configure, in particular exactly specifying dependencies,² and making sure the arguments are consistent with those expected by the cabal-version of the package,³
constructing the correct environment for invoking ./Setup, e.g. adding appropriate build-tool-depends executables in PATH and defining the corresponding <buildTool>_datadir environment variables.

Library registration

In the above recipe to build packages, there was a single step which wasn’t an invocation of the Setup script: a call to hc-pkg. To quote from the original Cabal specification:

Each Haskell compiler hc must provide an associated package-management program hc-pkg. A compiler user installs a package by placing the package’s supporting files somewhere, and then using hc-pkg to make the compiler aware of the new package. This step is called registering the package with the compiler.

To register a package, hc-pkg takes as input an installed package description (IPD), which describes the installed form of the package in detail.

This is the key interchange mechanism between Cabal and the Haskell compiler.

The installed package description format is laid out in the Cabal specification; in brief, it contains all the information the Haskell compiler needs to use a library, such as its exposed modules, its dependencies, and its installation path. This information can be seen by calling hc-pkg describe:

> ghc-pkg describe attoparsec --package-db=<cabal-store>/<ghc-ver>/package.db

name:            attoparsec
version:         0.14.4
visibility:      public
id:              attoparsec-0.14.4-b35cdbf2c0654f3ef00c00804c5e2b390700d4a0
abi:             d84b6b3e46222f7ab87b5a2d405e7f48
exposed:         True
exposed-modules:
    Data.Attoparsec Data.Attoparsec.ByteString
    [...]
hidden-modules:
    Data.Attoparsec.ByteString.Internal Data.Attoparsec.Text.Internal
depends:
    array-0.5.7.0-9340
    attoparsec-0.14.4-ab0b5b7d4498267368f35b0c9f521e31e33fe144
    base-4.20.0.0-30dc bytestring-0.12.1.0-b549 containers-0.7-2f81
    deepseq-1.5.0.0-30ad ghc-prim-0.11.0-d05e
    scientific-0.3.6.2-d4ceb07500a94c3c60cb88dff4bfb53d40348b25
    text-2.1.1-e169 transformers-0.6.1.1-6955

Note that, perhaps confusingly, the hc-pkg interface is not concerned with Cabal’s notion of “packages”. Rather, it deals only in “units”; these generally map to Cabal components, such as the package’s main library and its private and public sublibraries. For example, the internal attoparsec-internal sublibrary of the attoparsec package is registered separately:

> ghc-pkg describe z-attoparsec-z-internal

name:            z-attoparsec-z-attoparsec-internal
version:         0.14.4
package-name:    attoparsec
lib-name:        attoparsec-internal
id:              attoparsec-0.14.4-ab0b5b7d4498267368f35b0c9f521e31e33fe144
abi:             908ae57d09719bcdfb9cf85a27dab0e4
exposed-modules:
    Data.Attoparsec.ByteString.Buffer
    Data.Attoparsec.ByteString.FastSet Data.Attoparsec.Internal.Compat
    [...]
depends:
    array-0.5.7.0-9340 base-4.20.0.0-30dc bytestring-0.12.1.0-b549
    text-2.1.1-e169

How the `Setup` interface is used by packages

Centering the package build process around the Setup script provides a great deal of flexibility to package authors, as the Setup executable can be implemented in any way the package author chooses. In this way, each package brings its own build system.

However, in practice, this is more expressiveness that most library authors want or need. Consequently, almost all packages use one of the following two build systems:

build-type: Simple (most packages). For such packages, the Setup.hs file is of the following form:
```
module Main where
import Distribution.Simple (defaultMain)
main = defaultMain
```
This means that the ./Setup CLI interface maps directly to the implementation provided by the Cabal library:
- ./Setup configure = Cabal library Distribution.Simple.Configure.configure
- ./Setup build = Cabal library Distribution.Simple.Build.build
- etc.
build-type: Custom where the Setup.hs file uses the Cabal library to perform most of the build, but brackets some of its logic with package-specific code using the Cabal UserHooks mechanism, e.g. so that it runs custom configuration code after Cabal configure, or generates module sources before running Cabal build.

For an example of case (2), the custom Setup.hs code for hooking into the configure phase might look like the following:

main =
  ( defaultMainWithHooks simpleUserHooks )
    { confHook = \ info cfgFlags -> do
        info' <- customPreConfHook info cfgFlags
        confHook simpleUserHooks info' cfgFlags
    }

In this example, simpleUserHooks means “no hooks” (or more accurately “exactly the hooks that build-type: Simple uses”). So the above snippet shows how we can include custom logic in customPreConfHook in order to update the Cabal GenericPackageDescription, before calling the Cabal library configure function (via confHook simpleUserHooks). Here, aGenericPackageDescription is the representation of a .cabal file used by Cabal (the Generic part means “before attempting to resolve any conditionals”).

The fact that Setup executables may (in principle) be arbitrary when using build-type: Custom fundamentally limits what build tools such as cabal-install or the Haskell Language Server can do in multi-package projects. The tool has to treat the build system of each package as an opaque black box, merely invoking functionality defined by the specific version of the Setup interface supported by the package.

The main observation is that, in practice, custom Setup.hs scripts only insert benign modifications to the build process: they still fundamentally rely on the Cabal library to do the bulk of the work building the package.

A replacement for `Custom` setup scripts

The limitations of the Setup interface discussed above motivate the need for a new mechanism to customise the build system of a package:

The bulk of the work should be carried out by the Cabal library, which exposes functions such as configure and build, but these need to be augmented with hooks so that individual packages can customise certain phases.
The hooks provided by this mechanism should be kept to a minimum (to give more flexibility to build tools) while still accommodating the needs of package authors in practice.
Customisation should be declared by a Haskell library interface (as opposed to the black-box command-line interface of Setup.hs), in order to enable as much introspection by build systems as possible.

This will enable a gradual restructuring of build tools such as cabal-install away from the Setup command-line interface, which has grown unwieldy due to the difficulty of evolving it to meet requirements that could not be foreseen when it was created.

Building on this understanding, as well as a survey of existing uses cases of build-type: Custom, we have introduced an alternative mechanism for customizing how a package is built: build-type: Hooks. This mechanism does not allow arbitrary replacement of the usual Cabal build logic, but rather merely exposes a set of well-defined hooks which bracket a subset of Cabal’s existing build steps.

We arrived at this design through collaboration with Cabal developers, users, and packagers as part of a RFC process in Haskell Foundation Tech Proposal #60.

Introducing `build-type: Hooks`

The main documentation for usage of the hooks API is provided in the Haddocks for the Cabal-hooks package. The Cabal Hooks overlay contains patched packages using build-type: Hooks. It can be used as an overlay like head.hackage, for constructing build plans without any build-type: Custom packages. It can also serve as a reference for usage of the API.

At a high-level, a package with build-type: Hooks:

declares in its .cabal file:
- a cabal-version of at least 3.14,
- build-type: Hooks,
- a custom-setup stanza with a dependency on Cabal-hooks (the latter is a library bundled with Cabal that provides the API for writing hooks):

cabal-version: 3.14
...
build-type: Hooks
...

custom-setup
  setup-depends:
    base        >= 4.18 && < 5,
    Cabal-hooks >= 0.1  && < 0.2

contains a SetupHooks.hs Haskell module source file, next to the .cabal file, which specifies the hooks the package uses. This module exports a value setupHooks :: SetupHooks (in which the SetupHooks type is exported by Distribution.Simple.SetupHooks from the Cabal-hooks package).

module SetupHooks where

-- Cabal-hooks
import Distribution.Simple.SetupHooks

setupHooks :: SetupHooks
setupHooks =
  noSetupHooks
    { configureHooks = myConfigureHooks
    , buildHooks = myBuildHooks }

The new hooks fall into the following categories:

configure hooks allow customising how a package will be built
pre-build rules allow generating source files to be built
post-build hooks allow the package to customise the linking step
install hooks allow the package to install additional files alongside the usual binary artifacts

In the remainder of this blog post, we will focus on the two most important (and most commonly used) hooks: configure hooks and pre-build rules.

Configure hooks

The configure hooks allow package authors to make decisions about how to build their package, by modifying the Cabal package description (which is Cabal’s internal representation of the information in a .cabal file). Crucially, these modifications will persist to all subsequent phases.

Configuration happens at two levels:

global configuration covers the entire package,
local configuration covers a single component.

There are three hooks into the configure phase:

Package-wide pre-configure. This can be used for custom logic in the style of traditional ./configure scripts, e.g. finding out information about the system and configuring dependencies, when those don’t easily fit into Cabal’s framework.
Package-wide post-configure. This can be used to write custom package-wide information to disk, to be consumed by (3).
Per-component pre-configure. This can be used to modify individual components, e.g. adding exposed modules or specifying flags to be used when building the component.

Per-package configuration

Suppose our package needs to use some external executable, e.g. a preprocessor. If we require custom logic to find this external executable on the system, or to parse its version number, we need to go beyond Cabal’s built-in support for build-tool-depends.

We can do this in a pre-configure hook:

myConfigureHooks :: ConfigureHooks
myConfigureHooks =
  noConfigureHooks
    { preConfigurePackageHook = Just configureCustomPreProc }

configureCustomPreProc :: PreConfPackageInputs -> IO PreConfPackageOutputs
configureCustomPreProc pcpi@( PreConfPackageInputs { configFlags = cfg, localBuildConfig = lbc } ) = do
  let verbosity = fromFlag $ configVerbosity cfg
      progDb = withPrograms lbc
  configuredPreProcProg <-
    configureUnconfiguredProgram verbosity customPreProcProg progDb
  return $
    ( noPreConfPackageOutputs pcpi )
      { extraConfiguredProgs =
        Map.fromList
          [ ( customPreProcName, configuredPreProcProg ) ]
      }

customPreProcName :: String
customPreProcName = "customPreProc"

customPreProcProg :: Program
customPreProcProg =
  ( simpleProgram customPreProcName )
    { programFindLocation =
        -- custom logic to find the installed location of myPreProc
        -- on the system used to build the package
        myPreProcProgFindLocation
    , programFindVersion =
        -- custom logic to find the program version
        myPreProcProgFindVersion
    }

Cabal will then add this program to its program database, allowing the program to be used to satisfy build-tool-depends requirements, as well as making it available in subsequent hooks (e.g. pre-build hooks).

Modifying individual components

Suppose we want to modify a component of a Cabal package, e.g. inserting configuration options determined by inspecting the system used to build the package (e.g. availability of certain processor capabilities). We can do this using hooks into the configure phase. For illustration, consider the following example, which includes:

a package-wide post-configure hook, which inspects the system to determine availability of AVX2 CPU features, and writes it out to a "system-info" file,
a per-component pre-configure hook which reads the "system-info" file, and uses that to pass appropriate compiler options (e.g. -mavx2) when compiling each component.

myConfigureHooks :: ConfigureHooks
myConfigureHooks =
  noConfigureHooks
    { postConfPackageHook  = Just writeSystemInfo
    , preConfComponentHook = Just confComps
    }

data SystemInfo = SystemInfo { supportsAVX2 :: !Bool }
  deriving stock ( Show, Read )
    -- Show/Read for a quick-and-dirty serialisation interface (illustration only)

systemInfoFlags :: SystemInfo -> [ String ]
systemInfoFlags ( SystemInfo { supportsAVX2 } ) =
  [ "-mavx2" | supportsAVX2 ]

writeSystemInfo :: PostConfPackageInputs -> IO ()
writeSystemInfo ( PostConfPackageInputs { packageBuildDescr = pbd } ) = do
  let cfg = LBC.configFlags pbd
      distPref = fromFlag $ configDistPref cfg
      mbWorkDir = flagToMaybe $ configWorkingDir cfg
  supportsAVX2 <- System.Cpuid.Basic.supportsAVX2
  -- + more system-wide checks, if desired
  writeFile ( interpretSymbolicPath mbWorkDir $ systemInfoFile distPref )
    ( show $ SystemInfo { supportsAVX2 } )

systemInfoFile :: SymbolicPath Pkg ( Dir Dist ) -> SymbolicPath Pkg File
systemInfoFile distPref = distPref </> makeRelativePathEx "system-info"

confComps :: PreConfComponentInputs -> IO PreConfComponentOutputs
confComps pcci@( PreConfComponentInputs { packageBuildDescr = pbd, component = comp } ) = do
  let cfg = LBC.configFlags pbd
      distPref = fromFlag $ configDistPref cfg
      mbWorkDir = flagToMaybe $ configWorkingDir cfg
  sysInfo <- read <$> readFile ( interpretSymbolicPath mbWorkDir $ systemInfoFile distPref )
  let opts = systemInfoFlags sysInfo
      bi' = emptyBuildInfo
              { ccOptions = opts
              , cxxOptions = opts
              , options = PerCompilerFlavor opts []
              }
  return $
    ( noPreConfComponentOutputs pcci )
      { componentDiff =
         buildInfoComponentDiff ( componentName comp ) bi'
      }

Pre-build rules

Pre-build rules can be used to generate Haskell source files which can then be built as part of the compilation of a unit. Since we want to ensure that such generated modules don’t break recompilation avoidance (thereby crippling HLS and other interactive tools), these hooks comprise a simple build system. They are described in the Haddock documentation for Cabal-hooks.

The overall structure is that one specifies a collection of Rules inside the monadic API in the RulesM monad.

Each individual rule contains a Command, consisting of a statically specified action to run (e.g. a preprocessor such as alex, happy or c2hs) bundled with (possibly dynamic) arguments (such as the input and output filepaths). In the Hooks API, these are constructed using the mkCommand function. The actions are referenced using static pointers; this allows the static pointer table of the SetupHooks module to be used as a dispatch table for all the custom preprocessors provided by the hooks.

One registers rules using staticRule, declaring the inputs and outputs of each rule. In this way, we can think of each rule as corresponding to an individual invocation of a custom preprocessor. Rules are also allowed to have dynamic dependencies (using dynamicRule instead of staticRule); this supports use-cases such as C2Hs in which one needs to first process .chs module headers to discover the import structure.

Let’s start with a simple toy example to get used to the API: declare hooks that run alex on Lexer.alex and happy on Parser.happy (running alex/happy on *.x/*.y files is built into Cabal, but this is just for illustrative purposes).

{-# LANGUAGE StaticPointers #-}
-- [...]
myBuildHooks :: BuildHooks
myBuildHooks =
  noBuildHooks
    { preBuildComponentRules =
      Just $ rules ( static () ) myPreBuildRules
    }

myPreBuildRules :: PreBuildComponentInputs -> RulesM ()
myPreBuildRules pbci = do
  -- [...]
  -- Define the alex/happy commands.
      alexCmd  = mkCommand ( static Dict ) ( static runAlex )
      happyCmd = mkCommand ( static Dict ) ( static runHappy )
  -- Register a rule: run alex on Lexer.alex, producing Lexer.hs.
  let lexerInFile  = Location srcDir     ( makeRelativePathEx "Lexer.alex" )
      lexerOutFile = Location autogenDir ( makeRelativePathEx "Lexer.hs" )
  registerRule_ "alex:Lexer" $
    staticRule ( alexCmd ( verbosity, mbWorkDir, alex, lexerInFile, lexerOutFile ) )
      {- inputs  -} [ FileDependency lexerInFile ]
      {- outputs -} ( NE.singleton lexerOutFile )
  -- Register a rule: run happy on Parser.happy, producing Parser.hs.
  let parserInFile  = Location srcDir     (  makeRelativePathEx "Parser.happy" )
      parserOutFile = Location autogenDir (  makeRelativePathEx "Parser.hs" )
  registerRule_ "happy:Parser" $
    staticRule ( happyCmd ( verbosity, mbWorkDir, happy, parserInFile, parserOutFile ) )
      {- inputs  -} [ FileDependency parserInFile ]
      {- outputs -} ( NE.singleton parserOutFile )

runAlex, runHappy :: ( Verbosity, Maybe ( SymbolicPath CWD ( Dir Pkg ) ), ConfiguredProgram, Location, Location ) -> IO ()
runAlex  = runPp ( Suffix "x" )
runHappy = runPp ( Suffix "y" )

runPp :: Suffix
      -> ( Verbosity, Maybe ( SymbolicPath CWD ( Dir Pkg ) ), ConfiguredProgram, Location, Location )
      -> IO ()
runPp ( Suffix ppExt ) ( verbosity, mbWorkDir, ppProg, inLoc, outLoc ) = do
  -- Alex/Happy expect files with a specific extension,
  -- so we make a new temporary file and copy its contents,
  -- giving the file the expected file extension.
  tempDir <- makeSymbolicPath <$> getTemporaryDirectory
  withTempFileCwd mbWorkDir tempDir ( "." <> ppExt ) $ \ inputPpFile _ -> do
    copyFileVerbose verbosity
      ( interpretSymbolicPath mbWorkDir $ location inLoc )
      ( interpretSymbolicPath mbWorkDir inputPpFile )
    runProgramCwd verbosity mbWorkDir ppProg
      [ getSymbolicPath inputPpFile
      , "-o"
      , getSymbolicPath ( location outLoc )
      ]

The static Dict arguments to mkCommand provide evidence that the arguments passed to the preprocessor can be serialised and deserialised. While syntactically inconvenient for writers of Hooks, this crucially allows external build tools (such as cabal-install or HLS) to run and re-run individual build rules without re-building everything, as explained in the Haskell Foundation Tech Proposal #60.

Rules are allowed to depend on the output of other rules, as well as directly on files (using the Location datatype). If rule B depends on a file generated by rule A, then one must declare A as rule dependency of B (and not use a file dependency).

To summarise, the general structure is that we use the monadic API to declare a collection of rules (usually, one rule per Haskell module we want to generate, but a rule can generate multiple outputs as well). Each rule stores a reference (via StaticPointers) to a command to run, as well as the (possibly dynamic) arguments to that command. We can think of the pre-build rules as a table of statically known custom pre-processors, together with a collection of invocations of these custom pre-processors with specific arguments.

A word of warning: authors of pre-build rules should use the static keyword at the top-level whenever possible in order to avoid GHC bug #16981. In the example above, this corresponds to defining runAlex and runHappy at the top-level, instead of defining them in-line in the body of myPreBuildRules.

Custom pre-processors

To illustrate how to write pre-build rules, let’s suppose one wants to declare a custom preprocessor, say myPreProc, which generates Haskell modules from *.hs-mypp files. Any component of the package which requires such pre-processing would declare build-tool-depends: exe:myPreProc.

The pre-build rules can be structured as follows:

Look up the pre-processor in the Cabal ProgramDb (program database).
Define how, given input/output files, we should invoke this preprocessor, e.g. what arguments should we pass to it.
Search for all *.hs-mypp files relevant to the project, monitoring the results of this search (for recompilation checking).
For each file found by the search in (3), register a rule which invokes the processor as in (2).

{-# LANGUAGE StaticPointers #-}
myBuildHooks =
  noBuildHooks
    { preBuildComponentRules =
        Just $ rules ( static () ) myPreBuildRules
    }

myPreBuildRules :: PreBuildComponentInputs -> RulesM ()
myPreBuildRules
  PreBuildComponentInputs
    { buildingWhat   = what
    , localBuildInfo = lbi
    , targetInfo     = TargetInfo { targetComponent = comp, targetCLBI = clbi }
    } = do
  let verbosity = buildingWhatVerbosity what
      progDb = withPrograms lbi
      bi = componentBuildInfo comp
      mbWorkDir = mbWorkDirLBI lbi
  -- 1. Look up our custom pre-processor in the Cabal program database.
  for_ ( lookupProgramByName myPreProcName progDb ) $ \ myPreProc -> do
    -- 2. Define how to invoke our custom preprocessor.
    let myPpCmd :: Location -> Location -> Command MyPpArgs ( IO () )
        myPpCmd inputLoc outputLoc =
          mkCommand ( static Dict ) ( static ppModule )
            ( verbosity, mbWorkDir, myPreProc, inputLoc, outputLoc )

    -- 3. Search for "*.hs-mypp" files to pre-process in the source directories of the package.
    let glob = GlobDirRecursive [ WildCard, Literal "hs-mypp" ]
    myPpFiles <- liftIO $ for ( hsSourceDirs bi ) $ \ srcDir -> do
      let root = interpretSymbolicPath mbWorkDir srcDir
      matches <- runDirFileGlob verbosity Nothing root glob
      return
        [ Location srcDir ( makeRelativePathEx match )
        | match <- globMatches matches
        ]
    -- Monitor existence of file glob to handle new input files getting added.
    --   NB: we don't have to monitor the contents of the files, because the files
    --       are declared as inputs to rules, which means that their contents are
    --       automatically tracked.
    addRuleMonitors [ monitorFileGlobExistence $ RootedGlob FilePathRelative glob ]
      -- NB: monitoring a directory recursive glob isn't currently supported;
      -- but implementing support would be a nice newcomer-friendly task for cabal-install.
      -- See https://github.com/haskell/cabal/issues/10064.

    -- 4. Declare rules, one for each module to be preprocessed, with the
    --    corresponding preprocessor invocation.
    for_ ( concat myPpFiles ) $ \ inputLoc@( Location _ inputRelPath ) -> do
      let outputBaseLoc = autogenComponentModulesDir lbi clbi
          outputLoc =
            Location
              outputBaseLoc
              ( unsafeCoerceSymbolicPath $ replaceExtensionSymbolicPath inputRelPath "hs" )
      registerRule_ ( toShortText $ getSymbolicPath inputRelPath ) $
        staticRule ( myPpCmd inputLoc outputLoc ) [] ( outputLoc NE.:| [] )

type MyPpArgs = ( Verbosity, Maybe ( SymbolicPath CWD ( Dir Pkg ) ), ConfiguredProgram, Location, Location )
  -- NB: this could be a datatype instead, but it would need a 'Binary' instance.

ppModule :: MyPpArgs -> IO ()
ppModule ( verbosity, mbWorkDir, myPreProc, inputLoc, outputLoc ) = do
  let inputPath  = location inputLoc
      outputPath = location outputLoc
  createDirectoryIfMissingVerbose verbosity True $
    interpretSymbolicPath mbWorkDir $ takeDirectorySymbolicPath outputPath
  runProgramCwd verbosity mbWorkDir myPreProc
    [ getSymbolicPath inputPath, getSymbolicPath outputPath ]

This might all be a bit much on first reading, but the key principle is that we are declaring a preprocessor, and then registering one invocation of this preprocessor per *.hs-mypp file:

In myPpCmd, the occurrence of static ppModule can be thought of as declaring a new preprocessor,⁴ with ppModule being the function to run. This is accompanied by the neighbouring static Dict occurrence, which provides a way to serialise and deserialise the arguments passed to preprocessor invocations.
We register one rule per each module to pre-process, which means that external build tools can re-run the preprocessor on individual modules when the source *.hs-mypp file changes.

Conclusion

This post has introduced build-type: Hooks for the benefit of package authors who use build-type: Custom. We hope that this introduction will inspire and assist package authors to move away from build-type: Custom in the future.

We encourage package maintainers to explore build-type: Hooks and contribute their feedback on the Cabal issue tracker, helping refine the implementation and expand its adoption across the ecosystem. To assist such explorations, we also recall the existence of the Cabal Hooks overlay, an overlay repository like head.hackage which contains packages that have been patched to use build-type: Hooks instead of build-type: Custom.

In addition to the work described here, we have done extensive work in cabal-install to address technical debt and enable it to make use of the new interface as opposed to going through the Setup CLI. The changes needed in cabal-install and other build tools (such as HLS) will be the subject of a future post.

While there remains technical work needed in cabal-install and HLS to fully realize the potential of build-type: Hooks, it should eventually lead to:

decreases in build times,
improvements in recompilation checking,
more robust HLS support,
removal of most limitations of build-type: Custom, such as the lack of ability to use multiple sublibraries,
better long-term maintainability of the Cabal project.

Well-Typed are grateful to the Sovereign Tech Fund for funding this work. In order to continue our work on Cabal and the rest of the Haskell tooling ecosystem, we are offering Haskell Ecosystem Support Packages. If your company relies on Haskell, please encourage them to consider purchasing a package!

See, for example, Cabal issue #3600.↩︎
e.g. --package-db=<pkgDb>, --cid=<unitId> and --dependency=<depPkgNm>:<depCompNm>=<depUnitId> arguments↩︎
The cabal-version field of a package description specifies the version of the Cabal specification it expects. As the Cabal specification evolves, so does the set of flags understood by the Setup CLI. This means that, when invoking the Setup script for a package, the build tool needs to be careful to pass arguments consistent with that version; see for instance how cabal-install handles this in Distribution.Client.Setup.filterConfigureFlags.↩︎
In practice, this means adding an entry to the static pointer table.↩︎