This is the twenty-second edition of our GHC activities report, which describes the work on GHC, Cabal and related projects that we are doing at Well-Typed. The current edition covers roughly the months of December 2023 to February 2024. You can find the previous editions collected under the ghc-activities-report tag.
Many thanks to our sponsors who make this work possible: Anduril, Hasura and Juspay. In addition, we are grateful to Mercury for funding specific work on improved performance for developer tools on large codebases, and to the Sovereign Tech Fund for funding work on Cabal.
However, we need more sponsorship to sustain the team! If your company might be able to contribute funding to sustain this work, please read about how you can help or get in touch.
Of course, Haskell tooling is a large community effort, and Well-Typed’s contributions are just a small part of this. This report does not aim to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC!
Team
The GHC team at Well-Typed currently consists of Ben Gamari, Andreas Klebinger, Matthew Pickering, Zubin Duggal, Sam Derbyshire and Rodrigo Mesquita, with Hannes Siebenhandl joining the team in January and Finley McIlwaine moving to another client project. In addition, many others within Well-Typed are contributing to GHC more occasionally.
Releases
Zubin released GHC 9.6.4 in January and GHC 9.8.2 in February. We are now working towards the release of GHC 9.10 later in the year. Check out the GHC status page for more information on release plans.
Eras profiling
Matthew and Zubin recently implemented a new profiling mode, eras profiling, that can give insight into when particular objects are allocated. This can be a great boon in diagnosing memory leaks in long-running programs.
Check out our blog post introducing eras profiling for more information about this new feature, and an exploration of how we used this new profiling mode to diagnose a memory leak in GHCi. Matthew also used eras profiling to diagnose a space leak in GHC’s simplifier (!11914).
The combination of eras profiling and ghc-debug
works particularly well for
analysing memory leaks, so Zubin has been making various improvements to ghc-debug
(MR 32),
including improving how it handles profiled executables
(MR 35,
MR 36).
A new home for GHC’s internals
GHC’s base
library has long served a dual purpose: on one hand it is the
user-facing standard library interface, but at the same time it
contains many internal details used to implement the standard library.
This dual purpose lead to problems for both implementors and users alike, as
internal interfaces are freely interspersed with long-stable interfaces intended
for general consumption. Even worse, the documentation of base
often provided
little guidance to users regarding which interfaces fell into which category.
Earlier this year, the Core Libraries Committee and GHC Team
agreed a path to improve this situation
by splitting base
into three libraries: base
, ghc-internal
,
and ghc-experimental
. Our hope is that this approach will allow us to solve
several problems at once:
base
gives users a clearly-demarcated set of stable interfaces, overseen by the Core Libraries Commiteee.ghc-experimental
gives developers of new language and library features a dedicated place to iterate on their designs while still allowing usage to users willing to accept a slightly lower degree of stability.ghc-internal
provides a home for internal implementation details that are not intended for consumption by users, and potentially change from release to release.
Ben has been working on implementing this split by separating out definitions
that belong in the ghc-internal
package (!11400).
This split has lead to a number of improvements across the ecosystem,
ranging from Haddock improvements (see Haddock issues 1629,
1630)
to compiler bug-fixes (#24436) and implementation cleanups (#24472).
Exception backtraces
Ben has been working to land his long-running and long-awaited Exception Backtrace Proposal (!8869) following extensive discussions with the Core Libraries Committee. This is expected to form part of GHC 9.10 and will be a major step towards making exception diagnosis easier for users.
GHC Steering Committee and GHC2024
Adam has now taken on the role of Secretary to the GHC Steering Committee, following Joachim Breitner stepping down after many years of dedicated service in the role. His first major task as secretary has been seeking new volunteers to serve on the commitee. If you would be interested, please read more and get in touch.
The committee has updated the collection of recommended language extensions by
introducing GHC2024
.
GHC 9.10 will ship with GHC2024
available (!12084), but it is unclear when it
will become the default (see ghc-proposals
MR
632).
STM correctness and performance
Andreas has been diagnosing progress and performance issues with STM prompted by a user reporting STM starvation problems (#24142). In particular:
STM transaction performance scales badly with the number of
TVar
s involved (#24410), because the current implementation uses a linked list to keep track of allTVar
s used by a transaction. Ben explored one approach for improving this situation, using a hashmap for these lookups (!12030).Transactions with a large number of
TVar
s may perform badly (#24427) due to a check performed by the RTS each time Haskell threads return to the scheduler. This check identifies potentially non-terminating STM transactions by validating the transaction’s view of the STM memory against the memory’s current state. While very useful, this check is somewhat costly to perform, and under the current implementation can also lead to false negatives when multiple validations happen in parallel. It is likely that the best solution for this issue is to perform validations less frequently, especially on long running transactions.In pathological cases, two transactions run in parallel may be unable to make progress (#24446), even if all transactions are read only. This should be solvable with a rework of how
TVar
s are locked during validation.
Unfortunately, fixing these issues will require further work.
Specialisation and late plugins
Finley has been exploring techniques to make it easier to diagnose issues with specialisation in large applications, such as poor runtime performance due to overloaded calls not being specialised. One workaround for such problems is exposing all unfoldings and using aggresive specialisation, but this tends to lead to poor compile-time performance instead.
Motivated by these investigations he added “late plugins”, which are plugins that are run at the very end of the Core pipeline, after the addition of late cost centres (!11765). This allows plugins to analyse and modify the Core that is compiled down to STG, without the changes ending up in interface files.
Cabal
Matthew, Rodrigo and Sam have been working to address longstanding architectural
and maintenance issues in the Cabal
library and the cabal-install
build
tool. This work is being supported by the
Sovereign Tech Fund as discussed in our
previous blog post.
Some of the changes have included:
Designing and implementing a new
build-type: Hooks
feature to provide a path towards deprecatingbuild-type: Custom
. Based on community feedback, Sam iterated on the design, with a particular focus on pre-build rules, arriving at a design inspired by Cloud Haskell, using static pointers. See the detailed HF Tech Proposal for an in-depth explanation of the design and its benefits. The implementation is now being prepared for review (PR 9551).Disentangling implicit global state from the
Cabal
library, allowing it to take a working directory as an argument instead of using the working directory of the current process (PR 9718). This is intended to allow directly calling theCabal
library to build packages in a concurrent setting.Working on a design and prototype implementation for private dependencies (issue 4035), allowing packages to express the fact that they do not expose any types from a dependency in their API. This gives greater flexibility to construct build plans, potentially making library version upgrades easier, and allows tests and benchmarks to compare different versions of the same library.
Making the testsuite more robust, including refactoring it to run tests in a separate temporary directory so they are not influenced by the external configuration of the user’s system (PR 9717).
Allowing per-component builds with Haskell Program Coverage (HPC) information (PR 9464).
Refactoring to eliminate long-standing code duplication that was a regular source of bugs in the logic for building components (PR 9602) and in glob support (PR 9673).
Fixing several longstanding bugs with the install command often ignoring CLI flags (PR 9697).
Robustly handling the same GHC version having been compiled from source multiple times (PR 9618), as the GHC version number is not enough to ensure ABI-compatibility.
Many more bug fixes and refactorings to improve maintainability and robustness of the codebase (e.g. PR 9524 PR 9554).
GHC bug fixes
Ben investigated memory-ordering issues using ThreadSanitizer and fixed numerous data races (!9372, !11795, !11768).
Ben fixed a thread-safety issue due to GHC’s use of the C
strerror
utility (#24344).Sam fixed a 9.8 regression in shadowing error messages involving record fields with no field selectors (!11981).
Hannes fixed a 9.8 regression in how Haddock resolves qualified references (!11920).
Zubin fixed a regression in which GHC reported a poor error message in the presence of module cycles including hs-boot files (!11718, !11792).
Zubin fixed cross-module module breakpoints using incorrect cost centres (!11892).
Sam and Andreas fixed a variety of bugs in the handling of fused-multiply-add primops that were added in GHC 9.8.1 (!11587, !11893, !11902, !11987).
Ben fixed a subtle bug in the implementation of unique generation on 32-bit platforms (!11802).
Andreas fixed a bug in the C foreign-function interface that was introduced by using sub-word-sized arguments (!11989).
Zubin set
-DPROFILING
when compiling C++ sources with profiling (!11871).Matthew fixed an off-by-one error when handling info-table provenance entries (!11873).
Zubin fixed a bug with ghcup-metadata generation (!11791).
Zubin updated the users’ guide to take into account the unrestricted overloaded labels GHC proposal, which landed in GHC 9.6 (!11774).
Hannes fixed a bug arising from GHC being installed at a filepath that includes spaces on Windows (!11938).
Build system, CI and distribution improvements
Ben carried out a number of submodule bumps in preparation for the GHC 9.10 release.
Rodrigo allowed the configure script to use
autoconf
2.72 (!11942).Matthew fixed a bug in the configuration of
hsc2hs
when building GHC, which was the source of linker errors (#24050, !11384).Matthew updated the CI images, with a particular focus on improving the testing of the LLVM backend on CI (#24369, !11976).
Matthew ensured that documentation is built on more configuration in CI (e.g. on alpine, rocky8, Windows, Darwin) (!12134).
Ben adapted GHC to LLVM’s new pass manager CLI (!8999).