This is the twenty-third edition of our GHC activities report, which describes the work Well-Typed are doing on GHC, Cabal, HLS and other parts of the core Haskell toolchain. The current edition covers roughly the months of March to May 2024. You can find the previous editions collected under the ghc-activities-report tag.

Sponsorship

We are delighted to offer new Haskell Ecosystem Support Packages to provide commercial users with access to Well-Typed’s experts while investing in the Haskell community and its technical ecosystem. If your company is using Haskell, read more about our offer, or get in touch with us today, so we can help you get the most out of the toolchain, and continue our essential maintenance work.

Many thanks to our existing sponsors who make this work possible: Anduril and Juspay. In addition, we are grateful to Mercury for funding specific work on improved performance for developer tools on large codebases, to the Sovereign Tech Fund for funding work on Cabal, and to the HLS Open Collective for funding work on HLS. Of course, Haskell tooling is a large community effort, of which Well-Typed’s contributions are just one part. We are immensely grateful to everyone contributing to the Haskell ecosystem!

Team

The GHC/Cabal/HLS team at Well-Typed currently consists of Ben Gamari, Andreas Klebinger, Matthew Pickering, Zubin Duggal, Sam Derbyshire, Rodrigo Mesquita, Hannes Siebenhandl and Mikolaj Konarski. In addition, many others within Well-Typed are contributing to GHC more occasionally: Adam Gundry is secretary to the GHC Steering Committee, and this month’s report includes contributions from Duncan Coutts and Finley McIlwaine.

GHC Releases

Ben released GHC 9.10.1 in May. This includes some significant steps forward, including:

Zubin released GHC 9.6.5 in April, and published a blog post for the GHC developer blog on GHC release plans. Check out the GHC status page for up to date information on releases.

HLS

Thanks to ongoing support from the HLS Open Collective, Zubin released HLS 2.8.0.0 in May, and Well-Typed continue working on keeping HLS maintained and up to date with new GHC releases. In particular, Zubin and Hannes are working towards supporting GHC 9.10 and preparing to release HLS 2.9.0.0.

Cabal

Mikolaj is working as a maintainer of Cabal, supporting users and contributors. He coordinated the release of Cabal 3.12 as part of GHC 9.10 and assisted in releasing it as a standalone library, updating, documenting and streamlining the release process. He’s taking part in the release effort for version 3.12 of the cabal-install build tool.

Matthew, Rodrigo and Sam have been working to address longstanding architectural and maintenance issues in the Cabal library and the cabal-install build tool, thanks to support from the Sovereign Tech Fund. See our introductory blog post and the previous activities report for more details. This has included a wide range of bug fixes and code refactorings, as well as the development of specific new features.

A new home for GHC’s internals

Ben has been working for some time on creating the ghc-internal package to clearly distinguish user-facing APIs (in base) from compiler implementation details (in ghc-internal). This saw its first public release alongside GHC 9.10.1.

As far as possible, we want to make implementation details such as the existence of the ghc-internal package invisible to end users, but perhaps inevitably, the split exposed various issues where this was not the case, particularly in Haddock. In addition, compiler plugins that mistakenly hard-code references to identifiers in base may break due to internal identifiers moving to ghc-internal. Ben fixed several ghc-typelits-* plugins to resolve identifier locations correctly, thereby avoiding this problem (#24680).

More work is needed to gradually disentangle implementation details from user-facing APIs, and deprecate the parts of base that are not intended for direct use by users, in collaboration with the Core Libraries Committee.

Specialisation

Finley published a two-part series of blog posts on Choreographing a dance with the GHC specializer:

  • Part 1 acts as a reference manual documenting exactly how, why, and when specialization works in GHC.

  • Part 2 introduces new tools and techniques we’e developed to help make more precise, evidence-based decisions regarding the specialization of our programs.

Andreas added a new -fexpose-overloaded-unfoldings flag to GHC (!9940), allowing specialisations to fire without the full overhead of -fexpose-all-unfoldings.

Haddock merged into GHC tree!

A longstanding pain point for GHC development has been that Haddock is closely coupled to GHC, but was being developed in its own repository and included via a git submodule, which complicated making changes that span both GHC and Haddock. Ben recently assisted Hécate, the Haddock maintainer, merge the submodule into the main GHC tree (#24834, !11058). This allows for subsequent simplifications to Haddock (!12743).

Profiled dynamic way

Matthew has been working on adding support for building dynamic libraries with profiling in GHC and Cabal (#15394, !12595, Cabal MR 9900).

Deterministic object code

Thanks to a lot of past work by dedicated GHC contributors, GHC produces deterministic interface files (#4012), so compiling the same source code with the same compiler will always produce the same ABI. However, GHC does not yet produce deterministic object files (#12935), so compiling identical source code may produce object files that are not bit-for-bit identical (in particular this arises when compiling multiple modules concurrently).

This is an issue for build systems that rely on hashing compilation outputs to improve performance or ensure reproducibility. Rodrigo has started work on a new effort towards deterministic object code, and has made some promising initial progress.

Cost centre profiling

Andreas modified GHC to avoid adding cost centres to static data (#24103, !12498), resulting in much smaller code sizes with -fprof-late. For a profiled build of GHC the size of build artifacts goes down by about 25% in total and we expect similar benefits for other projects.

This is a step towards making it feasible to distribute libraries compiled for profiling with late cost centres included (#21732, !10930), which will improve the profiling and debugging experience.

Segfaults / backend soundness

  • Andreas investigated and fixed a segfault due to a tag inference bug (#24870).
  • Andreas fixed a serious but thankfully hard to trigger soundness bug due to anunsound pattern match optimization (#24507, !12256).
  • Andreas fixed an issue with the FMA primop generating a wrong result on x86_64 (#24496).
  • Andreas investigated an Arm codegen issue with jumps being out of range (#24648) when linking large projects on Mac. This turned out to be a linker bug/deficiency on newer Mac linkers.

process library

Ben released two new versions of the core process library, to address several issues:

  • HSEC-2024-0003, a security advisory relating to potential command injection via argument lists on Windows.

  • The introduction of a new API System.Process.CommunicationHandle for platform-independent interprocess communication, the need for which came out of our work on Cabal.

  • Various other bug fixes and API improvements.

A new I/O manager based on io_uring

Duncan is gradually working on a long-term project to introduce a new RTS I/O manager based on the io_uring Linux kernel system call interface. This will allow asynchronous I/O for block devices such as SSDs to make significantly greater use of parallelism, improving performance for applications that make heavy use of disk I/O.

As a preparatory step, Duncan has been refactoring and improving the RTS code for I/O managers (!9676) with review support from Ben and other GHC developers.

Compiler performance and memory usage

Hannes, Zubin and Matthew have been working on reducing memory usage of GHC, GHCi and HLS and improving their performance on very large codebases, thanks to support from Mercury. This includes:

  • Using more efficient representations of interface files (!12263, !12346, !12371). This is particularly helpful when using the -fwrite-if-simplified-core option to include Core definitions in interface files for better performance. A new -fwrite-if-compression option makes it possible to select different space/time trade-offs.
  • Choosing appropriate memory-efficient data structures (!12140, !12142, !12170).
  • Making sure that -fwrite-if-simplified-core causes recompilation when appropriate (!12484).
  • Many other memory usage improvements (!12345, !12347, !12348, !12582, !12442, !12222, !12200, !12070).
  • Using a more efficient algorithm for checkHomeUnitsClosed (!12162).

Rodrigo significantly improved the performance of the dynamic linker on MacOS (#23415), finishing off and landing a patch by Alexis King to reduce dependency-loading time by looking up symbols only in the relevant dynamic libraries (!12264). GHCi load time for a client project affected by this issue went down from 35 seconds to 2 seconds.

Runtime performance

  • Andreas made the magic inline function work in the presence of casts and will look through coerce to find a function it can inline (#24808).

  • Andreas fixed an issue where the bottomness of an unreachable branch was affecting performance (#24806).

Foreign function interface

Software transactional memory

Andreas completed his deep dive into STM and identified various improvements, including making starvation less likely in some cases (#24142, #24446, !12194).

Continuous integration and testing

While producing alpha releases for 9.10, it became clear that more validation was needed to detect problems earlier.

  • Matthew improved the monitoring setup with a Grafana nightly pipeline dashboard with the ability to send alerts on nightly job failures.

  • Hannes picked up earlier work by Ben to collect CI performance metrics via perf (!7414), which will allow more precise performance analysis.

  • Matthew made various other improvements to the CI pipelines, including upgrading the runners to GHC 9.6, and extending GHC’s CI infrastructure for testing installation with ghcup to test a variety of explicit linker configurations (ghcup-ci MR 14).