This seventh edition of our GHC activities report marks the one-year anniversary since that start of sending out these regular updates on the work on GHC and related projects that we are doing at Well-Typed. The current edition covers roughly the months of June and and July 2021.

You can find the previous editions collected under the ghc-activities-report tag.

A bit of background: One aspect of our work at Well-Typed is to support GHC and the Haskell core infrastructure. Several companies, including IOHK, Facebook, and GitHub via the Haskell Foundation, are providing us with funding to do this work. We are also working with Hasura on better debugging tools. We are very grateful on behalf of the whole Haskell community for the support these companies provide.

If you are interested in also contributing funding to ensure we can continue or even scale up this kind of work, please get in touch. If you are interested in working with us, we recently announced a hiring round.

Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report does not aim to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC. Please keep doing so (or start)!

Team

Currently, Ben Gamari, Andreas Klebinger, Matthew Pickering and Zubin Duggal are working primarily on GHC-related tasks. In addition, Alfredo Di Napoli has been doing some work on GHC in the last two months, next to other projects he is working on. Many others within Well-Typed, including Adam Gundry, Alp Mestanogullari, Douglas Wilson and Oleg Grenrus, contributed to GHC more occasionally.

Release management

  • Ben has been handling backports and release planning for the 9.2.1 and 9.0.2 releases.

  • Matt worked on the structure of the bindists produced by hadrian, now they are much more like the ones produced by make. This also fixes some other issues with the Windows packaging from the 9.0.1 release. (!6133)

  • Matt worked on fixing some issues with the 8.10.5 Darwin packaging which caused several tickets to be reported about hangs (#19950, #19968, #20004, !5992, !6003).

  • Zubin fixed some bugs with LLVM version detection in the HEAD and 8.10.5 releases (#19973, #19828, #19959).

  • Zubin has a patch in progress (!5965) that will allow the GHC library to be re-installed, so that GHC API clients (e.g. Haskell Language Server) are not restricted to the boot library versions that shipped with the compiler. This paves the way for smaller binary distributions of GHC, since users would be able to recompile the GHC library to have access to things like profiling and debug information, instead of having to ship all these configurations in the binary distribution.

Compiler error messages

  • Alfredo continued his work on GHC’s new diagnostic API (#18516, #19905). After completing the foundational work, he started to port existing GHC errors and warnings to the new API as well as fine-tuning the design (!6087, !6249, !6165, !6129, !5924, !5872, !5719). He also created a few newcomer-friendly tickets to help with the conversion work: these tickets have a lot of context to guide first-time GHC contributors towards their first merged MR. See for example #20118 and #20119.

  • Alfredo is also finalising an introductory blog post to the new GHC diagnostic API which will be published in the next few weeks.

Frontend

  • Matt has started preliminary work on fixing a long standing bug where mixing optimisation levels would lead to optimisations not firing in some cases (#12847, #13002, #20021, #8635, #9370). With the patch (!6080), the pragmas are always read from interface files but we are careful to not look when optimisation is turned off. It turns out that using some information from interface files improves compiler performance because simpler code is produced.

  • Matt has continued on his crusade to refactor and modernise GHC’s driver code (!5987, !6178). This time the code that drives --make has been in his sights. Amongst other things the patch tries to separate the specification of the build graph from the execution of the build graph, so it is possible to describe different execution strategies. The patch also simplifies (and specifies) how module cycles are compiled which has long been a pain-point for people modifying this area.

  • Matt fixed the -Wunused-packages warning to work correctly with reexported packages (!6130).

  • Zubin fixed a bug affecting Backpack users that resulted in a compiler panic instead of a type error in certain cases (#19244).

  • Ben introduced driver support for Clang’s --target flag, improving robustness of builds in multi-architecture environments (e.g. Darwin with Rosetta, #20162).

Haddock and documentation

  • Zubin rebased and improved the long pending hi Haddock work, which should allow Haddock to generate documentation using only GHC interface (.hi) files (!6224). This greatly simplifies Haddock’s implementation, and allows it to skip parsing, renaming and type-checking files if the appropriate information already exists in the interface files, speeding it up greatly in such cases. This also reduces Haddock’s peak memory consumption. Identifiers in Haddock comments will also be renamed by GHC itself, and the results are also serialized into .hi files for tooling to make use of. A number of Haddock bugs were fixed along the way (#20034, haddock 30, haddock 665, haddock 921).

GHCi and developer experience

  • Zubin improved GHCi completion to better support Unicode characters and operators, fixing a bug in the 9.2 pre-release, which erased the entire line the user typed if completion was triggered on an operator name (#20101, !6160).

  • Matt has fixed a number of 9.2 regressions involving GHCi (!6032, !6090).

Profiling and debugging

  • Matt took ghc-debug for a test on a puzzling profile presented by a user (#20065) which seemed to have a large discrepancy between live bytes and the information reported in the profile. It turned out that the application had a severely fragmented heap, which was easy to diagnose and observe using ghc-debug.

  • Andreas is still working on ways to make perf and similar tools work well on Haskell code. He wrote a blog post with more details for the curious.

Compiler performance

  • Matt squashed a leak in the simplifier which should reduce maximum residency for all programs, and in particular reduced maximum residency in the test from 2GB to 1.3GB (!6202).
  • Matt found a very subtle space leak caused by a reference being retained on the stack longer than necessary (!6185).
  • Andreas improved register allocation performance under high register pressure (!6209).

Runtime performance

Compiler correctness

  • Ben wrote a blog post motivating the keepAlive# operation introduced in GHC 9.0, as well as several of the considerations relevant to its design.

  • Ben performed a refactoring of GHC’s “adjustor” mechanism used by some foreign calls, fixing a bug manifesting with some newer libffi versions (#20051) while fixing a few nearby libffi-related bugs (#19869).

  • Ben collected and characterised a number of issues manifesting on AArch64/Darwin which were ultimately found to be due to the rather peculiar ABI of that platform (#20079). He performed an audit (#20085) of Hackage packages looking for similar issues in common packages and wrote a blog post providing advice to users for writing portable, robust foreign library bindings.

  • Ben carried out a thorough refactoring of the internals of the process library, fixing a subtle correctness bug manifesting under Darwin (#19994) while reducing process spawn cost in many cases.

  • Matt started looking into an old static pointers correctness issue (#16981) which a few users had commented on recently. We know what the problem is but it seems that to fix the ticket robustly a more invasive change will be needed to how static pointers are compiled.

Runtime system

  • Andreas enabled the pthread-based RTS ticker implementation by default for the single-threaded RTS (!6158), improving compatibility with foreign libraries using signal-based alarms.

  • Ben diagnosed and fixed a bug the GHC runtime’s threading abstraction leading to severe GC performance regressions in 9.2 and master (#20144).

  • Ben diagnosed and fixed a subtle bug in the non-moving garbage collector due to an inconsistency in size units in the array write barrier implementation (#19715).

CI and infrastructure

  • Matt has added support to head.hackage to run a test-suite of programs. This replaces tests in GHC’s testsuite which depended on external packages and hence were never executed during normal test runs. Now it will be straightforward to add tests with more complicated dependencies.

  • Matt worked on GHC’s performance dashboard infrastructure, using the data collected during head.hackage and validation builds to monitor GHC’s compilation performance.

  • Ben migrated GHC build artifacts and Docker images to local storage to improve service availability.

  • Ben refactored the GHC CI infrastructure on Darwin to make it uniform with other platforms and reducing the potential for nix paths leaking into binary distributions (#20131).