One aspect of our work at Well-Typed is to support GHC and the Haskell core infrastructure. Several companies are providing us with funding to do this work, for which we are extremely grateful.
If you are interested in also contributing funding to ensure we can continue or even scale up this kind of work, please get in touch.
Starting immediately, we will try to provide monthly updates on the work we have been doing. In this first edition, we will cover roughly two months, June and July 2020.
Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report is not aiming to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC. Please keep doing so (or start)!
WinIO
Historically, GHC has performed IO on Windows via the
POSIX-compatible interfaces provided by libc
. While simple, this
implementation strategy has led to numerous bugs due to the impedance
mismatch between the POSIX and Windows IO models. WinIO is an overhaul
of GHC’s Windows IO subsystem providing GHC with a proper event manager
implementation built around Windows’ I/O completion port mechanism.
This brings a number of benefits:
- Unicode support will be more consistent, particularly in console applications (see #18307, #16917, #12869, #10542).
- Server applications will be able to benefit from native asynchronous I/O support.
- Interruption of console applications will work more reliably (see #13516, #13396).
- I/O operations with timeouts (e.g.
hWaitForInput
) will work consistently across file handle types (see #12873). - Many more, as summarized in #11394.
The WinIO effort was started and implemented by Tamar Christina, and
the vast majority of the work is due to his tireless efforts. Thanks
to primarily our client IOHK, we had the funding to help finish this
work and integrating it into the master
branch.
These final steps have involved a great deal of debugging, a final
code review, merging work in the Cabal
, process
and haskeline
submodules upstream, rebasing and clean-up of the version control
history, and merging to master
(!1224,
!3669).
With this work behind us, we can all look forward to more robust I/O support on
Windows in GHC 9.0. However, there remains plenty of work to be done. In particular,
taking advantage of the new I/O manager in Haskell’s foundational network
library
could bring significant performance benefits. This work may require some
rearchitecting of network
’s implementation. You can follow this work in the corresponding
network
issue #364.
Performance
We have been working on several performance-related changes.
Tag inference
One of the significant costs of GHC/Haskell’s lazy evaluation model arises from the need for checks of whether a value has been evaluated. In GHC this takes the form of a check of a pointer’s “tag bits” [Marlow2007], resulting in frequent conditional branches when scrutinizing lifted values. However, there are many cases where these checks are in principle redundant. For instance, given the program
data AType = AType { a_field :: !(Int, Char) }
f :: AType -> Int
= case x of (n, _) -> n f x
the compiler should in principle be able to exploit the fact
that a_field
is strict and of a single-constructor type.
Well-Typed GHC contributor Andreas Klebinger has recently been looking into tightening up GHC’s invariants surround code generation of strict fields and introducing a STG pass to exploit these invariants to elide tag checks.
This month we rebased this work and began a set of performance measurements in preparation for potentially merging for 9.2.
Strict dictionaries
Another performance improvement opportunity in GHC revolves
around the compiler’s treatment of dictionary arguments. GHC has
long had a flag, -fdicts-strict
, which allows the GHC’s demand
analysis to assume that ad-hoc overloaded functions place strict demands
on their dictionary arguments (see #17758).
This allows more aggressive application
of the worker-wrapper transformation thereby reducing allocations
and tag checks, as mentioned above.
Last month Andreas carried out a characterisation of the effect of enabling
-fdicts-strict
by default. These measurements confirmed that there is indeed a good improvement in
runtime performance to be had by enabling strict dictionaries. However, we also
show that there can be non-negligible regressions in compiler performance for certain
cases (e.g. Generics-heavy code, due to more aggressive simplification). This
leads us to believe that this is best enabled with -O2
only.
Improving demand analysis for recursive products
We analyzed the source of a demand analysis issue for recursive products which caused #18304. Based on this analysis a patch was committed by Simon Peyton Jones which fixed the issue but has the potential to pessimize nested data types.
We are looking into reducing the potential impact of this change for demand analysis for such types while avoiding the looping behavior noted in #18304.
Performance regression testing
We have been addressing inconsistencies in the performance testsuite driver prompted by a recent spate of seemingly-spurious CI failures due to performance tests. We have identified that these failures are in part due to the behavior of the logic which determines the baselines which serve as the basis for comparison of performance metrics. We are currently working on refactoring the driver to make this logic more predictable.
Enabling large address-space support on Windows
GHC has long used a two-step address-space allocation strategy on Linux. This allows the Haskell heap to be allocated into a contiguous block of memory, greatly improving garbage collection efficiency. In the past, we have been unable to enable this scheme on Windows due to platform limitations. This situation has changed in the past few years. For this reason we re-evaluated enabling large address-space support on Windows and found that not only did it not regress in the ways that it did in the past, but it gave quite significant performance improvements. We will be enabling large address-space support by default in GHC 9.0.
List fusion for elem
There was a buggy rewrite rule that prevented list fusion to properly work
in the rather common case when elem
is called with a constant list as its
second argument (see !2580).
Improvements to the linear register allocator
The linear register allocator now remembers past assignments, with benchmarks indicating close to 1% improvement in both run and compile time.
Better code layout for loops
GHC is now smarter when producing code layout for loops. For the library that inspired this particular change, some benchmarks improved by 10% in runtime (#18053, !3094). But most code is unaffected.
Front-end
Rebindable syntax
We have been reworking the way GHC implements rebindable syntax (see #17582 for an overview of the existing and new approaches), starting with a patch (!2960) that makes GHC rebind if
expressions with the new approach, including some general infrastructure that we will be able to reuse for other constructs. The aforementioned patch has just been merged, the next step is to move over the treatment of rebindable monad operations to the new approach.
Structured errors
We have been working on moving GHC’s error representation from textual documents to properly structured ADT values, as described in this wiki page, in order to make the life of tooling authors easier when it comes to extracting information out of errors (expressions, types, suggestions, …) when using the GHC API. The beginning of the plan described in the wiki page has been implemented and recently pushed as !3691.
Release management
We currently have three releases in-flight:
- GHC 8.8.4, which should bring a few important fixes, primarily on Windows, was released on 15th of July,
- GHC 8.10.2, which should bring fixes for Windows as well as the new non-moving garbage collector,
- the next major release of GHC, 9.0.1, has been branched and alpha releases will start shortly.
Other open-source work
In addition to our regular work on GHC, we are also performing some open-source work on other Haskell tools, such as cabal-install and Liquid Haskell. We may report on these in future blog posts. We are always interested in improving Haskell and its ecosystem. If you have a project for us, please let us know.