The following blog post originally appeared as a guest post on the Skills Matter blog. Well-Typed are regularly teaching both introductory or advanced Haskell courses at Skills Matter, and we will also be running a special course on Haskell’s type system.
Statically typed languages are often seen as a relic of the past – old and clunky. Looking at languages such as C and Java, we’re used to writing down a lot of information in a program that just declares certain variables to be of certain types. And what do we get in return? Not all that much. Yes, granted, some errors are caught at compile time. But the price is high: we’ve cluttered up the code with noisy declarations. Often, code has to be duplicated or written in a more complicated way, just to satisfy the type checker. And then, we still have a significant risk of run-time type errors, because type casting is common-place and can fail at unexpected moments.
So it isn’t a big surprise that dynamically typed languages are now very fashionable. They promise to achieve much more in less time, simply by getting rid of static type checking.
However, I want to argue that we shouldn’t be too keen on giving up the advantages of static types, and instead start using programming languages that get static typing right. Many functional languages such as Scala, F#, OCaml and in particular Haskell are examples of programming languages with strong static type systems that try not to get in the way, but instead guide and help the programmer.
In the rest of this post, I want to look at a few of the reasons why Haskell’s type system is so great. Note that some of the features I’m going to discuss are exclusive to Haskell, but most are not. I’m mainly using Haskell as a vehicle to advertise the virtues of good static type systems.
1. Type inference
Type inference makes the compiler apply common sense to your programs. You no longer have to declare the types of your variables, the compiler looks at how you use them and tries to determine what type they have. If any of the uses are inconsistent, then a type error is reported. This removes a lot of noise from your programs, and lets you focus on what’s important.
Of course, you are still allowed to provide explicit type signatures, and encouraged to do so in places where it makes sense, for example, when specifying the interface of your code.
2. Code reuse
Nothing is more annoying than having to duplicate code. In the ancient days of statically typed programming, you had to write the same function several times if you wanted it to work for several types. These days, most languages have “generics” that allow you to abstract over type parameters. In Haskell, you just write a piece of code that works for several types, and type inference will tell you that it does, by inferring a type that is “polymorphic”. For example, write code that reverses all the elements of a data structure, and type inference will tell you that your code is independent of the type of elements of the data structure, so it’ll just work regardless of what element type you use. If you write code that sorts a data structure, type inference will figure out that all you require to know about the elements is that they admit an ordering.
3. No run-time type information by default
Haskell erases all type information after type checking. You may think that this is mainly a performance issue, but it’s much more than that. The absence of run-time type information means that code that’s polymorphic (i.e., type-agnostic, see above) cannot access certain values. This can be a powerful safety net. For example, just the type signature of a function can tell you that the function could reorder, delete or duplicate elements in a data structure, but not otherwise touch them, modify them or operate on them in any other way. Whereas in the beginning of this post I complained that bad static type systems don’t allow you to do what you want because they’re not powerful enough, here we can deliberately introduce restrictions to save us (as well as colleagues) from accidental mistakes. So polymorphism turns out to be much more than just a way to reduce code duplication.
By the way, Haskell gives you various degrees of selective run-time typing. If you really need it in places, you can explicitly attach run-time type information to values and then make type-based decisions. But you say where and when, making a conscious choice that you gain flexibility at the cost of safety.
4. Introducing new datatypes made easy
In Haskell, it’s a one-liner to define a new datatype with a new name that has the same run-time representation as an existing type, yet is treated as distinct by the type system. (This may sound trivial, but surprisingly many statically typed languages get it wrong.) So for example it’s easy to define lots of different types that are all integers internally: counters, years, quantities, … In Haskell, this simple feature is often used to define safe boundaries: a specific type for URLs, a specific type for SQL queries, a specific type for HTML documents, and so on. Each of these types then comes with specific functions to operate on them. All such operations guarantee that whenever you have a value of this type, it’s well-formed, and whenever you render a value of this type, it’s syntactically correct and properly escaped.
5. Explicit effects
In virtually all programming languages, a function that performs some calculations on a few numbers and another function that performs the same calculations, but additionally sends a million spam emails to addresses all over the world, have exactly the same type, and therefore the same interface. Not so in Haskell. If a function writes to the screen, reads from the disk, sends messages over the network, accesses the system time, or makes use of any other so-called side effect, this is visible in its type. This has two advantages: first, it makes it much easier to rely on other people’s code. If you look at the interface and a function is effect-free, then you for example automatically know that it is also thread-safe. Second, the language facilitates a design where side effects are isolated into relatively small parts of the code base. This may seem difficult to achieve for highly stateful systems, but surprisingly, it usually is not: even interactive systems can usually be described as pure functions reacting to a series of requests with appropriate responses, and a separate driver that does the actual communication. Such separation makes it not only easier to test the program, but also facilitates the evolution of the program such, for example, to adapt it to run in a different environment. Haskell’s type system therefore encourages good design.
6. Types as a guide in program development
If you only ever see types as a source of errors, and therefore as enemies on your path of getting your program accepted, you’re doing them injustice. Types as provided by Haskell are an element of program design. If you give your program precise types and follow systematic design principles, your program almost writes itself. Programming with a strong type system is comparable to playing a puzzle game, where the type system removes many of the wrong choices and helpfully guides you to take the right path. This style of programming is supported by a new language extension called “Typed Holes” where you leave parts of your program unspecified during development, and obtain feedback from the development environment about what type has to go into the hole, and what program fragments you have available locally to construct a value of the desired type. Playing this kind of puzzle game is actually quite fun!
7. Programming on the type level
Haskell’s type system provides many advanced features that you don’t usually have to know about, but that can help you if you want to ensure that some complicated invariants hold in your program. Scarily named concepts such as “higher-ranked polymorphism”, “generalized algebraic datatypes” and “type families” essentially provide you with a way to write programs that compute with types. The possibilities are nearly endless. From playful things such as writing a C-printf-style function where the first argument determines the number of arguments that are expected afterwards as well as their types, you can go on to code that provides useful guarantees such as that mutable references that are available within one thread of control are guaranteed not to be accessed in a completely different context, arrays that can adapt to different internal representations depending on what type of values they contain, working with lists that are guaranteed to be of a specific length, or with trees that are guaranteed to be balanced, or with heterogeneous lists (where each element can be of a different type) in a type-safe way. The goal is always to make illegal inputs impossible to construct. If they’re impossible to construct by the type system, you can isolate sanity tests at the boundary of your code, rather than having to do them over and over again. The good thing is that these features are mostly optional, and often hardly affect the interface of libraries. So as a user, you can benefit from libraries employing such features and having extra safety guarantees internally. As a library writer, you can choose whether you’re happy with the normal level of Haskell type safety (which is already rather a lot), or if you want to spend some extra effort and get even more.
If my overview has tempted you and you now want to learn more about Haskell, you’re welcome follow one of my introductory or advanced Haskell courses that I (together with my colleagues at Well-Typed) regularly teach at Skills Matter. These courses do not just focus on the type system of Haskell (although that’s a significant part). They introduce the entire language in a hands-on way with lots of examples and exercises, as well as providing guidelines on how to write idiomatic Haskell and how to follow good development practices.
If you already know some Haskell and are particularly interested in the advanced type system features mentioned in point 7, we also offer a new one-day course on Haskell’s type system that specifically focuses on how far you can push it.