What You Need to Know About Lazy Evaluation in Nix

📆 Mon Jul 15 2024 by Jacek Galowicz

(10 min. reading time)

This article contains all the knowledge about laziness you need to understand 90% of more complicated Nix expressions. It also prepares you to understand Nix Overlays, which is the powerful mechanism behind Nix(OS)‘s capability to extend the package collection and override dependencies of packages.

What Does Nix Have to Do With Laziness?

“Nix” is the name for both the C++ implementation of the Nix package management tool as well as for the language in which all the packages are defined. Newcomers will often find the Nix language confusing: It is declarative, pure, functional, dynamically typed, and on top of that, lazy.

These attributes are the results of intentional design decisions. As always, design decisions bring tradeoffs between advantages and disadvantages. Most Nix beginners come from a non-functional programming background (or had to use just enough Haskell at university to dislike functional programming) when they are surprised about the behavior of scripts/programs written in Nix. The disadvantages of a technological tradeoff become apparent all by themselves, while the advantages must be actively explained.

While writing pure, functional Nix programs without loops (and other constructs that imperative programmers are used to) requires some getting used to, the feedback mechanism when you do it wrong is very quick: Trying to mutate variables or writing sequences of statements is most of the time already caught on the syntax level. But lazy evaluation is something that only strikes at run time.

🤷 Lazy Evaluation, what does it mean?

Let’s shed some light on lazy evaluation in the Nix REPL:

$ nix repl
Nix 2.23.1
Type :? for help.
nix-repl> x = 1 / 0

nix-repl> x
error:
       … while calling the 'div' builtin
         at «string»:1:4:
            1|  1 / 0
             |    ^

       error: division by zero

What do we see here? We create a new variable x with the value of 1 divided by 0, which cannot result in a useful value. It does however not lead to an error, as we might expect. The error only hits us in the second command line when we try to reference x, like when we print it.

This looks different (and as we would expect) in a Python shell:

$ nix shell nixpkgs#python3
$ python3
Python 3.12.4 (main, Jun  6 2024, 18:26:44) [Clang 16.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> x = 1 / 0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero

This means that the value of variables is not computed when they are created — it is computed when they are referenced. From a technical viewpoint, each new variable gets assigned a parameterless function that calculates its result. When the variable’s value computation is referenced, its parameterless function is executed. Such a parameterless function is also called thunk.

This is how that would look like in Python:

>>> x = lambda: 1 / 0
>>> x()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <lambda>
ZeroDivisionError: division by zero

In purely functional, lazy languages, this behavior is simply implicit, although roughly the same.

Let’s look at a more complex example to illustrate this:

let                            
    # list with two impossible values
    list = [ (1 / 0) (2 / 0) ];  
    # Variable with long-time computation
    x = computingThisTakesVeryLong;
in                             
    if builtins.length list == 2 
        then "it's all good!"      
        else x

This short program snippet creates a variable list that contains a list with two impossible values. Variable x carries another value that might not be impossible but let’s assume it takes several hours to compute it. After the simpler example from above, we know, that these values are not computed just yet, because we didn’t reference them anywhere in the first part of the let ... in ... block. In the second part, we ask for the list’s length and then decide to print a string or the value of x. Think for a second what result we will get, and why, before you continue to read.

The result is "it's all good", because the length of the list is 2, like the program expects. Computing the length of the list does not require evaluating the list items’ values! 🤯 The Nix evaluator looks at the list like this: list = [ <thunk> <thunk> ]. As long as we don’t access the individual list items, we don’t observe an error. This way of only unwrapping and computing the outermost part of data structures is also called Weak Head Normal Form (WHNF). The program snippet will also compute in a very short time because the computationally very expensive x is not referenced!

🤔 What is This Laziness Abstraction Good For?

In their paper “NixOS: A Purely Functional Linux Distribution”, Eelco Dolstra, Andreas Löh, and Nicolas Pierron explain the choice for the laziness abstraction among other things:

The choice for lazy evaluation allows us to write Nix expressions in a convenient and elegant style: Packages are described by Nix expressions and these Nix expressions can freely be passed around in a Nix program – as long as we do not access the contents of the package, no evaluation and thus no build will occur. […] At the call site of a function, we can supply all potential dependencies without having to worry that unneeded dependencies might be evaluated. For instance, the whole of the Nix packages collection is essentially one attribute set where each attribute maps to one package contained in the collection. It is very convenient that at this point, we do not have to deal with the administration of what exactly will be needed where. Another benefit is that we can easily store meta information with packages, such as name, version number, homepage, license, description and maintainer. Without any extra effort, we can access such meta-information without having to build the whole package.

This explanation is very package-centric, but the Nix package collection nixpkgs use laziness to unlock a whole different degree of cardinality.

In Eelco Dolstra’s other paper “Maximal Laziness” he also explains how laziness makes the Nix language interpreter faster and simpler.

🔁 Enter Fixpoint Recursion and Overlays

Let’s look at fixed-point combinators which is particularly easy using the lazy Nix language (Fixed-point combinators are quite a big vehicle for what we will observe in this section, but we will see). This might look like a thematic jump, but we’re introducing this now to prepare for your understanding of how Nixpkgs Overlays work. While this is more of an explanation of the underlying mechanism with simple examples, our next article will be about best practices with bigger real-life examples.

You can follow the next steps in your Nix REPL and play with them to solidify your understanding.

Let’s first implement a function called fix:

fix = f:
  let 
    x = f x;
  in x

This is the actual implementation of pkgs.lib.fix!

While it is short and simple at first glance, it is also very confusing at the second: We accept a function f, which is a function that will itself accept an input set and then return an output set. fix calls f with the variable x which is also defined… as the result of the same calculation?? 🤯

Of course, laziness is part of the answer, but it’s still hard to digest on the first attempt. Let’s play with a few minimal examples to illustrate how this works.

nix-repl> f = final: { x = 1; }
nix-repl> fix f
{ x = 1; }

This function f accepts a set called final and completely ignores it. We could call it with anything and it would still return { x = 1; }.

That means that the magic line x = f x; works because the latter x is never referenced — and Nix only computes what is referenced. So this is the first part of this magic that we need to understand.

Let’s make it one step more complicated again:

nix-repl> f = final: { x = 1; y = final.x + 10; }
nix-repl> fix f
{
  x = 1;
  y = 11;
}

Still feels magic, but we know that for calculating the value of x, we need no recursive call of f inside the x = f x expression. Only for calculating y, do we need the value of x beforehand. The value of x is set after the first call of f. This means that although x = f x looks like an infinite recursion, we need to recurse into it one time to have access to all the values! (On the CPU, no recursion happens, as we will see, but let’s ignore this part for a moment.)

To settle the beginning of our understanding, let’s make it a bit more complicated again:

nix-repl> f = final: { x = 1; y = final.x + 10; z = final.y + 100; }
nix-repl> fix f
{
  x = 1;
  y = 11;
  z = 111;
}

How often does f eventually need to be called? This illustration helps us find out:

Fixpoint Recursion Evaluation Order

x is set from the start. No need for recursion.
y references x, so it looks like we called f another time.
z references y, so it again looks like we called f another time.

In total, it looks like we have called function f only three times in our recursion. But it only looks like that - Why?

The “it looks like we called f three times” explanation is simplified to make it easier to understand how it works. To be technically correct, the function f is called only once. What we observe as a recursion is the graph of attribute set values referencing each other. This call graph was set up once by the fix f call. Observe the dashed lines in the diagram: These are all the same thunks.

We can verify that function f is only called once in the Nix REPL:

nix-repl> f = final: 
  builtins.trace 
    "called f" 
    { x = 1; y = final.x + 10; z = final.y + 100; }

nix-repl> fix f
trace: called f
{
  x = 1;
  y = 11;
  z = 111;
}

Thanks to Robert Hensing for the more detailed description at this level.

Of course, we can still trigger infinite recursions:

nix-repl> fix (final: { x = final.x + 1; })                          
{
  x = «error: infinite recursion encountered»;
}

One could naively expect that an infinite recursion has similar effects to an infinite loop: The computer runs hot. But this is not the case. Using a technique called blackholing, such infinite recursions are detected and aborted quickly.

It shall be noted that Nix is still strict in the key names in a set. If we make the key names (like x, y, z in the example) dependant on recursive calls, then we will also get an infinite recursion.

Summary

Laziness in Nix feels pretty much alien to anyone who is used to imperative programming languages. After getting to know its tradeoffs better, it becomes a useful tool to make complex code simpler.

It is also an abstraction that lets you pay only the computational costs of the code that is really being used. But this becomes only apparent on a higher scale with bigger chunks of code, just like in nixpkgs.

The minimal examples in this article are more like toys to explain the mechanism. Because of that, we will learn how laziness and fix-point recursion is used in nixpkgs with Overlays, and how you can profit from it.

In our Nixcademy classes, we typically explain such technical backgrounds instead of just providing quick-start tutorials. The reason for this didactic choice is that we want our customers to succeed when things don’t go well and be able to debug problems independently by understanding how the important parts work under the hood! 🏋💪✅

About Jacek Galowicz

Jacek is the CEO of the Nixcademy and interested in functional programming, controlling complexity, and spread Nix and NixOS all over the world. He also wrote a book about C++ and gave university lectures about software quality.