Skip to content

Bring all packaging layers into a single fixpoint #273815

@roberth

Description

@roberth

Note

This proposal is scoped to more or less mkDerivation, so individual packages, and it does not affect package sets in any significant way.

Describe the problem

This issue proposes a solution to a number of problems.

  • Interface for package overriding is inconsistent
  • RFC 92 dynamic derivations-based packages will not be representable by mkDerivation
  • Packages leak their implementation details, without any sign that they are implementation details
  • It is unclear whether an attribute is meant to be used by the builder (script) of the derivation
  • Isolating the package derivation from documentation tools is not feasible despite multi-output derivations
  • It is almost impossible to write composable, reusable pieces of logic that affect multiple arguments of mkDerivation
    • mkDerivation has significant pressure to grow, become huge, incomprehensible, slow, and mass-rebuild inducing

When packaging, we have to keep six+ layers of attribute sets in mind for various purposes.
That is quite a lot, and it leads to issues when one needs to access information from a particular layer in another layer.
For instance, overriding functions are generally only available for ~3 of the layers, and using some may revert the effect of previous overrides.

As an example, the layers of a Python package are:

  • package function args (ie callPackage, .override)
  • mkPythonModule args
  • mkDerivation args
  • a stdenv adapter, maybe
  • derivation args (technically inaccessible, but may leak to next layer)
  • package attributes
  • package attributes for a different output (pkg.dev != pkg, but is very similar)
  • cross splicing

Proposed solution

The layers themselves are valid, mostly, but the way they are composed, by "ad hoc" functions, is the cause of aforementioned problems. Instead, we may compose them in a manner that may be somewhat familiar from the module system.

Use of the module system has been explored extensively and successfully by @DavHau and dream2nix. However, Nixpkgs has such a scale that we need to care far more about even a constant factor overhead, such as would be imposed by the module system. Experimentally, we have seen that this pretty much rules out wide use of such a feature rich system.

However, that does not mean that we need to reject what I would consider the core features of the module system: a fixpoint of a monoid. What does that mean?

Fixpoint: we declare things using functions, where the argument is the "final" result. This recursion allows access to "variables" or "option" in a way that might feel similar to fields in an object oriented language.
Monoid: we don't use a single function, but multiple, and their results are merged in some way.

In the module system, this merging operation is very elaborate, and therefore a source of evaluation overhead. This must be avoided.
A possible, lighter weight version of this is the merge operation in minimod. An even lighter alternative is that of overlays: little more than //, also known as attrsets.merge, which is non-recursive and minimally helpful.
The exact merging semantics is to be decided. It will be easier to do so when we have a prototype that we can benchmark and play around with.

And that brings us to the final crucial element, which is overriding. The module system uses numeric priority markers to specify which definitions win, whereas with overlays, the last overlay composition operation wins. That latter is more efficient, but harder to use, as any merging needs to be specified by hand, e.g. buildInputs = o.buildInputs or [] ++ ... etc.
Finally, we may consider leaving all merging behavior up to the user, and let them pick between such methods. This would be most flexible, but imposes more complexity on package authors.

What might this look like? Assuming we go with a limited amount of merging, and a last-wins overriding system as described, a package might as follows. Some required attributes, such as setup.name are omitted.

mkDerivationPackage({ pythonAttrs, setup, drvAttrs, public, ... }:
  # Don't need to get stuff from four levels deep
  # setup: the generic shell script from "stdenv"
  setup.configureFlags = ... (optional drvAttrs.doCheck "--enable-tests") ...;
  setup.doCheck = true; # like mkDerivation { doCheck } argument
  # No more passthru. It's symmetric now.
  public.tests = callPackage ./tests { mypkg = public; };
)

An override may look like:

pkg.override (self: { setup, ...}: {
  setup.buildInputs = super.setup.buildInputs ++ [ somePkg ];
  derivation.disallowedReferences = [ ];
  # Add a marker that isn't propagated to the derivation.
  public.hasSomePkg = true;
})

The "replacement" of callPackage may look like

# top-level.nix (or RFC 140 impl)
pkgs.mkDerivationPackage ../foo.nix;
# ../foo.nix
{ deps, ... }:

{
  deps = { pkgs, ... }: {
    # Some defaults may be obvious
    hello = pkgs.hello;
    # But here we introduce `boost` as a stable identifier to allow overriding without knowledge of the current default attribute.
    boost = pkgs.boost_180;
  };
  setup.buildInputs = [ deps.boost ];
  setup.nativeBuildInputs = [ deps.hello ];
}

Implementation

In order to implement this functionality in a sustainable manner, we need to disentangle the setup -> derivation transformation that is currently implemented in make-derivation.nix, mixed with parts of the implementation of overrideAttrs.
Taking this apart will greatly benefit readability, but more importantly allow us to improve the overriding mechanisms without unsustainable code duplication with mkDerivation.

The first layer to implement is the package attrs layer. This layer is fairly simple. It is responsible for taking the public attribute from the fixpoint and returning it.

The next layer to implement is the derivation layer. It writes the result of builtins.strictDerivation to public, perhaps with minor tweaks. This is more efficient than builtins.derivation and it implements a separation between a package's public, supported interface, and attributes that are implementation details. See #217243.

Then the disentangled mkDerivation logic can be applied, taking arguments from setup (or some other name; TBD), and returning into the derivation attribute of the fixpoint. This should not include redundant features such as passthru or overrideAttrs, whose implementations stay behind in mkDerivation.

For the multi-outputs wrapper functionality that presents alternate outputs as full blown package attrsets, I would suggest to provide an alternative attribute outputs = { bin = "<store path string>"; dev = "<store path string>"; }, which is sufficient for almost all usages, and more efficient to evaluate. Legacy output attributes may still be provided, but should be avoided by library code and builder functions if outputs is an attrset.

I am confident that we can combine all layers except perhaps cross splicing into a single fixed point. Cross splicing does not represent a 1:1 function invocation, so may need to remain similar to today, although I wouldn't exclude the possibility of making improvements in that area.

For the details of the deps pattern, refer to dream2nix.

At this point, the mkDerivation functionality has been replicated in a better way that solves most of the problems. Porting the language infrastructures to the new style will take further effort. As similar approach to that with mkDerivation can be applied.

Additional context

Notify maintainers

@infinisil @DavHau may already be somewhat familiar with the concepts.

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

[user@system:~]$ nix-shell -p nix-info --run "nix-info -m"
output here

Add a 👍 reaction to issues you find important.

Metadata

Metadata

Assignees

No one assigned

    Labels

    0.kind: enhancementAdd something new or improve an existing system.1.severity: significantNovel ideas, large API changes, notable refactorings, issues with RFC potential, etc.6.topic: stdenvStandard environment

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions