I was wondering if anyone had done any analysis of nixpkgs as a giant dependency graph, and of course it turns out the folks at Tweag have already been there. If nothing else, they have fantastic graphics! It also demonstates a little what we’re letting ourselves in for
From Construction and analysis of the build and runtime dependency graph of nixpkgs
[…] some derivations have
buildInputs
orpropagatedBuildInputs
that contain themselves
how is this possible, this blow my mind
From Mapping a Universe of Open Source Software
What looks remotely like the section of a mouse brain actually represents around 46000 software packages.
The post was published in 2019, so in the span of 5 years, we have tripled the count of packages, that’s insane :o
I didnt know about those two links! And I should’ve because Ive spent a ton of time trying to recurse across packages.
For determining whether or not to recurse, we can rely on
recurseForDerivations
andrecurseForRelease
attributes. There are important sets of derivations that are not derivations while their both recurse attributes are false, such aspython3Packages
Its weird he says we can rely on it, and then immediately says an example of how it can’t be relied on. I know in my experience it is unreliable, and I don’t think his whitelist is exhaustive because some packages appear for the first time 4 attributes deep. He did a good job, but nixpkgs even in 2022 was is more messy than what he presented.
Something he didnt mention is, its not always possible to detect cycles in nixpkgs thanks to nix’s decision that a==a is false when a is a function. Because of that, it is impossible to do a generic deep comparison of attributes, and so its impossible to check if in a loop happens in the tree without falling back on something like pname and version. However not all package have pname and version, and also pname and version could be the same between two different packages (rare if any but not enforced-impossible).
There are also many errors that tryEval
cannot catch.
Since exploring the tree exhaustively is an ongoing issue (sadly), I had to programmatically control a nix repl instance as a truly reliable try-eval. And instead of pname and version as a node-id we can now use the unsafe get path (like where the code is defined) as a means of detecting loops more generically (but still not perfectly)
Doing that along with an iterative deepening algorithm, a heuristic to reduce the cost of undetectable tree-loops, and a concurrency enhancement, I was able to touch 4-attrs-deep but not fully explore it on a machine with 256Gb ram and running for about ~18 hours (after 18 hours it ran out of ram).
Its pretty bad IMO that it takes a dedicated reseach project just to … list not quite all of packages in a package repo. Also btw nix-env -qa --json
does not list all of the packages either. Which is again pretty sad.
All of this is actually pretty relevent to the repo structure discussion.
By coincidence I happened to write about this earlier today
Check nix-visualize!
It does not offer as many option as the guix graph
you mentionned, but it can generate nice dependencies graphs
T get a dot graph, you can also do the following:
$ nix shell nixpkgs#graphviz
$ nix build nixpkgs#hello
$ nix-store -q result --graph | dot -Tpng > output.png