SIG Repos: How should they work?

getchoo · May 5, 2024, 5:18am

i think we should have some standard across each repo in order to ease onboarding contributors and helping with tree-wide code review

this is the most logical jumping off point to me. it has been proven extremely successful in nixpkgs, doesn’t arbitrary place many apps in overlay broad categories like “misc”, is easy for new contributors to understand, aids in navigation, and still keeps thing very organized. some wiggle room here (and other standards we set) should be available, though – after all some SIGs will probably have their own specific needs.

we already have a tool for this in core with lib.packagesFromDirectoryRecursive (thanks to @aemogie for introducing this to me a while ago!), which is fairly customizable and could provide that aforementioned wiggle room. if anyone wants to a see a basic example of this in the wild, you can here

getchoo · May 5, 2024, 6:01am

just to get this out of the way, we should not be using overlays in anything to do with how end users will consume our flakes. this has many footguns which are explained in full detail here

the beginning of this example is also flawed. quoting myself from the aforementioned PR where i made my proposal:

this is assuming that a package re-used elsewhere is exclusive to a SIG/language-specific repo…which i don’t think should happen. core is currently described as “a common set of packages and tools used across Auxolotl,” which to me implies that package used across multiple repositories will be there (like node-gyp), not individual repos. if common packages are in a SIG/language-specific repo, something has gone wrong. i don’t believe there should be any situation where a javascript repo should depend on the entire python repo

for the record i would amend this now to say core and top-level would be used for projects that are (reverse) dependencies of multiple SIGs, but my point still stands: python should not depend on javascript and vice versa. (reverse) dependencies on anything besides core should be relegated to a higher (lower?) tiers in this tree like gnome/kde and top-level, or just core itself in order to block circular dependencies from happening in the first place – since as i think we’ve all seen by now, circular dependencies introduce massive complication and maintenance burden

srxl · May 5, 2024, 6:07am

I think I have a solution that will work:

Let’s say we have 4 repos: Core, A, B, and top-level. Their dependency graph looks like this, where an arrow pointing to a repo means “this repo is an input of the one it’s pointing to”:

  Core
  /  \
 v    v
 A<-->B
 \    /
  v  v
top-level

We make a change in Core, and need to propagate it to A, B, and top-level. We can safely do this with the following input definitions:

Repo A:

inputs = {
  core.url = "github:auxolotl/core";
  b = {
    url = "github:auxolotl/b";
    inputs.core.follows = "core";
  };
}

Repo B:

inputs = {
  core.url = "github:auxolotl/core";
  a = {
    url = "github:auxolotl/a";
    inputs.core.follows = "core";
  };
}

top-level:

inputs = {
  core.url = "github:auxolotl/core";
  a = {
    url = "github:auxolotl/a";
    inputs = {
      core.follows = "core";
      b.follows = "b";
    };
  };
  b = {
    url = "github:auxolotl/b";
    inputs = {
      core.follows = "core";
      a.follows = "a";
    };
  };
}

With this, updates to core proceed as follows:

Make a change to Core
Someone/something (manual, CI, etc.) creates update PRs for only Core in both A and B, and top-level
PRs are merged in any order

Now, Core is updated everywhere, and the updated version of Core is included in the dependencies of each repo, no matter which individual repo is being pulled - the follows definitions ensure Core remains in sync across all inputs.

Similarly, to update A:

Make a change to A
Someone/something creates update PRs for only A in both B and top-level
PRs are merged in any order

Again, the follows definitions will ensure that the new version of A is propogated everywhere.

What about the case in which A is updated, B has the update merged, but top-level does not? In this case, pulling top-level will still include the old version of A everywhere, because top-level propogates A to all dependencies.

In short - for every input A in a given repo, A must have follows specified for every Auxolotl repo. In doing so, each repo is responsible for locking all other Auxolotl dependencies, and no one repo depends on another to keep shared dependencies in sync with each other.

One pitfall I see is repo B introducing a breaking change that breaks A, and top-level updating B - this would break A in top-level. I am currently rotating this situation in my head - I will report back if I find a solution.

imadnyc · May 5, 2024, 6:16am

In my example, it would create an overlay of the derivation and copy it over, so you’re not importing a new version of the repo you’re copying from. In the example, python isn’t instantiating a new version of JS SIG, but overlaying the input to C with a local derivation that now resides in Python SIG. This would be then copied back over the JS SIG and deleted from python by the bot keeping track, so it wouldn’t have another input to JS SIG’s top level that wasn’t already imported in Python’s top level. Of course, this doesn’t solve the issue of flake import circles.

The issue with making sure that each SIG is “pure” is that it creates a ton of maintainer burden and slowly creeps towards a monorepo, no? For each combination we need to pass it off to a specific SIG (JS + Python SIG, JS + Python + Rust SIG), or we give it to top-level, which will increase in size as time goes on. And each SIG gets smaller and smaller as the dependency graph gets more hairy no? If not I’d appreciate more clarification, I admit I’m having a little trouble picturing the graph in my head.

imadnyc · May 5, 2024, 6:30am

LIke I said in my other response I’m having a little trouble envisioning the graphs (it’s 2am and I really should get some sleep but it is what it is), so bear with me. Isn’t this slower than the monorepo approach? I’m imagining we have A-E repos under core, and say we have a package in E that relies on D which relies on C which relies on … and so on. We need to wait for the A input to the flakes to update and propagate through builds, then the B, then the C, then the D, and so on — it could be a lot of flake input refresh cycles before something is available to be updated (also, if there’s circles, it could be a>b>c>b>c>d>e). So if the maintainers of E want to update their package, they need to wait for a, then b, then c, before they can finally get to looking at it, even before any potential troubleshooting the updates might bring. It could be overcome by updating inputs often, but at that point isn’t that just getting closer and closer to a monorepo that updates it’s “inputs” instantaneously?

I’m still of the mind that AuxPkgs should be a monorepo to end user and separate repos just for maintainers.

getchoo · May 5, 2024, 6:31am

in general, this is reintroducing the problem of every package in A (let’s say python) depending on B (let’s say javascript) and vice versa. i’m not really understanding the focus on circular dependencies here, either. i genuinely do not see a case where it’s worth keeping repos independent, but then introducing huge dependency like all javascript packages for what would be (relatively) few python packages

the is in contrast to the tiered approach which makes sure our repositories for shared content (core and top-level) are for shared content, and language specific repositories are for language specific content. it also limits the many troubles we’re currently trying to work around with in regards to maintenance and dependencies outside of just core to lower/higher (i still haven’t decided on if core should considered the highest tier or top-level…so bear with me :p) tiers, where SIGs (like gnome and plasma) will be much smaller in comparison to the broad python/js/etc repositories. i think this gives them a very distinct opportunity to handle situations like this, as in general they will be much more specialized and not need to account for anything that could fall under the umbrella of a whole language

besides that, i don’t really see much in this message that explains how it would be a better solution than the proposal i made that vlinkz linked above – especially when you remove the assumption of circular dependencies being a requirement. i’m really reading this as a continuation of the original proposal more than anything

srxl · May 5, 2024, 7:00am

Hmm. I’m playing around with some ideas locally and it’s starting to become obvious to me that it’s very, very difficult to anticipate every permutation of outcomes here. I’m starting to think it might just be safest to go with the “tiered approach” for now, and try to address any problems as they come up. I think we will see problems - but the solutions to them will be much easier to find when we actually run into them, instead of simulating endless hypotheticals. Ideally, most of the problems will crop up during our bootstrapping phase, where we still have flexibility to make changes, and by the time everyone starts to stabilize out, we’ll have ironed out all the kinks.

It’s a play that doesn’t come without risks, but I think it’s the play we have to make if we want to move forward at this point.

imadnyc · May 5, 2024, 7:02am

Would you mind sharing what you have so far? I’d like to test some stuff locally too and would appreciate what you have as a jumping off point.

srxl · May 5, 2024, 7:07am

Here’s a gzipped tarball containing the repos I’ve been working with. Note the flake inputs are all absolute paths (to my knowledge, you can’t have relative paths as flake inputs) - you’ll need to change them to get them working on your system.

getchoo · May 5, 2024, 7:10am

i think this proposal is a very good example of exactly why we should be avoiding circular dependencies. throwing automatically created overlays around and ensuring their creation and deletion across multiple repositories is not just a massive technical hurdle in both the context of nix and creating a bot to do this in the first place, but also a huge point of failure. if any step in this process were to go wrong, it would most likely require manual intervention that i’m betting very few people would even be able to perform given the extreme complication of this process. i don’t see why we should increase our bus factor and inter-tree workflow complication just to support circular dependencies

i’m not sure how? the SIGs with the most work to go through (language specific ones) would only need to worry about PRs being sent to them, and occasionally updating core for packages deemed important enough to be shared at our lowest (highest? i still haven’t made up my mind!) tier at core. i’m pretty sure that regardless of any solution we choose, dealing with core and your own PRs will be a consistent burden; this isn’t introducing anything new.

as i said in my proposal, the only place this would possibly increase maintenance burden in top level. quoting myself from there:

the main cost here would be the overriding of inputs in top-level. this would require some additional testing in top-level, as we can no longer guarantee perfect reproducibility from the original repo. i think this could create a good place to introduce an equivalent to nixpkgs/nixos’ hydra jobsets though, as individual repos could act more as a master , nixpkgs-unstable , or nixos-unstable-small while top-level would be more akin to nixos-unstable .

i think top-level acting as nixos-unstable would make sense regardless of our solution here, as it is the “final stop” so to speak and what any nixos ~~do we have an actual alternative name for our fork yet? i haven’t been keeping up on some things~~ should be using

from my response to vlinkz voicing the same concern:

is this not already the case? top-level is quite literally all of our repos combined. for all intents and purposes, it is inherently a monorepo

this should never need to happen. i’ve been very explicit about language specific repositories not depending on each other. this should be reserved for SIGs larger than a single language like gnome and plasma

this will be happening regardless, given it’s a combination of all of our repos. unlike nixpkgs though, it has a practical cap on how large it’s individual package count can get, as any language specific package (like a vast majority of nodePackages, pythonPackages, haskellPackages in nixpkgs) will not be allowed in it. higher (lower? will i ever decide?) tiers preceding top-level like gnome and plasma will also take even more of the burden off top-level (something i didn’t fully realize the potential of until vlinkz mentioned it, so thanks!)

im a bit confused by this point, but i think it’s continuing with the assumption of python depending on javascript, etc? i can’t really provide anything here without more clarification

i would highly recommend not just you, but anyone who wants a bit more insight into what exactly i’m suggesting to read my original proposal here along with the follow-up to vlinkz’s concerns here. i feel some points may have been missed with these conversations happening in parallel across two platforms, as a few of these concerns have already been brought up and addressed there

srxl · May 5, 2024, 7:28am

I’m not yet convinced it’s possible to avoid this - it would be nice if each individual language was it’s own little silo, kept away from every other programming language, but it never plays out that way. At this point though, I think we approach that when it happens - trying to anticipate it when it’s this theoretical is a cognitive rabbit hole that never ends.

I think for now, @getchoo has the right idea - let’s keep things acyclical and simple, until we find a reason we need to change that. We have the luxury of being able to break stuff while we’re in the early bootstrapping phase - let’s use that to our advantage and cross these hurdles when we get to them.

liketechnik · May 5, 2024, 7:45am

Let me try to summarize the current discussion and extract the main points still in discussion:

Goals (from user perspective):
- SIG repos can be used individually, reducing size and eval time
- top-level still provides what previously ‘Nixpkgs’ was - a one-stop location for all packages
Goals (from SIG/maintainer perspective):
- reduce maintainer burden
  - smaller scoped repos, with less noise
  - less moving parts in each SIG repo
technical problems:
- circular dependencies between SIG repos
  - proposal: (multi-tiered) top-level repos
    - removes circular dependencies, by moving them one level up
    - possibly increases top-level maintenance burden
    - this might feel like going back to a (quasi) mono-repo though - but one could argue this is inherently the case anyway
  - proposal: using follows specifications to lock all other Auxolot dependencies
    - this looses the independence property of the SIG repos

(Did I miss something? Certainly not intentionally - I’ll be happy to make amendments here)

As far as I can tell, at no part of the discussion so far these goals were up to debate, so I’ll focus on the technical problem(s).

I come to the same conclusion as @srxl: A top-level approach without cyclical dependencies keeps things simpler without compromising on our goals (re: independence of SIG repos).

(And I also agree: let’s put our current possibility to experiment with stuff to good use, when we need it.)

Jeff · May 8, 2024, 10:47pm

First I’m so glad were all onboard with reducing recursive dependencies.

I think experimentation will be needed.

And towards that, I have automated nix code refactoring before (I created a nix bundler similar to a JS bundler). Let me say, there’s some real easy stuff we can do from the begining, even at the formatter level, that will allow us to make massive refactors overnight if we realize we need a completely different structure. Stuff like, always accessing packages through attributes (ex: auxpkg.name instead of using with auxpkg; name (which is also horribly slow at runtime)), being consistent with how top-level attibutes are set rather than sometimes dynamically generating top level attr names (which again is also slow), having a folder name match the top level attribute name, etc.

I can confirm this is not possible. Python’s R module has the entire R language (and R modules) as a dependency. Other languages have python as a dependency.

A Design Draft v1

I have spent 3 years trying to automatedly refactor nixpkgs. So basically I’ve been training for my whole (nix) life for this (not that I have everything figured out).

Here is a structure that I think could get put-to-work this week and later.

Note: in the list below dependencies go up
E.g. lib depends on nothing, core depends on lib, etc.

Repos:

lib - a pure nix lib, no system/stdenv (this already exists as a flake so we can use that)
core - a reorganized attrset of minimum-packages needed to build the nix cli (which I know how to create thanks to nixpkgs-skeleton)
sig_sources - manually edited/maintained repo. Who is this for? ecosystem/SIG maintainers
registry - automated. Who/what is this for? think of this like IR (intermediate representation). It exists to untangle the spaghetti of package dependencies/recursion, which helps cross-ecosystem maintainers, core maintainers, improves runtime performance, and makes package-indexing/security-notifications/testing and other stuff automatable
ecosystems - manually edited/designed. Who is it for? end users
aux - the aux namespace. Not backwards compatible with nixpkgs. Curated (starts off pretty empty). aux = { auxpkgs = registry; ecosystems = ecosystems; lib = lib; }
polypkgs - pollyfilled pkgs, aka nixpkgs with overlay of auxpkgs (e.g. mostly backwards compatible, temporary)

(I don’t care about the names, like polypkgs could definitely have a better name, thats not the point)

Design of each

Lib
- Should probably be split into stdlib (aka just lib, always forward-compatible) and internalSupport (aka stuff the aux-monorepo(s) might need but the rest of the world might not need, and also might break forward compatibility)
Core
- to create core, copy all nixpkgs files into a new repo, disable the cache, nix build -vvvvv --substituters '' '.#nix', do it on Linux x86, Linux Armv7.1, Mac Apple Silicon, and Mac x86
- if a file path is logged on at least one of those^ builds, then keep it. Delete all other files
- using the log output and some brute force trial-and-error we should be able to detect which of the top-level attributes that were evaluated
  - note which attributes existed on all systems, vs which were system-specific
- top-level.nix is going to contain a ton of junk still, with attributes to packages that don’t even have files anymore.
  - While we should eventually clean it, the 1-week fix is to not clean it
  - Make a minimal-legacy-core.nix that imports the top-level attr set
  - Make the attrset of minimal-legacy-core only inherit (from top-level) only attributes that are we know are used
- create core.nix which imports minimal-legacy-core, but re-structures it:
  - if an attribute is not a derivation, then put it directly on core
  - if an attribute is a derivation, and builds regardless of OS put it under core.pkgs
  - if an attribute is a derivation, but only exists on a certain OS, put it under core.os.${osName}
- v1 done
  - Later we can work on cleaning this repo.
  - Updates can be semi-automated by looking at the nixpkgs git history and checking for changes to the relevent subset of files
Registry (which will become auxpkgs)
- This is the key to what I call, THE GREAT UNTANGLING.
  - Which I think is the most important change to nixpkgs, and is the cause of inter-ecosystems trouble we are hitting right now.
- A flat repository
- The “this week” solution is to have registry start as an empty attrset, but I’ll continue to describe how it fits into the bigger picture.
- Two kinds of packages
  - base packages (normal nix func that produces 1 derivation as output, ex: python)
  - extension packages (these “modify” an existing derivation. For example, have numpy take python as an argument and return both a new python (that now has numpy), and also just a standalone numpy derivation (for stuff like numpy header files or venv))
  - Both extension-packages and base-packages are stored in the same flat attrset
- Note: we are forced to have both base and extensions in registry because some base packages (like VS Code) need base+extension packages (like nodePackages) to build themself. So its not possible to fully separate all base packages from all extensions packages.
- The great untangling/ordering
  - To fix the recursion issues we need the attrs in the top-level of registry to be in a particular order. This can be done, and scaled up without issue if we automate the generation of the top-level.nix file.
  - For an example of the attr order, if a package depends on merely core and/or lib then it is considered to have 0 dependencies. It goes at the top. However, something like npm would need to appear BELOW pythonMinimal because npm depends on pythonMinimal. You might be thinking “But Jeff, some packags–” I know, we’ll get there. Every built-package had a dependency tree (specifically a directed acyclic graph (DAG) of a tree). Conceptually, the order of attrs in the registry is the breath first search (BFS) iteration on the combined dependency tree of all packages. Conceptually. The main reason this post is so frickken long is because nixpkgs pretends the dependency tree has loops, even though, in reality, if packages are to ever be built in finite time, the dependency tree cannot have loops.
  - In practice we can achieve a total ordering of packages, with the following logic:
    - If [pkg] only uses core/lib, put in alphabetical order at the top
    - If all of pkg’s dependencies are already in the registry list; easy, just put the package as high as possible, while still being below all of its dependencies
    - Those two rules alone handle a massive amount of packages, but not everything. Let me introduce the “problem children”
    - 1. If a package has dynamic/optional dependencies we first try to assume that it uses all of them, even if that is somehow impossible (ex: for a package using gcc on linux and clang on mac, we pretend it uses both gcc and clang at the same time). If, with that assumption, all the pkgs dependencies are on the list, then we’re good. If not, then using tree search and some assumptions we can detect the issue and fallback on the next approach.
    - 1. We will need to semi-automatedly/manually break up some packages. There are kinda three cases for this. definitelyMinimal+maybeFull, branching groups, and multi-base-case recursive dynamic dependencies.
      - definitelyMinimal+maybeFull+: For dynamic non-recursive dependencies, such as pytorch maybe needing cuda, we can often break them up into a “minimal” package and a “full” package. The reason I say definitelyMinimal is that the minimal case cannot have any optinal arguments. It needs to be the bare-bones and nothing else. On the flip side, some packages like ffmpeg and opencv have tons of options and some options are incompatible. We can’t actually make a ffmpegFull. So instead we have an ffmpegMaybeFull where every option is available, and we ensure ffmpegMaybeFull is below all dependencies for all options. This minimal+full technique also works for trivial recursion. Every trivially recursive package has one base case (by definition). That base case gets put in its own derivation as minimal, then the recursive case becomes the full version.
      - Branching groups: Not all dynamic dependencies work under the minimal+full method. For example, evaluating a package on MacOS might cause it to have a different tree-order – an order that is incompatible with the same package evaluated on Linux. Theoretically this can happen even without OS differences. Solving this is actually pretty straightforward, the package is broken up into branches (different groups of dependencies) such as package_linux and package_macos. Each of those will have their own spot in the ordered list. Then one-below the lowest one (aka the one with the most dependencies), we create a combined package. The combined package depends on all the earlier ones, and the contains the “if … then package_linux else if … then package_macos” logic.
      - Dynamic recursive dependencies: Unfortunately I can confirm there are packages that are deeply, painfully, multi-base-case recursive with dynamic dependencies.
        
        Let’s start with easiest example. Let’s say registry.autoconf depends on perl. Well registry.perl (ex: perl 6.x) might depend on perl & autoconf. And now we’ve got a multi-recurisve problem; autoconf needs perl and perl needs autoconf (and perl!), its the dependency tree with loops.
        
        Except in reality reality we start with core.perl, then build autoconf::(built with core.perl), then build registry.perl::(built with core.perl and autoconf::(built with core.perl)), and then build autoconf::(built with registry.perl::(built with core.perl and autoconf::(built with core.perl))). It quicky becomes a lot to mentally process … and that’s the simple case!
        
        Nixpkgs does stuff exactly like that behind the scenes, at runtime. Thing is, we don’t have to do it at runtime. We can be way more clear about what is going on by adding stages.
        
        registry.autoconf_stage1, statically depends on core.perl.
        
        registry.perl_stage1, statically depends on registry.autoconf_stage1
        
        registry.autoconf_stage2 statically depends on registry.perl_stage1
        
        All other registry packages use registry.autoconf_stage2 instead of just “some version of autoconf”.
        
        While still complicated, making these stages explicit is, I think, the only way to make this stuff even barely manageable. Just imagine the difference between “Error: autoconf_stage2 failed to build” compared to “Error: autoconf (one of multiple generated at runtime) failed to build”.
        
        While this does require skilled manual labor, there’s not too many packages like this.
        
        Well … except for one category. Cross compliation.
        
        While I think we should have cross compilation in mind from the begining, I don’t think we should immediately (or any time soon) jump into trying to handle cross compiled packages.
        
        The normal (not-cross-compiled) version of a package is going to have less dependencies, and be higher up on the dependency tree. We should focus on those first since they’re the foundation.
        
        That said, I want to recognize what will eventually need to be done for the true deepest most nasty hairball of spaghetti-code in all of nixpkgs; cross compiling of major tools like VS Code, using QEMU virtualization. Not only is it an explosion of dependencies, its possible to depend on the same version of the same package twice, once for the host architecture and again for the target architecture. If we can eventually tackle that, I don’t think it gets any worse.
        
        I know it might feel unclean (give me a chance to talk about SIG sources), but in order to detangle cross compliation, some registry packages will need to have system postfix names like gcc_linux_x86_64, just FYI.
- Last note on the registry, we can use a _ prefix to indicate when a package attr is “just a helper” rather than a derivation that we want to be user-facing. For example _autoconf_stage1, _autoconf_stage2, and then we would have autoconf (e.g. stage2 renamed and ready for public use)
SIG sources
- While the registry can make detangling the recursion possible, it doesn’t necessarily make things perfectly easy to maintain. At a practical level, we can’t just have one package file for each registry package, because stuff like python (python2, pythonMinimal, CPython, Jython, Cython, pythonFull, etc) are going to have a bunch overlap in terms of nix-code, even if they belong at different levels of the dependency tree.
- SIG sources can let us have our untangled cake and eat (maintain) it too, but there is a big risk!
- Each SIG could have a directory inside of the sig_sources repo. For example, let’s say there’s a maintaince group for python. Every sig directory would be designed in a way that a script in the registry-repo could scan the SIG folder, see exported packages, see a static list of dependencies for each exported package, then compute the correct order for all of them, and have each attr import code from the sig directory.
- The danger is that we accidentally recreate the same nixpkgs mess. For example, a giant python/default.nix file that handles every variant of python, packed to the brim stuff like if isPythonFull then ... if isCython ... if isJython. In that case, we are right back to a recursive mess; because cython needs pythonMinimal, and both pythonMinimal and cython are generated by the same monolithic python/default.nix. We have only added indirection. The registry makes de-tangling possible, it doesn’t guarentee it.
- How can we solve this without subjective “code-feel” guidelines? Two rules.
  - 1. Evaluation at different points of the tree (e.g. pythonMinimal vs pythonFull) doesn’t always matter. For example, the aux lib functions wouldn’t care at all since they don’t use derivations. So when does it matter? Well lets say we had a helper like escapePythonString. If that helper is implemented without the registry, then its like lib, it doesn’t really care “where” in dependency tree its evaluated. However, if that same tool, escapePythonString, for some reason, needed registry.sed, then it becomes a risk of being tree-order dependent. Lets say we have another helper, buildWheel, which depends on pythonMinimal but is used inside of pythonFull. While not too common, when helpers depend on registry packages, we can break them up into groups. For example, utils_pre_python.nix could contain escapePythonString, and indicate at the top of the file that there is a dependency on registry.sed. Because buildWheel has different registry dependencies, we would need to make a different utils file, like utils_post_python_minimal.nix to house the buildWheel function. While this handles the tree-ordering issues, it doesn’t necessarily fully stop spaghetti code.
  - 1. This one is hard to explain, but once it “clicks” its easy to have an intuition for. Going back to escapePythonString, lets say it, and all of the helpers are pure-nix. We use escapePythonString across python2, python3Minimal, python38Full, etc. Everything is great. Then one day someone invents Wython (fictional) and the string-escaping of Wython is just a bit different than python. So we face a choice. Either
    - A. We create an independent escapeWythonString
    - or B. we make escapePythonString a bit more complicated by adding a { isWython ? false, ... } parameter
  - You might think “whatever, those options are merely personal preference” but that’s not entirely accurate. The runtime has slight performance difference in terms of tree-shaking, and we can the detect difference objectively via code coverage. Additionally there’s an argument to be made that option B creates a spaghetti control flow. Quick disclaimer, I’m not a 100% coverage kinda guy – I don’t care if a project has 50% coverage – code-coverage is just a tool.
    - Lets talk about tree-shaking, and look a option B. If we run python2, python3 or any individual build, the code coverage of escapePythonString will be more than 0 but not 100%. All of them miss the if isWython branch inside of escapePythonString. That means the engine is always wasting, at least a bit, of time evaulating code that will never be evaluated while building python3.
    - In contrast, under option A, building any individual package causes each helper function to either be 100% or 0% (e.g.100%=escapePythonString, 0%=escapeWythonString)
    - I’m not saying it needs to always be 100% or 0%, but rather:
      - If a single build calls both escapePythonString { option1 = true; }, and escapePythonString { option1 = false; } then there’s no issue, escapePythonString doesn’t need to be broken up (regardless of how other builds use it).
      - For example escapePythonString { singleQuote = true; }, and escapePythonString { singleQuote = false; }
      - But, if Wython only uses escapePythonString { option1 = true; } and all other builds ONLY use escapePythonString { option1 = false; } then there is a problem.
      - For example escapePythonString { isWython = true; }, and escapePythonString { isWython = false; }
  - For the “this week” implementation, these rule can just be eyeball-enforced.
    - It’ll be good enough to prevent the monolithic recursive dependency spaghetti problem.
    - With a tiny bit of practice it’s not that hard to follow the rules manually
    - If there is a debate it won’t become personal-preference war because there is an objective way of determining the answer
    - If a small case is missed, its not a big deal to find/fix it later
    - Later this can be automated by recording the code coverage of each registry-package in a SIG source. For all nix functions that were evaluated during the build, if the function was defined in a file within the SIG folder, and no individual build got 100% coverage of the function, then its flagged. If there is a different combination of arguments that cause a build to get 100% then it passes the flag, otherwise it needs to be broken up.
- There’s other technial details of SIG sources to discuss, like having inter-SIG dependencies go through the registry instead of being direct imports, and having all SIG sources provide one file per registry-entry, and each registry dependency be a function argument rather than an import, but I’m trying to not turn this post into my dissertation (despite how it might look)
Ecosystems
- Goal: be as ergonomic as possible for users
- Ecosystems shouldn’t depend on other ecosystems directly: either import derivations from the registry, or import nix-functions from a sig source
- SIG sources != ecosystem
  - For example, JavaScript might be a SIG group (someone who knows JS has relevent skill for maintaining both bun and nodejs), but in contrast nodejs might be an ecosystem, and bun might be a different ecosystem.
  - SIG sources might need to have messy tooling for bootstrapping like pythonMinimal_stage1. The ecosystem interface should hide all that and just present the final product.
  - If it helps generate packages in the registry, or if a registry needs a tool → then it goes in a SIG source
  - Else → Ecosystem
  - Home manager probably would live in the ecosystem space
- Enable stuff like the dev-shell mixin experience (ex: ecosystems.aux.tools.mkDev [ ecosystems.nodejs.envTooling.typescript ecosystems.python.envTooling.poetry ecosystems.latex.envTooling.basics ])
- While registry needs to be rigourously consistent in order to be automated, ecosystems only need to be consistent to help with ergonomics.
  - Like a common interface of
    - ecosystems.${name}.tools for nix functions
    - ecosystems.${name}.variant for minimal/full builds (ex: mruby, jruby, or jython or pythonMinimal)
    - ecosystems.${name}.main for the base tool (e.g. rustc/ruby/python/node)
    - ecosystems.${name}.pkgs. They can deviate on a per-ecosystem basis as needed.
    - ecosystems.${name}."v${version}".main
    - ecosystems.${name}."v${version}".pkgs
    - ecosystems.${name}.envTooling
    - etc
  - But they are allowed to be different when it makes sense, like ecosystems.${name}.tools.mkNodePackage, or ecosystems.${name}.tools.pythonWith
Aux
- Having one layer before getting into packages is important for future expansion, for example aux.pureAuxPkgs or aux.distributedPkgs, etc
Polypkgs
- Its own repo so that tarball-urls are easy drop-in replacements for nixpkg tarball urls
- If nixpkgs gets a commit, we generate a new flake.lock
- We have git tags equivlent to nixpkgs git tags
- Temporary
- Big special note: I know this goes against what I said at the top (“dependences only go up”), but out of practicality, and because this repo is temporary, sig_sources can use/refer to polypkgs.
  - Yes, this is a recursion issue (sig_sources uses polypkgs, which gets overlayed by auxpkgs, which links back to sig_soruces) but it is necessary. For example, lets say python is NOT in the aux registry yet. Lets also say nixpkgs.openssl is broken from a gcc update.
    - Cowsay can’t use nixpkgs.python (built with nixpkgs.openssl) because nixpkgs.openssl is broken
    - But cowsay can use polypkgs.python (built with the polypkgs.openssl which works because polypkgs is overlayed with registry.openssl)
    - E.g. cowsay doesn’t directly depend on registry.openssl.
    - The registry ordering script pretends cowsay has no dependencies (polypkgs is “invisible”)
    - BUT, as soon as we have a registry.python, (which would end up as polypkgs.python) we need to “collaspe” the difference, mark cowsay as depending on registry.python (instead depending on nothing), in which case the registry generator will put cowsay below python instead of having it at the top level.

Jeff · May 9, 2024, 3:18am

Oh and last thing I forgot. To wrap things up with a bow: packages outside of SIG’s and ecosystems, like which or ping or even openssl, they could simply be added-to then imported-from flakehub and put directly into the registry. Which would let us leverage individuals maintaining their own things on flakehub in a distributed way.

srtcd424 · May 9, 2024, 8:05am

@Jeff - I definitely don’t have the brain to understand all the details here, but my overall impression is very positive! I’m a big fan of “iteratively comb out the complexity” approaches - sounds like you’ve thought through the balance between that and up-front design. Maybe when my meds have worn off I will have another go at understanding the details

liketechnik · May 9, 2024, 9:19am

First of: that’s a very impressive write-up!

I’ll put my thoughts below quotes that are (hopefully) indicative of the exact place in your text (for lack of a better term) I’m talking about.

I can confirm this is not possible. Python’s R module has the entire R language (and R modules) as a dependency. Other languages have python as a dependency.

Hmm, I’m not yet convinced that this means we have to introduce circular dependencies (which I think this really is about)
For example for the node-gyp thing (Jupyter Notebook (python) ← node-gyp (node) ← python) thing,
could be solved by having a python/js core (the naming is up for bikeshed, e.g. ‘bootstrap’ might better convey the purpose) repo
which only sets up these cross-language concern,
and a pure python repo on top.
After fully reading your proposal: Nice, I belive this is what you already came up with, lol.

Here is a structure that I think could get put-to-work this week and later.

core - a reorganized attrset of minimum-packages needed to build the nix cli (which I know how to create thanks to nixpkgs-skeleton)

why focus on nix cli for core? I.e. what is the thing that differentiates nix-cli from other packages (e.g. compilers) (and with lix planning to depend on rust: this would pull in the SIG Rust stuff, depending on exact repo layout. How would we handle that? I.e. what is core and what is SIG <language/ecosystem>)

registry - automated.

I’m vary on automating such ‘core’ things, as the registry. I’m scared this makes it much more difficult to fix/understand if stuff breaks (a similar concern was raised in SIG Repos: How should they work? - #30 by getchoo, albeit with a possible more complex scheme behind it (very first paragraph))

Lib - Should probably be split into stdlib (aka just lib, always forward-compatible) and internalSupport (aka stuff the aux-monorepo(s) might need but the rest of the world might not need, and also might break forward compatibility)

I have hard time imagining that we can break forward compatability that easy, even if it’s just internally.
I might be wrong, but I think nixpkgs had problems with this in a monorepo - and we’re talking about multiple repos needing to synchronise on this here.

While still complicated, making these stages explicit is, I think, the only way to make this stuff even barely manageable. Just imagine the difference between “Error: autoconf_stage2 failed to build” compared to “Error: autoconf (one of multiple generated at runtime) failed to build”.

Someone who has more experience with how nixpkgs currently bootstraps needs to wheigh in here, but to me this sounds extremely reasonable.

… Cross Compilation …

I don’t quite understand why a cross-compiled package has more dependencies. In my mind it just has different packages (different compiler mainly, since the target arch is different).

How can we solve this without subjective “code-feel” guidelines? Two rules.
2.

I have a hard time following here and don’t quite get what you’re arguing for in the end. It would be really kind if you (or somebody else) could try to reword this, in order to help me understand this. But it’s not that important, in the end the whole think still kinda makes sense to me.

Home Manager probably would live in the ecosystem space

Food for thought: are we conflating things that should be kept apart here? I.e. package building/configuration and system (in the sense of system (host services, networking, boot, filesystems, etc) but also user configuration (dotfiles, user services, i.e. home manager, etc) configuration?
I think we should keep these apart from each other, i.e. the configuration stuff (todays nixos and home manager) should be kept different from packaging stuff (todays bootstrapping (the different stages), pythonPackages, rPackages, luaPackages, vscode, neovim, etc).
And this discussion should focus solely on the packaging aspect (which is then consumend by the configuration stuff).

Enable stuff like the dev-shell mixin experience (ex: ecosystems.aux.tools.mkDev [ ecosystems.nodejs.envTooling.typescript ecosystems.python.envTooling.poetry ecosystems.latex.envTooling.basics ])

And in spirit with the last point: where do we see the dev env story?
Is it something that needs to be kept together with packaing (for technical/maintenance reasons)?
Or is it both plausible and sensible to separate it too?

While registry needs to be rigourously consistent in order to be automated, ecosystems only need to be consistent to help with ergonomics.

To make sure I understand this all correctly:

lib: generic pure nix ‘tooling’
core: bootstrapping of build minimum amount of required toolchain(s)
sig_sources: this is the bootstrapping of SIG/ecosystem toolchains AND ecosystem/SIG specific (pure) nix ‘tooling’?
ecosystem: (language (or finer grained)/)SIG specific package sets + nix tooling building upon the bootstrapping in sig_sources
registry: combines sig_sources + ecosystem together in a clever way to avoid the circular dependency issues
polypkgs: tie everything together, to have a one-stop shop for those who want it

Then it also makes sense why you list these functions under ecosystems, and not sig_sources to me.

Aux

Having one layer before getting into packages is important for future expansion

I don’t quite get the example, but I belive what you’re getting at is that we need an abstraction layer between the actual layout of things
and how we present them to users, so that when our layout changes, we can maintain backwards compatability in this abstraction layer.
With that I fully agree. And I like the idea to have specific place for this layer!

Polypkgs

Its own repo so that tarball-urls are easy drop-in replacements for nixpkg tarball urls

If nixpkgs gets a commit, we generate a new flake.lock

I think we ended up somewhere that it’s best to actually fully separate our public
package sets from nixpkgs,
since providing nixpkgs is increased maintenance complexity for use,
but not much benefit to our users (they can just pull in nixpkgs themselves).
(this was the discussion in On the future of our nixpkgs fork)
(Not fully sure this is the right conclusion, now that I think about it:
I’m not knowledgable enough to know if this is also just as easy when not using flakse,
and us pulling in the nixpkgs has the (implicit) guaranteee that the nixpkgs
we provide is compatible with auxpkgs - which the user pulling it in doesn’t have)

All in all, I feel this really goes in the right direction; while I think there’s some discussion to have on certain details, the overall architecture/broader idea seems very solid to me, and I especially like the idea on making the bootstrapping and library parts of what today is all “nixpkgs” more distinguished in however we will call the different parts of auxolot packaging (I.e. going even further than just distinguishing nix the build tool, nixos the system configuration and nixpkgs the package set).

P.S.: I’m slowly starting to doubt that the complexity of it all can be appropriately handled in a discourse thread. Yesterday someone shared the CQ2 tool on in the matrix chat. But I really don’t wanna rush introducing yet another tool without proper discussion & consideration & approval of the current interim leadership, so this best stays here in the P.S. for now, and if we come to the point where we really feel productive discussion is simply not possible due to the complexity exceeding what discourse’s format can handle, we can revisit this.

PPS: For some details on the ecosystem/polypkgs package sets I think we should also take the thoughts on how to structure our new packages in On the future of our nixpkgs fork - #13 by getchoo into account (stuff like: do we want to namespace packages similar to e.g. gentoo or aur).

Jeff · May 9, 2024, 11:36am

It is quite the elephant to eat. Here’s a way more simple example of the problem.

I’ve got this std-lib -ish thing for Javascript (my good-js repo, which is only halfway done fixing circular dependencies if you want a look).

I first grouped things by like “array_functions.js” and “string_functions.js” etc.
Then my array functions needed some of my string functions. So array.js imported string.js. No problemo.
But then one of my string functions needed one of the array functions. Big problemo.
I can’t have string.js import array.js because that would now be a circular dependency. Which Javascript doesn’t allow (at least with synchronous imports, at least on the web and in Deno)

So I was in quite the pickle. Even worse circular dependencies between groups started happening all over the place (iterator helpers, set helpers, async helpers). I eventually realized I needed a flat structure.

Every helper function gets one file. No grouping.
Each function imports exactly the helpers it needs from the flat structure. Nothing more nothing less.
Then the namespaces (string.js, array.js, etc) just import funcs from the flat structure and organize them into nice little bundles.
volia, no more circular dependencies, and all the namespaces still work

Its kind of the same thing for nix, just a lot worse since nix not only allowed circular dependencies/imports but actually the nixpkgs team just went all-in on circular deps for over a decade for thousands of packages.

(Also last note, the flat structure doesn’t work for JS when there are two functions that are mutually recursive. For a dumb example, isEven calling !isOdd and isOdd calling !isEven is mutually recursive. This “limitation” is probably a good thing because if functions are mutually recursive we probably want to define them in the same file anyways).

Jeff · May 9, 2024, 11:43am

This a good point. The issue is core is arbitrary. And to your point my nixpkgs skeleton was built around “packages needed for cowsay” not the nix CLI.

The nix-cli choice is because, every system using nix at-minimum must’ve had nix-cli built for them. Otherwise they couldn’t be running the code lol. So why not say the packages for nix CLI are the core.

Otherwise I don’t think there’s a good way to differentiate core packages from non-core packages. And we might get this problem where core becomes more and more bloated over time with people arguing over what should and shouldn’t be core.

Jeff · May 9, 2024, 11:46am

Oh, I think there’s a misunderstanding. The entire post, with all its massive complexity, was the most simple structure I could come up with that PREVENTS circular dependencies (without loosing functionality, and without becoming unmaintainable, and in a way that we can iterate on it). That was the whole goal: destroy circular dependencies. Its just really really really hard, at least for me.

(Also I’ll have to reply to the rest of your post tomorrow! Thanks for the long response!)

(Edit: also also, like srxl said, I don’t think anyone can predict whether the structure is going to work or not until we try it.)

Nairou · May 10, 2024, 6:20pm

I love the idea of multi-repo, and forgive me if I missed something, but why can’t the multiple repos just be an implementation detail? Good for devs and maintainers, but irrelevant to end users.

As far as the user is concerned, they input auxpkgs and have access to whatever they need, like they currently do with nixpkgs. But auxpkgs updates are built from merging all of the multi-repos together. Doesn’t matter what repo a package is in, it ends up in the final package cache. Each repo has a single auxpkgs input, so cross-repo dependencies are automatically resolved when auxpkgs gets built, again as with the current nixpkgs.

SIG Repos: How should they work?

A Design Draft v1

Design of each

Lib

Core

Registry (which will become auxpkgs)

SIG sources

Ecosystems

Aux

Polypkgs