An attempt at using IPFS as a binary cache

Returning to bittorrent, I think I’ve got my signatures sorted out (pubkey aux-cache.dev-1:ywKsFPXTku4fvm8Scw4gZztmJ+R6zkeKFZCe5k3zoLE=) So the runtime and build closures for “lix built with aux core” should be being served by my horrible hack :slight_smile:

(Next step is to get it properly served via http/s3 as well)

1 Like

How would you build a form of guarantee/ commitment from a community perspective that is stronger than “hoping that someone will”?

I don’t see how it’s any harder than any other centralized infra - and being a lot less data than the nar contents themselves, it should be easier to host, distribute, etc.

I’m also still tempted to play with radicle for a decentralized approach, but much as it’s an interesting shiny thing to play with it’s probably not a priority.

Inspired by @srtcd424’s approach at using an intermediate KV store (dunno who else suggested it), I went ahead and wrote another binary that generates that store from an existing /nix/store. It’s the mapper in nix-subsubd.

Mapper

Right now nss-mapper works by grabbing the hash from /nix/store/$hash-$whatever, querying the binary cache (configurable but https://cache.nixos.org by default) for the .narinfo, downloading and streaming the .nar.xz to a “hasher” (configurable but IPFS at the moment), then writing it in a data structure to a directory dir/$shaFirstLetter/$sha.yaml.
My store has >30k files, and took 2 hours to get to 5k, so it’s not the fastest. That’s probably also the case because there are friggin .drv, .patch, and a bunch of other tiny files in the binary cache :open_mouth: .

Proxy

Anyway, the proxy is now called nss (short for nix-subsubd) and can use the KV store generated by the “mapper”. It introduces 2 concepts:

Mapper maps a SHA256 hashsum of a file in the KV store, which in turn contains the details necessary for the backend to retrieve the corresponding NAR. Both are traits / interfaces and thus the sky is the limit to what implementation can come.
Right now, there’s a FolderMapper which (and as I’m writing this I see how it can get confusing) uses the output nss-mapper, but if somebody fancies they could implement and SqliteMapper to query a DB. HttpMapper to query a service, SocketMapper to speak to another program over a socket, or whatever.
As for the backend, right now there’s an IpfsBackend to grab NARs from IPFS, but if somebody wants to they can implement a TorrentBackend to download torrents on the fly and stream the response, or HttpMirrorBackend to round robin on a list of HTTP/HTTPS mirrors, TahoeLafsBackend, S3Backend, or whatever you fancy implementing.

And yes, it does fallback to the binary cache if something fails.

Future

There’s a bunch of stuff missing, most notably documentation :see_no_evil: The next month might get pretty busy, so I’m just dropping this here before I disappear for a while. Maybe somebody will want to try it out, add a new mapper, or a new backend, or whatever.

What else is missing:

  • automatic tests (coverage is low af)
  • nss-mapper is currently sync (could be async, but async rust streams cooked my brain)
  • smaller features like filtering /nix/store input to nss-mapper
  • a git repo with the output of nss-mapper
  • better logging throughout the project
  • better error HTTP responses from nss when turds hit the fan
  • probably a bunch more stuff

Anyway, feel free to test, fork, contribute, give constrictive feedback, …

See y’all again in a month!

4 Likes

Using IPFS seems somewhat at odds with the Aux community’s values; not because of anything inherent to the technology, but it’s background. It is developed by a cryptocurrency company, Protocol Labs, who also develop FileCoin, which drives the development of IPFS. The FileCoin launch also has a dogwhistle (emphasis added):

The Filecoin mainnet will officially begin at epoch 148,888

They claim this means ““prosperity for life” in Chinese”, but that sounds like an excuse, and I’m not the only one.

The IPFS website also boasts Lockheed Martin, a defence contractor, as a major user on the carousel (along with items for Web3, DAOs, and NFTs).

Surely there must be another option for a distributed content-addressed store we could try instead?

5 Likes

The nice thing about @OtherBookmarks 's approach is that the underlying tech / network would be fairly easily pluggable.

In terms of the ethics of what we might use, I suspect it depends on the ultimate goal. For a fairly narrow distribution, e.g. between devs and early adopters etc, obviously we’re free to pick and choose. OTOH if we’re looking at wider use, then I guess realistically using a bigger, established network would probably be an unavoidable compromise.

BTW, I’ve been meaning to post GitHub - n0-computer/iroh: A toolkit for building distributed applications for a while. Haven’t looked into it deeply, but it started off as an ipfs implementation I think, then went in a slightly different direction. In particular I think it tries to address object name lookup which might be useful.

3 Likes

Note that Veilid is supposed to have a block store soon.

2 Likes

Bit torrent is a much more mature technology compared to IPFS imo. There are many more mature bit torrent libraries and clients out there than for IPFS. Just my 2c. I don’t think I would personally adopt any tech coming out of Protocol Labs.

I reckon the best option would be to have useful generic hooks for p2p binary caches that end-users can hook up to their protocol of choice.

2 Likes

Right now the process is extensible in code. The main function for getting a NAR is here.

Basically the process is

  • map the SHA256 of .nar.xz to a value (can be IPFS CID, path to torrent file, URL, whatever) - done by a “mapper”
  • call a backend to retrieve the file from the mapped value

At the moment, there’s only one backend implemented and one mapper. It’s conceivable to create a backend and mapper that call external processes and their output is piped back into nss. It’s something I thought about but forgot to create a ticket for.

Contributions are welcome.

2 Likes

I haven’t read everything here, but I have a few notes from my stints at trying to build a distributed trustless nix aux store:

  • if derivations are content-adressed, this becomes a trivial problem.
  • in my testing ipfs has worked pretty well, but…
  • we probably want something built specifically for aux. perhaps build on top of libp2p (as ipfs does)?

I’m not sure we need to integrate this into the Aux CLI (auxcpp?) as much as just … provide a module.

{
  services.distributed-store = {
    enable = true;
    addToSubstituters = true;
    settings.privateKeyFile = ...;
  };
}

Regardless, I absolutely love this concept. Avoiding the huge hosting fees might be essential for the Nix forks to succeed. :slight_smile:

1 Like