[Nix-dev] Using Nix for deterministic and verifiable builds

Michael Raskin 7c6f434c at mail.ru
Mon Nov 17 07:06:39 CET 2014


>> Deterministic output means that something has to be done about profile-
>> guided optimistion. It is a subject of an old discussion in our
>> community as some people would like bit-perfect builds and others think
>> that PGO improves performance significantly and should be used.
>> In our case the main discussion is about letting GCC to PGO itself
>> during bootstrap.
>
>Yes, PGO is a nuisance for Mozilla as well. We ship PGOd Firefox because 
>it matters for performance. Tor (which I forgot to mention is based on 
>Firefox) has disabled PGO, sacrificing raw speed for 
>privacy/security/trust. There is a debate of sorts on whether Mozilla 
>should offer a non-PGOd Firefox for the crowd that cares about these 
>things. Personally, I'm holding out hope that we can save and distribute 
>the PGO profile and use the same profile in distributed environments to 
>achieve the same binaries. AFAIK that hasn't been proved either way. 

Can you profile FF under a specific load and then give this to PGO? Can
you create two different PGO profiles and give them to the same build to
get FF instances with measurably different optimisation?

If both are possible, then there is probably no problem.

>Whether this is possible or whether tools exist to sufficiently audit 
>the PGO profile for malicious intent are very good questions that need 

If GCC were small enough to work closer to its own intentions, and had
no bugs like 
https://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=view&target=regehr_gcc_summit_2010.pdf
I think even malicious PGO profile would just worsen the performance. 
Of course, in the real world there is a risk of an attack on GCC just
to create an overflow and aribtrary code generation using weird PGO.

>answered. Debian and others are on a big deterministic builds kick right 
>now, so I hold out hope that smart people will find a way to make PGO 
>and trust work together.
>
>>> I *think* you can facilitate determinism over time by providing your own
>>> channel or version control repository of Nix expressions. Then, you tell
>>> people "checkout version X and install the 'firefox-build-env' package."
>>> You attain trust via regular auditing of the Nix expressions. Is it
>>> really this simple? How do you achieve full isolation of a Nix
>>> expressions "database" from other installed channels/sources? Is having
>>> a version controlled repository of Nix expressions that can be used to
>>> derive the same (hopefully identical) packages over time something that
>>> people do? e.g. if I check out a copy of the Nix expressions from a year
>>> ago and realize the "firefox-build-env" package, I should compile the
>>> environment as it was a year ago, right?
>>
>> Yes, that's right.
>>
>> Let me summarize my impression of what Nix reliably provides.
>>
>> 1) If you check out an old version of the package instruction
>> repository, exact same build instructions will be generated. The same
>> tarballs will be fetched from the same lists of mirrors and verified
>> against the same checksums, then unpacked and built with the same
>> patches applied and the same configure flag order etc.
>
>Great to have confirmation of this!
>
>> Tarball availability is an issue here; I guess Mozilla can solve this
>> part of the problem for all the tarballs needed for Firefox bootstrap
>> build.
>
>Is there a way to insert your own mirror without changing the fetchurl 
>{} in the .nix file? Isn't there some magic where the source inputs get 
>realized and can be fetched from a binary cache, just like the outputs?

You can rewrite fetchurl expression to add mirrors — without chaging
output hashes (any way to fetch a file with correct sha256/sha1 is OK).

The most realistic ways are: NIX_HASHED_MIRRORS variable when starting 
the Nix daemon or non-daemon Nix build (to set your set of 
hash-addressed mirrors), or just adding your own mirror to mirrors.nix
as hash-addressed mirrors.

>> Also there is a problem of «trusting trust» and bootstrap build; our
>> basic build environment is built from a binary set of bootstrap tools,
>> which is the checksumed build environment you want to avoid.
>>
>> To solve this, there is a procedure of building these bootstrap tools
>> and if you disable all non-determinism it should converge to the same
>> tarball when you take two acceptable bootstrap toolsets, build the
>> entire working environments from these, and use them to generate two new
>> sets of bootstrap tools.
>
>Yup. And it's turtles all the way down to silicon. "How do you know the 
>NSA didn't backdoor your compiler through adjustment to Intel's CPUs?"

Hey, that part is simpler! You build Qemu-x86 on different Chinese ARM
chips using differing starting compilers, bootstrapped by TCC and PCC…

You would still have perfect convergence from different GCC versions 
first, but it is achievable (I hope).

>Diverse double compilation and other tricks to help defeat poisoned 
>binary bootstrap tools is certainly on the TODO list. However, it's so
>far down the list of priorities for us right now that it barely 
>registers. Furthermore, I don't believe Tor/Gitian is doing anything
>much more creative than Nix and I don't believe people are sweating over 
>it. Perhaps they should be. We need to walk before we can run and 
>terrific toolchain trust can wait.

Here I just wanted to show that our trusting-trust situation is not much
worse than in any other situation, because it is very disappointing to 
invest in some technology to do 80% of the work and find out it stops 
you from doing the second 80%.

>> As for irrelevant-for-the-job tools:
>>
>> Binary caches say «if you perform build X you get output Y anyway, you
>> can save the time and trust me». You will probably want to only use
>> binary caches under your complete control. Hope fully some of these
>> would also serve the toolchain and library chain to the world…
>
>I should have mentioned that there are 2 consumers of deterministic 
>builds of Firefox: the privacy/security camp and developers. We want 
>developers to have access to the same build environment so they can get 
>local builds that behave just like the official ones. This also allows 

In both cases you will want to control the binary cache you recommend.

Also, you would probably be the first large binary cache to use 
signatures consistently (there is some support, but mostly unused so 
far).

>us to create a globally distributed "ccache" to speed up compilation. 
>(Quick note: Mozilla built an S3-backed version of ccache - 
>https://github.com/glandium/sccache).

For obvious reasons, ccache support is a complex topic in NixPkgs (there
is some, though). On the other hand, with Nix you could try to
reasonably split different parts of Firefox build into different Nix 
«derivations», so the same thing that lets you not to rebuild glibc each
time would allow you to rebuild only some of the subdirs.

>I mention this because the developer group likely only interface with 
>binary caches. We want the overhead to obtain the build environment to 
>be low. This likely means fetching pre-built Nix packages or packaging 
>up the full environment in a chroot/Docker container, etc. Trust 
>verifiers, however, are likely doing everything from source. I like Nix 
>because it placates both groups.

By the way, with Nix you can give people a closure — «we know you need 
75% of the things here, so there's one static file to you that you can
nix-store --import without fetching thousands of small packages». 

>> One of the problems is when some debug information gets leaked into the
>> final package and it contains build directory paths. We want them to
>> correcspond to external paths for easier debugging, and so they may leak
>> information. With minimal care they turn out to be completely
>> deterministic, though.
> >
>> Many issues are discussed in https://github.com/NixOS/nixpkgs/pull/2281
>> which was already mentioned.
>
>In case you haven't seen it, https://wiki.debian.org/ReproducibleBuilds 
>has a great overview of all the problems and solutions Debian is running 
>into.

I think we used it as a reference at some point. I gave you a link that 
also shows the current Nix-side acceptance/non-acceptance of various
solutions.

>> 3) Deterministically producing a Firefox intended for non-NixPkgs
>> installation (i.e. not referring to /nix/ for libraries) is a separate
>> question. I would say that it is quite easy: we have a few chroot
>> generators, so Firefox can simply be built — or just get its ELF header
>> edited — inside a chroot mimicking the layout you want by bind-mounting
>> /nix/ and then simply symlinking the libraries into /usr/lib. Here you
>> won't get any new determinism problems (or any non-trivial problems at
>> all).
>
>Ooh, I didn't know about the different chroot generators. I'll have to 
>go source diving :)

Well, as I understand it, Firefox does most things via relative paths 
anyway? So even without the chroot you just need to rewrite the 
ld-linux*.so* path and allow /usr/lib/ usage? We have tools for that 
part already (we mostly use them in the other direction: to transform
a Mozilla Firefox binary release into Nix-friendly binary).

>We already hack ELF headers and do custom debug symbol processing, so 
>that's well within the realm of possibility.





More information about the nix-dev mailing list