[Nix-dev] Help needed: patching Hydra to retry failed builds after a while

Rob Vermaas rob.vermaas at gmail.com
Wed Jan 8 16:56:54 CET 2014


Hi,

Hydra doesn't really deal with , it is just a nix (see 'man nix.conf')
setting we use that caches the build failures:
   build-cache-failure = true

Perhaps you can create an issue in the nix issue tracker
(https://github.com/NixOS/nix/issues) requesting a feature that would
implement what you are suggesting.

In the meanwhile we can restart these builds.

Cheers,
Rob

On Wed, Jan 8, 2014 at 4:53 PM, Gergely Risko <gergely at risko.hu> wrote:
> Hi,
>
> (Hungarian version below.)
>
> Yes, you're right, at least in part.
>
> I actually checked
> http://hydra.nixos.org/job/nixpkgs/trunk/haskellPackages_ghc763_profiling.pipesParse.i686-linux/all,
> here you can see that a lot of the failures are cached failures, but
> you're right, the hash of the ghc is changing sometimes.  Actually your
> GHC hydra shows this perfectly, because all of those are different
> tries, not cached failures of the same tries.
>
> If you check the dates carefully:
> Dependency failed       7819290         ghc-7.6.3-wrapper       i686-linux      2014-01-07 22:44:31
> Dependency failed       7634985         ghc-7.6.3-wrapper       i686-linux      2014-01-04 14:13:33
> Dependency failed       7395570         ghc-7.6.3-wrapper       i686-linux      2013-12-26 19:41:49
> Dependency failed       6662085         ghc-7.6.3-wrapper       i686-linux      2013-10-27 15:36:05
>
> 2013-10-27 -> 2013-12-26 is 2 months without retrying.  This seems not
> acceptable to me in the case of an out of memory error.  In this case
> you're right that the retry didn't help and hydra admins have to fix the
> machine somehow.  (I will write a separate mail about that.)  But in
> general, currently retry doesn't happen and we have to wait for a change
> to nixpkgs git that triggers a new build.  That's what I want to change.
>
> About local GHC i686 build: I haven't tried, I will try to make some
> time and try it, but I'm pretty sure that it's a memory limitation on
> the machine, so let's ping the admins! :)
>
> Thanks,
> Gergely
>
> -=-
>
> Helló,
>
> Igen, igazad van, legalábbis részben.
>
> Én a
> http://hydra.nixos.org/job/nixpkgs/trunk/haskellPackages_ghc763_profiling.pipesParse.i686-linux/all
> URL-t néztem, itt látható, hogy a hibák jó része cachelt hiba;
> ugyanakkor igazad van, a hashe ghcnek változik néha.  A te GHC hydra
> linked nagyon jó ilyen szempontból, mert az csak az igazi próbákat
> mutatja, a cachelt hibákat nem.
>
> Ha megnézed a dátumokat pontosan:
> Dependency failed       7819290         ghc-7.6.3-wrapper       i686-linux      2014-01-07 22:44:31
> Dependency failed       7634985         ghc-7.6.3-wrapper       i686-linux      2014-01-04 14:13:33
> Dependency failed       7395570         ghc-7.6.3-wrapper       i686-linux      2013-12-26 19:41:49
> Dependency failed       6662085         ghc-7.6.3-wrapper       i686-linux      2013-10-27 15:36:05
>
> 2013-10-27-től 2013-12-26-ig az 2 hónap újrapróbálkozás nélkül.  Ez az
> ami szerintem nem elfogadható egy átmeneti memória elfogyás esetén.
> Ebben az esetben igazad van abban, hogy a retry sem segít és a hydra
> adminoknak kell a gépet megjavítania.  (Írok mindjárt egy külön
> levelet.)  De úgy általában, jelenleg újrapróbálkozás csak akkor
> történik, ha kivárjuk, hogy valaki megváltoztassa a nixpkgs gitet.  Ezt
> szeretném megváltoztatni.
>
> A helyi GHC i686 fordítással kapcsolatban: nem próbáltam, megpróbálom
> majd, ha lesz egy kis időm.  De elég biztos vagyok benne, hogy a
> hibaüzenet valódi és egyszerűen tényleg elfogyott a gépen a memória,
> szólok az adminoknak!
>
> Üdv,
> Gergő
>
> On Wed, 8 Jan 2014 10:34:50 -0500, Thomas Bereknyei <tomberek at gmail.com> writes:
>
>> Jo napot,
>>
>> I'm willing to help you with working on Hydra, it's something I want
>> to get involved with. I'm not sure who is in charge of Hydra
>> administration.
>>
>>>From just a cursory look, it seems that the error is not transient, or
>> at least it consistently breaks [1] with:
>>
>>> ghc-stage1: out of memory (requested 1048576 bytes)
>>
>> [1] http://hydra.nixos.org/job/nixpkgs/trunk/haskellPackages.ghc.i686-linux
>>
>> Have you tried building it locally? I do not have a i686 machine at the moment.
>>
>> Es szertnem gyakorolni a magyar irast.
>>
>> -Tom
>>
>> On Wed, Jan 8, 2014 at 9:50 AM, Gergely Risko <gergely at risko.hu> wrote:
>>> Hi,
>>>
>>> Happy new year to all the Nixers around here!
>>>
>>> In https://github.com/NixOS/hydra/issues/139 I reported the following issue:
>>>
>>>> It happens quite frequently that some build breaks with a transient
>>>> failure on some Hydra machine. The most recent example is GHC on
>>>> i686. The only solution in these situations is to whine on the mailing
>>>> list and hope that some hydra admin will restart the failed build.
>>>>
>>>> It'd be much better to have a TTL for negative build caching and retry
>>>> failed builds e.g. every week at least once even if the derivation
>>>> didn't change. That would ensure that transient errors get fixed even
>>>> without manual intervention.
>>>
>>> Since I received no comments on the ticket, may I ask for opinions here?
>>> Is this a good idea to do?  If yes, can someone with actual coding and
>>> design experience with hydra help me please?
>>>
>>> Are there any design decisions to make?  Can someone point me to the
>>> relevant parts of the codebase and give a little bit of an overview what
>>> I have to do to achieve this goal?  I'd be happy to figure out the
>>> details and prepare a patch of course.
>>>
>>> Currently I can't update my haskell machines for the last 4 months on
>>> i686 because of this and always pinging the hell out of hydra admins
>>> seems to be a waste of everybody's time if this can be automated.
>>>
>>> Thanks,
>>> Gergely
>>>
>>> _______________________________________________
>>> nix-dev mailing list
>>> nix-dev at lists.science.uu.nl
>>> http://lists.science.uu.nl/mailman/listinfo/nix-dev
>
> _______________________________________________
> nix-dev mailing list
> nix-dev at lists.science.uu.nl
> http://lists.science.uu.nl/mailman/listinfo/nix-dev



-- 
Rob Vermaas

[email] rob.vermaas at gmail.com


More information about the nix-dev mailing list