[Nix-dev] If remote build machine is not reachable build jobs hang. (zeroconf ?)

Marc Weber marco-oweber at gmx.de
Tue Apr 27 21:55:00 CEST 2010


This is very annoying because you can't even rebuild nixos without patching,
can you?

So what about introducing a new env var called NIX_REMOTE_PING_TIMEOUT
which causes the build-remote.pl script to ping a machine before sending a
buiild request?

I've written this patch (without testing yet). What do you think about it?

diff --git a/scripts/build-remote.pl.in b/scripts/build-remote.pl.in
index 2afa3af..170d086 100755
--- a/scripts/build-remote.pl.in
+++ b/scripts/build-remote.pl.in
@@ -53,6 +53,10 @@ mkdir $currentLoad, 0777 or die unless -d $currentLoad;
 my $conf = $ENV{"NIX_REMOTE_SYSTEMS"};
 decline if !defined $conf || ! -e $conf;
 
+# if $ENV["NIX_REMOTE_PING_TIMEOUT"] is set ping hosts before asking them to
+# build a package. Set it to 0.5 to set timeout to 500ms
+my $ping_timeout = $ENV("NIX_REMOTE_PING_TIMEOUT");
+
 my $canBuildLocally = $amWilling && ($localSystem eq $neededSystem);
 
 
@@ -60,19 +64,34 @@ my $canBuildLocally = $amWilling && ($localSystem eq $neededSystem);
 my @machines;
 open CONF, "< $conf" or die;
 
-while (<CONF>) {
-    chomp;
-    s/\#.*$//g;
-    next if /^\s*$/;
-    /^\s*(\S+)\s+(\S+)\s+(\S+)\s+(\d+)(\s+([0-9\.]+))?\s*$/ or die;
-    push @machines,
-        { hostName => $1
-        , systemTypes => [split(/,/, $2)]
-        , sshKeys => $3
-        , maxJobs => $4
-        , speedFactor => 1.0 * ($6 || 1)
-        , enabled => 1
-        };
+try{
+
+  if ($ping_timeout){
+    use Net::Ping;
+    my $p = Net::Ping->new();
+  }
+
+  while (<CONF>) {
+      chomp;
+      s/\#.*$//g;
+      next if /^\s*$/;
+
+      # if other machine can't be reached don't add it to list
+      next if $ping_timeout && !$p->ping($1, $ping_timeout);
+
+      /^\s*(\S+)\s+(\S+)\s+(\S+)\s+(\d+)(\s+([0-9\.]+))?\s*$/ or die;
+      push @machines,
+          { hostName => $1
+          , systemTypes => [split(/,/, $2)]
+          , sshKeys => $3
+          , maxJobs => $4
+          , speedFactor => 1.0 * ($6 || 1)
+          , enabled => 1
+          };
+  }
+
+} finally {
+  $p->close if ($ping_timeout)
 }
 
 close CONF;


Maybe there is a much better way:
use kind of zeroconf (Avahi?) way to find available (trusted) build machines
automatically?

Has anyone thought about this?

Marc Weber



More information about the nix-dev mailing list