[Nix-dev] hotswappable self managing services in nix

Mon Nov 28 11:49:55 CET 2016

For process-level graceful restarts see
https://github.com/zimbatm/socketmaster and https://github.com/pusher/crank
. Those could be integrated into the activation script.

On Mon, 28 Nov 2016 at 09:33 zimbatm <zimbatm at zimbatm.com> wrote:

> Hi Stewart,
>
> In a HA setup availability is generally achieved on a network level
> instead of system level. Typically you would have two hotswappable
> load-balancers that distribute the traffic to multiple instances of your
> service boxes. In that context is doesn't matter how processes are being
> restarted because the load-balancer will automatically detect unresponsive
> machines and route the traffic accordingly. It's also handy because it
> allows to restart the machines in the event where the kernel needs an
> upgrade. In that setup I suppose you can think of each machine as being one
> Erlang OTP "process" and the network the "message-passing".
>
> One responsibility of the service in that setup is to shutdown properly to
> avoid unnecessary disruption of service. Mainly when the process gets the
> SIGTERM signal it should close the listening socket (so the load-balancer
> can route new incoming connections to a different machine) and then drain
> the existing client connection gracefully. It shouldn't stop all at once
> but let the clients disconnect when they are done with their sessions (and
> optionally signal them to go away if the protocol supports it).
>
> A last thing regarding this approach: generally you need a way to control
> the deploys; if all the service boxes are being upgraded at the same time
> then the load-balancer doesn't have anywhere to route the traffic to. It's
> also something desirable to have to do blue/green deployments.
>
> I need to stop there for now but I also have a similar design answer on
> the system level where processes get replaced gracefully.
>
> Cheers,
> z
>
> On Sun, 27 Nov 2016 at 04:33 stewart mackenzie <setori88 at gmail.com> wrote:
>
> 9 9s not unheard of in these circles, Google uptimes are a joke not worthy
> of mention.
>
> There are systems that have been running for some 40 odd years in
> production that factor in changes to legal banking regulations, hardware,
> business logic etc. Erlang has a system called the Ericsson AXD301 which
> has achieved this time frame.
>
> Just because Nixos hasn't been around that long doesn't mean it can't have
> the primitives to allow for such feats. Its these primitives I'm enquiring
> about.
>
> So let's use a new, less controversial figure of 5 9s and keep on topic.
>
> The thing is, we're designing this system so that its governed by nix
> don't necessarily have to depend heavily on the runtime - I really don't
> want to go down the imperative route, by introducing imperative language
> concepts into our declarative language which is managed by another
> declarative language (nix). Besides just bringing in a single component
> with an OS Dependency demands we manage this change from nix level.
>
> We currently have a hack in place, that will resolve dependencies and give
> us a path to load a correctly compiled shared object into memory:
> https://github.com/fractalide/fractalide/blob/master/components/nucleus/find/component/src/lib.rs#L43
> nasty and cringe worthy I know.
>
> Thanks for your pointer, I'll take a look at these activation scripts.
>
> Maybe this hack is the answer, and confine the dynamism to an ssh login al
> a Erlang style...
> _______________________________________________
> nix-dev mailing list
> nix-dev at lists.science.uu.nl
> http://lists.science.uu.nl/mailman/listinfo/nix-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.science.uu.nl/pipermail/nix-dev/attachments/20161128/3b14160d/attachment.html>