[Nix-dev] hotswappable self managing services in nix

Mon Nov 28 10:33:02 CET 2016

Hi Stewart,

In a HA setup availability is generally achieved on a network level instead
of system level. Typically you would have two hotswappable load-balancers
that distribute the traffic to multiple instances of your service boxes. In
that context is doesn't matter how processes are being restarted because
the load-balancer will automatically detect unresponsive machines and route
the traffic accordingly. It's also handy because it allows to restart the
machines in the event where the kernel needs an upgrade. In that setup I
suppose you can think of each machine as being one Erlang OTP "process" and
the network the "message-passing".

One responsibility of the service in that setup is to shutdown properly to
avoid unnecessary disruption of service. Mainly when the process gets the
SIGTERM signal it should close the listening socket (so the load-balancer
can route new incoming connections to a different machine) and then drain
the existing client connection gracefully. It shouldn't stop all at once
but let the clients disconnect when they are done with their sessions (and
optionally signal them to go away if the protocol supports it).

A last thing regarding this approach: generally you need a way to control
the deploys; if all the service boxes are being upgraded at the same time
then the load-balancer doesn't have anywhere to route the traffic to. It's
also something desirable to have to do blue/green deployments.

I need to stop there for now but I also have a similar design answer on the
system level where processes get replaced gracefully.

Cheers,
z

On Sun, 27 Nov 2016 at 04:33 stewart mackenzie <setori88 at gmail.com> wrote:

> 9 9s not unheard of in these circles, Google uptimes are a joke not worthy
> of mention.
>
> There are systems that have been running for some 40 odd years in
> production that factor in changes to legal banking regulations, hardware,
> business logic etc. Erlang has a system called the Ericsson AXD301 which
> has achieved this time frame.
>
> Just because Nixos hasn't been around that long doesn't mean it can't have
> the primitives to allow for such feats. Its these primitives I'm enquiring
> about.
>
> So let's use a new, less controversial figure of 5 9s and keep on topic.
>
> The thing is, we're designing this system so that its governed by nix
> don't necessarily have to depend heavily on the runtime - I really don't
> want to go down the imperative route, by introducing imperative language
> concepts into our declarative language which is managed by another
> declarative language (nix). Besides just bringing in a single component
> with an OS Dependency demands we manage this change from nix level.
>
> We currently have a hack in place, that will resolve dependencies and give
> us a path to load a correctly compiled shared object into memory:
> https://github.com/fractalide/fractalide/blob/master/components/nucleus/find/component/src/lib.rs#L43
> nasty and cringe worthy I know.
>
> Thanks for your pointer, I'll take a look at these activation scripts.
>
> Maybe this hack is the answer, and confine the dynamism to an ssh login al
> a Erlang style...
> _______________________________________________
> nix-dev mailing list
> nix-dev at lists.science.uu.nl
> http://lists.science.uu.nl/mailman/listinfo/nix-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.science.uu.nl/pipermail/nix-dev/attachments/20161128/ce5f7e24/attachment.html>