Software is Crap

Escape from System D (2)


Episode II: Init versus the service management daemon

I was pleased that my announcement of another in-development init/service manager met with a mostly positive response. I plan to keep making semi-regular posts where I post both general discussion around the issues of service management and progress updates on my own effort, dubbed Dinit.

In this post I will give a little background on init systems and service management generally. I expect a lot of readers will not learn much, since it is already well understood, but it is worth laying out some background for reference in future posts/discussion.

What is “init”?

The init process, traditionally started from /sbin/init on the filesystem, is the first userspace process to launch on the system. As such it is the only process with no parent process. Most (if not all) operating systems give it a process ID of 1, making it easy to identify. There are two special things about the init process:

  1. First, it automatically becomes the new parent of otherwise orphaned processes. In particular processes which “daemonise” themselves by double-forking and letting the intermediate parent die get re-parented to the init process.
  2. If the init process terminates, for any reason, the kernel panics (so the whole system crashes).

The second point is in fact not necessarily true – it just so happens that, at least on Linux, if the init process dies then the system dies with it. I am not sure how the various *BSD systems react, but in general, it is not expected that the init process will terminate. This means that it is very, very important that the init process does not crash. However, the first point above has some implications as well, which we’ll get to shortly.

Notionally, the init system has two jobs: to reap its child processes when they have terminated (this is accomplished using the wait system call or one of its variants; reaping a terminated process ensures that its resources are freed and that it is no longer listed in the process table of the system) and also to start up the system, which it can potentially do just by running another process. An init may also be involved in the system shutdown process as well, though strictly speaking that’s not necessary.

You might be interested in Rich Felker’s example of a minimal init system, which is part of one of his blog posts (where he also discusses Systemd). It’s less than a screenful of text – small enough that it can be “obviously bug free” – a nice attribute to have for an init, for reasons outlined above.

So what is a “service manager”?

A service manager provides, at the most basic level, a means for stopping and starting individual services. Services quite typically run as a process – consider for example the ssh server daemon, sshd – but sometimes exist in some other form; having the network connection(s) up and operational, for example, could be enacted by means of a service. Typical modern systems have a service manager which is either started from the init process or incorporated in it (Systemd is an example of an init process which incorporates service management functionality, but there are various others which do the same).

Aside from just an interface to starting and stopping services, service managers may provide:

Since a service manager is naturally somewhat more complex than a standalone init system, it should be obvious that incorporating the two in one process has some inherent risks. If an init system terminates unexpectedly, the whole system will generally crash; not only is this inconvenient for the user, but it also makes analysing the bug that caused the crash more difficult.

Why combine them, then?

The obvious question: if it’s better to keep init as simple as possible, why does it get combined with service management? One reason is so that double-forking processes, which have re-parented to the init process, can be supervised; normal POSIX functions only allow receiving status notifications for direct child processes. (Various *BSDs support watching arbitrary process status via the kqueue system calls, but the interface has flaws – that I will perhaps discuss another time – and anyway, any mechanism to watch a non-immediate-child process by process ID, without co-ordination with the parent process, is prone to a race condition: at least in theory, a process with a given ID can die, and be reaped, and the process ID can be recycled, in between some other process discovering the process ID and setting up a watch for it or even worse sending a termination signal in an attempt to shut down a service).

Now we could just about argue that no service should double-fork, and this is eliminates any need for the service manager to run as the init process (PID 1). However, we can’t actually prevent processes from double-forking; on the other hand, there is a mechanism – at least on Linux – called cgroups, which allows for tracking process origin even through double-fork. Importantly, this can be used to track processes belonging to particular user sessions. One operation that we might naturally want to perform to a cgroup is to terminate it – or rather, terminate all processes in the cgroup – and this, once again, is racy unless we can co-ordinate with the parent process(es) of all processes in the cgroup (and by “coordinate” I mean that we want to prevent the parent process from reaping child processes which have terminated, to avoid the race where a process ID is recycled and the wrong process is then terminated, as described above),

(Some other systems might have functionality similar to cgroups – I have FreeBSD jails in mind, though I need to do some research to understand exactly how jails work and their limitations, and in particular if they also suffer the termination race problem described above).

So, for supervising double-forked processes, and for controlling user sessions, having control of the PID 1 (init) process is important for a service manager. However, there’s a hint in what hasn’t been said: while we may need co-operation between the init process and the service manager, it’s not absolutely necessary that they are the same process. One of the ideas I’d like to investigate with Dinit is whether we can keep a very simple init process and a separate, more complex, service manager / supervisor.

Dinit progress

For the most part reactions to my announcement of Dinit were positive. One comment on Reddit wondered how I was going to be able to achieve a “solid-as-a-rock stable” system using a non-memory-safe language (C++) and without having any tests. Of course, this wasn’t quite correct; I have always had tests for Dinit, but they were not automated. One thing that I’ve done since my initial announcement is implement a small number of automated tests (that you can run using “make check”). I plan to write many more tests, but this feels like a good start. I’ll discuss the reasons for using C++ at some point, but it needs to borne in mind that while C++ is not memory-safe it is still perfectly possible to write stable software in such a language; it just takes a little more effort!

I’ve also done a little refactoring, solved one or two minor bugs, and improved the man pages. My TODO list is slowly getting smaller and I think Dinit is approaching the stage where it can be considered a high-quality service manager, though it is a way off from being a full replacement for Systemd.

Please feel free to comment below and/or check out the source code on the Github repository.