Escape from System D, episode V

Well, yes, I’m still working on Dinit, my portable and “lightweight” intended-as-an-alternative to Systemd. The first commit was on August 27, 2015 – just under three years ago – and my first announcement about Dinit on this blog was on June 14 last year. In looking up these dates, I’m surprised myself: I was working on Dinit for two years before I wrote the introductory blog post! It didn’t feel like that long, but it goes to show how long these things can take (when you’re working as a one-man development team in your spare time).

I recently issued a new release – 0.2.0, still considered alpha – with some new features (and bugfixes), and am planning a 0.3.0 release soon, but progress certainly has been slow. On the other hand, things really have come a long way, and I’m looking forward to being able to call the software “beta” rather than “alpha” at some point soon (though I suppose it’s open question if those terms really mean much anymore). One year in seems like a good time for a retrospective, so here it is; I’ll discuss a number of things that occur to me about the experience of developing some non-trivial software as a lone developer.

On software quality

One thing that’s always bothered me about open-source projects, although it’s not universally true, is that the quality isn’t always that great. There are a huge number of half-done software projects out there on Github (for example), but more importantly there are also a large number of 95% done projects – where they are basically working, but have a number of known bugs which have been sitting in the issue tracker for a year or more, and the documentation is mostly-correct but a bit out-of-date and some of the newer features aren’t mentioned at all. Build documentation is often seen as optional; you can always “just run ./configure –help” though of course it’s not entirely clear what all the options do or how they affect the result, and in my experience the chance that a configure script correctly checks for all the required dependencies is pretty low anyway.

Take the source of any major project, even an established one, and do a search for “TODO” and “XXX”, and the results are often a little disturbing. I try to avoid those in Dinit, though to be fair the count is not zero. There are some in Dasynq (the event-loop library which I’ve also released separately), and some in Dinit’s utility programs (dinitctl and shutdown), but at least there are none in the Dinit core daemon code. But keeping it that way means consistently going back over the code and fixing the things that are marked as needing fixing – or just avoiding creating such holes in the first place. By the time I release version 1.0 I’d like to have no TODO comments in any of the Dinit code.

Documentation is another thing that I’ve been very careful about. Whenever I add any feature, no matter how small, I make sure that the documentation gets updated in the same or the very next commit. I’m glad to say that the documentation is in really good shape; I plan to keep it that way.

Also, tests are important. I don’t enjoy writing them, but they are really the only way I can ensure that I don’t cause regressions when I make changes or add new features, and it satisfying to see all those “PASSED” lines when I run “make check”. I still need to add more tests, though; some parts of the code, particularly the control protocol handling and much of the service description loading, don’t have tests yet.

On autoconf and feature checks and portability

Dinit doesn’t use autoconf and doesn’t have a “configure” script. Basic build settings like compiler and compiler switches are specified in a configuration file which must be hand-edited, though this process isn’t onerous and will generally take all of a whole minute. I wouldn’t be against having a script which would probe and determine those particular settings but I also don’t see a strong need for such a thing.

In terms of system call features, Dinit largely sticks to POSIX, and in the few cases where it doesn’t it uses an #ifdef (eg `#if defined(__FreeBSD__)’). The latter probably isn’t ideal, but the danger of feature checks for system calls is that they usually can only check for the existence of a function with a particular name, and not that it does what we need it to do; I think I’d rather that you have to explicitly specify in the build configuration that such-and-such a call is available with the right semantics than to just check it exists and then blindly assume that it is what we think it is, but just checking for specific systems seems like a nice compromise, at least during development.

As it is now, if you run a current version of Linux, FreeBSD, OpenBSD or MacOS then you can build by editing a single file, uncommenting the appropriate section, and then running GNU make. I’ve also experimented briefly with building it on Sortix but ran into an issue that prevented me from getting it working.

On contributions (and lack thereof)

I’ve had one very minor contribution, from the one person other than myself who I know actually uses Dinit (he also maintains RPM packages of Dinit for Fedora and CentOS). I do sometimes wish that others would take an interest in the development of Dinit, but I’m not sure if there’s any way I can really make that happen, other than by trying to generate interest via blog posts like this one.

What I really should do, I guess, is clean up the presentation a bit – Dinit’s README is plain text, whereas a markdown version would look a lot more professional, and I really should create a web page for it that’s separate to the Github repository. But whatever I do, I know I can’t be certain that other contributors will step forward, nor even that more than handful of people will ever use the software that I’m writing.

On burnout (and avoiding it)

Keeping the momentum up has been difficult, and there’s been some longish periods where I haven’t made any commits. In truth, that’s probably to be expected for a solo, non-funded project, but I’m wary that a month of inactivity can easily become three, then six, and then before you know it you’ve actually stopped working on the project (and probably started on something else). I’m determined not to let that happen – Dinit will be completed. I think the key is to choose the right requirements for “completion” so that it can realistically happen; I’ve laid out some “required for 1.0” items in the TODO file in the repository and intend to implement them, but I do have to restrain myself from adding too much. It’s a balance between producing software that you are fully happy with and that feels complete and polished.

On C++

I’ve always thought C++ was superior to C and I stand by that, though there are plenty who disagree. Most of the hate for C++ seems to be about its complexity. It’s true that C++ is a complex language, but that doesn’t mean the code you write in it needs to be difficult to understand. A lot of Dinit is basically “C with classes (and generic containers)”, though I have a few templates in the logging subsystem and particularly in Dasynq. I have to be very careful that the code is exception safe – that is, there’s nowhere that I might generate an exception and fail to catch it, since that would cause the process to terminate (disastrously if it is running as “init”) – but this turns out to be easy enough; most I/O uses POSIX/C interfaces rather than C++ streams, and memory allocation is carefully controlled (it needs to be in any case).

I could have written Dinit in C, but the code would be quite a bit uglier in a number of places, and quite frankly I wouldn’t have enjoyed writing it nearly as much.

Of course there are other languages, but most of the “obvious” choices use garbage collection (I’d rather avoid this since it greatly increases memory use for comparable performance, and it often comes paired with a standard library / runtime  that doesn’t allow for catching allocation failures). Rust might seem to be a potential alternative which offers memory safety without imposing garbage collection, but its designers made the unfortunate choice of having memory allocation failure cause termination – which is perhaps ok for some applications, but not in general for system programs, and certainly not for init. Even if it weren’t for that, Rust is still a young language and I feel like it has yet to find its feet properly; I’m worried it will mutate (causing maintenance burden) at a rate faster than the more established languages will. It also supports less platforms than C++ does, and I feel like non-Linux OSes are always going to be Rust’s second-class citizens. Of course I hope to be proved wrong, but the panic-on-OOM issue still makes Rust a non-starter for this particular project.

On Systemd

Even when I announced Dinit after working on it for some time I struggled to explain exactly why I don’t like Systemd. There have been some issues with its developers’ attitudes towards certain bugs, and their habit of changing defaults in ways which break established workflows and generally caused problems that were seen by many as unnecessary (the tmux/screen issue for example), but few specific technical issues that couldn’t be classified as one-off bugs.

I think what really bothers me is just the scope of the thing. Systemd isn’t an init system; it’s a software ecosystem, a whole slew of separate programs which are designed to work together and to manage various different aspects of the system, not simply just manage services. The problem is, despite the claims of modularity, it’s somewhat difficult to separate out the pieces. Right from the start, building Systemd, you have a number of dependencies and a huge set of components that you may or may not be able to disable; if you do disable certain components, it’s not clear what the ramifications might be, whether you need to replace them, and what you might be able to replace them with. I’d be less bothered if I could download a source bundle just for “Systemd, the init daemon” and compile that separately, and pick and choose the other parts on an individual basis in a similar way, but that’s just not possible – and this is telling; sure, it’s “modular” but clearly the modules are all designed to be used together. In theory you may be able to take the core and a few select pieces but none of the distributions are doing that and therefore it’s not clear that it really is possible.

Also, I think it’s worth saying that while Systemd has a lot of documentation, it’s not necessarily good documentation. For example (from here):

Slices do not contain processes themselves, but the services and slices contained in them do

Is it (a) slices do not contain processes or (b) slices do contain processes?

This is just one example of something that’s clearly incorrect, but I have read much of the Systemd documentation a number of times and still struggled to find the exact information I was looking for on any number of occasions. And if you’re ever looking for details of internals / non-public APIs – good luck.

Regardless of whether Systemd’s technical merits and flaws are real, having another option doesn’t seem like a bad thing; after all, if you don’t want to use it, you don’t have to. I’m writing Dinit because I see it as what Systemd could have been: a good and reliable standalone service manager with dependency management that can function as a system init.

On detractors and trolls

I guess you can’t take on something as important as an init system and not raise some eyebrows, at least. Plenty of comments have been made since I announced Dinit that are less than positive:

(for the record, not trolling, not a newbie – if that is even a bad thing. And it is both stable and crossplatform).

Or this one:

(If you say so, though I can see some irony in accusing someone of hubris and then immediately following up with a tweet essentially claiming that you yourself are the only person in the world who understands how to do multi-process supervision).

Maybe I bought the last one on myself to some degree by saying that I was aware I could be accused of NIH and that I didn’t care – I was trying to head off this sort of criticism before it began, but may have inadvertently had the opposite effect.

Then, there’s the ever-pleasant commentary on hacker news:

>I’m making an init system

Awesome, maybe I won’t have to!

>C++

Whelp, nevermind.

(Dear Sir_Cmpwn of hacker news: I am quietly confident that my real init system written in C++ is better than your vapour-ware init system that is written in nothing).

And of course on Reddit:

> It will be both efficient and maintainable. It will be stable. Solid-as-a-rock stable.

Author does not have any tests whatsoever and uses a memory unsafe language. I don’t see how he wants to achieve the above goals.

(I know that it is difficult to believe, but truly, it is possible to write tests after you have written other code).

Anyway, this is the internet; of course people will say bad (and stupid) things. There were plenty of positive comments too, such as this one from hacker news:

I’m not a detractor, but there are many things systemd can still improve, but it feels we’re kind of stuck. I’m quite happy if we have some competition here.

Yes! Thank you. There were also some really good comments on my blog posts, and some good discussion elsewhere including on lobste.rs. Ultimately I’ve had probably as much positive as negative feedback, and that’s really helped to keep the motivation up.

The worst thing is, I’ve been guilty of trash-talking other projects myself in the past. I’ve only done so when I thought there was genuine technical issues, and usually out of frustration from wanting software to be better, but that’s no excuse; it doesn’t feel good when someone says bad things about software (or other work) that you created. If only one good thing comes from writing Dinit, it’s that I’ve learned to reign in my rants and focus on staying objective when discussion technical issues.

I guess that’s about a wrap – thanks for reading, as ever. Hopefully next time I write about Dinit it’ll be to report on all the great progress I’ve made since now!

Advertisements

9 thoughts on “Escape from System D, episode V

    • Thanks! I’d be remiss to fail to point out that there are in fact other alternatives, too. OpenRC from Gentoo apparently works quite well; S6-RC by Laurent Bercot, and Nosh by Jonathan de Boyne Pollard, are both also impressive offerings. All have slightly different design philosophies, but the thing they do have in common (with Dinit also) is a well-defined and suitably restrained scope and feature set.

      • There’s also runit (http://github.com/madscientist42/runit) which sometime last year about 8 months ago got resurrected as well. Open-RC, unfortunately, because of the current maintainer, started taking on some of the notions that drive the decisions on systemd as well, so there’s some concern THERE as well. They’ve drifted from nice, tidy, init-only function to doing things like mounting volumes, etc. (Ouch… What is with all this feature creep that is so alluring to people? X-D)

        • Thanks for your comments. I know of runit but generally don’t consider it to be in the same league as the systems I listed above because it has no real support for dependency handling (I know you can fudge it to some degree but it is not as clean as proper support). For some people that’s a bonus: the system itself is much simpler (with the benefits that entails: easier to understand, less likely to contain bugs); personally I prefer some more intelligence built into the system (but I recognise that this is a personal choice).

          I’ve not looked at OpenRC recently in enough detail to be able to comment, but if the scope is growing out-of-control then I think that alternative projects (like Dinit) are ever more important.

  1. The C++ question is an interesting one. There are a lot of linux kernel targets that have pure C stacks, such as those built with busybox, router projects, etc. I think the primary issue is not so much C++ itself, but rather the additional runtime/memory overhead of the C++ std library, especially if no other services are using it. I used to get around this for embedded uses by explicitly disabling the c++ lib, stack unwinding, and disabling exception handling.

    • It’s true that the C++ library tends to add some overhead, though I’m not sure that it’s necessary to completely avoid it. A completely trivial statically-linked program weighs in at 626kb on my system in a quick test, whether it’s C or C++, which seems fantastically excessive – the problem is probably that I’m linking with Glibc and it’s not designed for compactness. Compare this to a dynamic link, which is a shade under 6kb. In the statically-linked case, since I haven’t used any of the C++ standard library in my test program, there’s basically no cost to it.

      I believe the iostreams library is particularly heavy-weight (statically-linked size goes up to 1.6mb immediately) and using std::string seems to add a little over 100kb. However, I’d expect a lighter-weight C library and std C++ library implementation would bring the size down by a huge amount, and linking dynamically is often an option (even with eg. OpenWRT). Nevertheless, reducing or eliminating use of the C++ standard library might be something to consider for the future; right now however I’d rather focus on getting the functionality complete.

      • There certainly is a Hubris involved with working on an init system, your own, someone else’s, etc.

        As one of the developers for one of the “competing” ones brought back to life (runit), currently doing the slow process of security auditing the codebase there, I get that. (And I’m going to look at yours- I’m after a better answer than SysV Init regardless of where/how it comes to be. It’s just that systemd isn’t it.)

        I’ve similar issues as you. You fingered what’s really wrong in the blog post. All the issues people have the niggling worries, the all of the bitches out and about? They’re endemic from it being a monolithic platform for all intents and purposes. Your FIRST hint that while it’s “modular” it’s not usable in any simple lego-block fashion is that uselessd, the very thing that was intended to do just init…couldn’t keep up, couldn’t maintain itself because, just like most everything the lead dev on systemd has touched, keeps growing, having scope creep, and metastasizing into something nasty, problematic, etc. It is “modular” in a sense that each piece is partitioned off, but you really, really can’t use the piece parts or take things like their udev support out and use the original or ANY differing daemon for that purpose.

        That’s why most LEFT Windows for Linux in the first place. If you wanted what Apple or Microsoft offered, thinking it “better”- use THEIR stuff. Don’t morph Linux into it. It’s it’s own thing and none of this travesty, really, makes containers more robust or better or moves us onto the desktop or, or…

        It’s just hubris beyond the simple one there for Init- it presumes you know it all and have ALL the answers for all of the services, platform support, etc. And they DON’T. They don’t know the first thing about mission-critical…which would be the OTHER cause for all the serious “one-off bugs” that keep showing up. (Guess they’re not one-off, are they?)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.