Quantcast
Channel: Sander van der Burg's blog
Viewing all 159 articles
Browse latest View live

9th annual blog reflection

$
0
0
Today, it is my blog's 9th anniversary. Similar to previous years, this is a good opportunity for reflection.

A summary of 2019


There are a variety of things that I did in 2019 that can be categorized as follows:

Documenting the architecture of service-oriented systems


In the beginning of the year, most of my efforts were concentrated on software architecture related aspects. At Mendix, I have spent quite a bit of time on documenting the architecture of the platform that I and my team work on.

I wrote a blog post about my architecture documentation approach, that is based on a small number of simple configuration management principles that I came up several years ago. I gave a company wide presentation about this subject, that was well-received.

Later, I have extended Dynamic Disnix, a self-adaptive deployment framework that I have developed as part of my past research, with new tools to automate the architecture documentation approach that I have presented previously -- with these new tools, it has become possible to automatically generate architecture diagrams and descriptions from Disnix service models.

Furthermore, the documentation tools can also group components (making it possible to decompose diagrams into layers) and generate architecture documentation catalogs for complete systems.

Improving Disnix's data handling and data integration capabilities


While implementing the architecture documentation tools in Dynamic Disnix, I ran into several limitations of Disnix's internal architecture.

To alleviate the burden of developing new potential future integrations, I have "invented"NixXML, an XML file format that is moderately readable, that can be easily converted from and to Nix expressions, and is still useful to be used without Nix. I wrote a library called libnixxml that can be used to produce and consume NixXML data.

In addition to writing libnixxml. I also rewrote significant portions of Disnix and Dynamic Disnix to work with libnixxml (instead of hand written parsing and Nix expression generation code). The result is that we need far less boilerplate code, we have more robust data parsing, and we have significantly better error reporting.

In the process of rewriting the Disnix data handling portions, I have been following a number of conventions that I have documented in this blog post.

Implementing a new input model transformation pipeline in Disnix


Besides data integration and data handling, I have also re-engineered the input model transformation pipeline in Disnix, that translates the Disnix input models (a services, infrastructure and distribution model) capturing aspects of a service-oriented systems to an executable model (a deployment manifest/model) from which the deployment activities that it needs to run can be derived.

The old implementation evolved organically and has grown so much that it became too complicated to adjust and maintain. In contrast, the new transformation pipeline consists of well-defined intermediate models.

As a result of having a better, more well-defined transformation pipeline, more properties can be configured.

Moreover, it is now also possible to use a fourth and new input model: a packages model that can be used to directly deploy a set of packages to a target machine in the network.

The new data handling features and transformation pipeline were integrated into Disnix 0.9 that was released in September 2019.

An experimental process management framework


In addition to the release of Disnix 0.9 that contains a number of very big internal improvements, I have started working on a new experimental Nix-based process management framework.

I wrote a blog post about the first milestone: a functional organization that makes it possible to construct multiple instances of managed processes, that can be deployed by privileged and unprivileged users.

Miscellaneous


Earlier this year, I was also involved in several discussions about developer motivation. Since this is a common and recurring subject in the developer community, I have decided to write about my own experiences.

Some thoughts


Compared to previous years, this year is my least productive from a blogging perspective -- this can be attributed to several reasons. First and foremost, there are a number of major personal and work-related events that contributed to a drop in productivity for a while.

Second, some of the blog posts that I wrote are results of very big undertakings -- for example, the internal improvements to Disnix took me several months to complete.

As an improvement for next year, I will try to more carefully decompose my tasks, prioritize and order them in a better way. For example, implementing the architecture documentation tools was unnecessarily difficult before NixXML and the revised data models were implemented. I should have probably done these tasks in the opposite order.

Overall top 10 of my blog posts


As with previous years, I am going to publish the top 10 of my most frequently read blog posts. Surprisingly, not much has changed compared to last year:

  1. Managing private Nix packages outside the Nixpkgs tree. As with last year, this blog post remains to be the most popular. I believe this can be attributed to the fact that there is still no official, simple hands-on tutorial avaialble that allows users to easily experiment with building packages.
  2. On Nix and GNU Guix. This blog post used to be my most popular blog since 2012, but has dropped to the second place last year. I think this shows that the distinction between Nix and GNU Guix has become more clear these days.
  3. An evaluation and comparison of Snappy Ubuntu. Still remains very popular, but I have not heard much from Snappy lately so it still remains a bit of a mystery why it remains third.
  4. Setting up a multi-user Nix installation on non-NixOS systems. Some substantial improvements have been made to support multi-user installations on other operating systems than NixOS, but I believe we can still do better. There are some ongoing efforts to improve the Nix installer even further.
  5. Yet another blog post about Object Oriented Programming and JavaScript. This blog post is still at the same spot compared to last year, and still seems to be read quite frequently. It seems that documenting my previous misunderstandings are quite helpful.
  6. An alternative explanation of the Nix package manager. This blog post is a somewhat unconventional Nix explanation recipe, that typically has my preference. It seems to catch quite a bit of attention.
  7. On NixOps, Disnix, service deployment and infrastructure deployment. This blog post is still as popular as last year. It seems that it is quite frequently used as an introduction to NixOps, although I did not write this blog post for that purpose.
  8. Asynchronous programming with JavaScript. A blog post I wrote several years ago, when I was learning about Node.js and asynchronous programming concepts. It still seems to remain popular even though the blog post is over six years old.
  9. Composing FHS-compatible chroot environments with Nix (or deploying Steam in NixOS). This blog post is still read quite frequently, probably because of Steam, which is a very popular application for gamers.
  10. Auto patching prebuilt binary software packages for deployment with the Nix package manager. This is the only new blog post in the overall top 10 compared to last year. Patching binaries to deploy software on NixOS is a complicated subject. The popularity of this blog post proves that developing solutions to make this process more convenient is valuable.

Conclusion


As with previous years, I am still not out of ideas so stay tuned!

There is one more think I'd like to say and that is:


HAPPY NEW YEAR!!!!!!!!!!!


Writing a well-behaving daemon in the C programming language

$
0
0
Slightly over one month ago, I wrote a blog post about a new experimental Nix-based process management framework that I have been developing. For this framework, I need to experiment with processes that run in the foreground (i.e. they block the shell of the user that invokes it as long as it is running), and daemons -- processes that run in the background and are not directly controlled by the user.

Deamons are (still) a common practice in the UNIX world (although this is changing nowadays with process managers, such as systemd and launchd) to make system services available to end users, such as web servers, the secure shell, and FTP.

To make experimentation more convenient, I wanted to write a very simple service that can run both in the foreground and as a deamon. Initially, I thought writing a daemon would be straight forward, but this turned out to be much more difficult than I initially anticipated.

I have learned that daemonizing a process is quite simple, but writing a well-behaving deamon is quite complicated. I have been studying a number of sources on how to properly write one and none of them provided me all the information that I needed. As a result, I have decided to do some investigation myself and write a blog post about my findings.

The basics


As I have stated earlier, the basics of writing a daemon in the C programming language are simple. For example, I can write a very trivial service whose only purpose is to print: Hello on the terminal every second until it receives a terminate or interrupt signal:


#include <stdio.h>
#include <unistd.h>
#include <signal.h>

volatile int terminate = FALSE;

static void handle_termination(int signum)
{
terminate = TRUE;
}

static void init_service(void)
{
signal(SIGINT, handle_termination);
signal(SIGTERM, handle_termination);
}

static void run_main_loop(void)
{
while(!terminated)
{
fprintf(stderr, "Hello!\n");
sleep(1);
}
}

The following trivial main method allows us to let the service run in "foreground mode":


int main()
{
init_service();
run_main_loop();
return 0;
}

The above main method initializes the service (that configures the signal handlers) and invokes the main loop (as defined in the previous code example). The main loop keeps running until it receives a terminate (SIGTERM) or interrupt (SIGINT) signal that unblocks the main loop.

When we run the above program in a shell session, we should observe:


$ ./simpleservice
Hello!
Hello!
Hello!

We will see that the service prints Hello! every second until it gets terminated. Moreover, we will notice that the shell is blocked from receiving user input until we terminate the process. Furthermore, if we terminate the shell (for example by sending it a TERM signal from another shell session), the service gets terminated as well.

We can easily change the main method, shown earlier, to turn our trivial service (that runs in foreground mode) into a deamon:


#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>

int main()
{
pid_t pid = fork();

if(pid == -1)
{
fprintf(stderr, "Can't fork daemon process!\n");
return 1;
}
else if(pid == 0)
run_main_loop();

return 0;
}

The above code forks a child process, the child process executes the main loop, and the parent process terminates immediately.

When running the above program on the terminal, we should see that the ./simpleservice command returns almost immediately and a daemon process keeps running in the background. Stopping our shell session (e.g. with the exit command or killing it by sending a TERM signal to it), does not cause the daemon process to be stopped.

This behaviour can be easily explained -- because the shell only waits for the completion of the process that it invokes (the parent process), it will no longer block indefinitely, because it terminates directly after forking the child process.

The daemon process keeps running (even if we end our shell session), because it gets orphaned from the parent and adopted by the process that runs at PID 1 -- the init system.

Writing a well-behaving daemon


The above code fragments may probably look very trivial. Is this really sufficient to create a deamon? You could probably already guess that the answer is: no.

To learn more about properly writing a daemon, I studied various sources. The first source I consulted was the Linux Daemon HOWTO, but that document turned out to be a bit outdated (to be precise: it was last updated in 2004). This document basically shows how to implement a very minimalistic version of a well-behaving daemon. It does much more than just forking a child process, for reasons that I will explain later in this blog post.

After some additional searching, I stumbled on systemd's recommendations for writing a traditional SysV daemon (this information can also be found by opening the following manual page: man 7 daemon). systemd's daemon manual page specifies even more steps. Contrary to the Linux Daemon HOWTO, it does not provide any code examples.

Despite the fact that the HOWTO implements extra requirements than just a simple fork, it still looked quite simple. Implementing all systemd recommendations, however, turned out to be much more complicated than I expected.

It also made me realize: why is all this stuff needed? None of the sources that I studied so far, explain me why all these additional steps need to be implemented.

After some thinking, I believe I understand why: a well-behaving daemon needs to be fully detached from user control, be controllable from an external program, and act safely and predictably.

In the following sections I will explain what I believe is the rationale for each step described in the systemd daemon manual page. Moreover, I will describe the means that I used to implement each requirement:

Closing all file descriptors, except the standard ones: stdin, stdout, stderr


Closing all, but the standard file descriptors, is a good practice, because the daemon process inherits all open files from the calling process (e.g. the shell session from which the daemon is invoked).

Not closing any additional open file descriptors may cause the file descriptors to remain open for an indefinite amoung of time, making it impossible to cleanly unmount the partition where these files may have been stored. Moreover, it also keeps file descriptors unnecessarily allocated.

The daemon manual page describes two strategies to implement closing these non-standard file descriptors. On Linux, it is possible to iterate over the content of the: /proc/self/fd file. A portable, but less efficient, way is to iterate from file descriptor 3 to the value returned by getrlimit for RLIMIT_NOFILE.

I ended up implementing this step with the following function:


#include <sys/time.h>
#include <sys/resource.h>

static int close_non_standard_file_descriptors(void)
{
unsigned int i;

struct rlimit rlim;
int num_of_fds = getrlimit(RLIMIT_NOFILE, &rlim);

if(num_of_fds == -1)
return FALSE;

for(i = 3; i < num_of_fds; i++)
close(i);

return TRUE;
}

Resetting all signal handlers to their defaults


Similar to file descriptors, the daemon process also inherits the signal handler configuration of the caller process. If signal handlers have been altered, then the daemon process may behave in a non-standard and unpredictable way.

For example, the TERM signal handler could have been overridden so that the daemon no longer cleanly shuts down when it receives a TERM signal. As a countermeasure, the signal handlers must be reset to their default behaviour.

The systemd daemon manual page suggests to iterate over all signals up to the limit of _NSIG and resetting them to SIG_DFL.

I did some investigation and it seems that this method is not standardized by e.g. POSIX -- _NSIG is a constant that glibc defines and it is not a guarantee that other libc implementations will provide the same constant.

I ended up implementing the following function:


#include <signal.h>

static int reset_signal_handlers_to_default(void)
{
#if defined _NSIG
unsigned int i;

for(i = 1; i < _NSIG; i++)
{
if(i != SIGKILL && i != SIGSTOP)
signal(i, SIG_DFL);
}
#endif
return TRUE;
}

The above implementation iterates from the first signal handler, up until the maximum signal handler. It will ignore SIGKILL and SIGSTOP because they cannot be overridden.

Unfortunately, this implementation will not work with libc implementations that lack the _NSIG constant. I am really curious if somebody could suggest me a standards compliant way to reset all signal handlers.

Resetting the signal mask


It is also possible to completely block certain signals by adjusting the signal mask. The signal mask also gets inherited by the daemon from the calling process. To make a daemon act predicatably, e.g. it should do a proper shutdown when it receives the TERM signal, it would be a good thing to reset the signal mask to the default configuration.

I ended up implementing this requirement with the following function:


static int clear_signal_mask(void)
{
sigset_t set;

return((sigemptyset(&set) == 0)
&& (sigprocmask(SIG_SETMASK, &set, NULL) == 0));
}

Sanitizing the environment block


Another property that a daemon process inherits from the caller are the environment variables. Some environment variables might negatively affect the behaviour of the daemon. Furthermore, environment variables may also contain privacy-sensitive information that could get exposed if the security of a deamon gets compromised.

As a counter-measure, it would be good to sanitize the environment block. For example, by removing environment variables with clearenv() or using a white listing approach.

For my trivial example case, I did not need to sanitize the environment block because no environment variables are used.

Forking a background process


After closing all non-standard file descriptors, and resetting the signal handlers to their default behaviour, we can fork a background process. The primary reason to fork a background process, as explained earlier, is to get it orphaned from the parent so that it gets adopted by PID 1, the init system, and stays in the background.

We must actually fork twice, as I will explain later. First, I will fork a child process that I will call a helper process. The helper process will do some more housekeeping work and forks another child process, that will become our daemon process.

Detaching from the terminal


The child process is still attached to the terminal of the caller process, and can still read input from the terminal and send output to the terminal. To completely detach it from the terminal (and any user interaction), we must adjust the session ID:


if(setsid() == -1)
{
/* Do some error handling */
}

and then we must fork again, so that the daemon can never re-acquire a terminal again. The second fork will create the real daemon process. The helper process should terminate so that the newly created daemon process gets adopted by the init system (that runs on PID 1):


if(fork_daemon_process(pipefd[1],
pid_file,
data,
initialize_daemon,
run_main_loop) == -1)
{
/* Do some error handling */
}

/*
* Exit the helper process,
* so that the daemon process gets adopted by PID 1
*/
exit(0);

Connecting /dev/null to standard input, output and error in the daemon process


Since we have detached from the terminal, we should connect /dev/null to the standard file descriptors in the daemon process, because these file descriptors are still connected to the terminal from which we have detached.

I implemented this requirement with the following function:


static int attach_standard_file_descriptors_to_null(void)
{
int null_fd_read, null_fd_write;

return(((null_fd_read = open(NULL_DEV_FILE, O_RDONLY)) != -1)
&& (dup2(null_fd_read, STDIN_FILENO) != -1)
&& ((null_fd_write = open(NULL_DEV_FILE, O_WRONLY)) != -1)
&& (dup2(null_fd_write, STDOUT_FILENO) != -1)
&& (dup2(null_fd_write, STDERR_FILENO) != -1));
}

Resetting the umask to 0 in the daemon process


The umask (a setting that globally alters file permissions of newly created files) may have been adjusted by the calling process, causing directories and files created by the daemon to have unpredictable file permissions.

As a countermeasure, we should reset the umask to 0 with the following function call:


umask(0);

Changing current working directory to / in the daemon process


The daemon process also inherits the current working directory of the caller process. It may happen that the current working directory refers to an external drive or partition. As a result, it can no longer be cleanly unmounted while the daemon is running.

To prevent this from happening, we should change the current working directory to the root folder, because that is the only partition that is guaranteed to stay mounted while the system is running:


if(chdir("/") == -1)
{
/* Do some error handling */
}

Creating a PID file in the daemon process


Because a program that daemonizes forks another process, and terminates immediately, there is no way for the caller (e.g. the shell) to know what the process ID (PID) of the daemon process is. The caller can only know the PID of the parent process, that terminates right after setting up the daemon.

A common practice to know the PID of the daemon process is to write a PID file that contains its process ID (PID). A PID file can be used to reliably terminate service, when it is no longer needed.

According to the systemd recommendations, a PID file must be created in a race free fashion, e.g. when a daemon has been started already it should not attempt to create another PID file with the same name.

I ended up implementing this requirement as follows:


static int create_pid_file(const char *pid_file)
{
pid_t my_pid = getpid();
char my_pid_str[10];
int fd;

sprintf(my_pid_str, "%d", my_pid);

if((fd = open(pid_file, O_CREAT | O_EXCL | O_WRONLY, S_IRUSR | S_IWUSR)) == -1)
return FALSE;

if(write(fd, my_pid_str, strlen(my_pid_str)) == -1)
return FALSE;

close(fd);

return TRUE;
}

In the above implementation, the O_EXCL flag makes sure that the a previously generated PID file cannot already exist or belong to another process. If a PID file happens to exist already, the initialization of the daemon fails.

Dropping privileges in the daemon process, if applicable


Since daemons are typically long running, and they are typically started by the super user (root), they are also typically a security risk. By default, if a process is started as root, the daemon process also has root privileges and full access to the entire filesystem, if its security gets compromised.

For this reason, it is typically a good idea to drop privileges in the daemon process. There are a variety of restrictions you can impose, such as changing the ownership of the process to an unprivileged user:


if(setgid(100) == 0 && setuid(1000) == 0)
{
/* Execute some code with restrictive user permissions */
...
}
else
{
fprintf(stderr, "Cannot change user permissions!\n");
exit(1);
}

In my trivial example case, I had no such requirement.

Notifying the parent process when the initialization of the daemon is complete


Another practical problem you may run into with daemons is that you do not know (for sure) when they are ready to be used. Because the parent process terminates immediately and delegates most of the work, including the initialization steps, to the daemon process (that runs in the background), you may already attempt to use it before the initialization is done. If you rely on a network connection, then it may happen that right after starting the daemon, the network link does not work.

Furthermore, there is no way to know for sure how long it would take before all the daemon's services become available. This particularly inconvenient for scripting.

For me personally, notification was the most complicated requirement to implement.

systemd's daemon manual page suggests to use an unnamed pipe. I ended up with an implementation that looks as follows:

Before doing any forking, I will create a pipe, and pass the corresponding file descriptors the to utility function that creates the helper process, as described earlier:


int pipefd[2];

if(pipe(pipefd) == -1)
return STATUS_CANNOT_CREATE_PIPE;
else
{
if(fork_helper_process(pipefd, pid_file, data, initialize_daemon, run_main_loop) == -1)
return STATUS_CANNOT_FORK_HELPER_PROCESS;
else
{
/* Wait for notification from the parent */
}
}

The helper and daemon process will use the write end of the pipe to send notification messages. I ended up using it as follows:


static pid_t fork_helper_process(int pipefd[2],
const char *pid_file,
void *data,
int (*initialize_daemon) (void *data),
int (*run_main_loop) (void *data))
{
pid_t pid = fork();

if(pid == 0)
{
close(pipefd[0]); /* Close unneeded read-end */

if(setsid() == -1)
{
notify_parent_process(pipefd[1], STATUS_CANNOT_SET_SID);
exit(STATUS_CANNOT_SET_SID);
}

/* Fork again, so that the terminal can not be acquired again */
if(fork_daemon_process(pipefd[1], pid_file, data, initialize_daemon, run_main_loop) == -1)
{
notify_parent_process(pipefd[1], STATUS_CANNOT_FORK_DAEMON_PROCESS);
exit(STATUS_CANNOT_FORK_DAEMON_PROCESS);
}

exit(0); /* Exit the helper process, so that the daemon process gets adopted by PID 1 */
}

return pid;
}

If something fails or the entire initialization process finishes successfully completes, the helper and daemon processes invoke the notify_parent_process() function to send a message over the write end of the pipe to notify the parent. In case of an error, the helper or daemon process also terminates with the same exit status.

I implemented the notification function as follows:


static void notify_parent_process(int writefd, DaemonStatus message)
{
char byte = (char)message;
while(write(writefd, &byte, 1) == 0);
close(writefd);
}

The above function simply sends a message (of only one byte in size) over the pipe and then closes the connection. The possible messages are encoded in the following enumeration:


typedef enum
{
STATUS_INIT_SUCCESS = 0x0,
STATUS_CANNOT_ATTACH_STD_FDS_TO_NULL = 0x1,
STATUS_CANNOT_CHDIR = 0x2,
...
STATUS_CANNOT_SET_SID = 0xc,
STATUS_CANNOT_FORK_DAEMON_PROCESS = 0xd,
STATUS_UNKNOWN_DAEMON_ERROR = 0xe
}
DaemonStatus;

The parent process will not terminate immediately, but waits for a notification message from the helper or daemon processes:


DaemonStatus exit_status;

close(pipefd[1]); /* Close unneeded write end */
exit_status = wait_for_notification_message(pipefd[0]);
close(pipefd[0]);
return exit_status;

When the parent receives a notification message, it will simply propagate the value as an exit status (which is 0 if everything succeeds, and non-zero when the process fails somewhere). The non-zero exit status corresponds to a value in the enumeration (shown earlier) allowing us to trace the origins of the error.

The function that waits for the the notification of the daemon process is implemented as follows:


static DaemonStatus wait_for_notification_message(int readfd)
{
char buf[BUFFER_SIZE];
ssize_t bytes_read = read(readfd, buf, 1);

if(bytes_read == -1)
return STATUS_CANNOT_READ_FROM_PIPE;
else if(bytes_read == 0)
return STATUS_UNKNOWN_DAEMON_ERROR;
else
return buf[0];
}

The above method reads from the pipe, will block as long as no data was sent and the write end of the pipe was not closed, and returns the byte that it has received.

Exiting the parent process after the daemon initialization is done


This requirement overlaps with the previous requirement and can be met by calling exit() after the notification message was sent (and/or write end of the pipe was closed).

Discussion


I really did not expect that writing a well-behaving deamon (that follows systemd's recommendations) would be so difficult. I ended up writing 206 LOC to implement all the functionality listed above. Maybe I could reduce this amount a bit with some clever programming tricks, but my objective was to keep the code clear, have it decomposed into functions and make it understandable.

There are solutions that alleviate the burden of creating a daemon. A prominent example would be BSD's daemon() function (that is also included with glibc). It is a single function call that can be used to automatically daemonize a process. Unfortunately, it does not seem to meet all requirements that systemd specifies.

I also looked at many Stackoverflow posts, and although they correctly cite the systemd's daemon manual page with requirements for a well-behaving daemon, none of the solutions that I could find fully meet all requirements -- in particular, I could not find any good examples that implement a protocol that notify the parent process when the daemon process was successfully initialized.

Because none of these Stackoverflow posts provide what I need, I have decided to not use any of these articles as an example, but start from scratch and look for all relevant pieces myself.

One aspect that still puzzles me is how to "properly" iterate over all signal handlers. The solution hinted by systemd is non-standard requiring a glibc specific constant. Some sources say that there is no standardized equivalent, so I am still curious whether there is a recipe that can reset all signal handlers to their default behaviour in a standard compliant way.

In the introduction section, I mentioned that daemons are still a common practice in UNIX-like systems, such as Linux, but that this is changing. IMO this is for a good reason -- services typically need to reimplement the same kind of functionality over and over again. Furthermore, I have noticed that not all daemons meet all requirements and could behave incorrectly. For example, it is not a guarantee that a deamon correctly writes a PID file with the PID of the daemon process.

For these reasons, systemd's daemon manual page also describes "new style daemons", that are considerably easier to implement with less boilerplate code. Apple has similar recommendations for launchd.

With "new style daemons", processes just spawn in foreground mode, and the process manager (e.g. systemd, launchd or supervisord) takes care of all "housekeeping tasks" -- the process manager makes sure that it runs in the background, drops user privileges etc.

Furthermore, because the process manager directly invokes the daemon process (and as a result knows its PID), controlling a daemon is also less fragile -- the requirement that a PID file needs to be properly created is also dropped.

Availability


The daemonize infrastructure described in this blog post will be released as part of the main example case that I will describe in the next blog post.

A declarative process manager-agnostic deployment framework based on Nix tooling

$
0
0
In a previous blog post written two months ago, I have introduced a new experimental Nix-based process framework, that provides the following features:

  • It uses the Nix expression language for configuring running process instances, including their dependencies. The configuration process is based on only a few simple concepts: function definitions to define constructors that generate process manager configurations, function invocations to compose running process instances, and Nix profiles to make collections of process configurations accessible from a single location.
  • The Nix package manager delivers all packages and configuration files and isolates them in the Nix store, so that they never conflict with other running processes and packages.
  • It identifies process dependencies, so that a process manager can ensure that processes are activated and deactivated in the right order.
  • The ability to deploy multiple instances of the same process, by making conflicting resources configurable.
  • Deploying processes/services as an unprivileged user.
  • Advanced concepts and features, such as namespaces and cgroups, are not required.

Another objective of the framework is that it should work with a variety of process managers on a variety of operating systems.

In my previous blog post, I was deliberately using sysvinit scripts (also known as LSB Init compliant scripts) to manage the lifecycle of running processes as a starting point, because they are universally supported on Linux and self contained -- sysvinit scripts only require the right packages installed, but they do not rely on external programs that manage the processes' life-cycle. Moreover, sysvinit scripts can also be conveniently used as an unprivileged user.

I have also developed a Nix function that can be used to more conveniently generate sysvinit scripts. Traditionally, these scripts are written by hand and basically require that the implementer writes the same boilerplate code over and over again, such as the activities that start and stop the process.

The sysvinit script generator function can also be used to directly specify the implementation of all activities that manage the life-cycle of a process, such as:


{createSystemVInitScript, nginx, stateDir}:
{configFile, dependencies ? [], instanceSuffix ? ""}:

let
instanceName = "nginx${instanceSuffix}";
nginxLogDir = "${stateDir}/${instanceName}/logs";
in
createSystemVInitScript {
name = instanceName;
description = "Nginx";
activities = {
start = ''
mkdir -p ${nginxLogDir}
log_info_msg "Starting Nginx..."
loadproc ${nginx}/bin/nginx -c ${configFile} -p ${stateDir}
evaluate_retval
'';
stop = ''
log_info_msg "Stopping Nginx..."
killproc ${nginx}/bin/nginx
evaluate_retval
'';
reload = ''
log_info_msg "Reloading Nginx..."
killproc ${nginx}/bin/nginx -HUP
evaluate_retval
'';
restart = ''
$0 stop
sleep 1
$0 start
'';
status = "statusproc ${nginx}/bin/nginx";
};
runlevels = [ 3 4 5 ];

inherit dependencies instanceName;
}

In the above Nix expression, we specify five activities to manage the life-cycle of Nginx, a free/open source web server:

  • The start activity initializes the state of Nginx and starts the process (as a daemon that runs in the background).
  • stop stops the Nginx daemon.
  • reload instructs Nginx to reload its configuration
  • restart restarts the process
  • status shows whether the process is running or not.

Besides directly implementing activities, the Nix function invocation shown above can also be used on a much higher level -- typically, sysvinit scripts follow the same conventions. Nearly all sysvinit scripts implement the activities described above to manage the life-cycle of a process, and these typically need to be re-implemented over and over again.

We can also generate the implementations of these activities automatically from a high level specification, such as:


{createSystemVInitScript, nginx, stateDir}:
{configFile, dependencies ? [], instanceSuffix ? ""}:

let
instanceName = "nginx${instanceSuffix}";
nginxLogDir = "${stateDir}/${instanceName}/logs";
in
createSystemVInitScript {
name = instanceName;
description = "Nginx";
initialize = ''
mkdir -p ${nginxLogDir}
'';
process = "${nginx}/bin/nginx";
args = [ "-c" configFile "-p" stateDir ];
runlevels = [ 3 4 5 ];

inherit dependencies instanceName;
}

You could basically say that the above createSystemVInitScript function invocation makes the configuration process of a sysvinit script "more declarative" -- you do not need to specify the activities that need to be executed to manage processes, but instead, you specify the relevant characteristics of a running process.

From this high level specification, the implementations for all required activities will be derived, using conventions that are commonly used to write sysvinit scripts.

After completing the initial version of the process management framework that works with sysvinit scripts, I have also been investigating other process managers. I discovered that their configuration processes have many things in common with the sysvinit approach. As a result, I have decided to explore these declarative deployment concepts a bit further.

In this blog post, I will describe a declarative process manager-agnostic deployment approach that we can integrate into the experimental Nix-based process management framework.

Writing declarative deployment specifications for managed running processes


As explained in the introduction, I have also been experimenting with other process managers than sysvinit. For example, instead of generating a sysvinit script that manages the life-cycle of a process, such as the Nginx server, we can also generate a supervisord configuration file to define Nginx as a program that can be managed with supervisord:


{createSupervisordProgram, nginx, stateDir}:
{configFile, dependencies ? [], instanceSuffix ? ""}:

let
instanceName = "nginx${instanceSuffix}";
nginxLogDir = "${stateDir}/${instanceName}/logs";
in
createSupervisordProgram {
name = instanceName;
command = "mkdir -p ${nginxLogDir}; "+
"${nginx}/bin/nginx -c ${configFile} -p ${stateDir}";
inherit dependencies;
}

Invoking the above function will generate a supervisord program configuration file, instead of a sysvinit script.

With the following Nix expression, we can generate a systemd unit file so that Nginx's life-cycle can be managed by systemd:


{createSystemdService, nginx, stateDir}:
{configFile, dependencies ? [], instanceSuffix ? ""}:

let
instanceName = "nginx${instanceSuffix}";
nginxLogDir = "${stateDir}/${instanceName}/logs";
in
createSystemdService {
name = instanceName;
Unit = {
Description = "Nginx";
};
Service = {
ExecStartPre = "+mkdir -p ${nginxLogDir}";
ExecStart = "${nginx}/bin/nginx -c ${configFile} -p ${stateDir}";
Type = "simple";
};

inherit dependencies;
}

What you may probably notice when comparing the above two Nix expressions with the last sysvinit example (that captures process characteristics instead of activities), is that they all contain very similar properties. Their main difference is a slightly different organization and naming convention, because each abstraction function is tailored towards the configuration conventions that each target process manager uses.

As discussed in my previous blog post about declarative programming and deployment, declarativity is a spectrum -- the above specifications are (somewhat) declarative because they do not capture the activities to manage the life-cycle of the process (the how). Instead, they specify what process we want to run. The process manager derives and executes all activities to bring that process in a running state.

sysvinit scripts themselves are not declarative, because they specify all activities (i.e. shell commands) that need to be executed to accomplish that goal. supervisord configurations and systemd services configuration files are (somewhat) declarative, because they capture process characteristics -- the process manager executes derives all required activities to bring the process in a running state.

Despite the fact that I am not specifying any process management activities, these Nix expressions could still be considered somewhat a "how specification", because each configuration is tailored towards a specific process manager. A process manager, such as syvinit, is a means to accomplish something else: getting a running process whose life-cycle can be conveniently managed.

If I would revise the above specifications to only express what I kind of running process I want, disregarding the process manager, then I could simply write:


{createManagedProcess, nginx, stateDir}:
{configFile, dependencies ? [], instanceSuffix ? ""}:

let
instanceName = "nginx${instanceSuffix}";
nginxLogDir = "${stateDir}/${instanceName}/logs";
in
createManagedProcess {
name = instanceName;
description = "Nginx";
initialize = ''
mkdir -p ${nginxLogDir}
'';
process = "${nginx}/bin/nginx";
args = [ "-c" configFile" -p""${stateDir}/${instanceName}" ];

inherit dependencies instanceName;
}

The above Nix expression simply states that we want to run a managed Nginx process (using certain command-line arguments) and before starting the process, we want to initialize the state by creating the log directory, if it does not exists yet.

I can translate the above specification to all kinds of configuration artifacts that can be used by a variety of process managers to accomplish the same outcome. I have developed six kinds of generators allowing me to target the following process managers:


Translating the properties of the process manager-agnostic configuration to a process manager-specific properties is quite straight forward for most concepts -- in many cases, there is a direct mapping between a property in the process manager-agnostic configuration to a process manager-specific property.

For example, when we intend to target supervisord, then we can translate the process and args parameters to a command invocation. For systemd, we can translate process and args to the ExecStart property that refers to a command-line instruction that starts the process.

Although the process manager-agnostic abstraction function supports enough features to get some well known system services working (e.g. Nginx, Apache HTTP service, PostgreSQL, MySQL etc.), it does not facilitate all possible features of each process manager -- it will provide a reasonable set of common features to get a process running and to impose some restrictions on it.

It is still possible work around the feature limitations of process manager-agnostic deployment specifications. We can also influence the generation process by defining overrides to get process manager-specific properties supported:


{createManagedProcess, nginx, stateDir}:
{configFile, dependencies ? [], instanceSuffix ? ""}:

let
instanceName = "nginx${instanceSuffix}";
nginxLogDir = "${stateDir}/${instanceName}/logs";
in
createManagedProcess {
name = instanceName;
description = "Nginx";
initialize = ''
mkdir -p ${nginxLogDir}
'';
process = "${nginx}/bin/nginx";
args = [ "-c" configFile" -p""${stateDir}/${instanceName}" ];

inherit dependencies instanceName;

overrides = {
sysvinit = {
runlevels = [ 3 4 5 ];
};
};
}

In the above example, we have added an override specifically for sysvinit to tell that the init system that the process should be started in runlevels 3, 4 and 5 (which implies the process should stopped in the remaining runlevels: 0, 1, 2, and 6). The other process managers that I have worked with do not have a notion of runlevels.

Similarly, we can use an override to, for example, use systemd-specific features to run a process in a Linux namespace etc.

Simulating process manager-agnostic concepts with no direct equivalents


For some process manager-agnostic concepts, process managers do not always have direct equivalents. In such cases, there is still the possibility to apply non-trivial simulation strategies.

Foreground processes or daemons


What all deployment specifications shown in this blog post have in common is that their main objective is to bring a process in a running state. How these processes are expected to behave is different among process managers.

sysvinit and BSD rc scripts expect processes to daemonize -- on invocation, a process spawns another process that keeps running in the background (the daemon process). After the initialization of the daemon process is done, the parent process terminates. If processes do not deamonize, the startup process execution blocks indefinitely.

Daemons introduce another complexity from a process management perspective -- when invoking an executable from a shell session in background mode, the shell can you tell its process ID, so that it can be stopped when it is no longer necessary.

With deamons, an invoked processes forks another child process (or when it supposed to really behave well: it double forks) that becomes the daemon process. The daemon process gets adopted by the init system, and thus remains in the background even if the shell session ends.

The shell that invokes the executable does not know the PIDs of the resulting daemon processes, because that value is only propagated to the daemon's parent process, not the calling shell session. To still be able to control it, a well-behaving daemon typically writes its process IDs to a so-called PID file, so that it can be reliably terminated by a shell command when it is no longer required.

sysvinit and BSD rc scripts extensively use PID files to control daemons. By using a process' PID file, the managing sysvinit/BSD rc script can tell you whether a process is running or not and reliably terminate a process instance.

"More modern" process managers, such as launchd, supervisord, and cygrunsrv, do not work with processes that daemonize -- instead, these process managers are daemons themselves that invoke processes that work in "foreground mode".

One of the advantages of this approach is that services can be more reliably controlled -- because their PIDs are directly propagated to the controlling daemon from the fork() library call, it is no longer required to work with PID files, that may not always work reliably (for example: a process might abrubtly terminate and never clean its PID file, giving the system the false impression that it is still running).

systemd improves process control even further by using Linux cgroups -- although foreground process may be controlled more reliably than daemons, they can still fork other processes (e.g. a web service that creates processes per connection). When the controlling parent process terminates, and does not properly terminate its own child processes, they may keep running in the background indefintely. With cgroups it is possible for the process manager to retain control over all processes spawned by a service and terminate them when a service is no longer needed.

systemd has another unique advantage over the other process managers -- it can work both with foreground processes and daemons, although foreground processes seem to have to preference according to the documentation, because they are much easier to control and develop.

Many common system services, such as OpenSSH, MySQL or Nginx, have the ability to both run as a foreground process and as a daemon, typically by providing a command-line parameter or defining a property in a configuration file.

To provide an optimal user experience for all supported process managers, it is typically a good thing in the process manager-agnostic deployment specification to specify both how a process can be used as a foreground process and as a daemon:


{createManagedProcess, nginx, stateDir, runtimeDir}:
{configFile, dependencies ? [], instanceSuffix ? ""}:

let
instanceName = "nginx${instanceSuffix}";
nginxLogDir = "${stateDir}/${instanceName}/logs";
in
createManagedProcess {
name = instanceName;
description = "Nginx";
initialize = ''
mkdir -p ${nginxLogDir}
'';
process = "${nginx}/bin/nginx";
args = [ "-p""${stateDir}/${instanceName}""-c" configFile ];
foregroundProcessExtraArgs = [ "-g""daemon off;" ];
daemonExtraArgs = [ "-g""pid ${runtimeDir}/${instanceName}.pid;" ];

inherit dependencies instanceName;

overrides = {
sysvinit = {
runlevels = [ 3 4 5 ];
};
};
}

In the above example, we have revised Nginx expression to both specify how the process can be started as a foreground process and as a daemon. The only thing that needs to be configured differently is one global directive in the Nginx configuration file -- by default, Nginx runs as a deamon, but by adding the daemon off; directive to the configuration we can run it in foreground mode.

When we run Nginx as daemon, we configure a PID file that refers to the instance name so that multiple instances can co-exist.

To make this conveniently configurable, the above expression does the following:

  • The process parameter specifies the process that needs to be started both in foreground mode and as a daemon. The args parameter specifies common command-line arguments that both the foreground and daemon process will use.
  • The foregroundProcessExtraArgs parameter specifies additional command-line arguments that are only used when the process is started in foreground mode. In the above example, it is used to provide Nginx the global directive that disables the daemon setting.
  • The daemonExtraArgs parameter specifies additional command-line arguments that are only used when the process is started as a daemon. In the above example, it used to provide Nginx a global directive with a PID file path that uniquely identifies the process instance.

For custom software and services implemented in different language than C, e.g. Node.js, Java or Python, it is far less common that they have the ability to daemonize -- they can typically only be used as foreground processes.

Nonetheless, we can still daemonize foreground-only processes, by using an external tool, such as libslack's daemon command:


$ daemon -U -i myforegroundprocess

The above command deamonizes the foreground process and creates a PID file for it, so that it can be managed by the sysvinit/BSD rc utility scripts.

The opposite kind of "simulation" is also possible -- if a process can only be used as a daemon, then we can use a proxy process to make it appear as a foreground process:


export _TOP_PID=$$

# Handle to SIGTERM and SIGINT signals and forward them to the daemon process
_term()
{
trap "exit 0" TERM
kill -TERM "$pid"
kill $_TOP_PID
}

_interrupt()
{
kill -INT "$pid"
}

trap _term SIGTERM
trap _interrupt SIGINT

# Start process in the background as a daemon
${executable} "$@"

# Wait for the PID file to become available.
# Useful to work with daemons that don't behave well enough.
count=0

while [ ! -f "${_pidFile}" ]
do
if [ $count -eq 10 ]
then
echo "It does not seem that there isn't any pid file! Giving up!"
exit 1
fi

echo "Waiting for ${_pidFile} to become available..."
sleep 1

((count=count++))
done

# Determine the daemon's PID by using the PID file
pid=$(cat ${_pidFile})

# Wait in the background for the PID to terminate
${if stdenv.isDarwin then ''
lsof -p $pid +r 3 &>/dev/null &
'' else if stdenv.isLinux || stdenv.isCygwin then ''
tail --pid=$pid -f /dev/null &
'' else if stdenv.isBSD || stdenv.isSunOS then ''
pwait $pid &
'' else
throw "Don't know how to wait for process completion on system: ${stdenv.system}"}

# Wait for the blocker process to complete.
# We use wait, so that bash can still
# handle the SIGTERM and SIGINT signals that may be sent to it by
# a process manager
blocker_pid=$!
wait $blocker_pid

The idea of the proxy script shown above is that it runs as a foreground process as long as the daemon process is running and relays any relevant incoming signals (e.g. a terminate and interrupt) to the daemon process.

Implementing this proxy was a bit tricky:

  • In the beginning of the script we configure signal handlers for the TERM and INT signals so that the process manager can terminate the daemon process.
  • We must start the daemon and wait for it to become available. Although the parent process of a well-behaving daemon should only terminate when the initialization is done, this turns out not be a hard guarantee -- to make the process a bit more robust, we deliberately wait for the PID file to become available, before we attempt to wait for the termination of the daemon.
  • Then we wait for the PID to terminate. The bash shell has an internal wait command that can be used to wait for a background process to terminate, but this only works with processes in the same process group as the shell. Daemons are in a new session (with different process groups), so they cannot be monitored by the shell by using the wait command.

    From this Stackoverflow article, I learned that we can use the tail command of GNU Coreutils, or lsof on macOS/Darwin, and pwait on BSDs and Solaris/SunOS to monitor processes in other process groups.
  • When a command is being executed by a shell script (e.g. in this particular case: tail, lsof or pwait), the shell script can no longer respond to signals until the command completes. To still allow the script to respond to signals while it is waiting for the daemon process to terminate, we must run the previous command in background mode, and we use the wait instruction to block the script. While a wait command is running, the shell can respond to signals.

The generator function will automatically pick the best solution for the selected target process manager -- this means that when our target process manager are sysvinit or BSD rc scripts, the generator automatically picks the configuration settings to run the process as a daemon. For the remaining process managers, the generator will pick the configuration settings that runs it as a foreground process.

If a desired process model is not supported, then the generator will automatically simulate it. For instance, if we have a foreground-only process specification, then the generator will automatically configure a sysvinit script to call the daemon executable to daemonize it.

A similar process happens when a daemon-only process specification is deployed for a process manager that cannot work with it, such as supervisord.

State initialization


Another important aspect in process deployment is state initialization. Most system services require the presence of state directories in which they can store their PID, log and temp files. If these directories do not exist, the service may not work and refuse to start.

To cope with this problem, I typically make processes self initializing -- before starting the process, I check whether the state has been intialized (e.g. check if the state directories exist) and re-initialize the initial state if needed.

With most process managers, state initialization is easy to facilitate. For sysvinit and BSD rc scripts, we just use the generator to first execute the shell commands to initialize the state before the process gets started.

Supervisord allows you to execute multiple shell commands in a single command directive -- we can just execute a script that initializes the state before we execute the process that we want to manage.

systemd has a ExecStartPre directive that can be used to specify shell commands to execute before the main process starts.

Apple launchd and cygrunsrv, however, do not have a generic shell execution mechanism or some facility allowing you to execute things before a process starts. Nonetheless, we can still ensure that the state is going to be initialized by creating a wrapper script -- first the wrapper script does the state initialization and then executes the main process.

If a state initialization procedure was specified and the target process manager does not support scripting, then the generator function will transparently wrap the main process into a wrapper script that supports state initialization.

Process dependencies


Another important generic concept is process dependency management. For example, Nginx can act as a reverse proxy for another web application process. To provide a functional Nginx service, we must be sure that the web application process gets activated as well, and that the web application is activated before Nginx.

If the web application process is activated after Nginx or missing completely, then Nginx is (temporarily) unable to redirect incoming requests to the web application process causing end-users to see bad gateway errors.

The process managers that I have experimented with all have a different notion of process dependencies.

sysvinit scripts can optionally declare dependencies in their comment sections. Tools that know how to interpret these dependency specifications can use it to decide the right activation order. Systems using sysvinit typically ignore this specification. Instead, they work with sequence numbers in their file names -- each run level configuration directory contains a prefix (S or K) followed by two numeric digits that defines the start or stop order.

supervisord does not work with dependency specifications, but every program can optionally provide a priority setting that can be used to order the activation and deactivation of programs -- lower priority numbers have precedence over high priority numbers.

From dependency specifications in a process management expression, the generator function can automatically derive sequence numbers for process managers that require it.

Similar to sysvinit scripts, BSD rc scripts can also declare dependencies in their comment sections. Contrary to sysvinit scripts, BSD rc scripts can use the rcorder tool to parse these dependencies from the comments section and automatically derive the order in which the BSD rc scripts need to be activated.

cygrunsrv also allows you directly specify process dependencies. The Windows service manager makes sure that the service get activated in the right order and that all process dependencies are activated first. The only limitation is that cygrunsrv only allows up to 16 dependencies to be specified per service.

To simulate process dependencies with systemd, we can use two properties. The Wants property can be used to tell systemd that another service needs to be activated first. The After property can be used to specify the ordering.

Sadly, it seems that launchd has no notion of process dependencies at all -- processes can be activated by certain events, e.g. when a kernel module was loaded or through socket activation, but it does not seem to have the ability to configure process dependencies or the activation ordering. When our target process manager is launchd, then we simply have to inform the user that proper activation ordering cannot be guaranteed.

Changing user privileges


Another general concept, that has subtle differences in each process manager, is changing user privileges. Typically for the deployment of system services, you do not want to run these services as root user (that has full access to the filesystem), but as an unprivileged user.

sysvinit and BSD rc scripts have to change users through the su command. The su command can be used to change the user ID (UID), and will automatically adopt the primary group ID (GID) of the corresponding user.

Supervisord and cygrunsrv can also only change user IDs (UIDs), and will adopt the primary group ID (GID) of the corresponding user.

Systemd and launchd can both change the user IDs and group IDs of the process that it invokes.

Because only changing UIDs are universally supported amongst process managers, I did not add a configuration property that allows you to change GIDs in a process manager-agnostic way.

Deploying process manager-agnostic configurations


With a processes Nix expression, we can define which process instances we want to run (and how they can be constructed from source code and their dependencies):


{ pkgs ? import { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir;
inherit forceDisableUserChange processManager;
};
in
rec {
webapp = rec {
port = 5000;
dnsName = "webapp.local";

pkg = constructors.webapp {
inherit port;
};
};

nginxReverseProxy = rec {
port = 8080;

pkg = constructors.nginxReverseProxy {
webapps = [ webapp ];
inherit port;
} {};
};
}

In the above Nix expression, we compose two running process instances:

  • webapp is a trivial web application process that will simply return a static HTML page by using the HTTP protocol.
  • nginxReverseProxy is a Nginx server configured as a reverse proxy server. It will forward incoming HTTP requests to the appropriate web application instance, based on the virtual host name. If a virtual host name is webapp.local, then Nginx forwards the request to the webapp instance.

To generate the configuration artifacts for the process instances, we refer to a separate constructors Nix expression. Each constructor will call the createManagedProcess function abstraction (as shown earlier) to construct a process configuration in a process manager-agnostic way.

With the following command-line instruction, we can generate sysvinit scripts for the webapp and Nginx processes declared in the processes expression, and run them as an unprivileged user with the state files managed in our home directory:


$ nixproc-build --process-manager sysvinit \
--state-dir /home/sander/var \
--force-disable-user-change processes.nix

By adjusting the --process-manager parameter we can also generate artefacts for a different process manager. For example, the following command will generate systemd unit config files instead of sysvinit scripts:


$ nixproc-build --process-manager systemd \
--state-dir /home/sander/var \
--force-disable-user-change processes.nix

The following command will automatically build and deploy all processes, using sysvinit as a process manager:


$ nixproc-sysvinit-switch --state-dir /home/sander/var \
--force-disable-user-change processes.nix

We can also run a life-cycle management activity on all previously deployed processes. For example, to retrieve the statuses of all processes, we can run:


$ nixproc-sysvinit-runactivity status

We can also traverse the processes in reverse dependency order. This is particularly useful to reliably stop all processes, without breaking any process dependencies:


$ nixproc-sysvinit-runactivity -r stop

Similarly, there are command-line tools to use the other supported process managers. For example, to deploy systemd units instead of sysvinit scripts, you can run:


$ nixproc-systemd-switch processes.nix

Distributed process manager-agnostic deployment with Disnix


As shown in the previous process management framework blog post, it is also possible to deploy processes to machines in a network and have inter-dependencies between processes. These kinds of deployments can be managed by Disnix.

Compared to the previous blog post (in which we could only deploy sysvinit scripts), we can now also use any process manager that the framework supports. The Dysnomia toolset provides plugins that supports all process managers that this framework supports:


{ pkgs, distribution, invDistribution, system
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "sysvinit"
}:

let
constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir;
inherit forceDisableUserChange processManager;
};

processType =
if processManager == "sysvinit" then "sysvinit-script"
else if processManager == "systemd" then "systemd-unit"
else if processManager == "supervisord" then "supervisord-program"
else if processManager == "bsdrc" then "bsdrc-script"
else if processManager == "cygrunsrv" then "cygrunsrv-service"
else throw "Unknown process manager: ${processManager}";
in
rec {
webapp = rec {
name = "webapp";
port = 5000;
dnsName = "webapp.local";
pkg = constructors.webapp {
inherit port;
};
type = processType;
};

nginxReverseProxy = rec {
name = "nginxReverseProxy";
port = 8080;
pkg = constructors.nginxReverseProxy {
inherit port;
};
dependsOn = {
inherit webapp;
};
type = processType;
};
}

In the above expression, we have extended the previously shown processes expression into a Disnix service expression, in which every attribute in the attribute set represents a service that can be distributed to a target machine in the network.

The type attribute of each service indicates which Dysnomia plugin needs to manage its life-cycle. We can automatically select the appropriate plugin for our desired process manager by deriving it from the processManager parameter.

The above Disnix expression has a drawback -- in a heteregenous network of machines (that run multiple operating systems and/or process managers), we need to compose all desired variants of each service with configuration files for each process manager that we want to use.

It is also possible to have target-agnostic services, by delegating the translation steps to the corresponding target machines. Instead of directly generating a configuration file for a process manager, we generate a JSON specification containing all parameters that are passed to createManagedProcess. We can use this JSON file to build the corresponding configuration artefacts on the target machine:


{ pkgs, distribution, invDistribution, system
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? null
}:

let
constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir;
inherit forceDisableUserChange processManager;
};
in
rec {
webapp = rec {
name = "webapp";
port = 5000;
dnsName = "webapp.local";
pkg = constructors.webapp {
inherit port;
};
type = "managed-process";
};

nginxReverseProxy = rec {
name = "nginxReverseProxy";
port = 8080;
pkg = constructors.nginxReverseProxy {
inherit port;
};
dependsOn = {
inherit webapp;
};
type = "managed-process";
};
}

In the above services model, we have set the processManager parameter to null causing the generator to print JSON presentations of the function parameters passed to createManagedProcess.

The managed-process type refers to a Dysnomia plugin that consumes the JSON specification and invokes the createManagedProcess function to convert the JSON configuration to a configuration file used by the preferred process manager.

In the infrastructure model, we can configure the preferred process manager for each target machine:


{
test1 = {
properties = {
hostname = "test1";
};
containers = {
managed-process = {
processManager = "sysvinit";
};
};
};

test2 = {
properties = {
hostname = "test2";
};
containers = {
managed-process = {
processManager = "systemd";
};
};
};
}

In the above infrastructure model, the managed-proces container on the first machine: test1 has been configured to use sysvinit scripts to manage processes. On the second test machine: test2 the managed-process container is configured to use systemd to manage processes.

If we distribute the services in the services model to targets in the infrastructure model as follows:


{infrastructure}:

{
webapp = [ infrastructure.test1 ];
nginxReverseProxy = [ infrastructure.test2 ];
}

and the deploy the system as follows:


$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

Then the webapp process will distributed to the test1 machine in the network and will be managed with a sysvinit script.

The nginxReverseProxy will be deployed to the test2 machine and managed as a systemd job. The Nginx reverse proxy forwards incoming connections to the webapp.local domain name to the web application process hosted on the first machine.

Discussion


In this blog post, I have introduced a process manager-agnostic function abstraction making it possible to target all kinds of process managers on a variety of operating systems.

By using a single set of declarative specifications, we can:

  • Target six different process managers on four different kinds of operating systems.
  • Implement various kinds of deployment scenarios: production deployments, test deployments as an unprivileged user.
  • Construct multiple instances of processes.

In a distributed-context, the advantage is that we can uniformly target all supported process managers and operating systems in a heterogeneous environment from a single declarative specification.

This is particularly useful to facilitate technology diversity -- for example, one of the key selling points of Microservices is that "any technology" can be used to implement them. In many cases, technology diversity is "restricted" to frameworks, programming languages, and storage technologies.

One particular aspect that is rarely changed is the choice of operating systems, because of the limitations of deployment tools -- most deployment solutions for Microservices are container-based and heavily rely on Linux-only concepts, such as Namespaces and cgroups.

With this process managemenent framework and the recent Dysnomia plugin additions for Disnix, it is possible to target all kinds of operating systems that support the Nix package manager, making the operating system component selectable as well. This, for example, allows you to also pick the best operating system to implement a certain requirement -- for example, when performance is important you might pick Linux, and when there is a strong emphasis on security, you could pick OpenBSD to host a mission criticial component.

Limitations


The following table, summarizes the differences between the process manager solutions that I have investigated:

sysvinitbsdrcsupervisordsystemdlaunchdcygrunsrv
Process typedaemondaemonforegroundforeground
daemon
foregroundforeground
Process control methodPID filesPID filesProcess PIDcgroupsProcess PIDProcess PID
Scripting supportyesyesyesyesnono
Process dependency managementNumeric orderingDependency-basedNumeric orderingDependency-based
+ dependency loading
NoneDependency-based
+ dependency loading
User changing capabilitiesuseruseruser and groupuser and groupuser and groupuser
Unprivileged user deploymentsyes*yes*yesyes*nono
Operating system supportLinuxFreeBSD
OpenBSD
NetBSD
Many UNIX-like:
Linux
macOS
FreeBSD
Solaris
Linux (+glibc) onlymacOS (Darwin)Windows (Cygwin)

Although we can facilitate lifecycle management from a common specification with a variety of process managers, only the most important common features are supported.

Not every concept can be done in a process manager agnostic way. For example, we cannot generically do any isolation of resources (except for packages, because we use Nix). It is difficult to generalize these concepts because these they are not standardized, e.g. the POSIX standard does not descibe namespaces and cgroups (or similar concepts).

Furthermore, most process managers (with the exception of supervisord) are operating system specific. As a result, it still matters what process manager is picked.

Related work


Process manager-agnostic deployment is not entirely a new idea. Dysnomia already has a target-agnostic 'process' plugin for quite a while, that translates a simple deployment specification (constisting of key-value pairs) to a systemd unit configuration file or sysvinit script.

The features of Dysnomia's process plugin are much more limited compared to the createManagedProcess abstraction function described in this blog post. It does not support any other than process managers than sysvint and systemd, and it can only work with foreground processes.

Furthermore, target agnostic configurations cannot be easily extended -- it is possible to (ab)use the templating mechanism, but it has no first class overridde facilities.

I also found a project called pleaserun that also has the objective to generate configuration files for a variety of process managers (my approach and pleaserunit, both support sysvinit scripts, systemd and launchd).

It seems to use template files to generate the configuration artefacts, and it does not seem to have a generic extension mechanism. Furthermore, it provides no framework to configure the location of shared resources, automatically install package dependencies or to compose multiple instances of processes.

Some remaining thoughts


Although the Nix package manager (not the NixOS distribution), should be portable amongst a variety of UNIX-like systems, it turns out that the only two operating systems that are well supported are Linux and macOS. Nix was reported to work on a variety of other UNIX-like systems in the past, but recently it seems that many things are broken.

To make Nix work on FreeBSD 12.1, I have used the latest stable Nix package manager version with patches from this repository. It turns out that there is still a patch missing to work around in a bug in FreeBSD that incorrectly kills all processes in a process group. Fortunately, when we run Nix as as unprivileged user, this bug does not seem to cause any serious problems.

Recent versions of Nixpkgs turn out to be horribly broken on FreeBSD -- the FreeBSD stdenv does not seem to work at all. I tried switching back to stdenv-native (a stdenv environment that impurely uses the host system's compiler and core executables), but that also no longer seems to work in the last three major releases -- the Nix expression evaluation breaks in several places. Due to the intense amount of changes and assumptions that the stdenv infrastructure currently makes, it was as good as impossible for me to fix the infrastructure.

As another workaround, I reverted back very to a very old version of Nixpkgs (version 17.03 to be precise), that still has a working stdenv-native environment. With some tiny adjustments (e.g. adding some shell aliases for some GNU variants of certain shell executables to stdenv-native), I have managed to get some basic Nix packages working, including Nginx on FreeBSD.

Surprisingly, running Nix on Cygwin was less painful than FreeBSD (because of all the GNUisms that Cygwin provides). Similar to FreeBSD, recent versions of Nixpkgs also appear to be broken, including the Cygwin stdenv environment. By reverting back to release-18.03 (that still has a somewhat working stdenv for Cygwin), I have managed to build a working Nginx version.

As a future improvement to Nixpkgs, I would like to propose a testing solution for stdenv-native. Although I understand that is difficult to dedicate manpower to maintain all unconventional Nix/Nixpkgs ports, stdenv-native is something that we can also convienently test on Linux and prevent from breaking in the future.

Availability


The latest version of my experimental Nix-based process framework, that includes the process manager-agnostic configuration function described in this blog post, can be obtained from my GitHub page.

In addition, the repository also contains some example cases, including the web application system described in this blog post, and a set of common system services: MySQL, Apache HTTP server, PostgreSQL and Apache Tomcat.

Deploying container and application services with Disnix

$
0
0
As described in many previous blog posts, Disnix's purpose is to deploy service-oriented systems -- systems that can be decomposed into inter-connected service components, such as databases, web services, web applications and processes -- to networks of machines.

To use Disnix effectively, two requirements must be met:

  • A system must be decomposed into independently deployable services, and these services must be packaged with Nix.
  • Services may require other services that provide environments with essential facilities to run them. In Disnix terminology, these environments are called containers. For example, to host a MySQL database, Disnix requires a MySQL DBMS as a container, to run a Java web application archive you need a Java Servlet container, such as Apache Tomcat, and to run a daemon it needs a process manager, such as systemd, launchd or supervisord.

Disnix was originally designed to only deploy the (functional) application components (called services in Disnix terminology) of which a service-oriented systems consists, but it was not designed to handle the deployment of any underlying container services.

In my PhD thesis, I called Disnix's problem domain service deployment. Another problem domain that I identified was infrastructure deployment that concerns the deployment of machine configurations, including container services.

The fact that these problem domains are separated means that, if we want to fully deploy a service-oriented system from scratch, we basically need to do infrastructure deployment first, e.g. install a collection of machines with system software and these container services, such as MySQL and Apache Tomcat, and once that is done, we can use these machines as deployment targets for Disnix.

There are a variety of solutions available to automate infrastructure deployment. Most notably, NixOps can be used to automatically deploy networks of NixOS configurations, and (if desired) automatically instantiate virtual machines in a cloud/IaaS environment, such as Amazon EC2.

Although combining NixOps for infrastructure deployment with Disnix for service deployment works great in many scenarios, there are still a number of concerns that are not adequately addressed:

  • Infrastructure and service deployment are still two (somewhat) separated processes. Although I have developed an extension toolset (called DisnixOS) to combine Disnix with the deployment concepts of NixOS and NixOps, we still need to run two kinds of deployment procedures. Ideally, it would be nice to fully automate the entire deployment process with only one command.
  • Although NixOS (and NixOps that extends NixOS' concepts to networks of machines and the cloud) do a great job in fully automating the deployments of machines, we can only reap their benefits if we can permit ourselves use to NixOS, which is a particular Linux distribution flavour -- sometimes you may need to deploy services to conventional Linux distributions, or different kinds of operating systems (after all, one of the reasons to use service-oriented systems is to be able to use a diverse set of technologies).

    The Nix package manager also works on other operating systems than Linux, such macOS, but there is no Nix-based deployment automation solution that can universally deploy infrastructure components to other operating systems (the only other infrastructure deployment solution that provides similar functionality to NixOS is the the nix-darwin repository, that can only be used on macOS).
  • The NixOS module system does not facilitate the deployment of multiple instances of infrastructure components. Although this is probably a very uncommon use case, it is also possible to run two MySQL DBMS services on one machine and use both of them as Disnix deployment targets for databases.

In a Disnix-context, services have no specific meaning or shape and can basically represent anything -- a satellite tool providing a plugin system (called Dysnomia) takes care of most of their deployment steps, such as their activation and deactivation.

A couple of years ago, I have demonstrated with a proof of concept implementation that we can use Disnix and Dysnomia's features to deploy infrastructure components. This deployment approach is also capable of deploying multiple instances of container services to one machine.

Recently, I have revisited that idea again and extended it so that we can now deploy a service-oriented system including most underlying container services with a single command-line instruction.

About infrastructure deployment solutions


As described in the introduction, Disnix's purpose is service deployment and not infrastructure deployment. In the past, I have been using a variety of solutions to manage the underlying infrastructure of service-oriented systems:

  • In the very beginning, while working on my master thesis internship (in which I built the first prototype version of Disnix), there was not much automation at all -- for most of my testing activities I manually created VirtualBox virtual machines and manually installed NixOS on them, with all essential container servers, such as Apache Tomcat and MySQL, because these were the container services that my target system required.

    Even after some decent Nix-based automated solutions appeared, I still ended up doing manual deployments for non-NixOS machines. For example, I still remember the steps I had to perform to prepare myself for the demo I gave at NixCon 2015, in which I configured a small heterogeneous network consisting of an Ubuntu, NixOS, and Windows machine. It took me many hours of preparation time to get the demo right.
  • Some time later, for a research paper about declarative deployment and testing, we have developed a tool called nixos-deploy-network that deploys NixOS configurations in a network of machines and is driven by a networked NixOS configuration file.
  • Around the same time, I have also developed a similar tool called: disnixos-deploy-network that uses Disnix's deployment mechanisms to remotely deploy a network of NixOS configurations. It was primarily developed to show that Disnix's plugin system: Dysnomia, could also treat entire NixOS configurations as services.
  • When NixOps appeared (initially it was called Charon), I have also created facilities in the DisnixOS toolset to integrate with it -- for example DisnixOS can automatically convert a NixOps configuration to a Disnix infrastructure model.
  • And finally, I have created a proof of concept implementation that shows that Disnix can also treat every container service as a Disnix service and deploy it.

The idea behind the last approach is that we deploy two systems in sequential order with Disnix -- the former consisting of the container services and the latter of the application services.

For example, if we want to deploy a system that consists of a number of Java web applications and MySQL databases, such as the infamous Disnix StaffTracker example application (Java version), then we must first deploy a system with Disnix that provides the containers: the MySQL DBMS and Apache Tomcat:

$ disnix-env -s services-containers.nix \
-i infrastructure-bare.nix \
-d distribution-containers.nix \
--profile containers

As described in earlier blog posts about Disnix, deployments are driven by three configuration files -- the services model captures all distributable components of which the system consists (called services in a Disnix-context), the infrastructure model captures all target machines in the network and their relevant properties, and the distribution model specifies the mappings of services in the services model to the target machines (and container services already available on the machines in the network).

All the container services in the services model provide above refer to systemd services, that in addition to running Apache Tomcat and MySQL, also do the following:

  • They bundle a Dysnomia plugin that can be used to manage the life-cycles of Java web applications and MySQL databases.
  • They bundle a Dysnomia container configuration file capturing all relevant container configuration properties, such as the MySQL TCP port the daemon listens to, and the Tomcat web application deployment directory.

For example, the Nix expression that configures Apache Tomcat has roughly the following structure:


{stdenv, dysnomia, httpPort, catalinaBaseDir, instanceSuffix ? ""}:

stdenv.mkDerivation {
name = "simpleAppservingTomcat";
...
postInstall = ''
# Add Dysnomia container configuration file for a Tomcat web application
mkdir -p $out/etc/dysnomia/containers
cat > $out/etc/dysnomia/containers/tomcat-webapplication${instanceSuffix} <<EOF
tomcatPort=${toString httpPort}
catalinaBaseDir=${catalinaBaseDir}
EOF

# Copy the Dysnomia module that manages an Apache Tomcat web application
mkdir -p $out/libexec/dysnomia
ln -s ${dysnomia}/libexec/dysnomia/tomcat-webapplication $out/libexec/dysnomia
'';
}

First, the Nix expression will build and configure Apache Tomcat (this is left out of the example to keep it short). After Apache Tomcat has been built and configured, the Nix expression generates the container configuration file and copies the tomcat-webapplication Dysnomia module from the Dysnomia toolset.

The disnix-env command-line instruction shown earlier, deploys container services to target machines in the network, using a bare infrastructure model that does not provide any container services except the init system (which is systemd on NixOS). The profile parameter specifies a Disnix profile to tell the tool that we are deploying a different kind of system than the default.

If the command above succeeds, then we have all required container services at our disposal. The deployment architecture of the resulting system may look as follows:


In the above diagram, the light grey colored boxes correspond to machines in a network, the dark grey boxes to container environments, and white ovals to services.

As you may observe, we have deployed three services -- to the test1 machine we have deployed an Apache Tomcat service (that itself is managed by systemd), and to the test2 machine we have deployed both Apache Tomcat and the MySQL server (both their lifecycles are managed with systemd).

We can run the following command to generate a new infrastructure model that provides the properties of these newly deployed container services:

$ disnix-capture-infra infrastructure-bare.nix > infrastructure.nix

As shown earlier, the retrieved infrastructure model provides all relevant configuration properties of the MySQL and Apache Tomcat containers that we have just deployed, because they expose their configuration properties via container configuration files.

By using the retrieved infrastructure model and running the following command, we can deploy our web application and database components:

$ disnix-env -s services.nix \
-i infrastructure.nix \
-d distribution.nix \
--profile services

In the above command-line invocation, the services model contains all application components, and the distribution model maps these application components to the corresponding target machines and their containers.

As with the previous disnix-env command invocation, we provide a --profile parameter to tell Disnix that we are deploying a different system. If we would use the same profile parameter as in the previous example, then Disnix will undeploy the container services and tries to upgrade the system with the application services, which will obviously fail.

If the above command succeeds, then we have successfully deployed both the container and application services that our example system requires, resulting in a fully functional and activated system with a deployment architecture that may have the following structure:


As may you may observe by looking at the diagram above, we have deployed a system that consists of a number of MySQL databases, Java web services and Java web applications.

The diagram uses the same notational conventions used in the previous diagram. The arrows denote inter-dependency relationships, telling Disnix that one service depends on another, and that dependency should be deployed first.

Exposing services as containers


The Disnix service container deployment approach that I just described works, but it is not an integrated solution -- it has a limitation that is comparable to the infrastructure and services deployment separation that I have explained earlier. It requires you to run two deployments: one for the containers and one for the services.

In the blog post that I wrote a couple of years ago, I also explained that in order to fully automate the entire process with a single command, this might eventually lead to "a layered deployment approach" -- the idea was to combine several system deployment processes into one. For example, you might want to deploy a service manager in the first layer, the container services for application components in the second, and in the third the application components themselves.

I also argued that it is probably not worth spending a lot of effort in automating multiple deployment layers -- for nearly all systems that I deployed there were only two "layers" that I need to keep track of -- the infrastructure layer providing container services, and a service layer providing the application services. NixOps sufficed as a solution to automate the infrastructure parts for most of my use cases, except for deployment to non-NixOS machines, and deploying multiple instances of container services, which is a very uncommon use case.

However, I got inspired to revisit this problem again after I completed my work described in the previous blog post -- in my previous blog post, I have created a process manager-agnostic service management framework that works with a variety of process managers on a variety of operating systems.

Combining this framework with Disnix, makes it possible to also easily deploy container services (most of them are daemons) to non-NixOS machines, including non-Linux machines, such as macOS and FreeBSD from the same declarative specifications.

Moreover, this framework also provides facilities to easily deploy multiple instances of the same service to the same machine.

Revisiting this problem also made me think about the "layered approach" again, and after some thinking I have dropped the idea. The problem of using layers is that:

  • We need to develop another tool that integrates the deployment processes of all layers into one. In addition to the fact that we need to implement more automation, this introduces many additional technical challenges -- for example, if we want to deploy three layers and the deployment of the second fails, how are we going to do a rollback?
  • A layered approach is somewhat "imperative" -- each layer deploys services that include Dysnomia modules and Dysnomia container configuration files. The Disnix service on each target machine performs a lookup in the Nix profile that contains all packages of the containers layer to find the required Dysnomia modules and container configuration files.

    Essentially, Dysnomia modules and container configurations are stored in a global namespace. This means the order in which the deployment of the layers is executed is important and that each layer can imperatively modify the behaviour of each Dysnomia module.
  • Because we need to deploy the system on layer-by-layer basis, we cannot for example, deploy multiple services in another layer that have no dependency in parallel, making a deployment process slower than it should be.

After some thinking, I came up with a much simpler approach -- I have introduced a new concept to the Disnix services model that makes it possible to annotate services with a specification of the container services that it provides. This information can be used by application services that need to deploy to this container service.

For example, we can annotate the Apache Tomcat service in the Disnix services model as follows:

{ pkgs, system, distribution, invDistribution
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "systemd"
}:

let
constructors = import ../../../nix-processmgmt/examples/services-agnostic/constructors.nix {
inherit pkgs stateDir runtimeDir logDir cacheDir tmpDir forceDisableUserChange processManager;
};
in
rec {
simpleAppservingTomcat = rec {
name = "simpleAppservingTomcat";
pkg = constructors.simpleAppservingTomcat {
inherit httpPort;
commonLibs = [ "${pkgs.mysql_jdbc}/share/java/mysql-connector-java.jar" ];
};
httpPort = 8080;
catalinaBaseDir = "/var/tomcat/webapps";
type = "systemd-unit";
providesContainers = {
tomcat-webapplication = {
httpPort = 8080;
catalinaBaseDir = "/var/tomcat/webapps";
};
};
};

GeolocationService = {
name = "GeolocationService";
pkg = customPkgs.GeolocationService;
dependsOn = {};
type = "tomcat-webapplication";
};

...
}

In the above example, the simpleAppservingTomcat service refers to an Apache Tomcat server that serves Java web applications for one particular virtual host. The providesContainers property tells Disnix that the service is a container provider, providing a container named: tomcat-webapplication with the following properties:

  • For HTTP traffic, Apache Tomcat should listen on TCP port 8080
  • The Java web application archives (WAR files) should be deployed to the Catalina Servlet container. By copying the WAR files to the /var/tomcat/webapps directory, they should be automatically hot-deployed.

The other service in the services model (GeolocationService) is a Java web application that should be deployed to a Apache Tomcat container service.

If in a Disnix distribution model, we map the Apache Tomcat service (simpleAppservingTomcat) and the Java web application (GeolocationService) to the same machine:

{infrastructure}:

{
simpleAppservingTomcat = [ infrastructure.test1 ];
GeolocationService = [ infrastructure.test1 ];
}

Disnix will automatically search for a suitable container service provider for each service.

In the above scenario, Disnix knows that simpleAppservingTomcat provides a tomcat-webapplication container. The GeolocationService uses the type: tomcat-webapplication indicating that it needs to deployed to a Apache Tomcat servlet container.

Because these services have been deployed to the same machine Disnix will make sure that Apache Tomcat gets activated before the GeolocationService, and uses the Dysnomia module that is bundled with the simpleAppservingTomcat to handle the deployment of the Java web application.

Furthermore, the properties that simpleAppservingTomcat exposes in the providesContainers attribute set, are automatically propagated as container parameters to the GeolocationService Nix expression, so that it knows where the WAR file should be copied to, to automatically hot-deploy the service.

If Disnix does not detect a service that provides a required container deployed to the same machine, then it will fall back to its original behaviour -- it automatically propagates the properties of a container in the infrastructure model, and assumes the the container service is already deployed by an infrastructure deployment solution.

Simplifications


The notation used for the simpleAppservingTomcat service (shown earlier) refers to an attribute set. An attribute set also makes it possible to specify multiple container instances. However, it is far more common that we only need one single container instance.

Moreover, there is some redundancy -- we need to specify certain properties in two places. Some properties can both belong to a service, as well as the container properties that we want to propagate to the services that require it.

We can also use a shorter notation to expose only one single container:

simpleAppservingTomcat = rec {
name = "simpleAppservingTomcat";
pkg = constructors.simpleAppservingTomcat {
inherit httpPort;
commonLibs = [ "${pkgs.mysql_jdbc}/share/java/mysql-connector-java.jar" ];
};
httpPort = 8080;
catalinaBaseDir = "/var/tomcat/webapps";
type = "systemd-unit";
providesContainer = "tomcat-webapplication";
};

In the above example, we have rewritten the service configuration of simpleAppserviceTomcat to use the providesContainer attribute referring to a string. This shorter notation will automatically expose all non-reserved service properties as container properties.

For our example above, this means that it will automatically expose httpPort, and catalinaBaseDir and ignores the remaining properties -- these remaining properties have a specific purpose for the Disnix deployment system.

Although the notation above simplifies things considerably, the above example still contains a bit of redundancy -- some of the container properties that we want to expose to application services, also need to be propagated to the constructor function requiring us to specify the same properties twice.

We can eliminate this redundancy by encapsulating the creation of the service properties attribute set a constructor function. With a constructor function, we can simply write:

simpleAppservingTomcat = constructors.simpleAppservingTomcat {
httpPort = 8080;
commonLibs = [ "${pkgs.mysql_jdbc}/share/java/mysql-connector-java.jar" ];
type = "systemd-unit";
};

Example: deploying container and application services as one system


By applying the techniques described in the previous section to the StaffTracker example (e.g. distributing a simpleAppservingTomcat and mysql to the same machines that host Java web applications and MySQL databases), we can deploy the StaffTracker system including all its required container services with a single command-line instruction:

$ disnix-env -s services-with-containers.nix \
-i infrastructure-bare.nix \
-d distribution-with-containers.nix

The corresponding deployment architecture visualization may look as follows:


As you may notice, the above diagram looks very similar to the previously shown deployment architecture diagram of the services layer.

What has been added are the container services -- the ovals with the double borders denote services that are also container providers. The labels describe both the name of the service and the containers that it provides (behind the arrow ->).

Furthermore, all the services that are hosted inside a particular container environment (e.g. tomcat-webapplication) have a local inter-dependency on the corresponding container provider service (e.g. simpleAppservingTomcat), causing Disnix to activate Apache Tomcat before the web applications that are hosted inside it.

Another thing you might notice, is that we have not completely eliminated the dependency on an infrastructure deployment solution -- the MySQL DBMS and Apache Tomcat service are deployed as systemd-unit requiring the presence of systemd on the target system. Systemd should be provided as part of the target Linux distribution, and cannot be managed by Disnix because it runs as PID 1.

Example: deploying multiple container service instances and application services


One of my motivating reasons to use Disnix as a deployment solution for container services is to be able to deploy multiple instances of them to the same machine. This can also be done in a combined container and application services deployment approach.

To allow, for example, to have two instance of Apache Tomcat to co-exist on one machine, we must configure them in such a way their resources do not conflict:

{ pkgs, system, distribution, invDistribution
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "systemd"
}:

let
constructors = import ../../../nix-processmgmt/examples/service-containers-agnostic/constructors.nix {
inherit pkgs stateDir runtimeDir logDir cacheDir tmpDir forceDisableUserChange processManager;
};
in
rec {
simpleAppservingTomcat-primary = constructors.simpleAppservingTomcat {
instanceSuffix = "-primary";
httpPort = 8080;
httpsPort = 8443;
serverPort = 8005;
ajpPort = 8009;
commonLibs = [ "${pkgs.mysql_jdbc}/share/java/mysql-connector-java.jar" ];
type = "systemd-unit";
};

simpleAppservingTomcat-secondary = constructors.simpleAppservingTomcat {
instanceSuffix = "-secondary";
httpPort = 8081;
httpsPort = 8444;
serverPort = 8006;
ajpPort = 8010;
commonLibs = [ "${pkgs.mysql_jdbc}/share/java/mysql-connector-java.jar" ];
type = "systemd-unit";
};

...
}

The above partial services model defines two Apache Tomcat instances, that have been configured to listen to different TCP ports (for example the primary Tomcat instance listens to HTTP traffic on port 8080, whereas the secondary instance listens on port 8081), and serving web applications from a different deployment directories. Because their properties do not conflict, they can co-exist on the same machine.

With the following distribution model, we can deploy multiple container providers to the same machine and distribute application services to them:

{infrastructure}:

{
# Container providers

mysql-primary = [ infrastructure.test1 ];
mysql-secondary = [ infrastructure.test1 ];
simpleAppservingTomcat-primary = [ infrastructure.test2 ];
simpleAppservingTomcat-secondary = [ infrastructure.test2 ];

# Application components

GeolocationService = {
targets = [
{ target = infrastructure.test2;
container = "tomcat-webapplication-primary";
}
];
};
RoomService = {
targets = [
{ target = infrastructure.test2;
container = "tomcat-webapplication-secondary";
}
];
};
StaffTracker = {
targets = [
{ target = infrastructure.test2;
container = "tomcat-webapplication-secondary";
}
];
};
staff = {
targets = [
{ target = infrastructure.test1;
container = "mysql-database-secondary";
}
];
};
zipcodes = {
targets = [
{ target = infrastructure.test1;
container = "mysql-database-primary";
}
];
};
...
}

In the first four lines of the distribution model shown above, we distribute the container providers. As you may notice, we distribute two MySQL instances that should co-exist on machine test1 and two Apache Tomcat instances that should co-exist on machine test2.

In the remainder of the distribution model, we map Java web applications and MySQL databases to these container providers. As explained in the previous blog post about deploying multiple container service instances, if no container is specified in the distribution model, Disnix will auto map the service to the container that has the same name as the service's type.

In the above example, we have two instances of each container service with a different name. As a result, we need to use the more verbose notation for distribution mappings to instruct Disnix to which container provider we want to deploy the service.

Deploying the system with the following command-line instruction:

$ disnix-env -s services-with-multicontainers.nix \
-i infrastructure-bare.nix \
-d distribution-with-multicontainers.nix

results in a running system that may has the following deployment architecture:


As you may notice, we have MySQL databases and Java web application distributed over mutiple container providers residing on the same machine. All services belong to the same system, deployed by a single Disnix command.

A more extreme example: multiple process managers


By exposing services as container providers in Disnix, my original requirements were met. Because the facilities are very flexible, I also discovered that there is much more I could do.

For example, on more primitive systems that do not have systemd, I could also extend the services and distribution models in such a way that I can deploy supervisord as a process manager first (as a sysvinit-script that does not require any process manager service), then use supervisord to manage MySQL and Apache Tomcat, and then use the Dysnomia plugin system to deploy the databases and Java web applications to these container services managed by supervisord:


As you may notice, the deployment architecture above looks similar to the first combined deployment example, with supervisord added as an extra container provider service.

More efficient reuse: expose any kind of service as container provider


In addition to managed processes (which the MySQL DBMS and Apache Tomcat services are), any kind of Disnix service can act as a container provider.

An example of such a non-process managed container provider could be Apache Axis2. In the StaffTracker example, all data access is provided by web services. These web services are implemented as Java web applications (WAR files) embedding an Apache Axis2 container that embeds an Axis2 Application Archive (AAR file) providing the web service implementation.

Every web application that is a web service includes its own implementation of Apache Axis2.

It is also possible to deploy a single Axis2 web application to Apache Tomcat, and treat each Axis2 Application Archive as a separate deployment unit using the axis2-webservice identifier as a container provider for any service of the type: axis2-webservice:

{ pkgs, system, distribution, invDistribution
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "systemd"
}:

let
constructors = import ../../../nix-processmgmt/examples/service-containers-agnostic/constructors.nix {
inherit pkgs stateDir runtimeDir logDir cacheDir tmpDir forceDisableUserChange processManager;
};

customPkgs = import ../top-level/all-packages.nix {
inherit system pkgs stateDir;
};
in
rec {
### Container providers

simpleAppservingTomcat = constructors.simpleAppservingTomcat {
httpPort = 8080;
commonLibs = [ "${pkgs.mysql_jdbc}/share/java/mysql-connector-java.jar" ];
type = "systemd-unit";
};

axis2 = customPkgs.axis2 {};

### Web services

HelloService = {
name = "HelloService";
pkg = customPkgs.HelloService;
dependsOn = {};
type = "axis2-webservice";
};

HelloWorldService = {
name = "HelloWorldService";
pkg = customPkgs.HelloWorldService;
dependsOn = {
inherit HelloService;
};
type = "axis2-webservice";
};

...
}

In the above partial services model, we have defined two container providers:

  • simpleAppservingTomcat that provides a Servlet container in which Java web applications (WAR files) can be hosted.
  • The axis2 service is a Java web application that acts as a container provider for Axis2 web services.

The remaining services are Axis2 web services that can be embedded inside the shared Axis2 container.

If we deploy the above example system, e.g.:

$ disnix-env -s services-optimised.nix \
-i infrastructure-bare.nix \
-d distribution-optimised.nix

may result in the following deployment architecture:


As may be observed when looking at the above architecture diagram, the web services deployed to the test2 machine, use a shared Axis2 container, that is embedded as a Java web application inside Apache Tomcat.

The above system has a far better degree of reuse, because it does not use redundant copies of Apache Axis2 for each web service.

Although it is possible to have a deployment architecture with a shared Axis2 container, this shared approach is not always desirable to use. For example, database connections managed by Apache Tomcat are shared between all web services embedded in an Axis2 container, which is not always desirable from a security point of view.

Moreover, an unstable web service embedded in an Axis2 container might also tear the container down causing the other web services to crash as well. Still, the deployment system does not make it difficult to use a shared approach, when it is desired.

Conclusion


With this new feature addition to Disnix, that can expose services as container providers, it becomes possible to deploy both container services and application services as one integrated system.

Furthermore, it also makes it possible to:

  • Deploy multiple instances of container services and deploy services to them.
  • For process-based service containers, we can combine the process manager-agostic framework described in the previous blog post, so that we can use them with any process manager on any operating system that it supports.

The fact that Disnix can now also deploy containers does not mean that it no longer relies on external infrastructure deployment solutions anymore. For example, you still need target machines at your disposal that have Nix and Disnix installed and need to be remotely connectable, e.g. through SSH. For this, you still require an external infrastructure deployment solution, such as NixOps.

Furthermore, not all container services can be managed by Disnix. For example, systemd, that runs as a system's PID 1, cannot be installed by Disnix. Instead, it must already be provided by the target system's Linux distribution (In NixOS' case it is Nix that deploys it, but it is not managed by Disnix).

And there may also be other reasons why you may still want to use separated deployment processes for container and service deployment. For example, you may want to deploy to container services that cannot be managed by Nix/Disnix, or you may work in an organization in which two different teams take care of the infrastructure and the services.

Availability


The new features described in this blog post are part of the current development versions of Dysnomia and Disnix that can be obtained from my GitHub page. These features will become generally available in the next release.

Moreover, I have extended all my public Disnix examples with container deployment support (including the Java-based StaffTracker and composition examples shown in this blog post). These changes currently reside in the servicesascontainers Git branches.

The nix-processmgmt repository contains shared constructor functions for all kinds of system services, e.g. MySQL, Apache HTTP server, PostgreSQL and Apache Tomcat. These functions can be reused amongst all kinds of Disnix projects.

Deploying heterogeneous service-oriented systems locally with Disnix

$
0
0
In the previous blog post, I have shown a new useful application area that is built on top of the combination of my experimental Nix-based process management framework and Disnix.

Both of these underlying solutions have a number of similarities -- as their names obviously suggest, they both strongly depend on the Nix package manager to deploy all their package dependencies and static configuration artifacts, such as configuration files.

Furthermore, they are both driven by models written in the Nix expression language to automate the deployment processes of entire systems.

These models are built on a number of simple conventions that are frequently used in the Nix packages repository:

  • All units of which a system consists are defined as Nix expressions declaring a function. Each function parameter refers to a dependency or configuration property required to construct the unit from its sources.
  • To compose a particular variant of a unit, we must invoke the function that builds and configures the unit with parameters providing the dependencies and configuration properties that the unit needs.
  • To make all units conveniently accessible from a single location, the content of the configuration units is typically blended into a symlink tree called Nix profiles.

Besides these commonalities, their main difference is that the process management framework is specifically designed as a solution for systems that are composed out of running processes (i.e. daemons in UNIX terminology).

This framework makes it possible to construct multiple instances of running processes, isolate their resources (by avoiding conflicting resource configuration properties), and manage running process with a variety of process management solutions, such as sysvinit scripts, BSD rc scripts, systemd, launchd and supervisord.

The process management framework is quite useful for single machine deployments and local experimentation, but it does not do any distributed deployment and heterogeneous service deployment -- it cannot (at least not conveniently) deploy units that are not daemons, such as databases, Java web applications deployed to a Servlet container, PHP applications deployed to a PHP-enabled web server etc.

Disnix is a solution to automate the deployment processes of service-oriented systems -- distributed systems that are composed of components, using a variety of technologies, into a network of machines.

To accomplish full automation, Disnix integrates and combines a number of activities and tools, such as Nix for package management and Dysnomia for state management (Dysnomia takes care of the activation, deactivation steps for services, and can optionally manage snapshots and restores of state). Dysnomia provides a plugin system that makes it possible to manage a variety of component types, including processes and databases.

Disnix and Dysnomia can also include the features of the Nix process management framework for the deployment of services that are running processes, if desired.

The scope of Disnix is quite broad in comparison to the process management framework, but it can also be used to automate all kinds of sub problems. For example, it can also be used as a remote package deployment solution to build and deploy packages in a network of heterogeneous machines (e.g. Linux and macOS).

After comparing the properties of both deployment solutions, I have identified another interesting sub use case for Disnix -- deploying heterogeneous service-oriented systems (that are composed out of components using a variety of technologies) locally for experimentation purposes.

In this blog post, I will describe how Disnix can be used for local deployments.

Motivating example: deploying a Java-based web application and web service system


One of the examples I have shown in the previous blog post, is an over engineered Java-based web application and web service system which only purpose is to display the string: "Hello world!".

The "Hello" string is returned by the HelloService and consumed by another service called HelloWorldService that composes the sentence "Hello world!" from the first message. The HelloWorld web application is the front-end responsible for displaying the sentence to the end user.

When deploying the system to a single target machine, it could have the following deployment architecture:


In the architecture diagram shown above, ovals denote services, arrows inter-dependency relationships (requiring that a service gets activated before another), the dark grey colored boxes container environments, and the light grey colored box a machine (which is only one machine in the above example).

As you may notice, only one service in the diagram shown above is a daemon, namely Apache Tomcat (simpleAppservingTomcat) that can be managed by the experimental Nix process management framework.

The remainder of the services have a different kind of form -- the web application front-end (HelloWorld) is a Java web application that is embedded in Catalina, the Servlet container that comes with Apache Tomcat. The web services are Axis2 archives that are deployed to the Axis2 container (that in turn is a web application managed by Apache Tomcat).

In the previous blog post, I have shown that we can deploy and distribute these services over a small network of machines.

It is also possible to completely deploy this system locally, without any external physical or virtual machines, and network connectivity.

Configuring the client interface for local deployment


To execute deployment tasks remotely, Disnix invokes an external process that is called a client interface. By default, Disnix uses the disnix-ssh-client that remotely executes commands via SSH and transfers data via SCP.

It is also possible to use alternative client interfaces so that different communication protocols and methods can be used. For example, there is also an external package that provides a SOAP client disnix-soap-client and a NixOps client (disnix-nixops-client).

Communication with a local Disnix service instance can also be done with a client interface. For example, configuring the following environment variable:

$ export DISNIX_CLIENT_INTERFACE=disnix-client

instructs the Disnix tools to use the D-Bus client to communicate with a local Disnix service instance.

It is also possible to bypass the local Disnix service and directly execute all deployment activities with the following interface:

$ export DISNIX_CLIENT_INTERFACE=disnix-runactivity

The disnix-runactivity client interface is particularly useful for single-user/unprivileged user deployments. In the former case, you need a Disnix D-Bus daemon running in the background that authorizes the user to execute deployments. For the latter, nothing is required beyond a single user Nix installation.

Deploying the example system locally


As explained in earlier blog posts about Disnix, deployments are driven by three kinds of deployment specifications: a services model capturing all the services of which a system consists and how they depend on each other, an infrastructure model captures all available target machines and their relevant configuration properties (including so-called container services that can host application services) and the distribution model maps services in the services model to target machines in the infrastructure model (and container services that a machine may provide).

Normally, Disnix deploys services to remote machines defined in the infrastructure model. For local deployments, we simply need to provide an infrastructure model with only one entry:

{
localhost.properties.hostname = "localhost";
}

In the distribution model, we must map all services to the localhost target:

{infrastructure}:

{
simpleAppservingTomcat = [ infrastructure.localhost ];
axis2 = [ infrastructure.localhost ];

HelloService = [ infrastructure.localhost ];
HelloWorldService = [ infrastructure.localhost ];
HelloWorld = [ infrastructure.localhost ];
}

With the above infrastructure and distribution model that facilitates local deployment, and the services model of the example system shown above, we can deploy the entire system on our local machine:

$ disnix-env -s services.nix -i infrastructure-local.nix -d distribution-local.nix

Deploying the example system locally as an unprivileged user


The deployment scenario shown earlier supports local deployment, but still requires super-user privileges. For example, to deploy Apache Tomcat, we must have write access to the state directory: /var to configure Apache Tomcat's state and deploy the Java web application archives. An unprivileged user typically lacks the permissions to perform modifications in the /var directory.

One of they key features of the Nix process management framework is that it makes all state directories are configurable. State directories can be changed in such a way that also unprivileged users can deploy services (e.g. by changing the state directory to a sub folder in the user's home directory).

Disnix service models can also define these process management configuration parameters:

{ pkgs, system, distribution, invDistribution
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "systemd"
}:

let
processType =
if processManager == null then "managed-process"
else if processManager == "sysvinit" then "sysvinit-script"
else if processManager == "systemd" then "systemd-unit"
else if processManager == "supervisord" then "supervisord-program"
else if processManager == "bsdrc" then "bsdrc-script"
else if processManager == "cygrunsrv" then "cygrunsrv-service"
else if processManager == "launchd" then "launchd-daemon"
else throw "Unknown process manager: ${processManager}";

constructors = import ../../../nix-processmgmt/examples/service-containers-agnostic/constructors.nix {
inherit pkgs stateDir runtimeDir logDir cacheDir tmpDir forceDisableUserChange processManager;
};

customPkgs = import ../top-level/all-packages.nix {
inherit system pkgs stateDir;
};
in
rec {
simpleAppservingTomcat = constructors.simpleAppservingTomcat {
httpPort = 8080;
type = processType;
};
...
}

The above Nix expression shows a partial Nix services model for the Java example system. The first four function parameters: pkgs, system, distribution, and invDistribution are standard Disnix service model parameters.

The remainder of the parameters are specific to the process management framework -- they allow you to change the state directories, force disable user changing (this is useful for unprivileged user deployments) and the process manager it should use for daemons.

I have added a new command-line parameter (--extra-params) to the Disnix tools that can be used to propagate values for these additional parameters.

With the following command-line instruction, we change the base directory of the state directories to the user's home directory, force disable user changing (only a privileged user can do this), and change the process manager to sysvinit scripts:

$ disnix-env -s services.nix -i infrastructure-local.nix -d distribution-local.nix \
--extra-params '{
stateDir = "/home/sander/var";
processManager = "sysvinit";
forceDisableUserChange = true;
}'

With the above command, we can deploy the example system completely as an unprivileged user, without requiring any process/service manager to manage Apache Tomcat.

Working with predeployed container services


In our examples so far, we have deployed systems that are entirely self contained. However, it is also possible to deploy services to container services that have already been deployed by other means. For example, it is also possible to install Apache Tomcat with your host system's distribution and use Dysnomia to integrate with that.

To allow Disnix to deploy services to these containers, we need an infrastructure model that knows its properties. We can automatically generate an infrastructure model from the Dysnomia container configuration files, by running:

$ disnix-capture-infra infrastructure.nix > \
infrastructure-captured.nix

and using the captured infrastructure model to locally deploy the system:

$ disnix-env -s services.nix -i infrastructure-captured.nix -d distribution-local.nix

Undeploying a system


For local experimentation, it is probably quite common that you want to completely undeploy the system as soon as you no longer need it. Normally, this should be done by writing an empty distribution model and redeploying the system with that empty distribution model, but that is still a bit of a hassle.

In the latest development version of Disnix, an undeploy can be done with the following command-line instruction:

$ disnix-env --undeploy -i infrastructure.nix

Availability


The --extra-params and --undeploy Disnix command-line options are part of the current development version of Disnix and will become available in the next release.

Using Disnix as a simple and minimalistic dependency-based process manager

$
0
0
In my previous blog post I have demonstrated that I can deploy an entire service-oriented system locally with Disnix without the need of obtaining any external physical or virtual machines (or even Linux containers).

The fact that I could do this with relative ease is a benefit of using my experimental process manager-agnostic deployment framework that I have developed earlier, allowing you to target a variety of process management solutions with the same declarative deployment specifications.

Most notably, the fact that the framework can also work with processes that daemonize and let foreground processes automatically daemonize, make it very convenient to do local unprivileged user deployments.

To refresh your memory: a process that daemonizes spawns another process that keeps running in the background while the invoking process terminates after the initialization is done. Since there is no way for the caller to know the PID of the daemon process, daemons typically follow the convention to write a PID file to disk (containing the daemon's process ID), so that it can eventually be reliably terminated.

In addition to spawning a daemon process that remains in the background, services should also implement a number of steps to make it well-behaving, such as resetting signals handlers, clearing privacy sensitive environment variables, and dropping privileges etc.

In earlier blog posts, I argued that managing foreground processes with a process manager is typically more reliable (e.g. a PID of a foreground process is always known to be right).

On the other hand, processes that daemonize also have certain advantages:

  • They are self contained -- they do not rely on any external services to operate. This makes it very easy to run a collection of processes for local experimentation.
  • They have a standard means to notify the caller that the service is ready. By convention, the executable that spawns the daemon process is only supposed to terminate when the daemon has been successfully initialized. For example, foreground processes that are managed by systemd, should invoke the non-standard sd_notify() function to notify systemd that they are ready.

Although these concepts are nice, properly daemonizing a process is the responsibility of the service implementer -- as a consequence, it is not a guarantee that all services will properly implement all steps to make a daemon well-behaving.

Since the management of daemons is straight forward and self contained, the Nix expression language provides all kinds of advantages over data-oriented configuration languages (e.g. JSON or YAML) and Disnix has a flexible deployment model that works with a dependency graph and a plugin system that can activate and deactivate all kinds of components, I realized that I could integrate these facilities to make my own simple dependency-based process manager.

In this blog post, I will describe how this process management approach works.

Specifying a process configuration


A simple Nix expression capturing a daemon deployment configuration might look as follows:

{writeTextFile, mydaemon}:

writeTextFile {
name = "mydaemon";
text = ''
process=${mydaemon}/bin/mydaemon
pidFile=/var/run/mydaemon.pid
'';
destination = "/etc/dysnomia/process";
}

The above Nix expression generates a textual configuration file:

  • The process field specifies the path to executable to start (that in turn spawns a deamon process that keeps running in the background).
  • The pidFile field indicates the location of the PID file containing the process ID of the daemon process, so that it can be reliably terminated.

Most common system services (e.g. the Apache HTTP server, MySQL and PostgreSQL) can daemonize on their own and follow the same conventions. As a result, the deployment system can save you some configuration work by providing reasonable default values:

  • If no pidFile is provided, then the deployment system assumes that the daemon generates a PID file with the same name as the executable and resides in the directory that is commonly used for storing PID files: /var/run.
  • If a package provides only a single executable in the bin/ sub folder, then it is also not required to specify a process.

The fact that the configuration system provides reasonable defaults, means that for trivial services we do not have to specify any configuration properties at all -- simply providing a single executable in the package's bin/ sub folder suffices.

Do these simple configuration facilities really suffice to manage all kinds of system services? The answer is most likely no, because we may also want to manage processes that cannot daemonize on their own, or we may need to initialize some state first before the service can be used.

To provide these additional facilities, we can create a wrapper script around the executable and refer to it in the process field of the deployment specification.

The following Nix expression generates a deployment configuration for a service that requires state and only runs as a foreground process:

{stdenv, writeTextFile, writeScript, daemon, myForegroundService}:

let
myForegroundServiceWrapper = writeScript {
name = "myforegroundservice-wrapper";
text = ''
#! ${stdenv.shell} -e

mkdir -p /var/lib/myservice
exec ${daemon}/bin/daemon -U -F /var/run/mydaemon.pid -- \
${myForegroundService}/bin/myservice
'';
};
in
writeTextFile {
name = "mydaemon";
text = ''
process=${myForegroundServiceWrapper}
pidFile=/var/run/mydaemon.pid
'';
destination = "/etc/dysnomia/process";
}

As you may observe by looking at the Nix expression shown above, the Nix expression generates a wrapper script that does the following:

  • First, it creates the required state directory: /var/lib/myservice so that the service can work properly.
  • Then it invokes libslack's daemon command to automatically daemonize the service. The daemon command will automatically store a PID file containing the daemon's process ID, so that the configuration system knows how to terminate it. The value of the -F parameter passed to the daemon executable and the pidFile configuration property are the same.

Typically, in deployment systems that use a data-driven configuration language (such as YAML or JSON) obtaining a wrapped executable is a burden, but in the Nix expression language this is quite convenient -- the language allows you to automatically build packages and other static artifacts such as configuration files and scripts, and pass their corresponding Nix store paths as parameters to configuration files.

The combination of wrapper scripts and a simple configuration file suffices to manage all kinds of services, but it is fairly low-level -- to automate the deployment process of a system service, you basically need to re-implement the same kinds of configuration properties all over again.

In the Nix process mangement-framework, I have developed a high-level abstraction function for creating managed processes that can be used to target all kinds of process managers:

{createManagedProcess, runtimeDir}:
{port}:

let
webapp = import ../../webapp;
in
createManagedProcess rec {
name = "webapp";
description = "Simple web application";

# This expression can both run in foreground or daemon mode.
# The process manager can pick which mode it prefers.
process = "${webapp}/bin/webapp";
daemonArgs = [ "-D" ];

environment = {
PORT = port;
PID_FILE = "${runtimeDir}/${name}.pid";
};
}

The above Nix expression is a constructor function that generates a configuration for a web application process (with an embedded HTTP server) that returns a static HTML page.

The createManagedProcess function abstraction function can be used to generate configuration artifacts for systemd, supervisord, and launchd and various kinds of scripts, such as sysvinit scripts and BSD rc scripts.

I can also easily adjust the generator infrastructure to generate the configuration files shown earlier (capturing the path of an executable and a PID file) with a wrapper script.

Managing daemons with Disnix


As explained in earlier blog posts about Disnix, services in a Disnix deployment model are abstract representations of basically any kind of deployment unit.

Every service is annotated with a type field. Disnix consults a plugin system named Dysnomia to invoke the corresponding plugin that can manage the lifecycle of that service, e.g. by activating or deactivating it.

Implementing a Dysnomia module for directly managing daemons is quite straight forward -- as an activation step I just have to start the process defined in the configuration file (or the single executable that resides in the bin/ sub folder of the package).

As a deactivation step (which purpose is to stop a process) I simply need to send a TERM signal to the PID in the PID file, by running:

$ kill $(cat $pidFile)

Translation to a Disnix deployment specification


The last remaining bits in the puzzle is process dependency management and the translation to a Disnix services model so that Disnix can carry out the deployment.

Deployments managed by the Nix process management framework are driven by so-called processes models that capture the properties of running process instances, such as:

{ pkgs ? import  { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "disnix"
}:

let
constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager;
};
in
rec {
webapp = rec {
port = 5000;
dnsName = "webapp.local";

pkg = constructors.webapp {
inherit port;
};
};

nginxReverseProxy = rec {
port = 8080;

pkg = constructors.nginxReverseProxyHostBased {
webapps = [ webapp ];
inherit port;
} {};
};
}

The above Nix expression is a simple example of a processes model defining two running processes:

  • The webapp process is the web application process described earlier that runs an embedded HTTP server and serves a static HTML page.
  • The nginxReverseProxy is an Nginx web server that acts as a reverse proxy server for the webapp process. To make this service to work properly, it needs to be activated after the webapp process is activated. To ensure that the activation is done in the right order, webapp is passed as a process dependency to the nginxReverseProxyHostBased constructor function.

As explained in previous blog posts, Disnix deployments are driven by three kinds of deployment specifications: a services model that captures the service components of which a system consists, an infrastructure model that captures all available target machines and their configuration properties and a distribution model that maps services in the services model to machines in the infrastructure model.

The processes model and Disnix services model are quite similar -- the latter is actually a superset of the processes model.

We can translate process instances to Disnix services in a straight forward manner. For example, the nginxReverseProxy process can be translated into the following Disnix service configuration:

nginxReverseProxy = rec {
name = "nginxReverseProxy";
port = 8080;

pkg = constructors.nginxReverseProxyHostBased {
webapps = [ webapp ];
inherit port;
} {};

activatesAfter = {
inherit webapp;
};

type = "process";
};

In the above specification, the process configuration has been augmented with the following properties:

  • A name property because this is a mandatory field for every service.
  • In the process management framework all process instances are managed by the same process manager, but in Disnix services can have all kinds of shapes and formes and require a plugin to manage their life-cycles.

    To allow Disnix to manage daemons, we specify the type property to refer to our process Dysnomia module that starts and terminates a daemon from a simple textual specification.
  • The process dependencies are translated to Disnix inter-dependencies by using the activatesAfter property.

    In Disnix, inter-dependency parameters serve two purposes -- they provide the inter-dependent services with configuration parameters and they ensure the correct activation ordering.

    The activatesAfter parameter disregards the first inter-dependency property, because we are already using the process management framework's convention for propagating process dependencies.

To allow Disnix to carry out the deployment of processes only a services model does not suffice. Since we are only interested in local deployment, we can just provide an infrastructure model with only a localhost target and a distribution model that maps all services to localhost.

To accomplish this, we can use the same principles for local deployments described in the previous blog post.

An example deployment scenario


I have added a new tool called nixproc-disnix-switch to the Nix process management framework that automatically converts processes models into Disnix deployment models and invokes Disnix to locally deploy a system.

The following command will carry out the complete deployment of our webapp example system, shown earlier, using Disnix as a simple dependency-based process manager:

$ nixproc-disnix-switch --state-dir /home/sander/var \
--force-disable-user-change processes.nix

In addition to using Disnix for deploying processes, we can also use its other features. For example, another application of Disnix I typically find useful is the deployment visualization tool.

We can also use Disnix to generate a DOT graph from the deployment architecture of the currently deployed system and generate an image from it:

$ disnix-visualize > out.dot
$ dot -Tpng out.dot > out.png

Resulting in the following diagram:


In the first blog post that I wrote about the Nix process management framework (in which I explored a functional discipline using sysvinit-scripts as a basis), I was using hand-drawn diagrams to illustrate deployments.

With the Disnix backend, I can use Disnix's visualization tool to automatically generate these diagrams.

Discussion


In this blog post, I have shown that by implementing a few very simple concepts, we can use Disnix as a process management backend for the experimental Nix-based process management framework.

Although it was fun to develop a simple process management solution, my goal is not to compete with existing process management solutions (such as systemd, launchd or supervisord) -- this solution is primarily designed for simple use cases and local experimentation.

For production deployments, you probably still want to use a more sophisticated solution. For example, in production scenarios you also want to check the status of running processes and send them reload instructions. These are features that the Disnix backend does not support.

The Nix process management framework supports a variety of process managers, but none of them can be universally used on all platforms that Disnix can run on. For example, the sysvinit-script module works conveniently for local deployments but is restricted to Linux only. Likewise the bsdrc-script module only works on FreeBSD (and theoretically on NetBSD and OpenBSD). supervisord works on most UNIX-like systems, but is not self contained -- processes rely on the availablity of the supervisord service to run.

This Disnix-based process management solution is simple and portable to all UNIX-like systems that Disnix has been tested on.

The process module described in this blog post is a replacement for the process module that already exists in the current release of Dysnomia. The reason why I want it to be replaced is that Dysnomia now provides better alternatives to the old process module.

For example, when it is desired to have your process managed by systemd, then the new systemd-unit module should be used that is more reliable, supports many more features and has a simpler implementation.

Furthermore, I made a couple of mistakes in the past. The old process module was originally implemented as a simple module that would start a foreground process in the background, by using the nohup command. At the time I developed that module, I did not know much about developing daemons, nor about the additional steps daemons need to carry out to make themselves well-behaving.

nohup is not a proper solution for daemonizing foreground processes, such as critical system services -- a process might inherit privacy-sensitive environment variables, does not change the current working directory to the root folder and keep external drives mounted, and could also behave unpredictably if signal handlers have been changed from the default behaviour.

At some point I believed that it is more reliable to use a process manager to manage the lifecycle of a process and adjusted the process module to do that. Originally I used Upstart for this purpose, and later I switched to systemd, with sysvinit-scripts (and the direct appraoch with nohup as alternative implemenations).

Basically the process module provided three kinds of implementations in which none of them provided an optimal deployment experience.

I made a similar mistake with Dysnomia's wrapper module. Originally, its only purpose was to delegate the execution of deployment activities to a wrapper script included with the component that needs to be deployed. Because I was using this script mostly to deploy daemons, I have also adjusted the wrapper module to use an external process manager to manage the lifecycle of the daemon that the wrapper script might spawn.

Because of these mistakes and poor separation of functionality, I have decided to deprecate the old process and wrapper modules. Since they are frequently used and I do not want to break compatibility with old deployments, they can still be used if Dysnomia is configured in legacy mode, which is the default setting for the time being.

When using the old modules, Dysnomia will display a warning message explaining you that you should migrate to better alternatives.

Availability


The process Dysnomia module described in this blog post is part of the current development version of Dysnomia and will become available in the next release.

The Nix process management framework (which is still a highly-experimental prototype) includes the disnix backend (described in this blog post), allowing you to automatically translate a processes model to Disnix deployment models and uses Disnix to deploy a system.

A new input model transformation pipeline for Disnix (part 2)

$
0
0
Some time ago, I have revised the Disnix model transformation pipeline to cope with the increasing complexity of all the new features that were added to Disnix over the years.

The main reason why these model transformations are required is because Disnix takes multiple high-level models as input parameters. Each model captures a specific concern:

  • The services model captures all distributable components of which a service-oriented system consists including their inter-dependencies.
  • The infrastructure model captures all target machines in the network and all their relevant configuration properties.
  • The distribution model maps services in the services model, to target machines in the infrastructure model (and container services deployed to a machine).
  • The packages model supplements the machines with additional packages that are not required by any specific service.

The above specifications can be considered declarative specifications, because they specify relevant properties of a service-oriented system rather than the activities that need to be carried out to deploy the system, such as transferring the binary package versions of the services to the target machines in the network and the activation of the services.

The activities that need to be executed to deploy a system are derived automatically from the declarative input models.

To more easily derive these activities, Disnix translates the above input models to a single declarative specification (called a deployment model) that is also an executable specification -- in this model there is a one-on-one mapping between deployment artifacts, e.g. packages and snapshots. and deployment targets (e.g. machines and container services). For each of these mappings, we can easily derive the deployment activities that we need to execute.

In addition to a deployment model, Disnix can also optionally delegate package builds to target machines in the network by using a build model.

Transforming the input models to a deployment model is not very straight forward. To cope with the complexity, and give the user more abilities to experiment and integrate with Disnix, it first coverts the inputs model to two intermediate representations:

  • The deployment architecture model unifies the input models into one single declarative specification, and eliminates the distribution concern by augmenting the targets in the distribution model to the corresponding services.
  • The normalized deployment architecture model augments all services and targets with missing default properties and translates/desugars high-level deployment properties into unambiguous low-level properties.

Although I am quite happy with this major revision and having these well defined intermediate models (it allowed me to improve quality as well as exposing more configuration options), the normalization of the deployment architecture model still remained relatively complicated.

With the addition of service container providers, the normalization process became so complicated that I was forced to revise it again.

"Using references" in the Nix expression language


After some thinking, I realized that my biggest problem was dealing with the absence of "references" and dynamic binding in the Nix expression language. For example, in the deployment architecture model, a service might "refer" to a target machine to indicate where it should to be deployed to:


rec {
services = {
testServiceA = {
name = "testServiceA";
targets = [ infrastructure.test1 ];
type = "process";
};
};

infrastructure = {
test1 = {
properties = {
hostname = "test1";
};
containers.process = {};
};
};
}

In the above deployment architecture model, the service: testServiceA has a property: targets that "refers" to machines in the infrastructure section to tell Disnix where the service should be deployed to.

Although the above notation might suggest that the element in the targets list "refers" to the property: infrastructure.test1, what from a logical point of view happens is that we attach a copy of that attribute set to the list. This caused by the fact that in the Nix expression language, data structures are immutable -- they are not modified, but new instances are provided instead.

(As a sidenote: although the "reference" to the test1 target machine manifests itself as a logical copy to the language user, internally the Nix package manager may still use a data structure that resides in the same place in memory.

Older implementations of Nix use a technique called maximal laziness, that is based on the concept of maximal sharing, and as far as I know, current Nix implementations still share data structures in memory, although not as extensively anymore as the maximal laziness implementation).

Immutability has a number of consequences. For example, in Disnix we can override the deployment architecture model in the expression shown above to augment the target machines in the infrastructure section with some default properties (advanced machine properties that you would normally only specify for more specialized use cases):


let
pkgs = import <nixpkgs> {};
architecture = import ./architecture.nix;
in
pkgs.lib.recursiveUpdate architecture {
infrastructure.test1.properties = {
system = "x86_64-linux";
clientInterface = "disnix-ssh-client";
targetProperty = "hostname";
};
}

The Nix expression takes the deployment architecture model shown in the previous code fragment, and augments the test1 target machine configuration in the infrastructure section with default values for some more specialized deployment settings:

  • The system attribute specifies the system architecture of the target machine, so that optionally Nix can delegate the build of a package to another machine that is capable of building it (the system architecture of the coordinator machine might be a different operating system, CPU architecture, or both).
  • The clientInterface refers to an executable that can establish a remote connection to the machine. By default Disnix uses SSH, but also other communication protocols can be used.
  • The targetProperty attribute refers to the property in properties that can be used as a connection string for the client interface.

The result of the evaluation the expression above is due to immutability, logically speaking, not a modified attribute set, but a new attribute set containing the properties of the original deployment architecture with the advanced settings augmented.

The result of evaluating the expression shown above is the following:


{
services = {
testServiceA = {
name = "testServiceA";
targets = [
{
properties = {
hostname = "test1";
};
containers.process = {};
}
];
type = "process";
};
};

infrastructure = {
test1 = {
properties = {
hostname = "test1";
system = "x86_64-linux";
clientInterface = "disnix-ssh-client";
targetProperty = "hostname";
};
containers.process = {};
};
};
}

As you may probably notice, the test1 target machine configuration in the infrastructure section has been augmented with the specialized default properties shown earlier, but the "reference" (that in reality is not a reference) to the machine configuration in testServiceA still refers to the old configuration.

Simulating references


The result shown in the output above is not what I want. What I basically want is that target machine references get updated as well. To cope with this limitation in older Disnix versions, I implemented a very ad-hoc and tedious transformation strategy -- whenever I update a target machine, then I update all the target machine "references" as well.

In early implementations of Disnix, applying this strategy was still somewhat manageable, but over the years things have grown considerable in complexity. In addition to target machines, services in Disnix can also have inter-dependency references to other services, and inter-dependencies also depend on the target machines where the dependencies have been deployed to.

Moreover, target references are not only references to target machines, but also references to container services hosted on that machine. By default, if no target container is specified, then Disnix uses the service's type property to do an automapping.

For example, the following targets mapping of testServiceA:


targets = [ infrastructure.test1 ];

is equivalent to:


targets = {
targets = [
{ target = infrastructure.test1; container = "process"; }
];
};

In the revised normalization strategy, I first turn all references into reference specifications, that are composed of the attribute keys of the objects they refer to.

For example, consider the following deployment architecture model:


rec {
services = {
testServiceA = {
name = "testServiceA";
targets = [ infrastructure.test1 ];
type = "process";
};

testServiceB = {
name = "testServiceB";
dependsOn = {
inherit testServiceA;
};
targets = [ infrastructure.test1 ];
type = "process";
};
};

infrastructure = {
test1 = {
properties = {
hostname = "test1";
};
containers.process = {};
};
};
}

The above model declares the following properties:

  • There is one target machine: test1 with a container service called: process. The container service is a generic container that starts and stops daemons.
  • There are two services defined: testServiceA is a service that corresponds to a running process, testServiceB is a process that has an inter-dependency on testServiceA -- in order to work properly, it needs to know how to reach it and it should be activated after testServiceA.

In the normalization process, the model shown above gets "referenized" as follows:


{
services = {
testServiceA = {
name = "testServiceA";
targets = [ { target = "test1"; container = "process"; } ];
type = "process";
};

testServiceB = {
name = "testServiceB";
dependsOn = {
testServiceA = {
service = "testServiceA";
targets = [
{ target = "test1"; container = "process"; }
];
};
};
targets = [ { target = "test1"; container = "process"; } ];
type = "process";
};
};

infrastructure = {
test1 = {
properties = {
hostname = "test1";
system = "x86_64-linux";
clientInterface = "disnix-ssh-client";
targetProperty = "hostname";
};
containers.process = {};
};
};
}

In the above code fragment, the service's targets and inter-dependencies (dependsOn) get translated into reference specifications:

  • The targets property of each service gets translated into a list of attribute sets, in which the target attribute refers to the key of the target machine in the infrastructure attribute set, and the container attribute refers to the container in the containers sub attribute of the target machine.
  • The dependsOn property of each service gets translated into an inter-dependency reference specification. The service attribute refers to the key in the services section that provided the inter-dependency, and the targets refer to all the target containers where that inter-dependency is deployed to, in the same format as the targets property of a service.

    By default, an inter-dependency's targets attribute refers to the same targets as the corresponding service. It is also possible to limit the targets of a dependency to a subset.

To implement the "referenize" strategy shown above, there is a requirement imposed on the "logical units" in the model -- every attribute set representing an object (e.g. a service, or a target) needs a property that allows it to be uniquely identified, so that we can determine what the key in the attribute set is in which it is declared.

In the example above, every target can be uniquely identified by an attribute that serves as the connection string (e.g. hostname). With a lookup table (mapping connection strings to target keys) we can determine the corresponding key in the infrastructure section.

For the services, there is no unique property, so we have to introduce one: name -- the name attribute should correspond to the service attribute set's key.

(As a sidenote: introducing unique identification properties is strictly not required, whenever we encounter a parameter that should be treated as a reference, we could also scan the declaring attribute set for an object that has an identical structure, but this makes things more complicated and slower).

With all relevant objects turned into references, we can apply all our normalization rules, such as translating high-level properties to low-level properties and augmenting unspecified properties with reasonable default values, without requiring us to update all object "references" as well.

Consuming inter-dependency parameters in Disnix expressions


Eventually, we need to "dereference" these objects as well. Every service in the service model, gets build and configured from source and their relevant build inputs, including the service's inter-dependencies:


{stdenv, fetchurl}:
{testServiceA}:

stdenv.mkDerivation {
name = "testServiceB";
src = fetchurl {
url = https://.../testServiceA.tar.gz;
sha256 = "0242a...";
};
preConfigure = ''
cat > config.json <<EOF
targetServiceURL=http://${testServiceA.target.hostname}
EOF

cat > docs.txt <<EOF
The name of the first dependency is: ${testServiceA.name}
The system target architecture is: ${testServiceA.target.system}
EOF
'';
configurePhase = "./configure --prefix=$(out)";
buildPhase = "make";
buildInstall = "make install";
}

The Disnix expression shown above is used to build and configure the testServiceB from source code and all its build inputs. It follows a similar convention as ordinary Nix packages, in which a function header defines the build inputs and the body invokes a function that builds and configures the package:

  • The outer function header defines the local build and runtime dependencies of the services, also called: intra-dependencies in Disnix. In the example above, we use stdenv to compose an environment with standard build utilities and fetchurl to download a tarball from an external site.
  • The inner function header defines all the dependencies on services that may be deployed to remote machines in the network. These are called: inter-dependencies in Disnix. In this paricular example, we have a dependency on service: testServiceA that is defined in the services section of the deployment architecture model.
  • In the body, we invoke the stdenv.mkDerivation function to build the service from source code and configure it in such a way that it can connect to testServiceA, by creating a config.json file that contains the connection properties. We also generate a docs.txt file that displays some configuration properties.

The testServiceA parameter refers to the normalized properties of testServiceA in the services section of the normalized deployment architecture model, in which we have "dereferenced" the properties that were previously turned into references.

We also make a few small modifications to the properties of an inter-dependency:

  • Since it is so common that a service will be deployed to a single target machine only, there is also a target property that refers to the first element in the targets list.
  • The containers property of a target machine will not be accessible -- since you can map a service to one single container only, you can only use the container attribute set that provides access to the properties of the container where the service is mapped to.
  • We can also refer to transitve inter-dependencies via the connectsTo attribute, that provides access to the properties all visible inter-dependencies in the same format.

Translation to a deployment model


A normalized deployment architecture model, in which all object references have been translated into reference specifications, can also be more easily translated to a deployment model (the XML representation of the deployment model is called a manifest file in Disnix terminology), an executable model that Disnix can use to derive the deployment activities that need to be executed:


{
services = {
"282ppq..." = {
name = "testServiceA";
...
};
"q1283l..." = {
name = "testServiceB";
...
};
};

serviceMappings = [
{ service = "282ppq..."; container = "process"; target = "test1"; }
{ service = "q1283l..."; container = "process"; target = "test1"; }
];

infrastructure = {
test1 = {
properties = {
hostname = "test1";
system = "x86_64-linux";
clientInterface = "disnix-ssh-client";
targetProperty = "hostname";
};
containers.process = {};
};
};
}

In the above partial deployment model, the serviceMappings list contains mappings of services in the services section, to target machines in the infrastructure section and containers as sub attributes of the target machine.

We can make a straight forward translation from the normalized deployment architecture model. The only thing that needs to be computed is the hash keys of the services, that are derived from the hashes of the Nix store paths of the package builds and some essential target properties, such as the service's type and inter-dependencies.

Discussion


In this blog post, I have described how I have reimplemented the normalization strategy of the deployment architecture model in Disnix, in which I turn sub objects that should represent references into reference specifications, then apply all the required normalizations, and finally dereference the objects when needed, so that most of the normalized properties can be used by the Disnix expressions that build and configure the services.

Aside from the fact that "referenizing" objects as a first step makes the normalization infrastructure IMO simpler, we can also expose more configuration properties to services -- in Disnix expressions, it is now also possible to refer to transitive inter-dependencies and the package builds of inter-dependencies. The latter is useful to ensure that if two services that have an inter-dependency on each other are deployed to the same machine, that the local process management system (e.g. sysvinit scripts, systemd, and many others) can also activate them in the right order after a reboot.

Not having a references and dynamic binding is a general limitation of the Nix expression language and coping with that is not a problem exclusive to Disnix. For example, the Nixpkgs collection originally used to declare one big top-level attribute set with function invocations, such as:


rec {
stdenv = import ./stdenv {
...
};

zlib = import ./development/libraries/zlib {
inherit stdenv;
};

file = import ./tools/misc/file {
inherit stdenv zlib;
};
}

For example, the partial Nix expression shown above is a simplified representation of the top-level attribute set in Nixpkgs, containing three packages: stdenv is a standard environment providing basic build utilities, zlib is the zlib library used for compression and decompression, and file is the file tool, that can identify a file's type.

We can attempt to override the package set with a different version of zlib:


let
pkgs = import ./top-level.nix;
in
pkgs // {
zlib = import ./development/libraries/zlib-optimised {
inherit stdenv;
};
}

In the above expression we override the attribute zlib with an optimised variant. The outcome is comparable to what we have seen with the Disnix deployment architecture models -- the following command will build our optimized zlib package:


$ nix-build override.nix -A zlib

but building file (that has a dependency on zlib) still links against the old version (because it was statically bound to the original zlib declaration):


$ nix-build override.nix -A file

One way to cope with this limitation, and to bind dependencies dynamically, is to write the top-level expression as follows:


self:

{
stdenv = import ./stdenv {
...
};

zlib = import ./development/libraries/zlib {
inherit (self) stdenv;
};

file = import ./tools/misc/file {
inherit (self) stdenv zlib;
};
}

In the above expression, we have wrapped the attribute set that composes the package in a function and we have introduced a parameter: self, and changed all the function parameters to bind to self.

To be able to build a package, we need another Nix expression that provides a representation of self:


let
self = import ./composition1.nix self;
in
self

We can use the above expression to build any of the packages declared in the previous expression:


$ nix-build pkgs1.nix -A file

The construction used in the expression shown above was very mind blowing to me at first -- it is using a concept called fixed points. In this expression, I provide the variable that provides the end result as a parameter to the composition expression that declares a function. This expression (sort of) simulates the behaviour of a recursive attribute set (rec { ... }).

We can declare a second composition expression (using the same convention as the first) that defines our overrides:


self:

{
zlib = import ./development/libraries/zlib-optimised {
inherit (self) stdenv;
};
}

We can "glue" both composition expressions together as follows, making sure that the composition expression shown above overrides the packages in the first composition expression:


let
composition1 = import ./composition1.nix self;

composition2 = import ./composition2.nix self;

self = composition1 // composition2;
in
self

When we run the following command (pkgs2.nix refers to the Nix expression shown above):


$ nix-build pkgs2.nix -A file

The Nix package manager builds the file package using the optimized zlib library.

Nixpkgs goes much further than the simple dynamic binding strategy that I just described -- Nixpkgs provides a feature called overlays in which you can declare sets of packages as layers that use dynamic binding and can override packages when necessary. It also allows you to refer to all intermediate evaluation stages.

Furthermore, packages in the Nixpkgs set are also overridable -- you can take an existing function invocation and change their parameters, as well as the parameters to the derivation { } function invocation that builds a package.

Since the ability to dynamically bind parameters is a highly useful feature, there have been quite a few discussions in the Nix community about the usefulness of the Nix expression language. The language is a domain-specific language for package/configuration management, but it lacks some of the features that we commonly use (such as dynamic binding). As a result, we heavily rely on custom developed abstractions.

In Disnix, I have decided to not adopt the dynamic binding strategy of Nixpkgs (and the example shown earlier). Instead, I am using a more restricted form of references -- the Disnix models can be considered lightweight embedded DSLs inside an external DSL (the Nix expression language).

The general dynamic binding strategy offers users a lot of freedom -- they can decide to not use the self parameter and implement custom tricks and abstractions, or by mistake incorrectly bind a parameter.

In Disnix, I do not want a target to refer to anything else but a target declared in the infrastructure model. The same thing applies to inter-dependencies -- they can only refer to services in the services model, targets in the infrastructure model, and containers deployed to a target machine.

Related work


I got quite a bit of inspiration from reading Russell O'Connor's blog post about creating fake dynamic bindings in Nix. In this blog post, he demonstrates how recursive attribute sets (rec { }) can be simulated with ordinary attribute sets and fixed points, and how this concept can expanded to simulate "virtual attribute sets" with "fake" dynamic bindings.

He also makes a proposal to extend the Nix expression language with a virtual attribute set (virtual { }) in which dynamic bindings become a first class concept.

On using Nix and Docker as deployment automation solutions: similarities and differences

$
0
0
As frequent readers of my blog may probably already know, I have been using Nix-related tools for quite some time to solve many of my deployment automation problems.

Although I have worked in environments in which Nix and its related sub projects are well-known, when I show some of Nix's use cases to larger groups of DevOps-minded people, a frequent answer I that have been hearing is that it looks very similar to Docker. People also often ask me what advantages Nix has over Docker.

So far, I have not even covered Docker once on my blog, despite its popularity, including very popular sister projects such as docker-compose and Kubernetes.

The main reason why I never wrote anything about Docker is not because I do not know about it or how to use it, but simply because I never had any notable use cases that would lead to something publishable -- most of my problems for which Docker could be a solution, I solved it by other means, typically by using a Nix-based solution somewhere in the solution stack.

Docker is a container-based deployment solution, that was not the first (neither in the Linux world, nor in the UNIX-world in general), but since its introduction in 2013 it has grown very rapidly in popularity. I believe its popularity can be mainly attributed to its ease of usage and its extendable images ecosystem: Docker Hub.

In fact, Docker (and Kubernetes, a container orchestration solution that incorporates Docker) have become so popular, that they have set a new standard when it comes to organizing systems and automating deployment -- today, in many environments, I have the feeling that it is no longer the question what kind of deployment solution is best for a particular system and organization, but rather: "how do we get it into containers?".

The same thing applies to the "microservices paradigm" that should facilitate modular systems. If I compare the characteristics of microservices with the definition a "software component" by Clemens Szyperski's Component Software book, then I would argue that they have more in common than they are different.

One of the reasons why I think microservices are considered to be a success (or at least considered moderately more successful by some over older concepts, such as web services and software components) is because they easily map to a container, that can be conveniently managed with Docker. For some people, a microservice and a Docker container are pretty much the same things.

Modular software systems have all kinds of advantages, but its biggest disadvantage is that the deployment of a system becomes more complicated as the amount of components and dependencies grow. With Docker containers this problem can be (somewhat) addressed in a convenient way.

In this blog post, I will provide my view on Nix and Docker -- I will elaborate about some of their key concepts, explain in what ways they are different and similar, and I will show some use-cases in which both solutions can be combined to achieve interesting results.

Application domains


Nix and Docker are both deployment solutions for slightly different, but also somewhat overlapping, application domains.

The Nix package manager (on the recently revised homepage) advertises itself as follows:

Nix is a powerful package manager for Linux and other Unix systems that makes package management reliable and reproducible. Share your development and build environments across different machines.

whereas Docker advertises itself as follows (in the getting started guide):

Docker is an open platform for developing, shipping, and running applications.

To summarize my interpretations of the descriptions:

  • Nix's chief responsibility is as its description implies: package management and provides a collection of software tools that automates the process of installing, upgrading, configuring, and removing computer programs for a computer's operating system in a consistent manner.

    There are two properties that set Nix apart from most other package management solutions. First, Nix is also a source-based package manager -- it can be used as a tool to construct packages from source code and their dependencies, by invoking build scripts in "pure build environments".

    Moreover, it borrows concepts from purely functional programming languages to make deployments reproducible, reliable and efficient.
  • Docker's chief responsibility is much broader than package management -- Docker facilitates full process/service life-cycle management. Package management can be considered to be a sub problem of this domain, as I will explain later in this blog post.

Although both solutions map to slightly different domains, there is one prominent objective that both solutions have in common. They both facilitate reproducible deployment.

With Nix the goal is that if you build a package from source code and a set of dependencies and perform the same build with the same inputs on a different machine, their build results should be (nearly) bit-identical.

With Docker, the objective is to facilitate reproducible environments for running applications -- when running an application container on one machine that provides Docker, and running the same application container on another machine, they both should work in an identical way.

Although both solutions facilitate reproducible deployments, their reproducibility properties are based on different kinds of concepts. I will explain more about them in the next sections.

Nix concepts


As explained earlier, Nix is a source-based package manager that borrows concepts from purely functional programming languages. Packages are built from build recipes called Nix expressions, such as:


with import <nixpkgs> {};

stdenv.mkDerivation {
name = "file-5.38";

src = fetchurl {
url = "ftp://ftp.astron.com/pub/file/file-5.38.tar.gz";
sha256 = "0d7s376b4xqymnrsjxi3nsv3f5v89pzfspzml2pcajdk5by2yg2r";
};

buildInputs = [ zlib ];

meta = {
homepage = https://darwinsys.com/file;
description = "A program that shows the type of files";
};
}

The above Nix expression invokes the function: stdenv.mkDerivation that creates a build environment in which we build the package: file from source code:

  • The name parameter provides the package name.
  • The src parameter invokes the fetchurl function that specifies where to download the source tarball from.
  • buildInputs refers to the build-time dependencies that the package needs. The file package only uses one dependency: zlib that provides deflate compression support.

    The buildInputs parameter is used to automatically configure the build environment in such a way that zlib can be found as a library dependency by the build script.
  • The meta parameter specifies the package's meta data. Meta data is used by Nix to provide information about the package, but it is not used by the build script.

The Nix expression does not specify any build instructions -- if no build instructions were provided, the stdenv.mkDerivation function will execute the standard GNU Autotools build procedure: ./configure; make; make install.

Nix combines several concepts to make builds more reliable and reproducible.

Foremost, packages managed by Nix are stored in a so-called Nix store (/nix/store) in which every package build resides in its own directory.

When we build the above Nix expression with the following command:


$ nix-build file.nix

then we may get the following Nix store path as output:


/nix/store/6rcg0zgqyn2v1ypd46hlvngaf5lgqk9g-file-5.38

Each entry in the Nix store has a SHA256 hash prefix (e.g. ypag3bh7y7i15xf24zihr343wi6x5i6g) that is derived from all build inputs used to build a package.

If we would build file, for example, with a different build script or different version of zlib then the resulting Nix store prefix will be different. As a result, we can safely store multiple versions and variants of the same package next to each other, because they will never share the same name.

Because each package resides in its own directory in the Nix store, rather than global directories that are commonly used on conventional Linux systems, such as /bin and /lib, we get stricter purity guarantees -- dependencies can typically not be found if they have not been specified in any of the search environment variables (e.g. PATH) or provided as build parameters.

In conventional Linux systems, package builds might still accidentally succeed if they unknowingly use an undeclared dependency. When deploying such a package to another system that does not have this undeclared dependency installed, the package might not work properly or not all.

In simple single-user Nix installations, builds typically get executed in an environment in which most environment variables (including search path environment variables, such as PATH) are cleared or set to dummy variables.

Build abstraction functions (such as stdenv.mkDerivation) will populate the search path environment variables (e.g. PATH, CLASSPATH, PYTHONPATH etc.) and configure build parameters to ensure that the dependencies in the Nix store can be found.

Builds are only allowed to write in the build directory or designated output folders in the Nix store.

When a build has completed successfully, their results are made immutable (by removing their write permission bits in the Nix store) and their timestamps are reset to 1 second after the epoch (to improve build determinism).

Storing packages in isolation and providing an environment with cleared environment variables is obviously not a guarantee that builds will be pure. For example, build scripts may still have hard-coded absolute paths to executables on the host system, such as /bin/install and a C compiler may still implicitly search for headers in /usr/include.

To alleviate the problem with hard-coded global directory references, some common build utilities, such as GCC, deployed by Nix have been patched to ignore global directories, such as /usr/include.

When using Nix in multi-user mode, extra precautions have been taken to ensure build purity:

  • Each build will run as an unprivileged user, that do not have any write access to any directory but its own build directory and the designated output Nix store paths.
  • On Linux, optionally a build can run in a chroot environment, that completely disables access to all global directories in the build process. In addition, all Nix store paths of all dependencies will be bind mounted, preventing the build process to still access undeclared dependencies in the Nix store (changes will be slim that you encounter such a build, but still...)
  • On Linux kernels that support namespaces, the Nix build environment will use them to improve build purity.

    The network namespace helps the Nix builder to prevent a build process from accessing the network -- when a build process downloads an undeclared dependency from a remote location, we cannot be sure that we get a predictable result.

    In Nix, only builds that are so-called fixed output derivations (whose output hashes need to be known in advance) are allowed to download files from remote locations, because their output results can be verified.

    (As a sidenote: namespaces are also intensively used by Docker containers, as I will explain in the next section.)
  • On macOS, builds can optionally be executed in an app sandbox, that can also be used to restrict access to various kinds of shared resources, such as network access.

Besides isolation, using hash code prefixes have another advantage. Because every build with the same hash code is (nearly) bit identical, it also provides a nice optimization feature.

When we evaluate a Nix expression and the resulting hash code is identical to a valid Nix store path, then we do not have to build the package again -- because it is bit identical, we can simply return the Nix store path of the package that is already in the Nix store.

This property is also used by Nix to facilitate transparent binary package deployments. If we want to build a package with a certain hash prefix, and we know that another machine or binary cache already has this package in its Nix store, then we can download a binary substitute.

Another interesting benefit of using hash codes is that we can also identify the runtime dependencies that a package needs -- if a Nix store path contains references to other Nix store paths, then we know that these are runtime dependencies of the corresponding package.

Scanning for Nix store paths may sound scary, but there is a very slim change that a hash code string represents something else. In practice, it works really well.

For example, the following shell command shows all the runtime dependencies of the file package:


$ nix-store -qR /nix/store/6rcg0zgqyn2v1ypd46hlvngaf5lgqk9g-file-5.38
/nix/store/y8n2b9nwjrgfx3kvi3vywvfib2cw5xa6-libunistring-0.9.10
/nix/store/fhg84pzckx2igmcsvg92x1wpvl1dmybf-libidn2-2.3.0
/nix/store/bqbg6hb2jsl3kvf6jgmgfdqy06fpjrrn-glibc-2.30
/nix/store/5x6l9xm5dp6v113dpfv673qvhwjyb7p5-zlib-1.2.11
/nix/store/6rcg0zgqyn2v1ypd46hlvngaf5lgqk9g-file-5.38

If we query the dependencies of another package that is built from the same Nix packages set, such as cpio:


$ nix-store -qR /nix/store/bzm0mszhvbr6hp4gmar4czsn52hz07q1-cpio-2.13
/nix/store/y8n2b9nwjrgfx3kvi3vywvfib2cw5xa6-libunistring-0.9.10
/nix/store/fhg84pzckx2igmcsvg92x1wpvl1dmybf-libidn2-2.3.0
/nix/store/bqbg6hb2jsl3kvf6jgmgfdqy06fpjrrn-glibc-2.30
/nix/store/bzm0mszhvbr6hp4gmar4czsn52hz07q1-cpio-2.13

When looking at the outputs above, you may probably notice that both bash and cpio share the same kinds of dependencies (e.g. libidn2, libunisting and glibc), with the same hash code prefixes. Because they are same Nix store paths, they are shared on disk (and in RAM, because the operating system caches the same files in memory) leading to more efficient disk and RAM usage.

The fact that we can detect references to Nix store paths is because packages in the Nix package repository use an unorthodox form of static linking.

For example, ELF executables built with Nix have the store paths of their library dependencies in their RPATH header values (the ld command in Nixpkgs has been wrapped to transparently augment libraries to a binary's RPATH).

Python programs (and other programs written in interpreted languages) typically use wrapper scripts that set the PYTHONPATH (or equivalent) environment variables to contain Nix store paths providing the dependencies.

Docker concepts


The Docker overview page states the following about what Docker can do:

When you use Docker, you are creating and using images, containers, networks, volumes, plugins, and other objects.

Although you can create many kinds of objects with Docker, the two most important objects are the following:

  • Images. The overview page states: "An image is a read-only template with instructions for creating a Docker container.".

    To more accurately describe what this means is that images are created from build recipes called Dockerfiles. They produce self-contained root file systems containing all necessary files to run a program, such as binaries, libraries, configuration files etc. The resulting image itself is immutable (read only) and cannot change after it has been built.
  • Containers. The overview gives the following description: "A container is a runnable instance of an image".

    More specifically, this means that the life-cycle (whether a container is in a started or stopped state) is bound to the life-cycle of a root process, that runs in a (somewhat) isolated environment using the content of a Docker image as its root file system.

Besides the object types explained above, there are many more kinds objects, such as volumes (that can mount a directory from the host file system to a path in the container), and port forwardings from the host system to a container. For more information about these remaining objects, consult the Docker documentation.

Docker combines several concepts to facilitate reproducible and reliable container deployment. To be able to isolate containers from each other, it uses several kinds of Linux namespaces:

  • The mount namespace: This is in IMO the most important name space. After setting up a private mount namespace, every subsequent mount that we make will be visible in the container, but not to other containers/processes that are in a different mount name space.

    A private mount namespace is used to mount a new root file system (the contents of the Docker image) with all essential system software and other artifacts to run an application, that is different from the host system's root file system.
  • The Process ID (PID) namespace facilitates process isolation. A process/container with a private PID namespace will not be able to see or control the host system's processes (the opposite is actually possible).
  • The network namespace separates network interfaces from the host system. In a private network namespace, a container has one or more private network interfaces with their own IP addresses, port assignments and firewall settings.

    As a result, a service such as the Apache HTTP server in a Docker container can bind to port 80 without conflicting with another HTTP server that binds to the same port on the host system or in another container instance.
  • The Inter-Process Communication (IPC) namespace separates the ability for processes to communicate with each other via the SHM family of functions to establish a range of shared memory between the two processes.
  • The UTS namespace isolates kernel and version identifiers.

Another important concept that containers use are cgroups that can be use to limit the amount of system resources that containers can use, such as the amount of RAM.

Finally, to optimize/reduce storage overhead, Docker uses layers and a union filesystem (there are a variety of file system options for this) to combine these layers by "stacking" them on top of each other.

A running container basically mounts an image's read-only layers on top of each other, and keeps the final layer writable so that processes in the container can create and modify files on the system.

Whenever you construct an image from a Dockerfile, each modification operation generates a new layer. Each layer is immutable (it will never change after it has been created) and is uniquely identifiable with a hash code, similar to Nix store paths.

For example, we can build an image with the following Dockerfile that deploys and runs the Apache HTTP server on a Debian Buster Linux distribution:


FROM debian:buster

RUN apt-get update
RUN apt-get install -y apache2
ADD index.html /var/www/html
CMD ["apachectl", "-D", "FOREGROUND"]
EXPOSE 80/tcp

The above Dockerfile executes the following steps:

  • It takes the debian:buster image from Docker Hub as a base image.
  • It updates the Debian package database (apt-get update) and installs the Apache HTTPD server package from the Debian package repository.
  • It uploads an example page (index.html) to the document root folder.
  • It executes the: apachectl -D FOREGROUND command-line instruction to start the Apache HTTP server in foreground mode. The container's life-cycle is bound to the life-cycle of this foreground process.
  • It informs Docker that the container listens to TCP port: 80. Connecting to port 80 makes it possible for a user to retrieve the example index.html page.

With the following command we can build the image:


$ docker build . -t debian-apache

Resulting in the following layers:


$ docker history debian-nginx:latest
IMAGE CREATED CREATED BY SIZE COMMENT
a72c04bd48d6 About an hour ago /bin/sh -c #(nop) EXPOSE 80/tcp 0B
325875da0f6d About an hour ago /bin/sh -c #(nop) CMD ["apachectl""-D""FO… 0B
35d9a1dca334 About an hour ago /bin/sh -c #(nop) ADD file:18aed37573327bee1… 129B
59ee7771f1bc About an hour ago /bin/sh -c apt-get install -y apache2 112MB
c355fe9a587f 2 hours ago /bin/sh -c apt-get update 17.4MB
ae8514941ea4 33 hours ago /bin/sh -c #(nop) CMD ["bash"] 0B
<missing> 33 hours ago /bin/sh -c #(nop) ADD file:89dfd7d3ed77fd5e0… 114MB

As may be observed, the base Debian Buster image and every change made in the Dockerfile results in a new layer with a new hash code, as shown in the IMAGE column.

Layers and Nix store paths share the similarity that they are immutable and they can both be identified with hash codes.

They are also different -- first, a Nix store path is the result of building a package or a static artifact, whereas a layer is the result of making a filesystem modification. Second, for a Nix store path, the hash code is derived from all inputs, whereas the hash code of a layer is derived from the output: its contents.

Furthermore, Nix store paths are always isolated because they reside in a unique directory (enforced by the hash prefixes), whereas a layer might have files that overlap with files in other layers. In Docker, when a conflict is encountered the files in the layer that gets added on top of it take precedence.

We can construct a second image using the same Debian Linux distribution image that runs Nginx with the following Dockerfile:


FROM debian:buster

RUN apt-get update
RUN apt-get install -y nginx
ADD nginx.conf /etc
ADD index.html /var/www
CMD ["nginx", "-g", "daemon off;", "-c", "/etc/nginx.conf"]
EXPOSE 80/tcp

The above Dockerfile looks similar to the previous, except that we install the Nginx package from the Debian package repository and we use a different command-line instruction to start Nginx in foreground mode.

When building the image, its storage will be optimized -- both images share the same base layer (the Debian Buster Linux base distribution):


$ docker history debian-nginx:latest
IMAGE CREATED CREATED BY SIZE COMMENT
b7ae6f38ae77 2 hours ago /bin/sh -c #(nop) EXPOSE 80/tcp 0B
17027888ce23 2 hours ago /bin/sh -c #(nop) CMD ["nginx""-g""daemon… 0B
41a50a3fa73c 2 hours ago /bin/sh -c #(nop) ADD file:18aed37573327bee1… 129B
0f5b2fdcb207 2 hours ago /bin/sh -c #(nop) ADD file:f18afd18cfe2728b3… 189B
e49bbb46138b 2 hours ago /bin/sh -c apt-get install -y nginx 64.2MB
c355fe9a587f 2 hours ago /bin/sh -c apt-get update 17.4MB
ae8514941ea4 33 hours ago /bin/sh -c #(nop) CMD ["bash"] 0B
<missing> 33 hours ago /bin/sh -c #(nop) ADD file:89dfd7d3ed77fd5e0… 114MB

If you compare the above output with the previous docker history output, then you will notice that the bottom layer (last row) refers to the same layer using the same hash code behind the ADD file: statement in the CREATED BY column.

This ability to share the base distribution prevents us from storing another 114MB Debian Buster image, saving us storage and RAM.

Some common misconceptions


What I have noticed is that quite a few people compare containers to virtual machines (and even give containers that name, incorrectly suggesting that they are the same thing!).

A container is not a virtual machine, because it does not emulate or virtualize hardware -- virtual machines have a virtual CPU, virtual memory, virtual disk etc. that have similar capabilities and limitations as real hardware.

Furthermore, containers do not run a full operating system -- they run processes managed by the host system's Linux kernel. As a result, Docker containers will only deploy software that runs on Linux, and not software that was built for other operating systems.

(As a sidenote: Docker can also be used on Windows and macOS -- on these non-Linux platforms, a virtualized Linux system is used for hosting the containers, but the containers themselves are not separated by using virtualization).

Containers cannot even be considered "light weight virtual machines".

The means to isolate containers from each other only apply to a limited number of potentially shared resources. For example, a resource that could not be unshared is the system's clock, although this may change in the near future, because in March 2020 a time namespace has been added to the newest Linux kernel version. I believe this namespace is not yet offered as a generally available feature in Docker.

Moreover, namespaces, that normally provide separation/isolation between containers, are objects and these objects can also be shared among multiple container instances (this is a uncommon use-case, because by default every container has its own private namespaces).

For example, it is also possible for two containers to share the same IPC namespace -- then processes in both containers will be able to communicate with each other with a shared-memory IPC mechanism, but they cannot do any IPC with processes on the host system or containers not sharing the same namespace.

Finally, certain system resources are not constrained by default unlike a virtual machine -- for example, a container is allowed to consume all the RAM of the host machine unless a RAM restriction has been configured. An unrestricted container could potentially affect the machine's stability as a whole and other containers running on the same machine.

A comparison of use cases


As mentioned in the introduction, when I show people Nix, then I often get a remark that it looks very similar to Docker.

In this section, I will compare some of their common use cases.

Managing services


In addition to building a Docker image, I believe the most common use case for Docker is to manage services, such as custom REST API services (that are self-contained processes with an embedded web server), web servers or database management systems.

For example, after building an Nginx Docker image (as shown in the section about Docker concepts), we can also launch a container instance using the previously constructed image to serve our example HTML page:


$ docker run -p 8080:80 --name nginx-container -it debian-nginx

The above command create a new container instance using our Nginx image as a root file system and then starts the container in interactive mode -- the command's execution will block and display the output of the Nginx process on the terminal.

If we would omit the -it parameters then the container will run in the background.

The -p parameter configures a port forwarding from the host system to the container: traffic to the host system's port 8080 gets forwarded to port 80 in the container where the Nginx server listens to.

We should be able to see the example HTML page, by opening the following URL in a web browser:


http://localhost:8080

After stopping the container, its state will be retained. We can remove the container permanently, by running:


$ docker rm nginx-container

The Nix package manager has no equivalent use case for manging running processes, because its purpose is package management and not process/service life-cycle management.

However, some projects based on Nix, such as NixOS: a Linux distribution built around the Nix package manager using a single declarative configuration file to capture a machine's configuration, generates systemd unit files to manage services' life-cycles.

The Nix package manager can also be used on other operating systems, such as conventional Linux distributions, macOS and other UNIX-like systems. There is no universal solution that allows you to complement Nix with service manage support on all platforms that Nix supports.

Experimenting with packages


Another common use case is using Docker to experiment with packages that should not remain permanently installed on a system.

One way of doing this is by directly pulling a Linux distribution image (such as Debian Buster):


$ docker pull debian:buster

and then starting a container in an interactive shell session, in which we install the packages that we want to experiment with:


$ docker run --name myexperiment -it debian:buster /bin/sh
$ apt-get update
$ apt-get install -y file
# file --version
file-5.22
magic file from /etc/magic:/usr/share/misc/magic

The above example suffices to experiment with the file package, but its deployment is not guaranteed to be reproducible.

For example, the result of running my apt-get instructions shown above is file version 5.22. If I would, for example, run the same instructions a week later, then I might get a different version (e.g. 5.23).

The Docker-way of making such a deployment scenario reproducible, is by installing the packages in a Dockerfile as part of the container's image construction process:


FROM debian:buster

RUN apt-get update
RUN apt-get install -y file

we can build the container image, with our file package as follows:


$ docker build . -t file-experiment

and then deploy a container that uses that image:


$ docker run --name myexperiment -it debian:buster /bin/sh

As long as we deploy a container with the same image, we will always have the same version of the file executable:


$ docker run --name myexperiment -it file-experiment /bin/sh
# file --version
file-5.22
magic file from /etc/magic:/usr/share/misc/magic

With Nix, generating reproducible development environments with packages is a first-class feature.

For example, to launch a shell session providing the file package from the Nixpkgs collection, we can simply run:


$ nix-shell -p file
$ file --version
file-5.39
magic file from /nix/store/j4jj3slm15940mpmympb0z99a2ghg49q-file-5.39/share/misc/magic

As long as the Nix expression sources remain the same (e.g. the Nix channel is not updated, or NIX_PATH is hardwired to a certain Git revision of Nixpkgs), the deployment of the development environment is reproducible -- we should always get the same file package with the same Nix store path.

Building development projects/arbitrary packages


As shown in the section about Nix's concepts, one of Nix's key features is to generate build environments for building packages and other software projects. I have shown that with a simple Nix expression consisting of only a few lines of code, we can build the file package from source code and its build dependencies in such a dedicated build environment.

In Docker, only building images is a first-class concept. However, building arbitrary software projects and packages is also something you can do by using Docker containers in a specific way.

For example, we can create a bash script that builds the same example package (file) shown in the section that explains Nix's concepts:


#!/bin/bash -e

mkdir -p /build
cd /build

wget ftp://ftp.astron.com/pub/file/file-5.38.tar.gz

tar xfv file-5.38.tar.gz
cd file-5.38
./configure --prefix=/opt/file
make
make install

tar cfvz /out/file-5.38-binaries.tar.gz /opt/file

Compared to its Nix expression counterpart, the build script above does not use any abstractions -- as a consequence, we have to explicitly write all steps that executes the required build steps to build the package:

  • Create a dedicated build directory.
  • Download the source tarball from the FTP server
  • Unpack the tarball
  • Execute the standard GNU Autotools build procedure: ./configure; make; make install and install the binaries in an isolated folder (/opt/file).
  • Create a binary tarball from the /opt/file folder and store it in the /out directory (that is a volume shared between the container and the host system).

To create a container that runs the build script and to provide its dependencies in a reproducible way, we need to construct an image from the following Dockerfile:


FROM debian:buster

RUN apt-get update
RUN apt-get install -y wget gcc make libz-dev
ADD ./build.sh /
CMD /build.sh

The above Dockerfile builds an image using the Debian Buster Linux distribution, installs all mandatory build utilities (wget, gcc, and make) and library dependencies (libz-dev), and executes the build script shown above.

With the following command, we can build the image:


$ docker build . -t buildenv

and with the following command, we can create and launch the container that executes the build script (and automatically discard it as soon as it finishes its task):


$ docker run -v $(pwd)/out:/out --rm -t buildenv

To make sure that we can keep our resulting binary tarball after the container gets discarded, we have created a shared volume that maps the out directory in our current working directory onto the /out directory in the container.

When the build script finishes, the output directory should contain our generated binary tarball:


ls out/
file-5.38-binaries.tar.gz

Although both Nix and Docker both can provide reproducible environments for building packages (in the case of Docker, we need to make sure that all dependencies are provided by the Docker image), builds performed in a Docker container are not guaranteed to be pure, because it does not take the same precautions that Nix takes:

  • In the build script, we download the source tarball without checking its integrity. This might cause an impurity, because the tarball on the remote server could change (this could happen for non-mallicious as well as mallicous reasons).
  • While running the build, we have unrestricted network access. The build script might unknowingly download all kinds of undeclared/unknown dependencies from external sites whose results are not deterministic.
  • We do not reset any timestamps -- as a result, when performing the same build twice in a row, the second result might be slightly different because of the timestamps integrated in the build product.

Coping with these impurities in a Docker workflow is the responsibility of the build script implementer. With Nix, most of it is transparently handled for you.

Moreover, the build script implementer is also responsible to retrieve the build artifact and store it somewhere, e.g. in a directory outside the container or uploading it to a remote artifactory repository.

In Nix, the result of a build process is automatically stored in isolation in the Nix store. We can also quite easily turn a Nix store into a binary cache and let other Nix consumers download from it, e.g. by installing nix-serve, Hydra: the Nix-based continuous integration service, cachix, or by manually generating a static binary cache.

Beyond the ability to execute builds, Nix has another great advantage for building packages from source code. On Linux systems, the Nixpkgs collection is entirely bootstrapped, except for the bootstrap binaries -- this provides us almost full traceability of all dependencies and transitive dependencies used at build-time.

With Docker you typically do not have such insights -- images get constructed from binaries obtained from arbitrary locations (e.g. binary packages that originate from Linux distributions' package repositories). As a result, it is impossible to get any insights on how these package dependencies were constructed from source code.

For most people, knowing exactly from which sources a package has been built is not considered important, but it can still be useful for more specialized use cases. For example, to determine if your system is constructed from trustable/audited sources and whether you did not violate a license of a third-party library.

Combined use cases


As explained earlier in this blog post, Nix and Docker are deployment solutions for sightly different application domains.

There are quite a few solutions developed by the Nix community that can combine Nix and Docker in interesting ways.

In this section, I will show some of them.

Experimenting with the Nix package manager in a Docker container


Since Docker is such a common solution to provide environments in which users can experiment with packages, the Nix community also provides a Nix Docker image, that allows you to conveniently experiment with the Nix package manager in a Docker container.

We can pull this image as follows:


$ docker pull nixos/nix

Then launch a container interactively:


$ docker run -it nixos/nix

And finally, pull the package specifications from the Nix channel and install any Nix package that we want in the container:


$ nix-channel --add https://nixos.org/channels/nixpkgs-unstable nixpkgs
$ nix-channel --update
$ nix-env -f '<nixpkgs>' -iA file
$ file --version
file-5.39
magic file from /nix/store/bx9l7vrcb9izgjgwkjwvryxsdqdd5zba-file-5.39/share/misc/magic

Using the Nix package manager to deliver the required packages to construct an image


In the examples that construct Docker images for Nginx and the Apache HTTP server, I use the Debian Buster Linux distribution as base images in which I add the required packages to run the services from the Debian package repository.

This is a common practice to construct Docker images -- as I have already explained in section that covers its concepts, package management is a sub problem of the process/service life-cycle management problem, but Docker leaves solving this problem to the Linux distribution's package manager.

Instead of using conventional Linux distributions and their package management solutions, such as Debian, Ubuntu (using apt-get), Fedora (using yum) or Alpine Linux (using apk), it is also possible to use Nix.

The following Dockerfile can be used to create an image that uses Nginx deployed by the Nix package manager:


FROM nixos/nix

RUN nix-channel --add https://nixos.org/channels/nixpkgs-unstable nixpkgs
RUN nix-channel --update
RUN nix-env -f '<nixpkgs>' -iA nginx

RUN mkdir -p /var/log/nginx /var/cache/nginx /var/www
ADD nginx.conf /etc
ADD index.html /var/www

CMD ["nginx", "-g", "daemon off;", "-c", "/etc/nginx.conf"]
EXPOSE 80/tcp

Using Nix to build Docker images


Ealier, I have shown that the Nix package manager can also be used in a Dockerfile to obtain all required packages to run a service.

In addition to building software packages, Nix can also build all kinds of static artifacts, such as disk images, DVD ROM ISO images, and virtual machine configurations.

The Nixpkgs repository also contains an abstraction function to build Docker images that does not require any Docker utilities.

For example, with the following Nix expression, we can build a Docker image that deploys Nginx:


with import <nixpkgs> {};

dockerTools.buildImage {
name = "nginxexp";
tag = "test";

contents = nginx;

runAsRoot = ''
${dockerTools.shadowSetup}
groupadd -r nogroup
useradd -r nobody -g nogroup -d /dev/null
mkdir -p /var/log/nginx /var/cache/nginx /var/www
cp ${./index.html} /var/www/index.html
'';

config = {
Cmd = [ "${nginx}/bin/nginx""-g""daemon off;""-c" ./nginx.conf ];
Expose = {
"80/tcp" = {};
};
};
}

The above expression propagates the following parameters to the dockerTools.buildImage function:

  • The name of the image is: nginxexp using the tag: test.
  • The contents parameter specifies all Nix packages that should be installed in the Docker image.
  • The runAsRoot refers to a script that runs as root user in a QEMU virtual machine. This virtual machine is used to provide the dynamic parts of a Docker image, setting up user accounts and configuring the state of the Nginx service.
  • The config parameter specifies image configuration properties, such as the command to execute and which TCP ports should be exposed.

Running the following command:


$ nix-build
/nix/store/qx9cpvdxj78d98rwfk6a5z2qsmqvgzvk-docker-image-nginxexp.tar.gz

Produces a compressed tarball that contains all files belonging to the Docker image. We can load the image into Docker with the following command:


$ docker load -i \
/nix/store/qx9cpvdxj78d98rwfk6a5z2qsmqvgzvk-docker-image-nginxexp.tar.gz

and then launch a container instance that uses the Nix-generated image:


$ docker run -p 8080:80/tcp -it nginxexp:test

When we look at the Docker images overview:


$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
nginxexp test cde8298f025f 50 years ago 61MB

There are two properties that stand out when you compare the Nix generated Docker image to conventional Docker images:

  • The first odd property is that the overview says that the image that was created 50 years ago. This is explainable: to make Nix builds pure and deterministic, time stamps are typically reset to 1 second after the epoch (Januarty 1st 1970), to ensure that we always get the same bit-identical build result.
  • The second property is the size of the image: 61MB is considerably smaller than our Debian-based Docker image.

    To give you a comparison: the docker history command-line invocation (shown earlier in this blog post) that displays the layers of which the Debian-based Nginx image consists, shows that the base Linux distribution image consumes 114 MB, the update layer 17.4 MB and the layer that provides the Nginx package is 64.2 MB.

The reason why Nix-generated images are so small is because Nix exactly knows all runtime dependencies required to run Nginx. As a result, we can restrict the image to only contain Nginx and its required runtime dependencies, leaving all unnecessary software out.

The Debian-based Nginx container is much bigger, because it also contains a base Debian Linux system with all kinds of command-line utilities and libraries, that are not required to run Nginx.

The same limitation also applies to the Nix Docker image shown in the previous sections -- the Nix Docker image was constructed from an Alpine Linux image and contains a small, but fully functional Linux distribution. As a result, it is bigger than the Docker image directly generated from a Nix expression.

Although a Nix-generated Docker image is smaller than most conventional images, one of its disadvantages is that the image consists of only one single layer -- as we have seen in the section about Nix concepts, many services typically share the same runtime dependencies (such as glibc). Because these common dependencies are not in a reusable layer, they cannot be shared.

To optimize reuse, it is also possible to build layered Docker images with Nix:


with import <nixpkgs> {};

dockerTools.buildLayeredImage {
name = "nginxexp";
tag = "test";

contents = nginx;

maxLayers = 100;

extraCommands = ''
mkdir -p var/log/nginx var/cache/nginx var/www
cp ${./index.html} var/www/index.html
'';

config = {
Cmd = [ "${nginx}/bin/nginx""-g""daemon off;""-c" ./nginx.conf ];
Expose = {
"80/tcp" = {};
};
};
}

The above Nix expression is similar to the previous. but uses dockerTools.buildLayeredImage to construct a layered image.

We can build and load the image as follows:


$ docker load -i $(nix-build layered.nix)

When we retieve the history of the image, then we will see the following:


$ docker history nginxexp:test
IMAGE CREATED CREATED BY SIZE COMMENT
b91799a04b99 50 years ago 1.47kB store paths: ['/nix/store/snxpdsksd4wxcn3niiyck0fry3wzri96-nginxexp-customisation-layer']
<missing> 50 years ago 200B store paths: ['/nix/store/6npz42nl2hhsrs98bq45aqkqsndpwvp1-nginx-root.conf']
<missing> 50 years ago 1.79MB store paths: ['/nix/store/qsq6ni4lxd8i4g9g4dvh3y7v1f43fqsp-nginx-1.18.0']
<missing> 50 years ago 71.3kB store paths: ['/nix/store/n14bjnksgk2phl8n69m4yabmds7f0jj2-source']
<missing> 50 years ago 166kB store paths: ['/nix/store/jsqrk045m09i136mgcfjfai8i05nq14c-source']
<missing> 50 years ago 1.3MB store paths: ['/nix/store/4w2zbpv9ihl36kbpp6w5d1x33gp5ivfh-source']
<missing> 50 years ago 492kB store paths: ['/nix/store/kdrdxhswaqm4dgdqs1vs2l4b4md7djma-pcre-8.44']
<missing> 50 years ago 4.17MB store paths: ['/nix/store/6glpgx3pypxzb09wxdqyagv33rrj03qp-openssl-1.1.1g']
<missing> 50 years ago 385kB store paths: ['/nix/store/7n56vmgraagsl55aarx4qbigdmcvx345-libxslt-1.1.34']
<missing> 50 years ago 324kB store paths: ['/nix/store/1f8z1lc748w8clv1523lma4w31klrdpc-geoip-1.6.12']
<missing> 50 years ago 429kB store paths: ['/nix/store/wnrjhy16qzbhn2qdxqd6yrp76yghhkrg-gd-2.3.0']
<missing> 50 years ago 1.22MB store paths: ['/nix/store/hqd0i3nyb0717kqcm1v80x54ipkp4bv6-libwebp-1.0.3']
<missing> 50 years ago 327kB store paths: ['/nix/store/79nj0nblmb44v15kymha0489sw1l7fa0-fontconfig-2.12.6-lib']
<missing> 50 years ago 1.7MB store paths: ['/nix/store/6m9isbbvj78pjngmh0q5qr5cy5y1kzyw-libxml2-2.9.10']
<missing> 50 years ago 580kB store paths: ['/nix/store/2xmw4nxgfximk8v1rkw74490rfzz2gjp-libtiff-4.1.0']
<missing> 50 years ago 404kB store paths: ['/nix/store/vbxifzrl7i5nvh3h505kyw325da9k47n-giflib-5.2.1']
<missing> 50 years ago 79.8kB store paths: ['/nix/store/jc5bd71qcjshdjgzx9xdfrnc9hsi2qc3-fontconfig-2.12.6']
<missing> 50 years ago 236kB store paths: ['/nix/store/9q5gjvrabnr74vinmjzkkljbpxi8zk5j-expat-2.2.8']
<missing> 50 years ago 482kB store paths: ['/nix/store/0d6vl8gzwqc3bdkgj5qmmn8v67611znm-xz-5.2.5']
<missing> 50 years ago 6.28MB store paths: ['/nix/store/rmn2n2sycqviyccnhg85zangw1qpidx0-gcc-9.3.0-lib']
<missing> 50 years ago 1.98MB store paths: ['/nix/store/fnhsqz8a120qwgyyaiczv3lq4bjim780-freetype-2.10.2']
<missing> 50 years ago 757kB store paths: ['/nix/store/9ifada2prgfg7zm5ba0as6404rz6zy9w-dejavu-fonts-minimal-2.37']
<missing> 50 years ago 1.51MB store paths: ['/nix/store/yj40ch9rhkqwyjn920imxm1zcrvazsn3-libjpeg-turbo-2.0.4']
<missing> 50 years ago 79.8kB store paths: ['/nix/store/1lxskkhsfimhpg4fd7zqnynsmplvwqxz-bzip2-1.0.6.0.1']
<missing> 50 years ago 255kB store paths: ['/nix/store/adldw22awj7n65688smv19mdwvi1crsl-libpng-apng-1.6.37']
<missing> 50 years ago 123kB store paths: ['/nix/store/5x6l9xm5dp6v113dpfv673qvhwjyb7p5-zlib-1.2.11']
<missing> 50 years ago 30.9MB store paths: ['/nix/store/bqbg6hb2jsl3kvf6jgmgfdqy06fpjrrn-glibc-2.30']
<missing> 50 years ago 209kB store paths: ['/nix/store/fhg84pzckx2igmcsvg92x1wpvl1dmybf-libidn2-2.3.0']
<missing> 50 years ago 1.63MB store paths: ['/nix/store/y8n2b9nwjrgfx3kvi3vywvfib2cw5xa6-libunistring-0.9.10']

As you may notice, all Nix store paths are in their own layers. If we would also build a layered Docker image for the Apache HTTP service, we end up using less disk space (because common dependencies such as glibc can be reused), and less RAM (because these common dependencies can be shared in RAM).

Mapping Nix store paths onto layers obviously has limitations -- there is a maximum number of layers that Docker can use (in the Nix expression, I have imposed a limit of 100 layers, recent versions of Docker support a somewhat higher number).

Complex systems packaged with Nix typically have much more dependencies than the number of layers that Docker can mount. To cope with this limitation, the dockerTools.buildLayerImage abstraction function tries to merge infrequently used dependencies into a shared layers. More information about this process can be found in Graham Christensen's blog post.

Besides the use cases shown in the examples above, there is much more you can do with the dockerTools functions in Nixpkgs -- you can also pull images from Docker Hub (with the dockerTools.pullImage function) and use the dockerTools.buildImage function to use existing Docker images as a basis to create hybrids combining conventional Linux software with Nix packages.

Conclusion


In this blog post, I have elaborated about using Nix and Docker as deployment solutions.

What they both have in common is that they facilitate reliable and reproducible deployment.

They can be used for a variety of use cases in two different domains (package management and process/service management). Some of these use cases are common to both Nix and Docker.

Nix and Docker can also be combined in several interesting ways -- Nix can be used as a package manager to deliver package dependencies in the construction process of an image, and Nix can also be used directly to build images, as a replacement for Dockerfiles.

This table summarizes the conceptual differences between Nix and Docker covered in this blog post:

NixDocker
Application domainPackage managementProcess/service management
Storage unitsPackage build resultsFile system changes
Storage modelIsolated Nix store pathsLayers + union file system
Component addressingHashes computed from inputsHashes computed from a layer's contents
Service/process managementUnsupportedFirst-class feature
Package managementFirst class supportDelegated responsibility to a distro's package manager
Development environmentsnix-shellCreate image with dependencies + run shell session in container
Build management (images)DockerfiledockerTools.buildImage {}
dockerTools.buildLayeredImage {}
Build management (packages)First class function supportImplementer's responsibility, can be simulated
Build environment purityMany precautions takenOnly images provide some reproducibility, implementer's responsibility
Full source traceabilityYes (on Linux)No
OS supportMany UNIX-like systemsLinux (real system or virtualized)

I believe the last item in the table deserves a bit of clarification -- Nix works on other operating systems than Linux, e.g. macOS, and can also deploy binaries for those platforms.

Docker can be used on Windows and macOS, but it still deploys Linux software -- on Windows and macOS containers are deployed to a virtualized Linux environment. Docker containers can only work on Linux, because they heavily rely on Linux-specific concepts: namespaces and cgroups.

Aside from the functional parts, Nix and Docker also have some fundamental non-functional differences. One of them is usability.

Although I am a long-time Nix user (since 2007). Docker is very popular because it is well-known and provides quite an optimized user experience. It does not deviate much from the way traditional Linux systems are managed -- this probably explains why so many users incorrectly call containers "virtual machines", because they manifest themselves as units that provide almost fully functional Linux distributions.

From my own experiences, it is typically more challenging to convince a new audience to adopt Nix -- getting an audience used to the fact that a package build can be modeled as a pure function invocation (in which the function parameters are a package's build inputs) and that a specialized Nix store is used to store all static artifacts, is sometimes difficult.

Both Nix and Docker support reuse: the former by means of using identical Nix store paths and the latter by using identical layers. For both solutions, these objects can be identified with hash codes.

In practice, reuse with Docker is not always optimal -- for frequently used services, such as Nginx and Apache HTTP server, is not a common practice to manually derive these images from a Linux distribution base image.

Instead, most Docker users will obtain specialized Nginx and Apache HTTP images. The official Docker Nginx images are constructed from Debian Buster and Alpine Linux, whereas the official Apache HTTP images only support Alpine Linux. Sharing common dependencies between these two images will only be possible if we install the Alpine Linux-based images.

In practice, it happens quite frequently that people run images constructed from all kinds of different base images, making it very difficult to share common dependencies.

Another impractical aspect of Nix is that it works conveniently for software compiled from source code, but packaging and deploying pre-built binaries is typically a challenge -- ELF binaries typically do not work out of the box and need to be patched, or deployed to an FHS user environment in which dependencies can be found in their "usual" locations (e.g. /bin, /lib etc.).

Related work


In this blog post, I have restricted my analysis to Nix and Docker. Both tools are useful on their own, but they are also the foundations of entire solution eco-systems. I did not elaborate much about solutions in these extended eco-systems.

For example, Nix does not do any process/service management, but there are Nix-related projects that can address this concern. Most notably: NixOS: a Linux-distribution fully managed by Nix, uses systemd to manage services.

For Nix users on macOS, there is a project called nix-darwin that integrates with launchd, which is the default service manager on macOS.

There also used to be an interesting cross-over project between Nix and Docker (called nix-docker) combining the Nix's package management capabilities, with Docker's isolation capabilities, and supervisord's ability to manage multiple services in a container -- it takes a configuration file (that looks similar to a NixOS configuration) defining a set of services, fully generates a supervisord configuration (with all required services and dependencies) and deploys them to a container. Unfortunately, the project is no longer maintained.

Nixery is a Docker-compatible container registry that is capable of transparently building and serving container images using Nix.

Docker is also an interesting foundation for an entire eco-system of solutions. Most notably Kubernetes, a container-orchestrating system that works with a variety of container tools including Docker. docker-compose makes it possible to manage collections of Docker containers and dependencies between containers.

There are also many solutions available to make building development projects with Docker (and other container technologies) more convenient than my file package build example. Gitlab CI, for example, provides first-class Docker integration. Tekton is a Kubernetes-based framework that can be used to build CI/CD systems.

There are also quite a few Nix cross-over projects that integrate with the extended containers eco-system, such as Kubernetes and docker-compose. For example, arion can generate docker-compose configuration files with specialized containers from NixOS modules. KuberNix can be used to bootstrap a Kubernetes cluster with the Nix package manager, and Kubenix can be used to build Kubernetes resources with Nix.

As explained in my comparisons, package management is not something that Docker supports as a first-class feature, but Docker has been an inspiration for package management solutions as well.

Most notably, several years ago I did a comparison between Nix and Ubuntu's Snappy package manager. The latter deploys every package (and all its required dependencies) as a container.

In this comparison blog post, I raised a number of concerns about reuse. Snappy does not have any means to share common dependencies between packages, and as a result, Snaps can be quite disk space and memory consuming.

Flatpak can be considered an alternative and more open solution to Snappy.

I still do not understand why these Docker-inspired package management solutions have not used Nix (e.g. storing packages in insolated folders) or Docker (e.g. using layers) as an inspiration to optimize reuse and simplify the construction of packages.

Future work


In the next blog post, I will elaborate more about integrating the Nix package manager with tools that can address the process/service management concern.


Experimenting with Nix and the service management properties of Docker

$
0
0
In the previous blog post, I have analyzed Nix and Docker as deployment solutions and described in what ways these solutions are similar and different.

To summarize my findings:

  • Nix is a source-based package manager responsible for obtaining, installing, configuring and upgrading packages in a reliable and reproducible manner and facilitating the construction of packages from source code and their dependencies.
  • Docker's purpose is to fully manage the life-cycle of applications (services and ordinary processes) in a reliable and reproducible manner, including their deployments.

As explained in my previous blog post, two prominent goals both solutions have in common is to facilitate reliable and reproducible deployment. They both use different kinds of techniques to accomplish these goals.

Although Nix and Docker can be used for a variety of comparable use cases (such as constructing images, deploying test environments, and constructing packages from source code), one prominent feature that the Nix package manager does not provide is process (or service) management.

In a Nix-based workflow you need to augment Nix with another solution that can facilitate process management.

In this blog post, I will investigate how Docker could fulfill this role -- it is pretty much the opposite goal of the combined use cases scenarios I have shown in the previous blog post, in which Nix can overtake the role of a conventional package manager in supplying packages in the construction process of an image and even the complete construction process of images.

Existing Nix integrations with process management


Although Nix does not do any process management, there are sister projects that can, such as:

  • NixOS builds entire machine configurations from a single declarative deployment specification and uses the Nix package manager to deploy and isolate all static artifacts of a system. It will also automatically generate and deploy systemd units for services defined in a NixOS configuration.
  • nix-darwin can be used to specify a collection of services in a deployment specification and uses the Nix package manager to deploy all services and their corresponding launchd configuration files.

Although both projects do a great job (e.g. they both provide a big collection of deployable services) what I consider a disadvantage is that they are platform specific -- both solutions only work on a single operating system (Linux and macOS) and a single process management solution (systemd and launchd).

If you are using Nix in a different environment, such as a different operating system, a conventional (non-NixOS) Linux distribution, or a different process manager, then there is no off-the-shelf solution that will help you managing services for packages provided by Nix.

Docker functionality


Docker could be considered a multi-functional solution for application management. I can categorize its functionality as follows:

  • Process management. The life-cycle of a container is bound to the life-cycle of a root process that needs to be started or stopped.
  • Dependency management. To ensure that applications have all the dependencies that they need and that no dependency is missing, Docker uses images containing a complete root filesystem with all required files to run an application.
  • Resource isolation is heavily used for a variety of different reasons:
    • Foremost, to ensure that the root filesystem of the container does not conflict with the host system's root filesystem.
    • It is also used to prevent conflicts with other kinds of resources. For example, the isolated network interfaces allow services to bind to the same TCP ports that may also be in use by the host system or other containers.
    • It offers some degree of protection. For example, a malicious process will not be able to see or control a process belonging to the host system or a different container.
  • Resource restriction can be used to limit the amount of system resources that a process can consume, such as the amount of RAM.

    Resource restriction can be useful for a variety of reasons, for example, to prevent a service from eating up all the system's resources affecting the stability of the system as a whole.
  • Integrations with the host system (e.g. volumes) and other services.

As described in the previous blog post, Docker uses a number key concepts to implement the functionality shown above, such as layers, namespaces and cgroups.

Developing a Nix-based process management solution


For quite some time, I have been investigating the process management domain and worked on a prototype solution to provide a more generalized infrastructure that complements Nix with process management -- I came up with an experimental Nix-based process manager-agnostic framework that has the following objectives:

  • It uses Nix to deploy all required packages and other static artifacts (such as configuration files) that a service needs.
  • It integrates with a variety of process managers on a variety of operating systems. So far, it can work with: sysvinit scripts, BSD rc scripts, supervisord, systemd, cygrunsrv and launchd.

    In addition to process managers, it can also automatically convert a processes model to deployment specifications that Disnix can consume.
  • It uses declarative specifications to define functions that construct managed processes and process instances.

    Processes can be declared in a process-manager specific and process-manager agnostic way. The latter makes it possible to target all six supported process managers with the same declarative specification, albeit with a limited set of features.
  • It allows you to run multiple instances of processes, by introducing a convention to cope with potential resource conflicts between process instances -- instance properties and potential conflicts can be configured with function parameters and can be changed in such a way that they do not conflict.
  • It can facilitate unprivileged user deployments by using Nix's ability to perform unprivileged package deployments and introducing a convention that allows you to disable user switching.

To summarize how the solution works from a user point of view, we can write a process manager-agnostic constructor function as follows:


{createManagedProcess, tmpDir}:
{port, instanceSuffix ? "", instanceName ? "webapp${instanceSuffix}"}:

let
webapp = import ../../webapp;
in
createManagedProcess {
name = instanceName;
description = "Simple web application";
inherit instanceName;

process = "${webapp}/bin/webapp";
daemonArgs = [ "-D" ];

environment = {
PORT = port;
PID_FILE = "${tmpDir}/${instanceName}.pid";
};
user = instanceName;
credentials = {
groups = {
"${instanceName}" = {};
};
users = {
"${instanceName}" = {
group = instanceName;
description = "Webapp";
};
};
};

overrides = {
sysvinit = {
runlevels = [ 3 4 5 ];
};
};
}

The Nix expression above is a nested function that defines in a process manager-agnostic way a configuration for a web application process containing an embedded web server serving a static HTML page.

  • The outer function header (first line) refers to parameters that are common to all process instances: createManagedProcess is a function that can construct process manager configurations and tmpDir refers to the directory in which temp files are stored (which is /tmp in conventional Linux installations).
  • The inner function header (second line) refers to instance parameters -- when it is desired to construct multiple instances of this process, we must make sure that we have configured these parameters in such as a way that they do not conflict with other processes.

    For example, when we assign a unique TCP port and a unique instance name (a property used by the daemon tool to create unique PID files) we can safely have multiple instances of this service co-existing on the same system.
  • In the body, we invoke the createManagedProcess function to generate configurations files for a process manager.
  • The process parameter specifies the executable that we need to run to start the process.
  • The daemonArgs parameter specifies command-line instructions passed to the the process executable, when the process should daemonize itself (the -D parameter instructs the webapp process to daemonize).
  • The environment parameter specifies all environment variables. Environment variables are used as a generic configuration facility for the service.
  • The user parameter specifies the name the process should run as (each process instance has its own user and group with the same name as the instance).
  • The credentials parameter is used to automatically create the group and user that the process needs.
  • The overrides parameter makes it possible to override the parameters generated by the createManagedProcess function with process manager-specific overrides, to configure features that are not universally supported.

    In the example above, we use an override to configure the runlevels in which the service should run (runlevels 3-5 are typically used to boot a system that is network capable). Runlevels are a sysvinit-specific concept.

In addition to defining constructor functions allowing us to construct zero or more process instances, we also need to construct process instances. These can be defined in a processes model:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir;
inherit forceDisableUserChange processManager;
};
in
rec {
webapp = rec {
port = 5000;
dnsName = "webapp.local";

pkg = constructors.webapp {
inherit port;
};
};

nginxReverseProxy = rec {
port = 8080;

pkg = constructors.nginxReverseProxyHostBased {
webapps = [ webapp ];
inherit port;
} {};
};
}

The above Nix expressions defines two process instances and uses the following conventions:

  • The first line is a function header in which the function parameters correspond to ajustable properties that apply to all process instances:
    • stateDir allows you to globally override the base directory in which all state is stored (the default value is: /var).
    • We can also change the locations of each individual state directories: tmpDir, cacheDir, logDir, runtimeDir etc.) if desired.
    • forceDisableUserChange can be enabled to prevent the process manager to change user permissions and create users and groups. This is useful to facilitate unprivileged user deployments in which the user typically has no rights to change user permissions.
    • The processManager parameter allows you to pick a process manager. All process configurations will be automatically generated for the selected process manager.

      For example, if we would pick: systemd then all configurations get translated to systemd units. supervisord causes all configurations to be translated to supervisord configuration files.
  • To get access to constructor functions, we import a constructors expression that composes all constructor functions by calling them with their common parameters (not shown in this blog post).

    The constructors expression also contains a reference to the Nix expression that deploys the webapp service, shown in our previous example.
  • The processes model defines two processes: a webapp instance that listens to TCP port 5000 and Nginx that acts as a reverse proxy forwarding requests to webapp process instances based on the virtual host name.
  • webapp is declared a dependency of the nginxReverseProxy service (by passing webapp as a parameter to the constructor function of Nginx). This causes webapp to be activated before the nginxReverseProxy.

To deploy all process instances with a process manager, we can invoke a variety of tools that are bundled with the experimental Nix process management framework.

The process model can be deployed as sysvinit scripts for an unprivileged user, with the following command:


$ nixproc-sysvinit-switch --state-dir /home/sander/var \
--force-disable-user-change processes.nix

The above command automatically generates sysvinit scripts, changes the base directory of all state folders to a directory in the user's home directory: /home/sander/var and disables user changing (and creation) so that an unprivileged user can run it.

The following command uses systemd as a process manager with the default parameters, for production deployments:


$ nixproc-systemd-switch processes.nix

The above command automatically generates systemd unit files and invokes systemd to deploy the processes.

In addition to the examples shown above, the framework contains many more tools, such as: nixproc-supervisord-switch, nixproc-launchd-switch, nixproc-bsdrc-switch, nixproc-cygrunsrv-switch, and nixproc-disnix-switch that all work with the same processes model.

Integrating Docker into the process management framework


Both Docker and the Nix-based process management framework are multi-functional solutions. After comparing the functionality of Docker and the process management framework, I realized that it is possible to integrate Docker into this framework as well, if I would use it in an unconventional way, by disabling or substituting some if its conflicting features.

Using a shared Nix store


As explained in the beginning of this blog post, Docker's primary means to provide dependencies is by using images that are self-contained root file systems containing all necessary files (e.g. packages, configuration files) to allow an application to work.

In the previous blog post, I have also demonstrated that instead of using traditional Dockerfiles to construct images, we can also use the Nix package manager as a replacement. A Docker image built by Nix is typically smaller than a conventional Docker image built from a base Linux distribution, because it only contains the runtime dependencies that an application actually needs.

A major disadvantage of using Nix constructed Docker images is that they only consist of one layer -- as a result, there is no reuse between container instances running different services that use common libraries. To alleviate this problem, Nix can also build layered images, in which common dependencies are isolated in separate layers as much as possible.

There is even a more optimal reuse strategy possible -- when running Docker on a machine that also has Nix installed, we do not need to put anything that is in the Nix store in a disk image. Instead, we can share the host system's Nix store between Docker containers.

This may sound scary, but as I have explained in the previous blog post, paths in the Nix store are prefixed with SHA256 hash codes. When two Nix store paths with identical hash codes are built on two different machines, their build results should be (nearly) bit-identical. As a result, it is safe to share the same Nix store path between multiple machines and containers.

A hacky solution to build a container image, without actually putting any of the Nix built packages in the container, can be done with the following expression:


with import <nixpkgs> {};

let
cmd = [ "${nginx}/bin/nginx""-g""daemon off;""-c" ./nginx.conf ];
in
dockerTools.buildImage {
name = "nginxexp";
tag = "test";

runAsRoot = ''
${dockerTools.shadowSetup}
groupadd -r nogroup
useradd -r nobody -g nogroup -d /dev/null
mkdir -p /var/log/nginx /var/cache/nginx /var/www
cp ${./index.html} /var/www/index.html
'';

config = {
Cmd = map (arg: builtins.unsafeDiscardStringContext arg) cmd;
Expose = {
"80/tcp" = {};
};
};
}

The above expression is quite similar to the Nix-based Docker image example shown in the previous blog post, that deploys Nginx serving a static HTML page.

The only difference is how I configure the start command (the Cmd parameter). In the Nix expression language, strings have context -- if a string with context is passed to a build function (any string that contains a value that evaluates to a Nix store path), then the corresponding Nix store paths automatically become a dependency of the package that the build function builds.

By using the unsafe builtins.unsafeDiscardStringContext function I can discard the context of strings. As a result, the Nix packages that the image requires are still built. However, because their context is discarded they are no longer considered dependencies of the Docker image. As a consequence, they will not be integrated into the image that the dockerTools.buildImage creates.

(As a sidenote: there are still two Nix store paths that end-up in the image, namely bash and glibc that is a runtime dependency of bash. This is caused by the fact that the internals of the dockerTools.buildImage function make a reference to bash without discarding its context. In theory, it is also possible to eliminate this dependency as well).

To run the container and make sure that the required Nix store paths are available, I can mount the host system's Nix store as a shared volume:


$ docker run -p 8080:80 -v /nix/store:/nix/store -it nginxexp:latest

By mounting the host system's Nix store (with the -v parameter), Nginx should still behave as expected -- it is not provided by the image, but referenced from the shared Nix store.

(As a sidenote: mounting the host system's Nix store for sharing is not a new idea. It has already been intensively used by the NixOS test driver for many years to rapidly create QEMU virtual machines for system integration tests).

Using the host system's network


As explained in the previous blog post, every Docker container by default runs in its own private network namespace making it possible for services to bind to any port without conflicting with the services on the host system or services provided by any other container.

The Nix process management framework does not work with private networks, because it is not a generalizable concept (i.e. namespaces are a Linux-only feature). Aside from Docker, the only other process manager supported by the framework that can work with namespaces is systemd.

To prevent ports and other dynamic resources from conflicting with each other, the process management framework makes it possible to configure them through instance function parameters. If the instance parameters have unique values, they will not conflict with other process instances (based on the assumption that the packager has identified all possible conflicts that a process might have).

Because we already have a framework that prevents conflicts, we can also instruct Docker to use the host system's network with the --network host parameter:


$ docker run -v /nix/store:/nix/store --network host -it nginxexp:latest

The only thing the framework cannot provide you is protection -- mallicious services in a private network namespace cannot connect to ports used by other containers or the host system, but the framework cannot protect you from that.

Mapping a base directory for storing state


Services that run in containers are not always stateless -- they may rely on data that should be persistently stored, such as databases. The Docker recommendation to handle persistent state is not to store it in a container's writable layer, but on a shared volume on the host system.

Data stored outside the container makes it possible to reliably upgrade a container -- when it is desired to install a newer version of an application, the container can be discarded and recreated from a new image.

For the Nix process management framework, integration with a state directory outside the container is also useful. With an extra shared volume, we can mount the host system's state directory:


$ docker run -v /nix/store:/nix/store \
-v /var:/var --network host -it nginxexp:latest

Orchestrating containers


The last piece in the puzzle is to orchestrate the containers: we must create or discard them, and start or stop them, and perform all required steps in the right order.

Moreover, to prevent the Nix packages that a containers needs from being garbage collected, we need to make sure that they are a dependency of a package that is registered as in use.

I came up with my own convention to implement the container deployment process. When building the processes model for the docker process manager, the following files are generated that help me orchestrating the deployment process:


01-webapp-docker-priority
02-nginx-docker-priority
nginx-docker-cmd
nginx-docker-createparams
nginx-docker-settings
webapp-docker-cmd
webapp-docker-createparams
webapp-docker-settings

In the above list, we have the following kinds of files:

  • The files that have a -docker-settings suffix contain general properties of the container, such as the image that needs to be used a template.
  • The files that have a -docker-createparams suffix contain the command line parameters that are propagated to docker create to create the container. If a container with the same name already exists, the container creation is skipped and the existing instance is used instead.
  • To prevent the Nix packages that a Docker container needs from being garbage collected the generator creates a file with a -docker-cmd suffix containing the Cmd instruction including the full Nix store paths of the packages that a container needs.

    Because the strings' contexts are not discarded in the generation process, the packages become a dependency of the configuration file. As long as this configuration file is deployed, the packages will not get garbage collected.
  • To ensure that the containers are activated in the right order we have two files that are prefixed with two numeric digits that have a -container-priority suffix. The numeric digits determine in which order the containers should be activated -- in the above example the webapp process gets activated before Nginx (that acts as a reverse proxy).

With the following command, we can automatically generate the configuration files shown above for all our processes in the processes model, and use it to automatically create and start docker containers for all process instances:


$ nixproc-docker-switch processes.nix
55d833e07428: Loading layer [==================================================>] 46.61MB/46.61MB
Loaded image: webapp:latest
f020f5ecdc6595f029cf46db9cb6f05024892ce6d9b1bbdf9eac78f8a178efd7
nixproc-webapp
95b595c533d4: Loading layer [==================================================>] 46.61MB/46.61MB
Loaded image: nginx:latest
b195cd1fba24d4ec8542c3576b4e3a3889682600f0accc3ba2a195a44bf41846
nixproc-nginx

The result is two running Docker containers that correspond to the process instances shown in the processes model:


$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b195cd1fba24 nginx:latest "/nix/store/j3v4fz9h…" 15 seconds ago Up 14 seconds nixproc-nginx
f020f5ecdc65 webapp:latest "/nix/store/b6pz847g…" 16 seconds ago Up 15 seconds nixproc-webapp

and we should be able to access the example HTML page, by opening the following URL: http://localhost:8080 in a web browser.

Deploying Docker containers in a heteregenous and/or distributed environment


As explained in my previous blog posts about the experimental Nix process management framework, the processes model is a sub set of a Disnixservices model. When it is desired to deploy processes to a network of machines or combine processes with other kinds of services, we can easily turn a processes model into a services model.

For example, I can change the processes model shown earlier into a services model that deploys Docker containers:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
}:

let
constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir;
inherit forceDisableUserChange;
processManager = "docker";
};
in
rec {
webapp = rec {
name = "webapp";

port = 5000;
dnsName = "webapp.local";

pkg = constructors.webapp {
inherit port;
};

type = "docker-container";
};

nginxReverseProxy = rec {
name = "nginxReverseProxy";

port = 8080;

pkg = constructors.nginxReverseProxyHostBased {
webapps = [ webapp ];
inherit port;
} {};

type = "docker-container";
};
}

In the above example, I have added a name attribute to each process (a required property for Disnix service models) and a type attribute referring to: docker-container.

In Disnix, a service could take any form. A plugin system (named Dysnomia) is responsible for managing the life-cycle of a service, such as activating or deactivating it. The type attribute is used to tell Disnix that we should use the docker-container Dysnomia module. This module will automatically create and start the container on activation, and stop and discard the container on deactivation.

To deploy the above services to a network of machines, we require an infrastructure model (that captures the available machines and their relevant deployment properties):


{
test1.properties.hostname = "test1";
}

The above infrastructure model contains only one target machine: test1 with a hostname that is identical to the machine name.

We also require a distribution model that maps services in the services model to machines in the infrastructure model:


{infrastructure}:

{
webapp = [ infrastructure.test1 ];
nginxReverseProxy = [ infrastructure.test1 ];
}

In the above distribution model, we map the all the processes in the services model to the test1 target machine in the infrastructure model.

With the following command, we can deploy our Docker containers to the remote test1 target machine:


$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

When the above command succeeds, the test1 target machine provides running webapp and nginxReverseProxy containers.

(As a sidenote: to make Docker container deployments work with Disnix, the Docker service already needs to be predeployed to the target machines in the infrastructure model, or the Docker daemon needs to be deployed as a container provider).

Deploying conventional Docker containers with Disnix


The nice thing about the docker-container Dysnomia module is that it is generic enough to also work with conventional Docker containers (that work with images, not a shared Nix store).

For example, we can deploy Nginx as a regular container built with the dockerTools.buildImage function:


{dockerTools, stdenv, nginx}:

let
dockerImage = dockerTools.buildImage {
name = "nginxexp";
tag = "test";
contents = nginx;

runAsRoot = ''
${dockerTools.shadowSetup}
groupadd -r nogroup
useradd -r nobody -g nogroup -d /dev/null
mkdir -p /var/log/nginx /var/cache/nginx /var/www
cp ${./index.html} /var/www/index.html
'';

config = {
Cmd = [ "${nginx}/bin/nginx""-g""daemon off;""-c" ./nginx.conf ];
Expose = {
"80/tcp" = {};
};
};
};
in
stdenv.mkDerivation {
name = "nginxexp";
buildCommand = ''
mkdir -p $out
cat > $out/nginxexp-docker-settings <<EOF
dockerImage=${dockerImage}
dockerImageTag=nginxexp:test
EOF

cat > $out/nginxexp-docker-createparams <<EOF
-p
8080:80
EOF
'';
}

In the above example, instead of using the process manager-agnostic createManagedProcess, I directly construct a Docker-based Nginx image (by using the dockerImage attribute) and container configuration files (in the buildCommand parameter) to make the container deployments work with the docker-container Dysnomia module.

It is also possible to deploy containers from images that are constructed with Dockerfiles. After we have built an image in the traditional way, we can export it from Docker with the following command:


$ docker save nginx-debian -o nginx-debian.tar.gz

and then we can use the following Nix expression to deploy a container using our exported image:


{dockerTools, stdenv, nginx}:

stdenv.mkDerivation {
name = "nginxexp";
buildCommand = ''
mkdir -p $out
cat > $out/nginxexp-docker-settings <<EOF
dockerImage=${./nginx-debian.tar.gz}
dockerImageTag=nginxexp:test
EOF

cat > $out/nginxexp-docker-createparams <<EOF
-p
8080:80
EOF
'';
}

In the above expression, the dockerImage property refers to our exported image.

Although Disnix is flexible enough to also orchestrate Docker containers (thanks to its generalized plugin architecture), I did not develop the docker-container Dysnomia module to make Disnix compete with existing container orchestration solutions, such as Kubernetes or Docker Swarm.

Disnix is a heterogeneous deployment tool that can be used to integrate units that have all kinds of shapes and forms on all kinds of operating systems -- having a docker-container module makes it possible to mix Docker containers with other service types that Disnix and Dysnomia support.

Discussion


In this blog post, I have demonstrated that we can integrate Docker as a process management backend option into the experimental Nix process management framework, by substituting some of its conflicting features.

Moreover, because a Disnix service model is a superset of a processes model, we can also use Disnix as a simple Docker container orchestrator and integrate Docker containers with other kinds of services.

Compared to Docker, the Nix process management framework supports a number of features that Docker does not:

  • Docker is heavily developed around Linux-specific concepts, such as namespaces and cgroups. As a result, it can only be used to deploy software built for Linux.

    The Nix process management framework should work on any operating system that is supported by the Nix package manager (e.g. Nix also has first class support for macOS, and can also be used on other UNIX-like operating systems such as FreeBSD). The same also applies to Disnix.
  • The Nix process management framework can work with sysvinit, BSD rc and Disnix process scripts, that do not require any external service to manage a process' life-cycle. This is convenient for local unprivileged user deployments. To deploy Docker containers, you need to have the Docker daemon installed first.
  • Docker has an experimental rootless deployment mode, but in the Nix process management framework facilitating unprivileged user deployments is a first class concept.

On the other hand, the Nix process management framework does not take over all responsibilities of Docker:

  • Docker heavily relies on namespaces to prevent resource conflicts, such as overlapping TCP ports and global state directories. The Nix process management framework solves conflicts by avoiding them (i.e. configuring properties in such a way that they are unique). The conflict avoidance approach works as long as a service is well-specified. Unfortunately, preventing conflicts is not a hard guarantee that the tool can provide you.
  • Docker also provides some degree of protection by using namespaces and cgroups. The Nix process management framework does not support this out of the box, because these concepts are not generalizable over all the process management backends it supports. (As a sidenote: it is still possible to use these concepts by defining process manager-specific overrides).

From a functionality perspective, docker-compose comes close to the features that the experimental Nix process management framework supports. docker-compose allows you to declaratively define container instances and their dependencies, and automatically deploy them.

However, as its name implies docker-compose is specifically designed for deploying Docker containers whereas the Nix process management framework is more general -- it should work with all kinds of process managers, uses Nix as the primary means to provide dependencies, it uses the Nix expression language for configuration and it should work on a variety of operating systems.

The fact that Docker (and containers in general) are multi-functional solutions is not an observation only made by me. For example, this blog post also demonstrates that containers can work without images.

Availability


The Docker backend has been integrated into the latest development version of the Nix process management framework.

To use the docker-container Dysnomia module (so that Disnix can deploy Docker containers), you need to install the latest development version of Dysnomia.

Assigning unique IDs to services in Disnix deployment models

$
0
0
As described in some of my recent blog posts, one of the more advanced features of Disnix as well as the experimental Nix process management framework is to deploy multiple instances of the same service to the same machine.

To make running multiple service instances on the same machine possible, these tools rely on conflict avoidance rather than isolation (typically used for containers). To allow multiple services instances to co-exist on the same machine, they need to be configured in such a way that they do not allocate any conflicting resources.

Although for small systems it is doable to configure multiple instances by hand, this process gets tedious and time consuming for larger and more technologically diverse systems.

One particular kind of conflicting resource that could be configured automatically are numeric IDs, such as TCP/UDP port numbers, user IDs (UIDs), and group IDs (GIDs).

In this blog post, I will describe how multiple service instances are configured (in Disnix and the process management framework) and how we can automatically assign unique numeric IDs to them.

Configuring multiple service instances


To facilitate conflict avoidance in Disnix and the Nix process management framework, services are configured as follows:


{createManagedProcess, tmpDir}:
{port, instanceSuffix ? "", instanceName ? "webapp${instanceSuffix}"}:

let
webapp = import ../../webapp;
in
createManagedProcess {
name = instanceName;
description = "Simple web application";
inherit instanceName;

# This expression can both run in foreground or daemon mode.
# The process manager can pick which mode it prefers.
process = "${webapp}/bin/webapp";
daemonArgs = [ "-D" ];

environment = {
PORT = port;
PID_FILE = "${tmpDir}/${instanceName}.pid";
};
user = instanceName;
credentials = {
groups = {
"${instanceName}" = {};
};
users = {
"${instanceName}" = {
group = instanceName;
description = "Webapp";
};
};
};

overrides = {
sysvinit = {
runlevels = [ 3 4 5 ];
};
};
}

The Nix expression shown above is a nested function that describes how to deploy a simple self-contained REST web application with an embedded HTTP server:

  • The outer function header (first line) specifies all common build-time dependencies and configuration properties that the service needs:

    • createManagedProcess is a function that can be used to define process manager agnostic configurations that can be translated to configuration files for a variety of process managers (e.g. systemd, launchd, supervisord etc.).
    • tmpDir refers to the temp directory in which temp files are stored.
  • The inner function header (second line) specifies all instance parameters -- these are the parameters that must be configured in such a way that conflicts with other process instances are avoided:

    • The instanceName parameter (that can be derived from the instanceSuffix) is a value used by some of the process management backends (e.g. the ones that invoke the daemon command) to derive a unique PID file for the process. When running multiple instances of the same process, each of them requires a unique PID file name.
    • The port parameter specifies to which TCP port the service binds to. Binding the service to a port that is already taken by another service, causes the deployment of this service to fail.
  • In the function body, we invoke the createManagedProcess function to construct configuration files for all supported process manager backends to run the webapp process:

    • As explained earlier, the instanceName is used to configure the daemon executable in such a way that it allocates a unique PID file.
    • The process parameter specifies which executable we need to run, both as a foreground process or daemon.
    • The daemonArgs parameter specifies which command-line parameters need to be propagated to the executable when the process should daemonize on its own.
    • The environment parameter specifies all environment variables. The webapp service uses these variables for runtime property configuration.
    • The user parameter is used to specify that the process should run as an unprivileged user. The credentials parameter is used to configure the creation of the user account and corresponding user group.
    • The overrides parameter is used to override the process manager-agnostic parameters with process manager-specific parameters. For the sysvinit backend, we configure the runlevels in which the service should run.

Although the convention shown above makes it possible to avoid conflicts (assuming that all potential conflicts have been identified and exposed as function parameters), these parameters are typically configured manually:


{ pkgs, system
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "sysvinit"
, ...
}:

let
constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager;
};

processType = import ../../nixproc/derive-dysnomia-process-type.nix {
inherit processManager;
};
in
rec {
webapp1 = rec {
name = "webapp1";
port = 5000;
dnsName = "webapp.local";
pkg = constructors.webapp {
inherit port;
instanceSuffix = "1";
};
type = processType;
};

webapp2 = rec {
name = "webapp2";
port = 5001;
dnsName = "webapp.local";
pkg = constructors.webapp {
inherit port;
instanceSuffix = "2";
};
type = processType;
};
}

The above Nix expression shows both a valid Disnix services as well as a valid processes model that composes two web application process instances that can run concurrently on the same machine by invoking the nested constructor function shown in the previous example:

  • Each webapp instance has its own unique instance name, by specifying a unique numeric instanceSuffix that gets appended to the service name.
  • Every webapp instance binds to a unique TCP port (5000 and 5001) that should not conflict with system services or other process instances.

Previous work: assigning port numbers


Although configuring two process instances is still manageable, the configuration process becomes more tedious and time consuming when the amount and the kind of processes (each having their own potential conflicts) grow.

Five years ago, I already identified a resource that could be automatically assigned to services: port numbers.

I have created a very simple port assigner tool that allows you to specify a global ports pool and a target-specific pool pool. The former is used to assign globally unique port numbers to all services in the network, whereas the latter assigns port numbers that are unique to the target machine where the service is deployed to (this is to cope with the scarcity of port numbers).

Although the tool is quite useful for systems that do not consist of too many different kinds of components, I ran into a number limitations when I want to manage a more diverse set of services:

  • Port numbers are not the only numeric IDs that services may require. When deploying systems that consist of self-contained executables, you typically want to run them as unprivileged users for security reasons. User accounts on most UNIX-like systems require unique user IDs, and the corresponding users' groups require unique group IDs.
  • We typically want to manage multiple resource pools, for a variety of reasons. For example, when we have a number of HTTP server instances and a number of database instances, then we may want to pick port numbers in the 8000-9000 range for the HTTP servers, whereas for the database servers we want to use a different pool, such as 5000-6000.

Assigning unique numeric IDs


To address these shortcomings, I have developed a replacement tool that acts as a generic numeric ID assigner.

This new ID assigner tool works with ID resource configuration files, such as:


rec {
ports = {
min = 5000;
max = 6000;
scope = "global";
};

uids = {
min = 2000;
max = 3000;
scope = "global";
};

gids = uids;
}

The above ID resource configuration file (idresources.nix) defines three resource pools: ports is a resource that represents port numbers to be assigned to the webapp processes, uids refers to user IDs and gids to group IDs. The group IDs' resource configuration is identical to the users' IDs configuration.

Each resource attribute refers the following configuration properties:

  • The min value specifies the minimum ID to hand out, max the maximum ID.
  • The scope value specifies the scope of the resource pool. global (which is the default option) means that the IDs assigned from this resource pool to services are globally unique for the entire system.

    The machine scope can be used to assign IDs that are unique for the machine where a service is distributed to. When the latter option is used, services that are distributed two separate machines may have the same ID.

We can adjust the services/processes model in such a way that every service will use dynamically assigned IDs and that each service specifies for which resources it requires a unique ID:


{ pkgs, system
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "sysvinit"
, ...
}:

let
ids = if builtins.pathExists ./ids.nix then (import ./ids.nix).ids else {};

constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager ids;
};

processType = import ../../nixproc/derive-dysnomia-process-type.nix {
inherit processManager;
};
in
rec {
webapp1 = rec {
name = "webapp1";
port = ids.ports.webapp1 or 0;
dnsName = "webapp.local";
pkg = constructors.webapp {
inherit port;
instanceSuffix = "1";
};
type = processType;
requiresUniqueIdsFor = [ "ports""uids""gids" ];
};

webapp2 = rec {
name = "webapp2";
port = ids.ports.webapp2 or 0;
dnsName = "webapp.local";
pkg = constructors.webapp {
inherit port;
instanceSuffix = "2";
};
type = processType;
requiresUniqueIdsFor = [ "ports""uids""gids" ];
};
}

In the above services/processes model, we have made the following changes:

  • In the beginning of the expression, we import the dynamically generated ids.nix expression that provides ID assignments for each resource. If the ids.nix file does not exists, we generate an empty attribute set. We implement this construction (in which the absence of ids.nix can be tolerated) to allow the ID assigner to bootstrap the ID assignment process.
  • Every hardcoded port attribute of every service is replaced by a reference to the ids attribute set that is dynamically generated by the ID assigner tool. To allow the ID assigner to open the services model in the first run, we provide a fallback port value of 0.
  • Every service specifies for which resources it requires a unique ID through the requiresUniqueIdsFor attribute. In the above example, both service instances require unique IDs to assign a port number, user ID to the user and group ID to the group.

The port assignments are propagated as function parameters to the constructor functions that configure the services (as shown earlier in this blog post).

We could also implement a similar strategy with the UIDs and GIDs, but a more convenient mechanism is to compose the function that creates the credentials, so that it transparently uses our uids and gids assignments.

As shown in the expression above, the ids attribute set is also propagated to the constructors expression. The constructors expression indirectly composes the createCredentials function as follows:


{pkgs, ids ? {}, ...}:

{
createCredentials = import ../../create-credentials {
inherit (pkgs) stdenv;
inherit ids;
};

...
}

The ids attribute set is propagated to the function that composes the createCredentials function. As a result, it will automatically assign the UIDs and GIDs in the ids.nix expression when the user configures a user or group with a name that exists in the uids and gids resource pools.

To make these UIDs and GIDs assignments go smoothly, it is recommended to give a process instance the same process name, instance name, user and group names.

Using the ID assigner tool


By combining the ID resources specification with the three Disnix models: a services model (that defines all distributable services, shown above), an infrastructure model (that captures all available target machines) and their properties and a distribution model (that maps services to target machines in the network), we can automatically generate an ids configuration that contains all ID assignments:


$ dydisnix -s services.nix -i infrastructure.nix -d distribution.nix \
--id-resources idresources.nix --output-file ids.nix

The above command will generate an ids configuration file (ids.nix) that provides, for each resource in the ID resources model, a unique assignment to services that are distributed to a target machine in the network. (Services that are not distributed to any machine in the distribution model will be skipped, to not waste too many resources).

The output file (ids.nix) has the following structure:


{
"ids" = {
"gids" = {
"webapp1" = 2000;
"webapp2" = 2001;
};
"uids" = {
"webapp1" = 2000;
"webapp2" = 2001;
};
"ports" = {
"webapp1" = 5000;
"webapp2" = 5001;
};
};
"lastAssignments" = {
"gids" = 2001;
"uids" = 2001;
"ports" = 5001;
};
}

  • The ids attribute contains for each resource (defined in the ID resources model) the unique ID assignments per service. As shown earlier, both service instances require unique IDs for ports, uids and gids. The above attribute set stores the corresponding ID assignments.
  • The lastAssignments attribute memorizes the last ID assignment per resource. Once an ID is assigned, it will not be immediately reused. This is to allow roll backs and to prevent data to incorrectly get owned by the wrong user accounts. Once the maximum ID limit is reached, the ID assigner will start searching for a free assignment from the beginning of the resource pool.

In addition to assigning IDs to services that are distributed to machines in the network, it is also possible to assign IDs to all services (regardless whether they have been deployed or not):


$ dydisnix -s services.nix \
--id-resources idresources.nix --output-file ids.nix

Since the above command does not know anything about the target machines, it only works with an ID resources configuration that defines global scope resources.

When you intend to upgrade an existing deployment, you typically want to retain already assigned IDs, while obsolete ID assignment should be removed, and new IDs should be assigned to services that have none yet. This is to prevent unnecessary redeployments.

When removing the first webapp service and adding a third instance:


{ pkgs, system
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "sysvinit"
, ...
}:

let
ids = if builtins.pathExists ./ids.nix then (import ./ids.nix).ids else {};

constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager ids;
};

processType = import ../../nixproc/derive-dysnomia-process-type.nix {
inherit processManager;
};
in
rec {
webapp2 = rec {
name = "webapp2";
port = ids.ports.webapp2 or 0;
dnsName = "webapp.local";
pkg = constructors.webapp {
inherit port;
instanceSuffix = "2";
};
type = processType;
requiresUniqueIdsFor = [ "ports""uids""gids" ];
};

webapp3 = rec {
name = "webapp3";
port = ids.ports.webapp3 or 0;
dnsName = "webapp.local";
pkg = constructors.webapp {
inherit port;
instanceSuffix = "3";
};
type = processType;
requiresUniqueIdsFor = [ "ports""uids""gids" ];
};
}

And running the following command (that provides the current ids.nix as a parameter):


$ dydisnix -s services.nix -i infrastructure.nix -d distribution.nix \
--id-resources idresources.nix --ids ids.nix --output-file ids.nix

we will get the following ID assignment configuration:


{
"ids" = {
"gids" = {
"webapp2" = 2001;
"webapp3" = 2002;
};
"uids" = {
"webapp2" = 2001;
"webapp3" = 2002;
};
"ports" = {
"webapp2" = 5001;
"webapp3" = 5002;
};
};
"lastAssignments" = {
"gids" = 2002;
"uids" = 2002;
"ports" = 5002;
};
}

As may be observed, since the webapp2 process is in both the current and the previous configuration, its ID assignments will be retained. webapp1 gets removed because it is no longer in the services model. webapp3 gets the next numeric IDs from the resources pools.

Because the configuration of webapp2 stays the same, it does not need to be redeployed.

The models shown earlier are valid Disnix services models. As a consequence, they can be used with Dynamic Disnix's ID assigner tool: dydisnix-id-assign.

Although these Disnix services models are also valid processes models (used by the Nix process management framework) not every processes model is guaranteed to be compatible with a Disnix service model.

For process models that are not compatible, it is possible to use the nixproc-id-assign tool that acts as a wrapper around dydisnix-id-assign tool:


$ nixproc-id-assign --id-resources idresources.nix processes.nix

Internally, the nixproc-id-assign tool converts a processes model to a Disnix service model (augmenting the process instance objects with missing properties) and propagates it to the dydisnix-id-assign tool.

A more advanced example


The webapp processes example is fairly trivial and only needs unique IDs for three kinds of resources: port numbers, UIDs, and GIDs.

I have also developed a more complex example for the Nix process management framework that exposes several commonly used system services on Linux systems, such as:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
ids = if builtins.pathExists ./ids.nix then (import ./ids.nix).ids else {};

constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir forceDisableUserChange processManager ids;
};
in
rec {
apache = rec {
port = ids.httpPorts.apache or 0;

pkg = constructors.simpleWebappApache {
inherit port;
serverAdmin = "root@localhost";
};

requiresUniqueIdsFor = [ "httpPorts""uids""gids" ];
};

postgresql = rec {
port = ids.postgresqlPorts.postgresql or 0;

pkg = constructors.postgresql {
inherit port;
};

requiresUniqueIdsFor = [ "postgresqlPorts""uids""gids" ];
};

influxdb = rec {
httpPort = ids.influxdbPorts.influxdb or 0;
rpcPort = httpPort + 2;

pkg = constructors.simpleInfluxdb {
inherit httpPort rpcPort;
};

requiresUniqueIdsFor = [ "influxdbPorts""uids""gids" ];
};
}

The above processes model exposes three service instances: an Apache HTTP server (that works with a simple configuration that serves web applications from a single virtual host), PostgreSQL and InfluxDB. Each service requires a unique user ID and group ID so that their privileges are separated.

To make these services more accessible/usable, we do not use a shared ports resource pool. Instead, each service type consumes port numbers from their own resource pools.

The following ID resources configuration can be used to provision the unique IDs to the services above:


rec {
uids = {
min = 2000;
max = 3000;
};

gids = uids;

httpPorts = {
min = 8080;
max = 8085;
};

postgresqlPorts = {
min = 5432;
max = 5532;
};

influxdbPorts = {
min = 8086;
max = 8096;
step = 3;
};
}


The above ID resources configuration defines a shared UIDs and GIDs resource pool, but separate ports resource pools for each service type. This has the following implications if we deploy multiple instances of each service type:

  • All Apache HTTP server instances get a TCP port assignment between 8080-8085.
  • All PostgreSQL server instances get a TCP port assignment between 5432-5532.
  • All InfluxDB server instances get a TCP port assignment between 8086-8096. Since an InfluxDB allocates two port numbers: one for the HTTP server and one for the RPC service (the latter's port number is the base port number + 2). We use a step count of 3 so that we can retain this convention for each InfluxDB instance.

Conclusion


In this blog post, I have described a new tool: dydisnix-id-assign that can be used to automatically assign unique numeric IDs to services in Disnix service models.

Moreover, I have described: nixproc-id-assign that acts a thin wrapper around this tool to automatically assign numeric IDs to services in the Nix process management framework's processes model.

This tool replaces the old dydisnix-port-assign tool in the Dynamic Disnix toolset (described in the blog post written five years ago) that is much more limited in its capabilities.

Availability


The dydisnix-id-assign tool is available in the current development version of Dynamic Disnix. The nixproc-id-assign is part of the current implementation of the Nix process management framework prototype.

Transforming Disnix models to graphs and visualizing them

$
0
0
In my previous blog post, I have described a new tool in the Dynamic Disnix toolset that can be used to automatically assign unique numeric IDs to services in a Disnix service model. Unique numeric IDs can represent all kinds of useful resources, such as TCP/UDP port numbers, user IDs (UIDs), and group IDs (GIDs).

Although I am quite happy to have this tool at my disposal, implementing it was much more difficult and time consuming than I expected. Aside from the fact that the problem is not as obvious as it may sound, the main reason is that the Dynamic Disnix toolset was originally developed as a proof of concept implementation for a research paper under very high time pressure. As a result, it has accumulated quite a bit of technical debt, that as of today, is still at a fairly high level (but much better than it was when I completed the PoC).

For the ID assigner tool, I needed to make changes to the foundations of the tools, such as the model parsing libraries. As a consequence, all kinds of related aspects in the toolset started to break, such as the deployment planning algorithm implementations.

Fixing some of these algorithm implementations was much more difficult than I expected -- they were not properly documented, not decomposed into functions, had little to no reuse of common concepts and as a result, were difficult to understand and change. I was forced to re-read the papers that I used as a basis for these algorithms.

To prevent myself from having to go through such a painful process again, I have decided to revise them in such a way that they are better understandable and maintainable.

Dynamically distributing services


The deployment models in the core Disnix toolset are static. For example, the distribution of services to machines in the network is done in a distribution model in which the user has to manually map services in the services model to target machines in the infrastructure model (and optionally to container services hosted on the target machines).

Each time a condition changes, e.g. the system needs to scale up or a machine crashes and the system needs to recover, a new distribution model must be configured and the system must be redeployed. For big complex systems that need to be reconfigured frequently, manually specifying new distribution models becomes very impractical.

As I have already explained in older blog posts, to cope with the limitations of static deployment models (and other static configuration aspects), I have developed Dynamic Disnix, in which various configuration aspects can be automated, including the distribution of services to machines.

A strategy for dynamically distributing services to machines can be specified in a QoS model, that typically consists of two phases:

  • First, a candidate target selection must be made, in which for each service the arppropriate candidate target machines are selected.

    Not all machines are capable of hosting a certain service for functional and non-functional reasons -- for example, a i686-linux machine is not capable of running a binary compiled for a x86_64-linux machine.

    A machine can also be exposed to the public internet, and as a result, may not be suitable to host a service that exposes privacy-sensitive information.
  • After the suitable candidate target machines are known for each service, we must decide to which candidate machine each service gets distributed.

    This can be done in many ways. The strategy that we want to use is typically based on all kinds of non-functional requirements.

    For example, we can optimize a system's reliability by minimizing the amount of network links between services, requiring a strategy in which services that depend on each other are mapped to the same machine, as much as possible.

Graph-based optimization problems


In the Dynamic Disnix toolset, I have implemented various kinds of distribution algorithms/strategies for all kinds of purposes.

I did not "invent" most of them -- for some, I got inspiration from papers in the academic literature.

Two of the more advanced deployment planning algorithms are graph-based, to accomplish the following goals:

  • Reliable deployment. Network links are a potential source making a distributed system unreliable -- connections may fail, become slow, or could be interrupted frequently. By minimizing the amount of network links between services (by co-locating them on the same machine), their impact can be reduced. To not make deployments not too expensive, it should be done with a minimal amount of machines.

    As described in the paper: "Reliable Deployment of Component-based Applications into Distributed Environments" by A. Heydarnoori and F. Mavaddat, this problem can be transformed into a graph problem: the multiway cut problem (which is NP-hard).

    It can only be solved in polynomial time with an approximation algorithm that comes close to the optimal solution, unless a proof that P = NP exists.
  • Fragile deployment. Inspired by the above deployment problem, I also came up with the opposite problem (as my own "invention") -- how can we make any connection between a service a true network link (not local), so that we can test a system for robustness, using a minimal amount of machines?

    This problem can be modeled as a graph coloring problem (that is a NP-hard problem as well). I used one of approximatation alogithms described in the paper: "New Methods to Color the Vertices of a Graph" by D. Brélaz to implement a solution.

To work with these graph-based algorithms, I originally did not apply any transformations -- because of time pressure, I directly worked with objects from the Disnix models (e.g. services, target machines) and somewhat "glued" these together with generic data structures, such as lists and hash tables.

As a result, when looking at the implementation, it is very hard to get an understanding of the process and how an implementation aspect relates to a concept described in the papers shown above.

In my revised version, I have implemented a general purpose graph library that can be used to solve all kinds of general graph related problems.

Aside from using a general graph library, I have also separated the graph-based generation processes into the following steps:

  • After opening the Disnix input models (such as the services, infrastructure, and distribution models) I transform the models to a graph representing an instance of the problem domain.
  • After the graph has been generated, I apply the approximation algorithm to the graph data structure.
  • Finally, I transform the resolved graph back to a distribution model that should provide our desired distribution outcome.

This new organization provides better separation of concerns, common concepts can be reused (such as graph operations), and as a result, the implementations are much closer to the approximation algorithms described in the papers.

Visualizing the generation process


Another advantage of having a reusable graph implementation is that we can easily extend it to visualize the problem graphs.

When I combine these features together with my earlier work that visualizes services models, and a new tool that visualizes infrastructure models, I can make the entire generation process transparent.

For example, the following services model:


{system, pkgs, distribution, invDistribution}:

let
customPkgs = import ./pkgs { inherit pkgs system; };
in
rec {
testService1 = {
name = "testService1";
pkg = customPkgs.testService1;
type = "echo";
};

testService2 = {
name = "testService2";
pkg = customPkgs.testService2;
dependsOn = {
inherit testService1;
};
type = "echo";
};

testService3 = {
name = "testService3";
pkg = customPkgs.testService3;
dependsOn = {
inherit testService1 testService2;
};
type = "echo";
};
}

can be visualized as follows:


$ dydisnix-visualize-services -s services.nix


The above services model and corresponding visualization capture the following properties:

  • They describe three services (as denoted by ovals).
  • The arrows denote inter-dependency relationships (the dependsOn attribute in the services model).

    When a service has an inter-dependency on another service means that the latter service has to be activated first, and that the dependent service needs to know how to reach the former.

    testService2 depends on testService1 and testService3 depends on both the other two services.

We can also visualize the following infrastructure model:


{
testtarget1 = {
properties = {
hostname = "testtarget1";
};
containers = {
mysql-database = {
mysqlPort = 3306;
};
echo = {};
};
};

testtarget2 = {
properties = {
hostname = "testtarget2";
};
containers = {
mysql-database = {
mysqlPort = 3306;
};
};
};

testtarget3 = {
properties = {
hostname = "testtarget3";
};
};
}

with the following command:


$ dydisnix-visualize-infra -i infrastructure.nix

resulting in the following visualization:


The above infrastructure model declares three machines. Each target machine provides a number of container services (such as a MySQL database server, and echo that acts as a testing container).

With the following command, we can generate a problem instance for the graph coloring problem using the above services and infrastructure models as inputs:


$ dydisnix-graphcol -s services.nix -i infrastructure.nix \
--output-graph

resulting in the following graph:


The graph shown above captures the following properties:

  • Each service translates to a node
  • When an inter-dependency relationship exists between services, it gets translated to a (bi-directional) link representing a network connection (the rationale is that a service that has an inter-dependency on another service, interact with each other by using a network connection).

Each target machine translates to a color, that we can represent with a numeric index -- 0 is testtarget1, 1 is testtarget2 and so on.

The following command generates the resolved problem instance graph in which each vertex has a color assigned:


$ dydisnix-graphcol -s services.nix -i infrastructure.nix \
--output-resolved-graph

resulting in the following visualization:


(As a sidenote: in the above graph, colors are represented by numbers. In theory, I could also use real colors, but if I also want that the graph to remain visually appealing, I need to solve a color picking problem, which is beyond the scope of my refactoring objective).

The resolved graph can be translated back into the following distribution model:


$ dydisnix-graphcol -s services.nix -i infrastructure.nix
{
"testService2" = [
"testtarget2"
];
"testService1" = [
"testtarget1"
];
"testService3" = [
"testtarget3"
];
}

As you may notice, every service is distributed to a separate machine, so that every network link between a service is a real network connection between machines.

We can also visualize the problem instance of the multiway cut problem. For this, we also need a distribution model that, declares for each service, which target machine is a candidate.

The following distribution model makes all three target machines in the infrastructure model a candidate for every service:


{infrastructure}:

{
testService1 = [ infrastructure.testtarget1 infrastructure.testtarget2 infrastructure.testtarget3 ];
testService2 = [ infrastructure.testtarget1 infrastructure.testtarget2 infrastructure.testtarget3 ];
testService3 = [ infrastructure.testtarget1 infrastructure.testtarget2 infrastructure.testtarget3 ];
}

With the following command we can generate a problem instance representing a host-application graph:


$ dydisnix-multiwaycut -s services.nix -i infrastructure.nix \
-d distribution.nix --output-graph

providing me the following output:


The above problem graph has the following properties:

  • Each service translates to an app node (prefixed with app:) and each candidate target machine to a host node (prefixed with host:).
  • When a network connection between two services exists (implicitly derived from having an inter-dependency relationship), an edge is generated with a weight of 1.
  • When a target machine is a candidate target for a service, then an edge is generated with a weight of n2 representing a very large number.

The objective of solving the multiway cut problem is to cut edges in the graph in such a way that each terminal (host node) is disconnected from the other terminals (host nodes), in which the total weight of the cuts is minimized.

When applying the approximation algorithm in the paper to the above graph:


$ dydisnix-multiwaycut -s services.nix -i infrastructure.nix \
-d distribution.nix --output-resolved-graph

we get the following resolved graph:


that can be transformed back into the following distribution model:


$ dydisnix-multiwaycut -s services.nix -i infrastructure.nix \
-d distribution.nix
{
"testService2" = [
"testtarget1"
];
"testService1" = [
"testtarget1"
];
"testService3" = [
"testtarget1"
];
}

As you may notice by looking at the resolved graph (in which the terminals: testtarget2 and testtarget3 are disconnected) and the distribution model output, all services are distributed to the same machine: testtarget1 making all connections between the services local connections.

In this particular case, the solution is not only close to the optimal solution, but it is the optimal solution.

Conclusion


In this blog post, I have described how I have revised the deployment planning algorithm implementations in the Dynamic Disnix toolset. Their concerns are now much better separated, and the graph-based algorithms now use a general purpose graph library, that can also be used for generating visualizations of the intermediate steps in the generation process.

This revision was not on my short-term planned features list, but I am happy that I did the work. Retrospectively, I regret that I never took the time to finish things up properly after the submission of the paper. Although Dynamic Disnix's quality is still not where I want it to be, it is quite a step forward in making the toolset more usable.

Sadly, it is almost 10 years ago that I started Dynamic Disnix and still there is no offical release yet and the technical debt in Dynamic Disnix is one of the important reasons that I never did an official release. Hopefully, with this step I can do it some day. :-)

The good news is that I made the paper submission deadline and that the paper got accepted for presentation. It brought me to the SEAMS 2011 conference (co-located with ICSE 2011) in Honolulu, Hawaii, allowing me to take pictures such as this one:


Availability


The graph library and new implementations of the deployment planning algorithms described in this blog post are part of the current development version of Dynamic Disnix.

The paper: "A Self-Adaptive Deployment Framework for Service-Oriented Systems" describes the Dynamic Disnix framework (developed 9 years ago) and can be obtained from my publications page.

Acknowledgements


To generate the visualizations I used the Graphviz toolset.

Building multi-process Docker images with the Nix process management framework

$
0
0
Some time ago, I have described my experimental Nix-based process management framework that makes it possible to automatically deploy running processes (sometimes also ambiguously called services) from declarative specifications written in the Nix expression language.

The framework is built around two concepts. As its name implies, the Nix package manager is used to deploy all required packages and static artifacts, and a process manager of choice (e.g. sysvinit, systemd, supervisord and others) is used to manage the life-cycles of the processes.

Moreover, it is built around flexible concepts allowing integration with solutions that are not qualified as process managers (but can still be used as such), such as Docker -- each process instance can be deployed as a Docker container with a shared Nix store using the host system's network.

As explained in an earlier blog post, Docker has become such a popular solution that it has become a standard for deploying (micro)services (often as a utility in the Kubernetes solution stack).

When deploying a system that consists of multiple services with Docker, a typical strategy (and recommended practice) is to use multiple containers that have only one root application process. Advantages of this approach is that Docker can control the life-cycles of the applications, and that each process is (somewhat) isolated/protected from other processes and the host system.

By default, containers are isolated, but if they need to interact with other processes, then they can use all kinds of integration facilities -- for example, they can share namespaces, or use shared volumes.

In some situations, it may also be desirable to deviate from the one root process per container practice -- for some systems, processes may need to interact quite intensively (e.g. with IPC mechanisms, shared files or shared memory, or a combination these) in which the cointainer boundaries introduce more inconveniences than benefits.

Moreover, when running multiple processes in a single container, common dependencies can also typically be more efficiently shared leading to lower disk and RAM consumption.

As explained in my previous blog post (that explores various Docker concepts), sharing dependencies between containers only works if containers are constructed from images that share the same layers with the same shared libraries. In practice, this form of sharing is not always as efficient as we want it to be.

Configuring a Docker image to run multiple application processes is somewhat cumbersome -- the official Docker documentation describes two solutions: one that relies on a wrapper script that starts multiple processes in the background and a loop that waits for the "main process" to terminate, and the other is to use a process manager, such as supervisord.

I realised that I could solve this problem much more conveniently by combining the dockerTools.buildImage {} function in Nixpkgs (that builds Docker images with the Nix package manager) with the Nix process management abstractions.

I have created my own abstraction function: createMultiProcessImage that builds multi-process Docker images, managed by any supported process manager that works in a Docker container.

In this blog post, I will describe how this function is implemented and how it can be used.

Creating images for single root process containers


As shown in earlier blog posts, creating a Docker image with Nix for a single root application process is very straight forward.

For example, we can build an image that launches a trivial web application service with an embedded HTTP server (as shown in many of my previous blog posts), as follows:


{dockerTools, webapp}:

dockerTools.buildImage {
name = "webapp";
tag = "test";

runAsRoot = ''
${dockerTools.shadowSetup}
groupadd webapp
useradd webapp -g webapp -d /dev/null
'';

config = {
Env = [ "PORT=5000" ];
Cmd = [ "${webapp}/bin/webapp" ];
Expose = {
"5000/tcp" = {};
};
};
}

The above Nix expression (default.nix) invokes the dockerTools.buildImage function to automatically construct an image with the following properties:

  • The image has the following name: webapp and the following version tag: test.
  • The web application service requires some state to be initialized before it can be used. To configure state, we can run instructions in a QEMU virual machine with root privileges (runAsRoot).

    In the above deployment Nix expression, we create an unprivileged user and group named: webapp. For production deployments, it is typically recommended to drop root privileges, for security reasons.
  • The Env directive is used to configure environment variables. The PORT environment variable is used to configure the TCP port where the service should bind to.
  • The Cmd directive starts the webapp process in foreground mode. The life-cycle of the container is bound to this application process.
  • Expose exposes TCP port 5000 to the public so that the service can respond to requests made by clients.

We can build the Docker image as follows:


$ nix-build

load it into Docker with the following command:


$ docker load -i result

and launch a container instance using the image as a template:


$ docker run -it -p 5000:5000 webapp:test

If the deployment of the container succeeded, we should get a response from the webapp process, by running:


$ curl http://localhost:5000
<!DOCTYPE html>
<html>
<head>
<title>Simple test webapp</title>
</head>
<body>
Simple test webapp listening on port: 5000
</body>
</html>

Creating multi-process images


As shown in previous blog posts, the webapp process is part of a bigger system, namely: a web application system with an Nginx reverse proxy forwaring requests to multiple webapp instances:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
sharedConstructors = import ../services-agnostic/constructors.nix {
inherit pkgs stateDir runtimeDir logDir cacheDir tmpDir forceDisableUserChange processManager;
};

constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager;
};
in
rec {
webapp = rec {
port = 5000;
dnsName = "webapp.local";

pkg = constructors.webapp {
inherit port;
};
};

nginx = rec {
port = 8080;

pkg = sharedConstructors.nginxReverseProxyHostBased {
webapps = [ webapp ];
inherit port;
} {};
};
}

The Nix expression above shows a simple processes model variant of that system, that consists of only two process instances:

  • The webapp process is (as shown earlier) an application that returns a static HTML page.
  • nginx is configured as a reverse proxy to forward incoming connections to multiple webapp instances using the virtual host header property (dnsName).

    If somebody connects to the nginx server with the following host name: webapp.local then the request is forwarded to the webapp service.

Configuration steps


To allow all processes in the process model shown to be deployed to a single container, we need to execute the following steps in the construction of an image:

  • Instead of deploying a single package, such as webapp, we need to refer to a collection of packages and/or configuration files that can be managed with a process manager, such as sysvinit, systemd or supervisord.

    The Nix process management framework provides all kinds of Nix function abstractions to accomplish this.

    For example, the following function invocation builds a configuration profile for the sysvinit process manager, containing a collection of sysvinit scripts (also known as LSB Init compliant scripts):


    profile = import ../create-managed-process/sysvinit/build-sysvinit-env.nix {
    exprFile = ./processes.nix;
    stateDir = "/var";
    };

  • Similar to single root process containers, we may also need to initialize state. For example, we need to create common FHS state directories (e.g. /tmp, /var etc.) in which services can store their relevant state files (e.g. log files, temp files).

    This can be done by running the following command:


    nixproc-init-state --state-dir /var
  • Another property that multiple process containers have in common is that they may also require the presence of unprivileged users and groups, for security reasons.

    With the following commands, we can automatically generate all required users and groups specified in a deployment profile:


    ${dysnomia}/bin/dysnomia-addgroups ${profile}
    ${dysnomia}/bin/dysnomia-addusers ${profile}
  • Instead of starting a (single root) application process, we need to start a process manager that manages the processes that we want to deploy. As already explained, the framework allows you to pick multiple options.

Starting a process manager as a root process


From all process managers that the framework currently supports, the most straight forward option to use in a Docker container is: supervisord.

To use it, we can create a symlink to the supervisord configuration in the deployment profile:


ln -s ${profile} /etc/supervisor

and then start supervisord as a root process with the following command directive:


Cmd = [
"${pkgs.pythonPackages.supervisor}/bin/supervisord"
"--nodaemon"
"--configuration""/etc/supervisor/supervisord.conf"
"--logfile""/var/log/supervisord.log"
"--pidfile""/var/run/supervisord.pid"
];

(As a sidenote: creating a symlink is not strictly required, but makes it possible to control running services with the supervisorctl command-line tool).

Supervisord is not the only option. We can also use sysvinit scripts, but doing so is a bit tricky. As explained earlier, the life-cycle of container is bound to a running root process (in foreground mode).

sysvinit scripts do not run in the foreground, but start processes that daemonize and terminate immediately, leaving daemon processes behind that remain running in the background.

As described in an earlier blog post about translating high-level process management concepts, it is also possible to run "daemons in the foreground" by creating a proxy script. We can also make a similar foreground proxy for a collection of daemons:


#!/bin/bash -e

_term()
{
nixproc-sysvinit-runactivity -r stop ${profile}
kill "$pid"
exit 0
}

nixproc-sysvinit-runactivity start ${profile}

# Keep process running, but allow it to respond to the TERM and INT
# signals so that all scripts are stopped properly

trap _term TERM
trap _term INT

tail -f /dev/null & pid=$!
wait "$pid"

The above proxy script does the following:

  • It first starts all sysvinit scripts by invoking the nixproc-sysvinit-runactivity start command.
  • Then it registers a signal handler for the TERM and INT signals. The corresponding callback triggers a shutdown procedure.
  • We invoke a dummy command that keeps running in the foreground without consuming too many system resources (tail -f /dev/null) and we wait for it to terminate.
  • The signal handler properly deactivates all processes in reverse order (with the nixproc-sysvinit-runactivity -r stop command), and finally terminates the dummy command causing the script (and the container) to stop.

In addition supervisord and sysvinit, we can also use Disnix as a process manager by using a similar strategy with a foreground proxy.

Other configuration properties


The above configuration properties suffice to get a multi-process container running. However, to make working with such containers more practical from a user perspective, we may also want to:

  • Add basic shell utilities to the image, so that you can control the processes, investigate log files (in case of errors), and do other maintenance tasks.
  • Add a .bashrc configuration file to make file coloring working for the ls command, and to provide a decent prompt in a shell session.

Usage


The configuration steps described in the previous section are wrapped into a function named: createMultiProcessImage, which itself is a thin wrapper around the dockerTools.buildImage function in Nixpkgs -- it accepts the same parameters with a number of additional parameters that are specific to multi-process configurations.

The following function invocation builds a multi-process container deploying our example system, using supervisord as a process manager:


let
pkgs = import <nixpkgs> {};

createMultiProcessImage = import ../../nixproc/create-multi-process-image/create-multi-process-image.nix {
inherit pkgs system;
inherit (pkgs) dockerTools stdenv;
};
in
createMultiProcessImage {
name = "multiprocess";
tag = "test";
exprFile = ./processes.nix;
stateDir = "/var";
processManager = "supervisord";
}

After building the image, and deploying a container, with the following commands:


$ nix-build
$ docker load -i result
$ docker run -it --network host multiprocessimage:test

we should be able to connect to the webapp instance via the nginx reverse proxy:


$ curl -H 'Host: webapp.local' http://localhost:8080
<!DOCTYPE html>
<html>
<head>
<title>Simple test webapp</title>
</head>
<body>
Simple test webapp listening on port: 5000
</body>
</html>

As explained earlier, the constructed image also provides extra command-line utilities to do maintenance tasks, and control the lifecycle of the individual processes.

For example, we can "connect" to the running container, and check which processes are running:


$ docker exec -it mycontainer /bin/bash
# supervisorctl
nginx RUNNING pid 11, uptime 0:00:38
webapp RUNNING pid 10, uptime 0:00:38
supervisor>

If we change the processManager parameter to sysvinit, we can deploy a multi-process image in which the foreground proxy script is used as a root process (that starts and stops sysvinit scripts).

We can control the life-cycle of each individual process by directly invoking the sysvinit scripts in the container:


$ docker exec -it mycontainer /bin/bash
$ /etc/rc.d/init.d/webapp status
webapp is running with Process ID(s) 33.

$ /etc/rc.d/init.d/nginx status
nginx is running with Process ID(s) 51.

Although having extra command-line utilities to do administration tasks is useful, a disadvantage is that they considerably increase the size of the image.

To save storage costs, it is also possible to disable interactive mode to exclude these packages:


let
pkgs = import <nixpkgs> {};

createMultiProcessImage = import ../../nixproc/create-multi-process-image/create-multi-process-image.nix {
inherit pkgs system;
inherit (pkgs) dockerTools stdenv;
};
in
createMultiProcessImage {
name = "multiprocess";
tag = "test";
exprFile = ./processes.nix;
stateDir = "/var";
processManager = "supervisord";
interactive = false; # Do not install any additional shell utilities
}

Discussion


In this blog post, I have described a new utility function in the Nix process management framework: createMultiProcessImage -- a thin wrapper around the dockerTools.buildImage function that can be used to convienently build multi-process Docker images, using any Docker-capable process manager that the Nix process management framework supports.

Besides the fact that we can convienently construct multi-process images, this function also has the advantage (similar to the dockerTools.buildImage function) that Nix is only required for the construction of the image. To deploy containers from a multi-process image, Nix is not a requirement.

There is also a drawback: similar to "ordinary" multi-process container deployments, when it is desired to upgrade a process, the entire container needs to be redeployed, also requiring a user to terminate all other running processes.

Availability


The createMultiProcessImage function is part of the current development version of the Nix process management framework that can be obtained from my GitHub page.

Constructing a simple alerting system with well-known open source projects

$
0
0

Some time ago, I have been experimenting with all kinds of monitoring and alerting technologies. For example, with the following technologies, I can develop a simple alerting system with relative ease:

  • Telegraf is an agent that can be used to gather measurements and transfer the corresponding data to all kinds of storage solutions.
  • InfluxDB is a time series database platform that can store, manage and analyze timestamped data.
  • Kapacitor is a real-time streaming data process engine, that can be used for a variety of purposes. I can use Kapacitor to analyze measurements and see if a threshold has been exceeded so that an alert can be triggered.
  • Alerta is a monitoring system that can store, de-duplicate alerts, and arrange black outs.
  • Grafana is a multi-platform open source analytics and interactive visualization web application.

These technologies appear to be quite straight forward to use. However, as I was learning more about them, I discovered a number of oddities, that may have big implications.

Furthermore, testing and making incremental changes also turns out to be much more challenging than expected, making it very hard to diagnose and fix problems.

In this blog post, I will describe how I built a simple monitoring and alerting system, and elaborate about my learning experiences.

Building the alerting system


As described in the introduction, I can combine several technologies to create an alerting system. I will explain them more in detail in the upcoming sections.

Telegraf


Telegraf is a pluggable agent that gathers measurements from a variety of inputs (such as system metrics, platform metrics, database metrics etc.) and sends them to a variety of outputs, typically storage solutions (database management systems such as InfluxDB, PostgreSQL or MongoDB). Telegraf has a large plugin eco-system that provides all kinds integrations.

In this blog post, I will use InfluxDB as an output storage backend. For the inputs, I will restrict myself to capturing a sub set of system metrics only.

With the following telegraf.conf configuration file, I can capture a variety of system metrics every 10 seconds:


[agent]
interval = "10s"

[[outputs.influxdb]]
urls = [ "http://test1:8086" ]
database = "sysmetricsdb"
username = "sysmetricsdb"
password = "sysmetricsdb"

[[inputs.system]]
# no configuration

[[inputs.cpu]]
## Whether to report per-cpu stats or not
percpu = true
## Whether to report total system cpu stats or not
totalcpu = true
## If true, collect raw CPU time metrics.
collect_cpu_time = false
## If true, compute and report the sum of all non-idle CPU states.
report_active = true

[[inputs.mem]]
# no configuration

With the above configuration file, I can collect the following metrics:
  • System metrics, such as the hostname and system load.
  • CPU metrics, such as how much the CPU cores on a machine are utilized, including the total CPU activity.
  • Memory (RAM) metrics.

The data will be stored in an InfluxDB database name: sysmetricsdb hosted on a remote machine with host name: test1.

InfluxDB


As explained earlier, InfluxDB is a timeseries platform that can store, manage and analyze timestamped data. In many ways, InfluxDB resembles relational databases, but there are also some notable differences.

The query language that InfluxDB uses is called InfluxQL (that shares many similarities with SQL).

For example, with the following query I can retrieve the first three data points from the cpu measurement, that contains the CPU-related measurements collected by Telegraf:


> precision rfc3339
> select * from "cpu" limit 3

providing me the following result set:


name: cpu
time cpu host usage_active usage_guest usage_guest_nice usage_idle usage_iowait usage_irq usage_nice usage_softirq usage_steal usage_system usage_user
---- --- ---- ------------ ----------- ---------------- ---------- ------------ --------- ---------- ------------- ----------- ------------ ----------
2020-11-16T15:36:00Z cpu-total test2 10.665258711721098 0 0 89.3347412882789 0.10559662090813073 0 0 0.10559662090813073 0 8.658922914466714 1.79514255543822
2020-11-16T15:36:00Z cpu0 test2 10.665258711721098 0 0 89.3347412882789 0.10559662090813073 0 0 0.10559662090813073 0 8.658922914466714 1.79514255543822
2020-11-16T15:36:10Z cpu-total test2 0.1055966209080346 0 0 99.89440337909197 0 0 0 0.10559662090813073 0 0 0

As you may probably notice by looking at the output above, every data point has a timestamp and a number of fields capturing CPU metrics:

  • cpu identifies the CPU core.
  • host contains the host name of the machine.
  • The remainder of the fields contain all kinds of CPU metrics, e.g. how much CPU time is consumed by the system (usage_system), the user (usage_user), by waiting for IO (usage_iowait) etc.
  • The usage_active field contains the total CPU activity percentage, which is going to be useful to develop an alert that will warn us if there is too much CPU activity for a long period of time.

Aside from the fact that all data is timestamp based, data in InfluxDB has another notable difference compared to relational databases: an InfluxDB database is schemaless. You can add an arbitrary number of fields and tags to a data point without having to adjust the database structure (and migrating existing data to the new database structure).

Fields and tags can contain arbitrary data, such as numeric values or strings. Tags are also indexed so that you can search for these values more efficiently. Furthermore, tags can be used to group data.

For example, the cpu measurement collection has the following tags:


> SHOW TAG KEYS ON "sysmetricsdb" FROM "cpu";
name: cpu
tagKey
------
cpu
host

As shown in the above output, the cpu and host fields are tags in the cpu measurement.

We can use these tags to search for all data points related to a CPU core and/or host machine. Moreover, we can use these tags for grouping allowing us to compute aggregate values, sch as the mean value per CPU core and host.

Beyond storing and retrieving data, InfluxDB has many useful additional features:

  • You can also automatically sample data and run continuous queries that generate and store sampled data in the background.
  • Configure retention policies so that data is no longer stored for an indefinite amount of time. For example, you can configure a retention policy to drop raw data after a certain amount of time, but retain the corresponding sampled data.

InfluxDB has a "open core" development model. The free and open source edition (FOSS) of InfluxDB server (that is MIT licensed) allows you to host multiple databases on a multiple servers.

However, if you also want horizontal scalability and/or high assurance, then you need to switch to the hosted InfluxDB versions -- data in InfluxDB is partitioned into so-called shards of a fixed size (the default shard size is 168 hours).

These shards can be distributed over multiple InfluxDB servers. It is also possible to deploy multiple read replicas of the same shard to multiple InfluxDB servers improving read speed.

Kapacitor


Kapacitor is a real-time streaming data process engine developed by InfluxData -- the same company that also develops InfluxDB and Telegraf.

It can be used for all kinds of purposes. In my example cases, I will only use it to determine whether some threshold has been exceeded and an alert needs to be triggered.

Kapacitor works with customly implemented tasks that are written in a domain-specific language called the TICK script language. There are two kinds of tasks: stream and batch tasks. Both task types have advantages and disadvantages.

We can easily develop an alert that gets triggered if the CPU activity level is high for a relatively long period of time (more than 75% on average over 1 minute).

To implement this alert as a stream job, we can write the following TICK script:


dbrp "sysmetricsdb"."autogen"

stream
|from()
.measurement('cpu')
.groupBy('host', 'cpu')
.where(lambda: "cpu" != 'cpu-total')
|window()
.period(1m)
.every(1m)
|mean('usage_active')
|alert()
.message('Host: {{ index .Tags "host" }} has high cpu usage: {{ index .Fields "mean" }}')
.warn(lambda: "mean"> 75.0)
.crit(lambda: "mean"> 85.0)
.alerta()
.resource('{{ index .Tags "host" }}/{{ index .Tags "cpu" }}')
.event('cpu overload')
.value('{{ index .Fields "mean" }}')

A stream job is built around the following principles:

  • A stream task does not execute queries on an InfluxDB server. Instead, it creates a subscription to InfluxDB -- whenever a data point gets inserted into InfluxDB, the data points gets forwarded to Kapacitor as well.

    To make subscriptions work, both InfluxDB and Kapacitor need to be able to connect to each other with a public IP address.
  • A stream task defines a pipeline consisting of a number of nodes (connected with the | operator). Each node can consume data points, filter, transform, aggregate, or execute arbitrary operations (such as calling an external service), and produce new data points that can be propagated to the next node in the pipeline.
  • Every node also has property methods (such as .measurement('cpu')) making it possible to configure parameters.

The TICK script example shown above does the following:

  • The from node consumes cpu data points from the InfluxDB subscription, groups them by host and cpu and filters out data points with the the cpu-total label, because we are only interested in the CPU consumption per core, not the total amount.
  • The window node states that we should aggregate data points over the last 1 minute and pass the resulting (aggregated) data points to the next node after one minute in time has elapsed. To aggregate data, Kapacitor will buffer data points in memory.
  • The mean node computes the mean value for usage_active for the aggregated data points.
  • The alert node is used to trigger an alert of a specific severity level (WARNING if the mean activity percentage is bigger than 75%) and (CRITICAL if the mean activity percentage is bigger than 85%). In the remainder of the case, the status is considered OK. The alert is sent to Alerta.

It is also possible to write a similar kind of alerting script as a batch task:


dbrp "sysmetricsdb"."autogen"

batch
|query('''
SELECT mean("usage_active")
FROM "sysmetricsdb"."autogen"."cpu"
WHERE "cpu" != 'cpu-total'
''')
.period(1m)
.every(1m)
.groupBy('host', 'cpu')
|alert()
.message('Host: {{ index .Tags "host" }} has high cpu usage: {{ index .Fields "mean" }}')
.warn(lambda: "mean"> 75.0)
.crit(lambda: "mean"> 85.0)
.alerta()
.resource('{{ index .Tags "host" }}/{{ index .Tags "cpu" }}')
.event('cpu overload')
.value('{{ index .Fields "mean" }}')

The above TICK script looks similar to the stream task shown earlier, but instead of using a subscription, the script queries the InfluxDB database (with an InfluxQL query) for data points over the last minute with a query node.

Which approach for writing a CPU alert is best, you may wonder? Each of these two approaches have their pros and cons:

  • Stream tasks offer low latency responses -- when a data point appears, a stream task can immediately respond, whereas a batch task needs to query every minute all the data points to compute the mean percentage over the last minute.
  • Stream tasks maintain a buffer for aggregating the data points making it possible to only send incremental updates to Alerta. Batch tasks are stateless. As a result, they need to update the status of all hosts and CPUs every minute.
  • Processing data points is done synchronously and in sequential order -- if an update round to Alerta takes too long (which is more likely to happen with a batch task), then the next processing run may overlap with the previous, causing all kinds of unpredictable results.

    It may also cause Kapacitor to eventually crash due to growing resource consumption.
  • Batch tasks may also miss data points -- while querying data over a certain time window, it may happen that a new data point gets inserted in that time window (that is being queried). This new data point will not be picked up by Kapacitor.

    A subscription made by a stream task, however, will never miss any data points.
  • Stream tasks can only work with data points that appear from the moment Kapacitor is started -- it cannot work with data points in the past.

    For example, if Kapacitor is restarted and some important event is triggered in the restart time window, Kapacitor will not notice that event, causing the alert to remain in its previous state.

    To work effectively with stream tasks, a continuous data stream is required that frequently reports on the status of a resource. Batch tasks, on the other hand, can work with historical data.
  • The fact that nodes maintain a buffer may also cause the RAM consumption of Kapacitor to grow considerably, if the data volumes are big.

    A batch task on the other hand, does not buffer any data and is more memory efficient.

    Another compelling advantage of batch tasks over stream tasks is that InfluxDB does all the work. The hosted version of InfluxDB can also horizontally scale.
  • Batch tasks can also aggregate data more efficiently (e.g. computing the mean value or sum of values over a certain time period).

I consider neither of these script types the optimal solution. However, for implementing the alerts I tend to have a slight preference for stream jobs, because of its low latency, and incremental update properties.

Alerta


As explained in the introduction, Alerta is a monitoring system that can store and de-duplicate alerts, and arrange black outs.

The Alerta server provides a REST API that can be used to query and modify alerting data and uses MongoDB or PostgreSQL as a storage database.

There are also a variety of Alerta clients: there is the alerta-cli allows you to control the service from the command-line. There is also a web user interface that I will show later in this blog post.

Running experiments


With all the components described above in place, we can start running experiments to see if the CPU alert will work as expected. To gain better insights in the process, I can install Grafana that allows me to visualize the measurements that are stored in InfluxDB.

Configuring a dashboard and panel for visualizing the CPU activity rate was straight forward. I configured a new dashboard, with the following variables:


The above variables allow me to select for each machine in the network, which CPU core's activity percentage I want to visualize.

I have configured the CPU panel as follows:


In the above configuration, I query the usage_activity from the cpu measurement collection, using the dashboard variables: cpu and host to filter for the right target machine and CPU core.

I have also configured the field unit to be a percentage value (between 0 and 100).

When running the following command-line instruction on a test machine that runs Telegraf (test2), I can deliberately hog the CPU:


$ dd if=/dev/zero of=/dev/null

The above command reads zero bytes (one-by-one) and discards them by sending them to /dev/null, causing the CPU to remain utilized at a high level:


In the graph shown above, it is clearly visible that CPU core 0 on the test2 machine remains utilized at 100% for several minutes.

(As a sidenote, we can also hog both the CPU and consume RAM at the same time with a simple command line instruction).

If we keep hogging the CPU and wait for at least a minute, the Alerta web interface dashboard will show a CRITICAL alert:


If we stop the dd command, then the TICK script should eventually notice that the mean percentage drops below the WARNING threshold causing the alert to go back into the OK state and disappearing from the Alerta dashboard.

Developing test cases


Being able to trigger an alert with a simple command-line instruction is useful, but not always convenient or effective -- one of the inconveniences is that we always have to wait at least one minute to get feedback.

Moreover, when an alert does not work, it is not always easy to find the root cause. I have encountered the following problems that contribute to a failing alert:

  • Telegraf may not be running and, as a result, not capturing the data points that need to be analyzed by the TICK script.
  • A subscription cannot be established between InfluxDB and Kapacitor. This may happen when Kapacitor cannot be reached through a public IP address.
  • There are data points collected, but only the wrong kinds of measurements.
  • The TICK script is functionally incorrect.

Fortunately, for stream tasks it is relatively easy to quickly find out whether an alert is functionally correct or not -- we can generate test cases that almost instantly trigger each possible outcome with a minimal amount of data points.

An interesting property of stream tasks is that they have no notion of time -- the .window(1m) property may suggest that Kapacitor computes the mean value of the data points every minute, but that is not what it actually does. Instead, Kapacitor only looks at the timestamps of the data points that it receives.

When Kapacitor sees that the timestamps of the data points fit in the 1 minute time window, then it keeps buffering. As soon as a data point appears that is outside this time window, the window node relays an aggregated data point to the next node (that computes the mean value, than in turn is consumed by the alert node deciding whether an alert needs to be raised or not).

We can exploit that knowledge, to create a very minimal bash test script that triggers every possible outcome: OK, WARNING and CRITICAL:


influxCmd="influx -database sysmetricsdb -host test1"

export ALERTA_ENDPOINT="http://test1"

### Trigger CRITICAL alert

# Force the average CPU consumption to be 100%
$influxCmd -execute "INSERT cpu,cpu=cpu0,host=test2 usage_active=100 0000000000"
$influxCmd -execute "INSERT cpu,cpu=cpu0,host=test2 usage_active=100 60000000000"
# This data point triggers the alert
$influxCmd -execute "INSERT cpu,cpu=cpu0,host=test2 usage_active=100 120000000000"

sleep 1
actualSeverity=$(alerta --output json query | jq '.[0].severity')

if [ "$actualSeverity" != "critical" ]
then
echo "Expected severity: critical, but we got: $actualSeverity">&2
false
fi

### Trigger WARNING alert

# Force the average CPU consumption to be 80%
$influxCmd -execute "INSERT cpu,cpu=cpu0,host=test2 usage_active=80 180000000000"
$influxCmd -execute "INSERT cpu,cpu=cpu0,host=test2 usage_active=80 240000000000"
# This data point triggers the alert
$influxCmd -execute "INSERT cpu,cpu=cpu0,host=test2 usage_active=80 300000000000"

sleep 1
actualSeverity=$(alerta --output json query | jq '.[0].severity')

if [ "$actualSeverity" != "warning" ]
then
echo "Expected severity: warning, but we got: $actualSeverity">&2
false
fi

### Trigger OK alert

# Force the average CPU consumption to be 0%
$influxCmd -execute "INSERT cpu,cpu=cpu0,host=test2 usage_active=0 300000000000"
$influxCmd -execute "INSERT cpu,cpu=cpu0,host=test2 usage_active=0 360000000000"
# This data point triggers the alert
$influxCmd -execute "INSERT cpu,cpu=cpu0,host=test2 usage_active=0 420000000000"

sleep 1
actualSeverity=$(alerta --output json query | jq '.[0].severity')

if [ "$actualSeverity" != "ok" ]
then
echo "Expected severity: ok, but we got: $actualSeverity">&2
false
fi

The shell script shown above automatically triggers all three possible outcomes of the CPU alert:

  • CRITICAL is triggered by generating data points that force a mean activity percentage of 100%.
  • WARNING is triggered by a mean activity percentage of 80%.
  • OK is triggered by a mean activity percentage of 0%.

It uses the Alerta CLI to connect to the Alerta server to check whether the alert's severity level has the expected value.

We need three data points to trigger each alert type -- the first two data points are on the boundaries of the 1 minute window (0 seconds and 60 seconds), forcing the mean value to become the specified CPU activity percentage.

The third data point is deliberately outside the time window (of 1 minute), forcing the alert node to be triggered with a mean value over the previous two data points.

Although the above test strategy works to quickly validate all possible outcomes, one impractical aspect is that the timestamps in the above example start with 0 (meaning 0 seconds after the epoch: January 1st 1970 00:00 UTC).

If we also want to observe the data points generated by the above script in Grafana, we need to configure the panel to go back in time 50 years.

Fortunately, I can also easily adjust the script to start with a base timestamp, that is 1 hour in the past:


offset="$(($(date +%s) - 3600))"

With this tiny adjustment, we should see the following CPU graph (displaying data points from the last hour) after running the test script:


As you may notice, we can see that the CPU activity level quickly goes from 100%, to 80%, to 0%, using only 9 data points.

Although testing stream tasks (from a functional perspective) is quick and convenient, testing batch tasks in a similar way is difficult. Contrary to the stream task implementation, the query node in the batch task does have a notion of time (because of the WHERE clause that includes the now() expression).

Moreover, the embedded InfluxQL query evaluates the mean values every minute, but the test script does not exactly know when this event triggers.

The only way I could think of to (somewhat reliably) validate the outcomes is by creating a test script that continuously inserts data points for at least double the time window size (2 minutes) until Alerta reports the right alert status (if it does not after a while, I can conclude that the alert is incorrectly implemented).

Automating the deployment


As you may probably have already guessed, to be able to conveniently experiment with all these services, and to reliably run tests in isolation, some form of deployment automation is an absolute must-have.

Most people who do not know anything about my deployment technology preferences, will probably go for Docker or docker-compose, but I have decided to use a variety of solutions from the Nix project.

NixOps is used to automatically deploy a network of NixOS machines -- I have created a logical and physical NixOps configuration that deploys two VirtualBox virtual machines.

With the following command I can create and deploy the virtual machines:


$ nixops create network.nix network-virtualbox.nix -d test
$ nixops deploy -d test

The first machine: test1 is responsible for hosting the entire monitoring infrastructure (InfluxDB, Kapacitor, Alerta, Grafana), and the second machine (test2) runs Telegraf and the load tests.

Disnix (my own deployment tool) is responsible for deploying all services, such as InfluxDB, Kapacitor, Alarta, and the database storage backends. Contrary to docker-compose, Disnix does not work with containers (or other Docker objects, such as networks or volumes), but with arbitrary deployment units that are managed with a plugin system called Dysnomia.

Moreover, Disnix can also be used for distributed deployment in a network of machines.

I have packaged all the services and captured them in a Disnix services model that specifies all deployable services, their types, and their inter-dependencies.

If I combine the services model with the NixOps network models, and a distribution model (that maps Telegraf and the test scripts to the test2 machine and the remainder of the services to the first: test1), I can deploy the entire system:


$ export NIXOPS_DEPLOYMENT=test
$ export NIXOPS_USE_NIXOPS=1

$ disnixos-env -s services.nix \
-n network.nix \
-n network-virtualbox.nix \
-d distribution.nix

The following diagram shows a possible deployment scenario of the system:


The above diagram describes the following properties:

  • The light-grey colored boxes denote machines. In the above diagram, we have two of them: test1 and test2 that correspond to the VirtualBox machines deployed by NixOps.
  • The dark-grey colored boxes denote containers in a Disnix-context (not to be confused with Linux or Docker containers). These are environments that manage other services.

    For example, a container service could be the PostgreSQL DBMS managing a number of PostgreSQL databases or the Apache HTTP server managing web applications.
  • The ovals denote services that could be any kind of deployment unit. In the above example, we have services that are running processes (managed by systemd), databases and web applications.
  • The arrows denote inter-dependencies between services. When a service has an inter-dependency on another service (i.e. the arrow points from the former to the latter), then the latter service needs to be activated first. Moreover, the former service also needs to know how the latter can be reached.
  • Services can also be container providers (as denoted by the arrows in the labels), stating that other services can be embedded inside this service.

    As already explained, the PostgreSQL DBMS is an example of such a service, because it can host multiple PostgreSQL databases.

Although the process components in the diagram above can also be conveniently deployed with Docker-based solutions (i.e. as I have explained in an earlier blog post, containers are somewhat confined and restricted processes), the non-process integrations need to be managed by other means, such as writing extra shell instructions in Dockerfiles.

In addition to deploying the system to machines managed by NixOps, it is also possible to use the NixOS test driver -- the NixOS test driver automatically generates QEMU virtual machines with a shared Nix store, so that no disk images need to be created, making it possible to quickly spawn networks of virtual machines, with very small storage footprints.

I can also create a minimal distribution model that only deploys the services required to run the test scripts -- Telegraf, Grafana and the front-end applications are not required, resulting in a much smaller deployment:


As can be seen in the above diagram, there are far fewer components required.

In this virtual network that runs a minimal system, we can run automated tests for rapid feedback. For example, the following test driver script (implemented in Python) will run my test shell script shown earlier:


test2.succeed("test-cpu-alerts")

With the following command I can automatically run the tests on the terminal:


$ nix-build release.nix -A tests

Availability


The deployment recipes, test scripts and documentation describing the configuration steps are stored in the monitoring playground repository that can be obtained from my GitHub page.

Besides the CPU activity alert described in this blog post, I have also developed a memory alert that triggers if too much RAM is consumed for a longer period of time.

In addition to virtual machines and services, there is also deployment automation in place allowing you also easily deploy Kapacitor TICK scripts and Grafana dashboards.

To deploy the system, you need to use the very latest version of Disnix (version 0.10) that was released very recently.

Acknowledgements


I would like to thank my employer: Mendix for writing this blog post. Mendix allows developers to work two days per month on research projects, making projects like these possible.

Blog reflection over the last decade

$
0
0

Today it is exactly ten years ago that I started this blog. As with previous years, I will do a reflection, but this time it will be over the last decade.

What was holding me back


The idea to have my own blog was already there for a long time. I always thought it was an interesting medium. For example, I considered it a good instrument to express my thoughts on the technical work I do, and in particular, I liked having the ability to get feedback.

The main reason why it still took me so long to start was because I never considered it "the right time". For example, I was already close to starting a blog 15 years ago (while I was still early in my studies) when web development was still one of my main technical interests, but still refrained from doing so.

At that time I did some interesting "discoveries" and I had some random ideas I could elaborate about, but these ideas never materialized enough so that I could write a story about it.

Moreover, I also did not feel comfortable enough yet to express myself, because I did not have much writing experience in English. Retrospectively, I learned that there is a never a right time for having a blog, I should just start.

Entering the research domain


A couple of years later, while I was working on my master's thesis, I made the decision to go for a PhD degree, because I was genuinely interested in the research domain of my master's thesis: software deployment, mostly because of my prior experience in industry and building Linux distributions from scratch.

Even before starting my PhD, I already knew that writing is an important component in research -- as a researcher, you have to regularly report about your work by means of research papers that typically need to be anonymously peer reviewed.

In most scientific disciplines, academic papers are published in journals. In the computer science domain, it is more common to publish papers in conference proceedings.

Only a certain percentage of paper submissions that are considered good quality (as judged by the peer reviews) are accepted for publication. Rejection of a paper typically requires you make revisions and submitting that paper to a different conference.

For top general conferences in the software engineering domain the acceptance rate is typically lower than 20% (this ratio used be even lower, close to 15%).

In my PhD, I had a very quick publication start -- in the first month, a paper about atomic upgrades for distributed systems was accepted that covered an important topic of my master's thesis.

Roughly half a year later, me and my co-workers published a research paper about the objectives of the Pull Deployment of Services (PDS) research project (in which my research was of the sub topics) funded by NWO/Jacquard.

Although I had a very good start, I slowly started to learn (the hard way) that you cannot simply publish research papers about all the work you do -- as a matter of fact, it only represents a modest sub set of your daily work.

To write a good research paper, it takes quite a bit of time and effort to decide about the topic (including the paper's title) and to get all the details right. I had all kinds of interesting ideas but many of these ideas were not considered novel -- they were interesting engineering efforts but they did not add interesting new (significant) scientific knowledge.

Moreover, in a research paper, you also need to put your contribution in context (e.g. explain/show how it compares to similar work and how it expands existing knowledge), and provide validation (this can be a proof, but in most cases you evaluate in what degree your contribution meets its claims, for example, by providing empirical data).

After instant acceptance of the first two papers, things did not work out that smoothly anymore. I had several paper rejections in a row -- one paper was badly rejected because I did not put it into the right context (for example, I ignored some important related work) and I did not make my contribution very clear (I basically left it open to the interpretation of the reader, which is a bad thing).

Fortunately, I learned a lot from this rejection. The reviewers even suggested me an alternative conference where I could submit my revised paper to. After addressing the reviewers' criticisms, the paper got accepted.

Another paper was rejected twice in a row for IMO very weak reasons. Most notably, it turned out that many reviewers believed that the subject was not really software engineering related (which is strange, because software deployment is explicitly listed as one of the subjects in the conference's call for papers).

When I explained this peculiarity to Eelco Visser (one of my supervisors and co-promotor), he suggested that I should have more frequent interaction with the scientific community and write about the subject on my blog. Software deployment is generally a neglected subject in the software engineering research community.

Eventually, we have managed to publish the problematic papers (one is about Disnix, the tool implementation of my research) and the other about the testing aspect of the previously rejected paper.

After that problematic period, I have managed to publish two more papers that got instantly accepted bringing me to all kinds of interesting conferences.

The decision to start my blog


Although having a 3 paper acceptance streak and traveling to the conferences to present them felt nice for a while, I still was not too happy.

In late 2010, one day before new years eve (I typically reflect over things in the past year at new year's eve) I realized that research papers alone is just a very narrow representation of the work that I do as a researcher (although the amount of papers and their impact are typically used as the only metric to judge the performance of a researcher).

In addition to getting research papers accepted and doing the required writing, there is much more that the work of an academic researcher (and in particular the software engineering domain) is about:

  • Research in software engineering is about constructing tools. For example, the paper: Research Paradigms in Computer Science' by Peter Wegner from Brown University says:

    Research in engineering is directed towards the efficient accomplishment of specific tasks and towards the development of tools that will enable classes of tasks to be accomplished more efficiently.

    In addition to the problems that tools try solve or optimize, the construction of these tools is typically also very challenging, similar to conventional software development projects.

    Although the construction aspects of tools may not always be novel and too detailed for a research paper (that typically has a page limit), it is definitely useful to work towards a good and stable design and implementation. Writing about these aspects can be very useful for yourself, your colleagues and peers in the field.

    Moreover, having a tool that is usable and works also mattered to me and to the people in my research group. For example, my deployment research was built on top of the Nix package manager, that in addition to research, was also used to solve our internal deployment problems.

  • I did not completely start all the development work of my tooling from scratch -- I was building my deployment tooling on top of the Nix package manager that was both a research project, and an open source project (more accurately called a community project) with a small group of external contributors.

    (As a sidenote: the Nix package manager was started by Eelco Dolstra who was a Postdoc in the same research project and one my university supervisors).

    I considered my blog a good instrument to communicate with the Nix community about ideas and implementation aspects.
  • Research is also about having frequent interaction with your peers that work for different universities, companies and/or research institutes.

    A research paper is useful to get feedback, but at the same time, it is also quite an inaccessible medium -- people can obtain a copy from publishers (typically behind a paywall) or from your personal homepage and communicate by e-mail, but the barrier is typically high.
  • I was also frequently in touch with software engineering practitioners, such as former study friends, open source communities and people from our research project's industry partner: Philips Healthcare.

    I regularly received all kinds of interesting questions related to the practical aspects of my work. For example, how to apply our research tools to industry problems or how our research tools compare to conventional tools.

    Not all of these questions can be transformed into research papers, but were definitely useful to investigate and write about.
  • Being in academia is more than just working on publications. You also travel to conferences, get involved in all kinds of different (and sometimes related) research subjects of your colleagues and peers and you may also help in teaching. These subjects are also worth writing about.

Because of the above reasons, I was finally convinced that the time was right to start my blog.

The beginning: catching up with my research papers


Since I was already working on my PhD research for more than 2 years, there was still a lot of catching up I had to do. It did not make sense to just randomly start writing about something technical or research related. Basically, I wanted all information on my blog "to fit together".

For the first half year, my blog was basically about writing things down I had already done and published about.


After my blog announcement, I started explaining what the Pull Deployment of Services research project is about, then explaining the Nix package manager that serves as the fundamental basis of all the deployment tooling that I was developing, followed by NixOS, a Linux distribution that is entirely managed by the Nix package manager that can be deployed from a single declarative specification.

The next blog post was about declarative deployment and testing with NixOS. It was used as an ingredient for a research paper that already got published, and a talk with the same title for FOSDEM: the free and open source's European meeting in Brussels. Writing about the subject on my blog was a useful preparation session for my talk.

After giving my talk at FOSDEM, there was more catching up work to do. After explaining the basic Nix concepts, I could finally elaborate about Disnix, the tool I have been developing as part of my research that uses Nix to extend deployment to the domain of service-oriented systems.

After writing about the self-adaptive deployment framework built on top of Disnix (I have submitted my paper at the beginning that year, and it got accepted shortly before writing the corresponding blog post), I was basically up-to-date with all research aspects.

Using my blog for research


After my catch up phase was completed, I could finally start writing about things that were not directly related to any research papers already written in the past.

One of the things I have been struggling with for a while was making our tools work with .NET technology. The Nix package manager (and sister projects, such as Disnix) were primarily developed for UNIX-like operating systems (most notably Linux) and technologies that run on these operating systems.

Our industry partner: Philips Healthcare, mostly uses Microsoft technologies in their development stack ranging from .NET as a runtime, C# for coding, SQL server for storage, and IIS as web server.

At that time, .NET was heavily tied to the Windows eco-system (Mono already existed that provided a somewhat compatible runtime for other operating systems than Windows, but it did not provide compatible implementations of all libraries to work with the Philips platform).

With some small modifications, I could use Nix on Cygwin to build .NET projects. However, running .NET applications that rely on shared libraries (called library assemblies in .NET terminology) was still a challenge. I could only provide a number of very sub optimal solutions, of which none was ideal.

I wrote about it on my blog, and during my trip to ICSE 2011 in Hawaii I learned from a discussion with a co-attendee that you could also use an event listener that triggers when a library assembly is missing. The reflection API can be used in this event handler to load these missing assemblies, making it possible to efficiently solve my dependency problem making it possible to use both Nix and Disnix to deploy .NET services on Windows without any serious obstacles.


I have also managed to discuss one of my biggest frustrations in the research community: the fact that software deployment is a neglected subject. Thanks to spreading the blog post on Twitter (that in turn got retweeted by all kinds of people in the research community) it attracted quite a few visitors and a large number helpful comments. I even got in touch with a company that develops a software deployment automation solution as their main product.

Another investigation that I did as part of my blog (without publishing in mind) was addressing a common criticism from various communities, such as the Debian community, that Nix would not qualify itself as a viable package management solution because it does not comply to the Filesystem Hierarchy Standard (FHS).

I also did a comparison with the deployment properties of GoboLinux, another Linux distribution that deliberately deviates from the FHS to show that a different filesystem organisation has clear benefits for making deployments more reliable and reproducible. The GoboLinux blog post appeared on Reddit (both the NixOS and Linux channels) and attracted quite a few visitors.

From these practical investigations I wrote a blog post that draws some general conclusions.

Reaching the end of my PhD research


After an interesting year, both from a research and blogging perspective, I was reaching the final year of my PhD research (in the Netherlands, a contract of a PhD student is typically only valid for 4 years).

I had already slowly started with writing my PhD thesis, but there was still some unfinished business. There were four (!!!) more research ideas that I wanted to publish about (which was retrospectively looking, a very overambitious goal).

One of these papers was a collaboration project in which we combined our knowledge about software deployment and construction with license compliance engineering to determine which source files are actually used in a binary so that we could detect whether it meets the terms and conditions of free and open-source licenses.

Although our contribution looked great and we were able to detect a compliance issue in FFmpeg, a widely used open source project, the paper was rejected twice in a row. The second time the reviews were really vague and not helpful at all. One of my co-authors called the reviewers extremely incompetent.

After the second rejection, I was (sort of) done with it and extremely disappointed. I did not even want to revise it and submit it anywhere else. Nonetheless, I have published the paper as a technical report, reported about it on my blog, and added it as a chapter to my PhD thesis.

(As a sidenote: more than 2 years later, we did another attempt to resurrect the paper. The revisions were quite a bit of work, but the third version finally got accepted at ASE 2014: one of the top general conferences in the software engineering domain.

This was a happy moment for me -- I was so disappointed about the process, and I was happy to see that there were people who could motivate and convince me that we should not give up).

Another research idea was formalizing infrastructure deployment. Sadly, the idea was not really considered novel -- it was mostly just an incremental improvement over our earlier work. As a result, I got two paper rejections in a row. After the second rejection, I have abolished the idea to publish about it, but I still wrote a chapter about it in my PhD thesis.

All the above rejections (and the corresponding reviews) really started to negatively impact my motivation. I wrote two blog posts about my observations: one blog post was about a common reason for rejecting a paper: the complaint that a contribution is engineering, but not science (which is quite weird for research in software engineering). Another blog post was about the difficulties in connecting academic research with software engineering practice. From my experiences thus far, I concluded that there is a huge gap between the two.

Fortunately, I still managed to gather enough energy to finish my third idea. I already had a proof-of-concept implementation for managing state of services deployed by Disnix for a while. By pulling out a few all nighters, I managed to write a research paper (all by myself) and submitted it to HotSWUp 2012. That paper got instantly accepted, which was a good boost for my motivation.

In the last few months, the only thing I could basically do is finishing up my PhD thesis. To still keep my blog somewhat active, I have written a number of posts about myconferenceexperiences.


Although I already had a very early proof-of-concept implementation, I never managed to finish my fourth research paper idea. This was not a problem for finishing my PhD thesis as I already had enough material to complete it, but still I consider it one the more interesting research ideas that I never got to finish. As of today, I still have not finished or published about it (neither on my blog or in a research paper).

Leaving academia, working for industry


A couple of weeks before my contract with the university was about to expire, I finished the first draft of my PhD thesis and submitted it to the reading committee for review.

Although the idea of having an academic research career crossed my mind several times, I ultimately decided that this was not something I wanted to pursue, for a variety of reasons. Most notably, the discrepancy between topics suitable for publishing and things that could be applied in practice was one of the major reasons.

All that was left was looking for a new job. After an interesting job searching month I joined Conference Compass, a startup company that consisted of fewer than 10 people when I joined.

One of the interesting technical challenges they were facing was setting up a product-line for their mobile conference apps. My past experience with deployment technologies turned out to come in quite handy.

The Nix project did not disappear after all involved people in the PDS project left the university (besides me, Eelco Dolstra (the author of the Nix package manager) and Rob Vermaas also joined an industrial company) -- the project moved to GitHub, increasing its popularity and the number of contributors.

The fact that the Nix project continued and that blogging had so many advantages for me personally, I decided to resume my blog. The only thing that changed is that my blog was no longer in service of a research project, but just a personal means to dive into technical subjects.

Reintroducing Nix to different audiences


Almost at the same time that the Nix project moved to GitHub, the GNU Guix project was announced: GNU Guix is a package manager with similar objectives to the Nix package manager, but with some notable differences too: instead of the Nix expression language, it uses Scheme as a configuration language.

Moreover, the corresponding software packages distribution: GuixSD, exclusively provides free software.

GNU Guix reuses the Nix daemon, and related components such as the Nix store from the Nix package manager to organize and isolate software packages.

I wrote a comparison blog post, that was posted on Reddit and Hackernews attracting a huge number of visitors. The amount of visitors was several orders of magnitude higher than all the blog posts I have written before that. As of today, this blog post is still in my overall top 10.

One of the things I did in the first month at Conference Compass is explaining the Nix package manager to my colleagues who did not have much system administration experience or knowledge about package managers.

I have decided to use a programming language-centered Nix explanation recipe, as opposed to a system administration-centered explanation. In many ways, I consider this explanation recipe the better of the three that I wrote.

This blog post also got posted on Reddit and Hackernews attracting a huge number of visitors. In only one month, with two blog posts, I attracted more visitors to my blog than all my previous blog posts combined.

Developing an app building infrastructure



As explained earlier, Conference Compass was looking into developing a product-line for mobile conference apps.

I did some of the work in the open, by using a variety of tools from the Nix project and making contributions to the Nix project.

I have packaged many components of the Android SDK and developed a function abstraction that automatically builds Android APKs. Similarly, I also built a function for iOS apps (that works both with the simulator and real devices), and for Appcelerator Titanium: a JavaScript-based cross platform framework allowing you target a variety of mobile platforms including Android and iOS.

In addition to the Nix-based app building infrastructure, I have also described how you can set up Hydra: a Nix-based continuous integration service to automatically build mobile apps and other software projects.

It turns out that in addition to ordinary software projects, Hydra also works well for distributing bleeding edge builds of mobile apps -- for example, you can use your phone or tablet's web browser to automatically download and install any bleeding edge build that you want.

The only thing that was a bit of a challenge was distributing apps to iOS devices with Hydra, but with some workarounds that was also possible.

I have also developed a Node.js package to conveniently integrate custom application with Hydra.

Finishing up my PhD and defending my thesis



Although I left academia, the transition to industry was actually very gradual -- as explained earlier, while being employed at Conference Compass, I still had to finish and defend my PhD thesis.

Several weeks before my planned defence date, I received feedback from my reading committee about my draft that I finished in my last month at the university. This was a very stressful period -- in addition to making revisions to my PhD thesis, I also had to arrange the printing and the logistics of the ceremony.

I also wrote three more blog posts about my thesis and the defence process: I provided a summary of my PhD thesis as a blog post, I wrote about the defence ceremony, and about my PhD thesis propositions.

Writing thesis propositions is also a tradition in the Netherlands. Earlier that year, my former colleague Felienne Hermans decided to blog and tweet about her PhD thesis propositions, and I did the same thing.

PhD thesis propositions are typically not supposed to have a direct relationship to your PhD thesis, but they should be defendable. In addition to your thesis, the committee members are also allowed to ask you questions about your propositions.

The blog post about my PhD thesis propositions (as of today) still regularly attracts visitors. The amount of visitors of this blog post heavily outnumbers the summary blog post about my PhD thesis.

In addition to my PhD thesis, there were more interesting post-academia research events: a journal paper submission finally got officially published (4 years after submitting the first draft!) and we have managed to get our paper about discovering license compliance inconsistencies accepted at ASE 2014, that was previously rejected twice.

Learning Node.js and more about JavaScript



In addition to the app building infrastructure at Conference Compass, I have also spend considerable amounts of time learning things about Node.js and its underlying concepts: the asynchronous event loop. Although I already had some JavaScript programming experience, all my knowledge thus far was limited to the web browser.

I learned about all kinds of new concepts, such as callbacks (and function-level scoping), promises, asynchronous programming (in general) and mixing callbacks with promises. Moreover, I also learned that (despite my earlier experiences in the concepts of programming languages course) working with prototypes in JavaScript was more difficult than expected. I have decided to address my earlier shortcomings in my teachings with a blog post.

With Titanium (the cross-platform mobile app development framework that uses JavaScript as an implementation language), beyond regular development work, I investigated how we can port a Node.js-based XMPP library to Titanium and how we can separate concerns well enough to make a simple, reliable chat application.

Building a service management platform and implementing major Disnix improvements


At Conference Compass, somewhere in the middle of 2013, we decided to shift away from a single monolithic backend application for all our apps, to a more modular approach in which each app has their own backend and their own storage.

After a couple of brief experiments with Heroku, we shifted to a Nix-based approach in mid 2014. NixOps was used to automatically deploy virtual machines in the cloud (using Amazon's EC2 service), and Disnix became responsible for deploying all services to these virtual machines.

In the Nix community, there was quite a bit of confusion about these two tools, because both use the Nix package manager and are designed for distributed deployment. I wrote a blog post to explain in what ways they are similar and different.

Over the course of 2015, most of my company work was concentrated on the service management platform. In addition to automating the deployment of all machines and services, I also implemented the following functionality:



In late 2015, the first NixCon conference was organized, in which I gave a presentation about Disnix and explained how it can be used for the deployment of microservices. I received all kinds of useful feedback that I implemented in the first half of 2016:


Over time, I did many more interesting Disnix developments:


Furthermore, the Dynamic Disnix framework (an extension toolset that I developed for a research paper many years ago), also got all kinds of updates. For example, it was extended to automatically assign TCP/UDP port numbers and to work with state migrations.

While working on the service management platform, five new Disnix versions were released (the first was 0.3, the last 0.8). I wrote a blog post for the 0.5 release that explains all previously released versions, including the first two prototype iterations.

Brief return to web technology


As explained in the introduction, I already had the idea to start my blog while I was still actively doing web development.

At some point I needed to make some updates to web applications that I had developed for my voluntary work that still use pieces of my old custom web framework.

I already release some pieces (most notably the layout manager) of it on my GitHub page as a side project, but at some point I have also decided to release the remainder of the components.


I also wrote a blog post about my struggles composing a decent layout and some pointers on "rational" layout decisions.

Working on Nix generators


In addition to JavaScript development at Conference Compass, I was also using Nix-related tools for automating deployments of Node.js projects.

Eventually, I created node2nix to make deployments with the NPM package manager possible in Nix expressions (at the time this was already possible with npm2nix, but node2nix was developed to address important shortcomings of npm2nix, such as circular dependencies).

Over time, I faced many more challenges that were node2nix/NPM related:


When I released my custom web framework I also did the same for PHP. I have created composer2nix to allow PHP composer projects to be deployed with the Nix package manager.

In addition to building these generators, I also heavily invested in working towards identifying common concepts and implementation decisions for both node2nix and composer2nix.

Both tools use an internal DSL to generate Nix expressions (NiJS for JavaScript, and PNDP for PHP) as opposed to using strings.

Both tools implement a domain model (that is close to NPM and composer concepts) that get translated to an object structure in the Nix expression language with a generic translation strategy.

Joining Mendix, working on Nix concepts and improvements


Slightly over 2 years ago I joinedMendix, a company that develops a low-code application development platform and related services.

While I was learning about Mendix, I wrote a blog post that explains its basic concepts.

In addition, as a crafting project, I also automated the deployment of Mendix applications with Nix technologies (and even wrote about it on the public Mendix blog).


While learning about the Mendix cloud deployment platform, I also got heavily involved in documenting its architecture. I wrote a blog post about my practices (the notation that I used was inspired by the diagrams that I generate with Disnix). I even implemented some of these ideas in the Dynamic Disnix toolset.

When I just joined Mendix, I was mostly learning about the company and their development stack. In my spare time, I made quite a few random Nix contributions:


Furthermore, I made some structural Disnix improvements as well:


Side projects



In addition to all the major themes above, there are also many in between projects and blog posts about all kinds of random subjects.

For example, one of my long-running side projects is the IFF file format experiments project (a container file format commonly used on the Commodore Amiga) that I already started in the middle of my PhD.

In addition to the viewer, I also developed a hacky Nix function to build software projects on AmigaOS, explained how to emulate the Amiga graphics modes, ported the project to SDL 2.0, and to Visual Studio so that it could run on Windows.

I also wrote many general Nix-related blog posts between major projects, such as:


And covered developer's cultural aspects, such as my experiences with Agile software development and Scrum, and developer motivation.

Some thoughts


In this blog post, I have explained my motivation for starting my blog 10 years ago, and covered all the major projects I have been working including most of the blog posts that I have written.

If you are a PhD student or a more seasoned researcher, then I would definitely encourage you to start a blog -- it gave me the following benefits:

  • It makes your job much more interesting. All aspects of your research and teaching get attention, not just the papers, that typically only reflect over a modest sub set of your work.
  • It is a good and more accessible means to get frequent interaction with peers, practitioners, and outsiders who might have an interest in your work.
  • It improves your writing skills, which is also useful for writing papers.
  • It helps me to structure my work, by working on focused goals one at the time. You can use some of these pieces as ingredient for a research paper and/or your PhD thesis.
  • It may attract more visitors than research papers.

About the last benefit: in academia, there all kinds of metrics to measure the impact of a researcher, such as the G-index, and H-index. These metrics are sometimes taken very seriously, for example, by organizations that decide whether you can get a research grant or not.

To give you a comparison: my most "popular" research paper titled: "Software deployment in a dynamic cloud: From device to service orientation in a hospital environment" was only downloaded (at the time of writing this blog post) 625 times from the ACM digital library and 240 times from IEEE Xplore. According to Google Scholar, it got 28 citations.

My most popular blog post (that I wrote as an ingredient for my PhD research) is: On Nix, NixOS and the Filesystem Hierarchy Standard (FHS) that attracted 5654 views, which is several orders of magnitude higher than my most popular research paper. In addition, I wrote several more research-related blog posts that got a comparable number of views, such as the blog post about my PhD thesis propositions.

After completing my PhD research, I wrote blog posts that attracted even several orders of magnitude more visitors than the two blog posts mentioned above.

(As a sidenote: I personally am not a big believer in the relevance of these numbers. What matters to me is the quality of my work, not quantity).

Regularly writing for yourself as part of your job is not an observation that is unique to me. For example, the famous computer scientist Edsger Dijkstra, wrote more than 1300 manuscripts (called EWDs) about topics that he considered important, without publishing in mind.

In EWD 1000, he says:

If there is one "scientific" discovery I am proud of, it is the discovery of the habit of writing without publication in mind. I experience it as a liberating habit: without it, doing the work becomes one thing and writing it down becomes another one, which is often viewed as an unpleasant burden. When working and writing have merged, that burden has been taken away.

If you feel hesitant to start your blog, he says the following about a writer's block:

I only freed myself from that writer's block by the conscious decision not to write for a target audience but to write primarily to please myself.

For software engineering practitioners (which I effectively became after leaving academia) a blog has benefits too:

  • I consider good writing skills important for practitioners as well, for example to write specifications, API documentation, other technical documentation and end-user documentation. A blog helps you developing them.
  • Structuring your thoughts and work is also useful for software development projects, in particular free and open source projects.
  • It is also a good instrument to get in touch with development and open source communities. In addition to the Nix community, I also got a bit of attention in the Titanium community (with my XMPP library porting project), the JavaScript community (for example, with my blog post about prototypes) and more recently: the InfluxData community (with my monitoring playground project).

Concluding remarks


In this blog post, I covered most of my blog post written in the last decade, but I did not elaborate much about 2020. Since 2020 is a year that will definitely not go unnoticed in the history books, I will write (as an exception) an annual reflection over 2020 tomorrow.

Moreover, after browsing over all my blog posts since the beginning of my blog, I also realized that it is a bit hard to find relevant old information.

To alleviate that problem, I have reorganized/standardized all my labels so that you can more easily search on subjects. On my homepage, I have added an overview of all labels that I am currently using.

Annual blog reflection over 2020

$
0
0
In my previous blog post that I wrote yesterday, I celebrated my blog's 10th anniversary and did a reflection over the last decade. However, I did not elaborate much about 2020.

Because 2020 is a year for the history books, I have decided to also do an annual reflection over the last year (similar to my previous annual blog reflections).

A summary of blog posts written in 2020


Nearly all of the blog posts that I have written this year were in service of only two major goals: developing the Nix process management framework and implementing service container support in Disnix.

Both of them took a substantial amount of development effort. Much more than I initially anticipated.

Investigating process management


I already started working on this topic last year. In November 2019, I wrote a blog post about packaging sysvinit scripts and a Nix-based functional organization for configuring process instances, that could potentially also be applied to other process management solutions, such as systemd and supervisord.

After building my first version of a framework (which was already a substantial leap in reaching my full objective), I thought it would not take me that much time to get all details that I originally planned finished. It turns out that I heavily underestimated the complexity.

To test my framework, I needed a simple test program that could daemonize on its own, which was (and still is) a common practice for running services on Linux and many other UNIX-like operating systems.

I thought writing such a tool that daemonizes would be easy, but after some research, I discovered that it is actually quite complicated to do it properly. I wrote a blog post about my findings.

It took me roughly three months to finish the first implementation of the process manager-agnostic abstraction layer that makes it possible to write a high-level specification of a running process, that could universally target all kinds of process managers, such as sysvinit, systemd, supervisord and launchd.

After completing the abstraction layer, I also discovered that a sufficiently high-level deployment specification of running processes could also target other kinds deployment solutions.

I have developed two additional backends for the Nix process management framework: one that uses Docker and another using Disnix. Both solutions are technically not qualified as process managers, but they can still be used as such by only using a limited set of features of these tools.

To be able to develop the Docker backend, I needed to dive deep into the underlying concepts of Docker. I wrote a blog post about the relevant Docker deployment concepts, and also gave a presentation about it at Mendix.

While implementing more examples, I also realized that to more securely run long-running services, they typically need to run as unprivileged users. To get predictable results, these unprivileged users require stable user IDs and group IDs.

Several years ago, I have already worked on a port assigner tool that could already assign unique TCP/UDP port numbers to services, so that multiple instances can co-exist.

I have extended the port assigner tool to assign arbitrary numeric IDs to generically solve this problem. In turns out that implementing this tool was much more difficult than expected -- the Dynamic Disnix toolset was originally developed under very high time pressure and had a substantial amount of technical debt.

In order to implement the numeric ID assigner tool, I needed to revise the model parsing libraries, that broke the implementations of some of the deployment planner algorithms.

To fix them, I was forced to study how these algorithms worked again. I wrote a blog post about the graph-based deployment planning algorithms and a new implementation that should be better maintainable. Retrospectively, I wish I did my homework better at the time when I wrote the original implementation.

In September, I gave a talk about the Nix process management framework at NixCon 2020, that was held online.

I pretty much reached all my objectives that I initially set for the Nix process management framework, but there is still some leftover work to bring it at an acceptable usability level -- to be able to more easily add new backends (somebody gave me s6 as an option) the transformation layer needs to be standardized.

Moreover, I still need to develop a test strategy for services so that you can be (reasonably) sure that they work with a variety of process managers and under a variety of conditions (e.g. unprivileged user deployments).

Exposing services as containers in Disnix


Disnix is a Nix-based distributed service deployment tool. Services can basically be any kind of deployment unit whose life-cycle can be managed by a companion tool called Dysnomia.

There is one practical problem though, in order to deploy a service-oriented system with Disnix, it typically requires the presence of already deployed containers (not be confused with Linux containers), that are environments in which services are managed by another service.

Some examples of container providers and corresponding services are:

  • The MySQL DBMS (as a container) and multiple hosted MySQL databases (as services)
  • Apache Tomcat (as a container) and multiple hosted Java web applications (as services)
  • systemd (as a container) and multiple hosted systemd unit configuration files (as services)

Disnix deploys the services (as described above), but not the containers. These need to be deployed by other means first.

In the past, I have been working on solutions that manage the underlying infrastructure of services as well (I typically used to call this problem domain: infrastructure deployment). For example, NixOps can deploy a network of NixOS machines that also expose container services that can be used by Disnix. It is also possible to deploy the containers as services, in a separate deployment layer managed by Disnix.

When the Nix process management framework became more usable, I wanted to make the deployment of container providers also a more accessible feature. I heavily revised Disnix with a new feature that makes it possible to expose services as container providers, making it possible to deploy both the container services and application services from a single deployment model.

To make this feature work reliably, I was again forced to revise the model transformation pipeline. This time I concluded that the lack of references in the Nix expression language was an impediment.

Another nice feature by combining the Nix process management framework and Disnix is that you can more easily deploy a heterogeneous system locally.

I have released a new version of Disnix: version 0.10, that provides all these new features.

The Monitoring playground


Besides working on the two major topics shown above, the only other thing I did was a Mendix crafting project in which I developed a monitoring playground, allowing me to locally experiment with alerting scripts, including the visualization and testing.

Some thoughts


From a blogging perspective, I am happy what I have accomplished this year -- not only have I managed to reach my usual level of productivity again (last year was somewhat disappointing), I also managed to both develop a working Nix process management framework (making it possible to use all kinds of process managers), and use Disnix to deploy both container and application services. Both of these features are on my wish list for many years.

In the Nix community, having the ability to also use other process managers than systemd, is something we have been discussing already since late 2014.

However, there are also two major things that kept me mentally occupied in the last year.

Open source work


Many blog posts are about the open source work I do. Some of my open source work is done as part of my day time job as a software engineer -- sometimes we can write a new useful feature, make an extension to an existing project that may come in handy, or do something as part of an investigation/learning project.

However, the majority of my open source work is done in my spare time -- in many cases, my motivation is not as altruistic as people may think: typically I need something to solve my own problems or there is some technical concept that I would like to explore. However, I still do a substantial amount of work to help other people or for the "greater good".

Open source projects are typically quite satisfying the work on, but they also have negative aspects (typically the negative aspects are negligible in the early stages of a project). Sadly as projects get more popular and gain more exposure, the negativity attached to them also grows.

For example, although I got quite a few positive reactions on my work on the Nix process management framework (especially at NixCon 2020), I know that not everybody is or will be happy about it.

I have worked with people in the past, who consider this kind of work a complete waste of time -- in their opinion, we already have Kubernetes that has already solved all relevant service management problems (some people even think it is a solution to all problems).

I have to admit that, while Kubernetes can be used to solve similar kind of problems (and what is not supported as a first-class feature, can still be scripted in ad-hoc ways), there is still much to think about:

  • As explained in my blog post about Docker concepts, the Nix store typically supports much more efficient sharing of common dependencies between packages than layered Docker images, resulting in much lower disk space consumption and RAM consumption of processes that have common dependencies.
  • Docker containers support the deployment of so-called microservices because of a common denominator: processes. Almost all modern operating systems and programming-languages have a notion of processes.

    As a consequence, lots of systems nowadays typically get constructed in such a way that they can be easily decomposed into processes (translating to container instances), imposing substantial overhead on each process instance (because these containers typically need to embed common sub services).

    Services can also be more efficiently deployed (in terms of storage and RAM usage) as units by managed by a common runtime (e.g. multiple Java web applications managed by Apache Tomcat or multple PHP applications managed by the Apache HTTP server).

    The latter form of reuse is now slowly disappearing, because it does not fit nicely in a container model. In Disnix, this form of reuse is a first-class concept.
  • Microservices managed by Docker (somewhat) support technology diversity, because of the fact that all major programming languages support the concept of processes.

    However, one particular kind of technology that you cannot choose is the the operating system -- Docker/Kubernetes relies on non-standardized Linux-only concepts.

    I have also been thinking about the option to pick your operating system as well: you need security, then pick: OpenBSD, you want performance, then pick: Linux etc. The Nix process management framework allows you to also target process managers on different operating systems than Linux, such as BSD rc scripts and Apple's launchd.

I personally believe that these goals are still important, and that keeps me motivated to work on it.

Furthermore, I also believe that it is important to have multiple implementations of tools that solve the same or similar kind of problems -- in the open source world, there are lot of "battles" between communities about which technology should be the standard for a certain problem.

My favourite example of such a battle is the system's process manager -- many Linux distributions nowadays have adopted systemd, but this is not without any controversy, such as in the Debian project.

It took them many years to come to the decision to adopt it, and still there are people who want to discuss "init system diversity". Likewise, there are people who find the systemd-adoption decision unacceptable, and have forked Debian into Devuan, providing a Debian-based distribution without systemd.

With the Nix process management framework the fact that systemd exists (and may not be everybody's first choice) is not a big deal -- you can actually switch to other solutions, if desired. A battle between service managers is not required. A sufficiently high-level specification of a well understood problem allows you to target multiple solutions.

Another problem I face is that these two projects are not the only projects I have been working on or maintain. There are many other projects I have been working on the past.

Sadly, I am also a very bad multitasker. If there are problems reported with my other projects, and the fix is straight forward, or there is a straight forward pull request, then it is typically no big deal to respond.

However, I also learned that some for some of the problems other people face, there is no quick fix. Sometimes I get pull requests that partially solves a problem, or in other cases: fix a specific problem, but breaks others features. These pull requests cannot always be directly accepted and also need a substantial amount of my time for reviewing.

For certain kinds of reported problems, I need to work on a fundamental revision that requires a substantial amount of development effort -- however, it is impossible to pick up such a task while working on another "major project".

Alternatively, I need to make the decision to abandon what I am currently working on and make the switch. However, this option also does not have my preference because I know it will significantly delay my original goal.

I have noticed that lots of people get dissatisfied and frustrated, including myself. Moreover, I also consider it a bad thing to feel pressure on the things I am working on in my spare time.

So what to do about it? Maybe I can write a separate blog post on this subject.

Anyway, I was not planning to abandon or stop anything. Eventually, I will pick up these other problems as well -- my strategy for now, is to do it when I am ready. People simply have to wait (so if you are reading this and waiting for something: yes, I will pick it up eventually, just be patient).

The COVID-19 crisis


The other subject that kept me (and pretty much everybody in the world) busy is the COVID-19 crisis.

I still remember the beginning of 2020 -- for me personally, it started out very well. I visited some friends that I have not seen in a long time, and then FOSDEM came, the Free and Open Source Developers European Meeting.

Already in January, I heard about this new virus that was rapidly spreading in the Wuhan region on the news. At that time, nobody in the Netherlands (or in Europe) was really worried yet. Even to questions, such as: "what will happen when it reaches Europe?", people typically responded with: "ah yes, well influenza has a impact on people too, it will not be worse!".

A few weeks later, it started to spread to countries close to Europe. The first problematic country I heard about was Iran, and a couple of weeks later it reached Italy. In Italy, it spread so rapidly that within only a few weeks, the intensive care capacity was completely drained, forcing medical personnel to make choices who could be helped and who could not.

By then, it sounded pretty serious to me. Furthermore, I was already quite sure that it was only a matter of time before it would reach the Netherlands. And indeed, at the end of February, the first COVID-19 case was reported. Apparently this person contracted the virus in Italy.

Then the spreading went quickly -- every day, more and more COVID-19 cases were reported and this amount grew exponentially. Similar to other countries, we also slowly ran into capacity problems in hospitals (materials, equipment, personnel, intensive care capacity etc.). In particular, the intensive care capacity reached at a very critical level. Fortunately, there were hospitals in Germany willing to help us out.

In March, a country-wide lockdown was announced -- basically all group activities were forbidden, schools and non-essential shops were closed, and everybody who is capable of working from should work from home. As a consequence, since March, I have been permanently working from home.

As with pretty much everybody in the world, COVID-19 has negative consequences for me as well. Fortunately, I have not much to complain about -- I did not get unemployed, I did not get sick, and also nobody in my direct neighbourhood ran into any serious problems.

The biggest implication of the COVID-19 pandemic for me is social contacts -- despite the lockdown I still regularly meet up with family and frequent acquaintances, but I have barely met any new people. For example, at Mendix, I typically came in contact with all kinds of people in the company, especially those that do not work in the same team.

Moreover, I also learned that quite a few of my contacts got isolated because of all group activities that were canceled -- for example I did not have any music rehearsals in a while, causing me not to see or speak to any of my friends there.

Same thing with conferences and meet ups -- because most of them were canceled or turned into online events, it is very difficult to have good interactions with new people.

I also did not do any traveling -- my summer holiday was basically a staycation. Fortunately, in the summer, we have managed to minimize the amount of infections, making it possible to open up public places. I visited some touristic places in the Netherlands, that are normally crowded by people from abroad. That by itself was quite interesting -- I normally tend to neglect national touristic sites.

Although the COVID-19 pandemic brought all kinds of negative things, there were also a couple of things that I consider a good thing:

  • At Mendix, we have an open office space that typically tends to be very crowded and noisy. It is not that I cannot work in such an environment, but I also realize that I do appreciate silence, especially for programming tasks that require concentration. At home, it is quiet, I have much fewer distractions and I also typically feel much less tired after a busy work day.
  • I also typically used to neglect home improvements a lot. The COVID-19 crisis helped me to finally prioritize some non-urgent home improvements tasks -- for example, on the attic, where my musical instruments are stored, I finally took the time to organize everything in such a way that I can rehearse conveniently.
  • Besides the fact that rehearsals and concerts were cancelled, I actually practiced a lot -- I even studied many advanced solo pieces that I have not looked at in years. Playing music became a standard activity between my programming tasks, to clear my mind. Normally, I would use this time to talk to people at the coffee machine in the office.
  • During busy times I also used to tend to neglect house keeping tasks a lot. I still remember (many years ago) when I just moved into my first house, doing the dishes was already a problem (I had no dish washer at that time). When working from home, it is not a problem to keep everything tidy.
  • It is also much easier to maintain healthy daily habits. In the first lockdown (that was in spring), cycling/walking/running was a daily routine that I could maintain with ease.

In the Netherlands, we have managed to overcome the first lockdown in just a matter of weeks by social distancing. Sadly, after the restrictions were relaxed we got sloppy and at the end of the summer the infection rate started to grow. We also ran into all kinds of problems to mitigate the infections -- limited test capacity, people who got tired of all the countermeasures not following the rules, illegal parties etc.

Since a couple of weeks we are in our second lockdown with a comparable level of strictness -- again, the schools and non-essentials shops are closed etc. The second lockdown feels a lot worse than the first -- now it is in the winter, people are no longer motivated (the amount of people that revolt in the Netherlands have grown substantially, including people spreading news that everything is a Hoax and/or a one big scheme organized by left-wing politicians) and it is already taking much longer than the first.

Fortunately, there is a tiny light at the end of the tunnel. In Europe, one vaccine (the Pfizer vaccine) has been approved and more submissions are pending (with good results). By Monday, the authorities will start to vaccinate people in the Netherlands.

If we can keep the infection rates and the mutations under control (such as the mutation that appeared in England) then we will eventually build up the required group immunity to finally get the situation under control (this probably is going to take many more months, but at least it is a start).

Conclusion


This elaborate reflection blog post (that is considerably longer than all my previous yearly reflections combined) reflects over 2020 that is probably a year that will no go unnoticed in the history books.

I hope everybody remains in good health and stays motivated to do what it is needed to get the virus under control.

Moreover, when the crisis is over, I also hope we can retain the positive things learned in this crisis, such as making it more a habit to allow people to work (at least partially) from home. The open-source complaint in this blog post is just a minor inconvenience compared to the COVID-19 crisis and the impact that it has on many people in the world.

The final thing I would like to say is:


HAPPY NEW YEAR!!!!!!!!!!!!

Developing an s6-rc backend for the Nix process management framework

$
0
0
One of my major blog topics last year was my experimental Nix process management framework, that is still under heavy development.

As explained in many of my earlier blog posts, one of its major objectives is to facilitate high-level deployment specifications of running processes that can be translated to configurations for all kinds of process managers and deployment solutions.

The backends that I have implemented so far, were picked for the following reasons:

  • Multiple operating systems support. The most common process management service was chosen for each operating system: On Linux, sysvinit (because this used to be the most common solution) and systemd (because it is used by many conventional Linux distributions today), bsdrc on FreeBSD, launchd for macOS, and cygrunsrv for Cygwin.
  • Supporting unprivileged user deployments. To supervise processes without requiring a service that runs on PID 1, that also works for unprivileged users, supervisord is very convenient because it was specifically designed for this purpose.
  • Docker was selected because it is a very popular solution for managing services, and process management is one of its sub responsibilities.
  • Universal process management. Disnix was selected because it can be used as a primitive process management solution that works on any operating system supported by the Nix package manager. Moreover, the Disnix services model is a super set of the processes model used by the process management framework.

Not long after writing my blog post about the process manager-agnostic abstraction layer, somebody opened an issue on GitHub with the suggestion to also support s6-rc. Although I was already aware that more process/service management solutions exist, s6-rc was a solution that I did not know about.

Recently, I have implemented the suggested s6-rc backend. Although deploying s6-rc services now works quite conveniently, getting to know s6-rc and its companion tools was somewhat challenging for me.

In this blog post, I will elaborate about my learning experiences and explain how the s6-rc backend was implemented.

The s6 tool suite


s6-rc is a software projected published on skarnet and part of a bigger tool ecosystem. s6-rc is a companion tool of s6: skarnet.org's small & secure supervision software suite.

On Linux and many other UNIX-like systems, the initialization process (typically /sbin/init) is a highly critical program:

  • It is the first program loaded by the kernel and responsible for setting the remainder of the boot procedure in motion. This procedure is responsible for mounting additional file systems, loading device drivers, and starting essential system services, such as SSH and logging services.
  • The PID 1 process supervises all processes that were directly loaded by it, as well as indirect child processes that get orphaned -- when this happens they get automatically adopted by the process that runs as PID 1.

    As explained in an earlier blog post, traditional UNIX services that daemonize on their own, deliberately orphan themselves so that they remain running in the background.
  • When a child process terminates, the parent process must take notice or the terminated process will stay behind as a zombie process.

    Because the PID 1 process is the common ancestor of all other processes, it is required to automatically reap all relevant zombie processes that become a child of it.
  • The PID 1 process runs with root privileges and, as a result, has full access to the system. When the security of the PID 1 process gets compromised, the entire system is at risk.
  • If the PID 1 process crashes, the kernel crashes (and hence the entire system) with a kernel panic.

There are many kinds of programs that you can use as a system's PID 1. For example, you can directly use a shell, such as bash, but is far more common to use an init system, such as sysvinit or systemd.

According to the author of s6, an init system is made out of four parts:

  1. /sbin/init: the first userspace program that is run by the kernel at boot time (not counting an initramfs).
  2. pid 1: the program that will run as process 1 for most of the lifetime of the machine. This is not necessarily the same executable as /sbin/init, because /sbin/init can exec into something else.
  3. a process supervisor.
  4. a service manager.

In the s6 tool eco-system, most of these parts are implemented by separate tools:

  • The first userspace program: s6-linux-init takes care of the coordination of the initialization process. It does a variety of one-time boot things: for example, it traps the ctrl-alt-del keyboard combination, it starts the shutdown daemon (that is responsible for eventually shutting down the system), and runs the initial boot script (rc.init).

    (As a sidenote: this is almost true -- the /sbin/init process is a wrapper script that "execs" into s6-linux-linux-init with the appropriate parameters).
  • When the initialization is done, s6-linux-init execs into a process called s6-svscan provided by the s6 toolset. s6-svscan's task is to supervise an entire process supervision tree, which I will explain later.
  • Starting and stopping services is done by a separate service manager started from the rc.init script. s6-rc is the most prominent option (that we will use in this blog post), but also other tools can be used.

Many conventional init systems, implement most (or sometimes all) of these aspects in a single executable.

In particular, the s6 author is highly critical of systemd: the init system that is widely used by many conventional Linux distributions today -- he dedicated an entire page with criticisms about it.

The author of s6 advocates a number of design principles for his tool eco-system (that systemd violates in many ways):

  • The Unix philosophy: do one job and do it well.
  • Doing less instead of more (preventing feature creep).
  • Keeping tight quality control over every tool by only opening up repository access to small teams only (or rather a single person).
  • Integration support: he is against the bazaar approach on project level, but in favor of the bazaar approach on an eco-system level in which everybody can write their own tools that integrate with existing tools.

The concepts implemented by the s6 tool suite were not completely "invented" from scratch. daemontools is what the author considers the ancestor of s6 (if you look at the web page then you will notice that the concept of a "supervision tree" was pioneered there and that some of the tools listed resemble the same tools in the s6 tool suite), and runit its cousin (that is also heavily inspired by daemontools).

A basic usage scenario of s6 and s6-rc


Although it is possible to use Linux distributions in which the init system, supervisor and service manager are all provided by skarnet tools, a sub set of s6 and s6-rc can also be used on any Linux distribution and other supported operating systems, such as the BSDs.

Root privileges are not required to experiment with these tools.

For example, with the following command we can use the Nix package manager to deploy the s6 supervision toolset in a development shell session:


$ nix-shell -p s6

In this development shell session, we can start the s6-svscan service as follows:


$ mkdir -p $HOME/var/run/service
$ s6-svscan $HOME/var/run/service

The s6-svscan is a service that supervises an entire process supervision tree, including processes that may accidentally become a child of it, such as orphaned processes.

The directory parameter is a scan directory that maintains the configurations of the processes that are currently supervised. So far, no supervised process have been deployed yet.

We can actually deploy services by using the s6-rc toolset.

For example, I can easily configure my trivial example system used in previous blog posts that consists of one or multiple web application processes (with an embedded HTTP server) returning static HTML pages and an Nginx reverse proxy that forwards requests to one of the web application processes based on the appropriate virtual host header.

Contrary to the other process management solutions that I have investigated earlier, s6-rc does not have an elaborate configuration language. It does not implement a parser (for very good reasons as explained by the author, because it introduces extra complexity and bugs).

Instead, you have to create directories with text files, in which each file represents a configuration property.

With the following command, I can spawn a development shell with all the required utilities to work with s6-rc:


$ nix-shell -p s6 s6-rc execline

The following shell commands create an s6-rc service configuration directory and a configuration for a single webapp process instance:


$ mkdir -p sv/webapp
$ cd sv/webapp

$ echo "longrun"> type

$ cat > run <<EOF
$ #!$(type -p execlineb) -P

envfile $HOME/envfile
exec $HOME/webapp/bin/webapp
EOF

The above shell script creates a configuration directory for a service named: webapp with the following properties:

  • It creates a service with type: longrun. A long run service deploys a process that runs in the foreground that will get supervised by s6.
  • The run file refers to an executable that starts the service. For s6-rc services it is common practice to implement wrapper scripts using execline: a non-interactive scripting language.

    The execline script shown above loads an environment variable config file with the following content: PORT=5000. This environment variable is used to configure the TCP port number to which the service should bind to and then "execs" into a new process that runs the webapp process.

    (As a sidenote: although it is a common habit to use execline for writing wrapper scripts, this is not a hard requirement -- any executable implemented in any language can be used. For example, we could also write the above run wrapper script as a bash script).

We can also configure the Nginx reverse proxy service in a similar way:


$ mkdir -p ../nginx
$ cd ../nginx

$ echo "longrun"> type
$ echo "webapp"> dependencies

$ cat > run <<EOF
$ #!$(type -p execlineb) -P

foreground { mkdir -p $HOME/var/nginx/logs $HOME/var/cache/nginx }
exec $(type -p nginx) "-p""$HOME/var/nginx""-c""$HOME/nginx/nginx.conf""-g""daemon off;"
EOF

The above shell script creates a configuration directory for a service named: nginx with the following properties:

  • It again creates a service of type: longrun because Nginx should be started as a foreground process.
  • It declares the webapp service (that we have configured earlier) a dependency ensuring that webapp is started before nginx. This dependency relationship is important to prevent Nginx doing a redirect to a non-existent service.
  • The run script first creates all mandatory state directories and finally execs into the Nginx process, with a configuration file using the above state directories, and turning off daemon mode so that it runs in the foreground.

In addition to configuring the above services, we also want to deploy the system as a whole. This can be done by creating bundles that encapsulate collections of services:


mkdir -p ../default
cd ../default

echo "bundle"> type

cat > contents <<EOF
webapp
nginx
EOF

The above shell instructions create a bundle named: default referring to both the webapp and nginx reverse proxy service that we have configured earlier.

Our s6-rc configuration directory structure looks as follows:


$ find ./sv
./sv
./sv/default
./sv/default/contents
./sv/default/type
./sv/nginx/run
./sv/nginx/type
./sv/webapp/dependencies
./sv/webapp/run
./sv/webapp/type

If we want to deploy the service directory structure shown above, we first need to compile it into a configuration database. This can be done with the following command:


$ mkdir -p $HOME/etc/s6/rc
$ s6-rc-compile $HOME/etc/s6/rc/compiled-1 $HOME/sv

The above command creates a compiled database file in: $HOME/etc/s6/rc/compiled-1 stored in: $HOME/sv.

With the following command we can initialize the s6-rc system with our compiled configuration database:


$ s6-rc-init -c $HOME/etc/s6/rc/compiled-1 -l $HOME/var/run/s6-rc \
$HOME/var/run/service

The above command generates a "live directory" in: $HOME/var/run/s6-rc containing the state of s6-rc.

With the following command, we can start all services in the: default bundle:


$ s6-rc -l $HOME/var/run/s6-rc -u change default

The above command deploys a running system with the following process tree:


As as can be seen in the diagram above, the entire process tree is supervised by s6-svscan (the program that we have started first). Every longrun service deployed by s6-rc is supervised by a process named: s6-supervise.

Managing service logging


Another important property of s6 and s6-rc is the way it handles logging. By default, all output that the supervised processes produce on the standard output and standard error are captured by s6-svscan and written to a single log stream (in our case, it will be redirected to the terminal).

When it is desired to capture the output of a service into its own dedicated log file, you need to configure the service in such a way that it writes all relevant information to a pipe. A companion logging service is required to capture the data that is sent over the pipe.

The following command-line instructions modify the webapp service (that we have created earlier) to let it send its output to another service:


$ cd sv
$ mv webapp webapp-srv
$ cd webapp-srv

$ echo "webapp-log"> producer-for
$ cat > run <<EOF
$ #!$(type -p execlineb) -P

envfile $HOME/envfile
fdmove -c 2 1
exec $HOME/webapp/bin/webapp
EOF

In the script above, we have changed the webapp service configuration as follows:

  • We rename the service from: webapp to webapp-srv. Using suffixes is a convention commonly used for s6-rc services that also have a log companion service.
  • With the producer-for property, we specify that the webapp-srv is a service that produces output for another service named: webapp-log. We will configure this service later.
  • We create a new run script that adds the following command: fdmove -c 2 1.

    The purpose of this added instruction is to redirect all output that is sent over the standard error (file descriptor: 2) to the standard output (file descriptor: 1). This redirection makes it possible that all data can be captured by the log companion service.

We can configure the log companion service: webapp-log with the following command-line instructions:


$ mkdir ../webapp-log
$ cd ../webapp-log

$ echo "longrun"> type
$ echo "webapp-srv"> consumer-for
$ echo "webapp"> pipeline-name
$ echo 3 > notification-fd

$ cat > run <<EOF
#!$(type -p execlineb) -P

foreground { mkdir -p $HOME/var/log/s6-log/webapp }
exec -c s6-log -d3 $HOME/var/log/s6-log/webapp
EOF

The service configuration created above does the following:

  • We create a service named: webapp-log that is a long running service.
  • We declare the service to be a consumer for the webapp-srv (earlier, we have already declared the companion service: webapp-srv to be a producer for this logging service).
  • We configure a pipeline name: webapp causing s6-rc to automatically generate a bundle with the name: webapp in which all involved services are its contents.

    This generated bundle allows us to always manage the service and logging companion as a single deployment unit.
  • The s6-log service supports readiness notifications. File descriptor: 3 is configured to receive that notification.
  • The run script creates the log directory in which the output should be stored and starts the s6-log service to capture the output and store the data in the corresponding log directory.

    The -d3 parameter instructs it to send a readiness notification over file descriptor 3.

After modifying the configuration files in such a way that each longrun service has a logging companion, we need to compile a new database that provides s6-rc our new configuration:


$ s6-rc-compile $HOME/etc/s6/rc/compiled-2 $HOME/sv

The above command creates a database with a new filename in: $HOME/etc/s6/rc/compiled-2. We are required to give it a new name -- the old configuration database (compiled-1) must be retained to make the upgrade process work.

With the following command, we can upgrade our running configuration:


$ s6-rc-update -l $HOME/var/run/s6-rc $HOME/etc/s6/rc/compiled-2

The result is the following process supervision tree:


As you may observe by looking at the diagram above, every service has a companion s6-log service that is responsible for capturing and storing its output.

The log files of the services can be found in $HOME/var/log/s6-log/webapp and $HOME/var/log/s6-log/nginx.

One shot services


In addition to longrun services that are useful for managing system services, more aspects need to be automated in a boot process, such as mounting file systems.

These kinds of tasks can be automated with oneshot services, that execute an up script on startup, and optionally, a down script on shutdown.

The following service configuration can be used to mount the kernel's /proc filesystem:


mkdir -p ../mount-proc
cd ../mount-proc

echo "oneshot"> type

cat > run <<EOF
$ #!$(type -p execlineb) -P
foreground { mount -t proc proc /proc }
EOF

Chain loading


The execline scripts shown in this blog post resemble shell scripts in many ways. One particular aspect that sets execline scripts apart from shell scripts is that all commands make intensive use of a concept called chain loading.

Every instruction in an execline script executes a task, may imperatively modify the environment (e.g. by changing environment variables, or changing the current working directory etc.) and then "execs" into a new chain loading task.

The last parameter of each command-line instruction refers to the command-line instruction that it needs to "execs into" -- typically this command-line instruction is put on the next line.

The execline package, as well as many packages in the s6 ecosystem, contain many programs that support chain loading.

It is also possible to implement custom chain loaders that follow the same protocol.

Developing s6-rc function abstractions for the Nix process management framework


In the Nix process management framework, I have added function abstractions for each s6-rc service type: longrun, oneshot and bundle.

For example, with the following Nix expression we can generate an s6-rclongrun configuration for the webapp process:


{createLongRunService, writeTextFile, execline, webapp}:

let
envFile = writeTextFile {
name = "envfile";
text = ''
PORT=5000
'';
};
in
createLongRunService {
name = "webapp";
run = writeTextFile {
name = "run";
executable = true;
text = ''
#!${execline}/bin/execlineb -P

envfile ${envFile}
fdmove -c 2 1
exec ${webapp}/bin/webapp
'';
};
autoGenerateLogService = true;
}

Evaluating the Nix expression above does the following:

  • It generates a service directory that corresponds to the: name parameter with a longruntype property file.
  • It generates a run execline script, that uses a generated envFile for configuring the service's port number, redirects the standard error to the standard output and starts the webapp process (that runs in the foreground).
  • The autoGenerateLogService parameter is a concept I introduced myself, to conveniently configure a companion log service, because this a very common operation -- I cannot think of any scenario in which you do not want to have a dedicated log file for a long running service.

    Enabling this option causes the service to automatically become a producer for the log companion service (having the same name with a -log suffix) and automatically configures a logging companion service that consumes from it.

In addition to constructing long run services from Nix expressions, there are also abstraction functions to create one shots: createOneShotService and bundles: createServiceBundle.

The function that generates a log companion service can also be directly invoked with: createLogServiceForLongRunService, if desired.

Generating a s6-rc service configuration from a process-manager agnostic configuration


The following Nix expression is a process manager-agnostic configuration for the webapp service, that can be translated to a configuration for any supported process manager in the Nix process management framework:


{createManagedProcess, tmpDir}:
{port, instanceSuffix ? "", instanceName ? "webapp${instanceSuffix}"}:

let
webapp = import ../../webapp;
in
createManagedProcess {
name = instanceName;
description = "Simple web application";
inherit instanceName;

process = "${webapp}/bin/webapp";
daemonArgs = [ "-D" ];

environment = {
PORT = port;
};

overrides = {
sysvinit = {
runlevels = [ 3 4 5 ];
};
};
}

The Nix expression above specifies the following high-level configuration concepts:

  • The name and description attributes are just meta data. The description property is ignored by the s6-rc generator, because s6-rc has no equivalent configuration property for capturing a description.
  • A process manager-agnostic configuration can specify both how the service can be started as a foreground process or as a process that daemonizes itself.

    In the above example, the process attribute specifies that the same executable needs to invoked for both a foregroundProcess and daemon. The daemonArgs parameter specifies the command-line arguments that need to be propagated to the executable to let it daemonize itself.

    s6-rc has a preference for managing foreground processes, because these can be more reliably managed. When a foregroundProcess executable can be inferred, the generator will automatically compose a longrun service making it possible for s6 to supervise it.

    If only a daemon can be inferred, the generator will compose a oneshot service that starts the daemon with the up script, and on shutdown, terminates the daemon by dereferencing the PID file in the down script.
  • The environment attribute set parameter is automatically translated to an envfile that the generated run script consumes.
  • Similar to the sysvinit backend, it is also possible to override the generated arguments for the s6-rc backend, if desired.

As already explained in the blog post that covers the framework's concepts, the Nix expression above needs to be complemented with a constructors expression that composes the common parameters of every process configuration and a processes model that constructs process instances that need to be deployed.

The following processes model can be used to deploy a webapp process and an nginx reverse proxy instance that connects to it:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir;
inherit forceDisableUserChange processManager;
};
in
rec {
webapp = rec {
port = 5000;
dnsName = "webapp.local";

pkg = constructors.webapp {
inherit port;
};
};

nginx = rec {
port = 8080;

pkg = constructors.nginxReverseProxyHostBased {
webapps = [ webapp ];
inherit port;
} {};
};
}

With the following command-line instruction, we can automatically create a scan directory and start s6-svscan:


$ nixproc-s6-svscan --state-dir $HOME/var

The --state-dir causes the scan directory to be created in the user's home directory making unprivileged deployments possible.

With the following command, we can deploy the entire system, that will get supervised by the s6-svscan service that we just started:


$ nixproc-s6-rc-switch --state-dir $HOME/var \
--force-disable-user-change processes.nix

The --force-disable-user-change parameter prevents the deployment system from creating users and groups and changing user privileges, allowing the deployment as an unprivileged user to succeed.

The result is a running system that allows us to connect to the webapp service via the Nginx reverse proxy:


$ curl -H 'Host: webapp.local' http://localhost:8080
<!DOCTYPE html>
<html>
<head>
<title>Simple test webapp</title>
</head>
<body>
Simple test webapp listening on port: 5000
</body>
</html>

Constructing multi-process Docker images supervised by s6


Another feature of the Nix process management framework is constructing multi-process Docker images in which multiple process instances are supervised by a process manager of choice.

s6 can also be used as a supervisor in a container. To accomplish this, we can use s6-linux-init as an entry point.

The following attribute generates a skeleton configuration directory:


let
skelDir = pkgs.stdenv.mkDerivation {
name = "s6-skel-dir";
buildCommand = ''
mkdir -p $out
cd $out

cat > rc.init <<EOF
#! ${pkgs.stdenv.shell} -e
rl="\$1"
shift

# Stage 1
s6-rc-init -c /etc/s6/rc/compiled /run/service

# Stage 2
s6-rc -v2 -up change default
EOF

chmod 755 rc.init

cat > rc.shutdown <<EOF
#! ${pkgs.stdenv.shell} -e

exec s6-rc -v2 -bDa change
EOF

chmod 755 rc.shutdown

cat > rc.shutdown.final <<EOF
#! ${pkgs.stdenv.shell} -e
# Empty
EOF
chmod 755 rc.shutdown.final
'';
};

The skeleton directory generated by the above sub expression contains three configuration files:

  • rc.init is the script that the init system starts, right after starting the supervisor: s6-svscan. It is responsible for initializing the s6-rc system and starting all services in the default bundle.
  • rc.shutdown script is executed on shutdown and stops all previously started services by s6-rc.
  • rc.shutdown.final runs at the very end of the shutdown procedure, after all processes have been killed and all file systems have been unmounted. In the above expression, it does nothing.

In the initialization process of the image (the runAsRoot parameter of dockerTools.buildImage), we need to execute a number of dynamic initialization steps.

First, we must initialize s6-linux-init to read its configuration files from /etc/s6/current using the skeleton directory (that we have configured in the sub expression shown earlier) as its initial contents (the -f parameter) and run the init system in container mode (the -C parameter):


mkdir -p /etc/s6
s6-linux-init-maker -c /etc/s6/current -p /bin -m 0022 -f ${skelDir} -N -C -B /etc/s6/current
mv /etc/s6/current/bin/* /bin
rmdir etc/s6/current/bin

s6-linux-init-maker generates an /bin/init script, that we can use as the container's entry point.

I want the logging services to run as an unprivileged user (s6-log) requiring me to create the user and corresponding group first:


groupadd -g 2 s6-log
useradd -u 2 -d /dev/null -g s6-log s6-log

We must also compile a database from the s6-rc configuration files, by running the following command-line instructions:


mkdir -p /etc/s6/rc
s6-rc-compile /etc/s6/rc/compiled ${profile}/etc/s6/sv

As can be seen in the rc.init script that we have generated earlier, the compiled database: /etc/s6/rc/compiled is propagated to s6-rc-init as a command-line parameter.

With the following Nix expression, we can build an s6-rc managed multi-process Docker image that deploys all the process instances in the processes model that we have written earlier:


let
pkgs = import <nixpkgs> {};

createMultiProcessImage = import ../../nixproc/create-multi-process-image/create-multi-process-image-universal.nix {
inherit pkgs system;
inherit (pkgs) dockerTools stdenv;
};
in
createMultiProcessImage {
name = "multiprocess";
tag = "test";
exprFile = ./processes.nix;
stateDir = "/var";
processManager = "s6-rc";
}

With the following command, we can build the image:


$ nix-build

and load the image into Docker with the following command:


$ docker load -i result

Discussion


With the addition of the s6-rc backend in the Nix process management framework, we have a modern alternative to systemd at our disposal.

We can easily let services be managed by s6-rc using the same agnostic high-level deployment configurations that can also be used to target other process management backends, including systemd.

What I particularly like about the s6 tool ecosystem (and this also applies in some extent to its ancestor: daemontools and cousin project: runit) is the idea to construct the entire system's initialization process and its sub concerns (process supervision, logging and service management) from separate tools, each having clear/fixed scopes.

This kind of design reminds me of microkernels -- in a microkernel design, the kernel is basically split into multiple collaborating processes each having their own responsibilities (e.g. file systems, drivers).

The microkernel is the only process that has full access to the system and typically only has very few responsibilities (e.g. memory management, task scheduling, interrupt handling).

When a process crashes, such as a driver, this failure should not tear the entire system down. Systems can even recover from problems, by restarting crashed processes.

Furthermore, these non-kernel processes typically have very few privileges. If a process' security gets compromised (such as a leaky driver), the system as a whole will not be affected.

Aside from a number of functional differences compared to systemd, there are also some non-functional differences as well.

systemd can only be used on Linux using glibc as the system's libc, s6 can also be used on different operating systems (e.g. the BSDs) with different libc implementations, such as musl.

Moreover, the supervisor service (s6-svscan) can also be used as a user-level supervisor that does not need to run as PID 1. Although systemd supports user sessions (allowing service deployments from unprivileged users), it still has the requirement to have systemd as an init system that needs to run as the system's PID 1.

Improvement suggestions


Although the s6 ecosystem provides useful tools and has all kinds of powerful features, I also have a number of improvement suggestions. They are mostly usability related:

  • I have noticed that the command-line tools have very brief help pages -- they only enumerate the available options, but they do not provide any additional information explaining what these options do.

    I have also noticed that there are no official manpages, but there is a third-party initiative that seems to provide them.

    The "official" source of reference are the HTML pages. For me personally, it is not always convenient to access HTML pages on limited machines with no Internet connection and/or only terminal access.
  • Although each individual tool is well documented (albeit in HTML), I was having quite a few difficulties figuring out how to use them together -- because every tool has a very specific purpose, you typically need to combine them in interesting ways to do something meaningful.

    For example, I could not find any clear documentation on skarnet describing typical combined usage scenarios, such as how to use s6-rc on a conventional Linux distribution that already has a different service management solution.

    Fortunately, I discovered a Linux distribution that turned out to be immensely helpful: Artix Linux. Artix Linux provides s6 as one of its supported process management solutions. I ended up installing Artix Linux in a virtual machine and reading their documentation.

    This kind of unclarity seems to be somewhat analogous to common criticisms of microkernels: one of Linus Torvalds' criticisms is that in microkernel designs, the pieces are simplified, but the coordination of the entire system is more difficult.
  • Updating existing service configurations is difficult and cumbersome. Each time I want to change something (e.g. adding a new service), then I need to compile a new database, make sure that the newly compiled database co-exists with the previous database, and then run s6-rc-update.

    It is very easy to make mistakes. For example, I ended up overwriting the previous database several times. When this happens, the upgrade process gets stuck.

    systemd, on the other hand, allows you to put a new service configuration file in the configuration directory, such as: /etc/systemd/system. We can conveniently reload the configuration with a single command-line instruction:


    $ systemctl daemon-reload
    I believe that the updating process can still be somewhat simplified in s6-rc. Fortunately, I have managed to hide that complexity in the nixproc-s6-rc-deploy tool.
  • It was also difficult to find out all the available configuration properties for s6-rc services -- I ended up looking at the examples and studying the documentation pages for s6-rc-compile, s6-supervise and service directories.

    I think that it could be very helpful to write a dedicated documentation page that describes all configurable properties of s6-rc services.
  • I believe it is also very common that for each longrun service (with a -srv suffix), that you want a companion logging service (with a -log suffix).

    As a matter of fact, I can hardly think of a situation in which you do not want this. Maybe it helps to introduce a convenience property to automatically facilitate the generation of log companion services.

Availability


The s6-rc backend described in this blog post is part of the current development version of the Nix process management framework, that is still under heavy development.

The framework can be obtained from my GitHub page.

Deploying mutable multi-process Docker containers with the Nix process management framework (or running Hydra in a Docker container)

$
0
0
In a blog post written several months ago, I have shown that the Nix process management framework can also be used to conveniently construct multi-process Docker images.

Although Docker is primarily used for managing single root application process containers, multi-process containers can sometimes be useful to deploy systems that consist of multiple, tightly coupled, processes.

The Docker manual has a section that describes how to construct images for multi-process containers, but IMO the configuration process is a bit tedious and cumbersome.

To make this process more convenient, I have built a wrapper function: createMultiProcessImage around the dockerTools.buildImage function (provided by Nixpkgs) that does the following:

  • It constructs an image that runs a Linux and Docker compatible process manager as an entry point. Currently, it supports supervisord, sysvinit, disnix and s6-rc.
  • The Nix process management framework is used to build a configuration for a system that consists of multiple processes, that will be managed by any of the supported process managers.

Although the framework makes the construction of multi-process images convenient, a big drawback of multi-process Docker containers is upgrading them -- for example, for Debian-based containers you can imperatively upgrade packages by connecting to the container:


$ docker exec -it mycontainer /bin/bash

and upgrade the desired packages, such as file:


$ apt install file

The upgrade instruction above is not reproducible -- apt may install file version 5.38 today, and 5.39 tomorrow.

To cope with these kinds of side-effects, Docker works with images that snapshot the outcomes of all the installation steps. Constructing a container from the same image will always provide the same versions of all dependencies.

As a consequence, to perform a reproducible container upgrade, it is required to construct a new image, discard the container and reconstruct the container from the new image version, causing the system as a whole to be terminated, including the processes that have not changed.

For a while, I have been thinking about this limitation and developed a solution that makes it possible to upgrade multi-process containers without stopping and discarding them. The only exception is the process manager.

To make deployments reproducible, it combines the reproducibility properties of Docker and Nix.

In this blog post, I will describe how this solution works and how it can be used.

Creating a function for building mutable Docker images


As explained in an earlier blog post, that compares the deployment properties of Nix and Docker, both solutions support reproducible deployment, albeit for different application domains.

Moreover, their reproducibility properties are built around different concepts:

  • Docker containers are reproducible, because they are constructed from images that consist of immutable layers identified by hash codes derived from their contents.
  • Nix package builds are reproducible, because they are stored in isolation in a Nix store and made immutable (the files' permissions are set read-only). In the construction process of the packages, many side effects are mitigated.

    As a result, when the hash code prefix of a package (derived from all build inputs) is the same, then the build output is also (nearly) bit-identical, regardless of the machine on which the package was built.

By taking these reproducibilty properties into account, we can create a reproducible deployment process for upgradable containers by using a specific separation of responsibilities.

Deploying the base system


For the deployment of the base system that includes the process manager, we can stick ourselves to the traditional Docker deployment workflow based on images (the only unconventional aspect is that we use Nix to build a Docker image, instead of Dockerfiles).

The process manager that the image provides deploys its configuration from a dynamic configuration directory.

To support supervisord, we can invoke the following command as the container's entry point:


supervisord --nodaemon \
--configuration /etc/supervisor/supervisord.conf \
--logfile /var/log/supervisord.log \
--pidfile /var/run/supervisord.pid

The above command starts the supervisord service (in foreground mode), using the supervisord.conf configuration file stored in /etc/supervisord.

The supervisord.conf configuration file has the following structure:


[supervisord]

[include]
files=conf.d/*

The above configuration automatically loads all program definitions stored in the conf.d directory. This directory is writable and initially empty. It can be populated with configuration files generated by the Nix process management framework.

For the other process managers that the framework supports (sysvinit, disnix and s6-rc), we follow a similar strategy -- we configure the process manager in such a way that the configuration is loaded from a source that can be dynamically updated.

Deploying process instances


Deployment of the process instances is not done in the construction of the image, but by the Nix process management framework and the Nix package manager running in the container.

To allow a processes model deployment to refer to packages in the Nixpkgs collection and install binary substitutes, we must configure a Nix channel, such as the unstable Nixpkgs channel:


$ nix-channel --add https://nixos.org/channels/nixpkgs-unstable
$ nix-channel --update

(As a sidenote: it is also possible to subscribe to a stable Nixpkgs channel or a specific Git revision of Nixpkgs).

The processes model (and relevant sub models, such as ids.nix that contains numeric ID assignments) are copied into the Docker image.

We can deploy the processes model for supervisord as follows:


$ nixproc-supervisord-switch

The above command will deploy the processes model in the NIXPROC_PROCESSES environment variable, which defaults to: /etc/nixproc/processes.nix:

  • First, it builds supervisord configuration files from the processes model (this step also includes deploying all required packages and service configuration files)
  • It creates symlinks for each configuration file belonging to a process instance in the writable conf.d directory
  • It instructs supervisord to reload the configuration so that only obsolete processes get deactivated and new services activated, causing unchanged processes to remain untouched.

(For the other process managers, we have equivalent tools: nixproc-sysvinit-switch, nixproc-disnix-switch and nixproc-s6-rc-switch).

Initial deployment of the system


Because only the process manager is deployed as part of the image (with an initially empty configuration), the system is not yet usable when we start a container.

To solve this problem, we must perform an initial deployment of the system on first startup.

I used my lessons learned from the chainloading techniques in s6 (in the previous blog post) and developed hacky generated bootstrap script (/bin/bootstrap) that serves as the container's entry point:


cat > /bin/bootstrap <<EOF
#! ${pkgs.stdenv.shell} -e

# Configure Nix channels
nix-channel --add ${channelURL}
nix-channel --update

# Deploy the processes model (in a child process)
nixproc-${input.processManager}-switch &

# Overwrite the bootstrap script, so that it simply just
# starts the process manager the next time we start the
# container
cat > /bin/bootstrap <<EOR
#! ${pkgs.stdenv.shell} -e
exec ${cmd}
EOR

# Chain load the actual process manager
exec ${cmd}
EOF
chmod 755 /bin/bootstrap

The generated bootstrap script does the following:

  • First, a Nix channel is configured and updated so that we can install packages from the Nixpkgs collection and obtain substitutes.
  • The next step is deploying the processes model by running the nixproc-*-switch tool for a supported process manager. This process is started in the background (as a child process) -- we can use this trick to force the managing bash shell to load our desired process supervisor as soon as possible.

    Ultimately, we want the process manager to become responsible for supervising any other process running in the container.
  • After the deployment process is started in the background, the bootstrap script is overridden by a bootstrap script that becomes our real entry point -- the process manager that we want to use, such as supervisord.

    Overriding the bootstrap script makes sure that the next time we start the container, it will start instantly without attempting to deploy the system again.
  • Finally, the bootstrap script "execs" into the real process manager, becoming the new PID 1 process. When the deployment of the system is done (the nixproc-*-switch process that still runs in the background), the process manager becomes responsible for reaping it.

With the above script, the workflow of deploying an upgradable/mutable multi-process container is the same as deploying an ordinary container from a Docker image -- the only (minor) difference is that the first time that we start the container, it may take some time before the services become available, because the multi-process system needs to be deployed by Nix and the Nix process management framework.

A simple usage scenario


Similar to my previous blog posts about the Nix process management framework, I will use the trivial web application system to demonstrate how the functionality of the framework can be used.

The web application system consists of one or more webapp processes (with an embedded HTTP server) that only return static HTML pages displaying their identities.

An Nginx reverse proxy forwards incoming requests to the appropriate webapp instance -- each webapp service can be reached by using its unique virtual host value.

To construct a mutable multi-process Docker image with Nix, we can write the following Nix expression (default.nix):


let
pkgs = import <nixpkgs> {};

nix-processmgmt = builtins.fetchGit {
url = https://github.com/svanderburg/nix-processmgmt.git;
ref = "master";
};

createMutableMultiProcessImage = import "${nix-processmgmt}/nixproc/create-image-from-steps/create-mutable-multi-process-image-universal.nix" {
inherit pkgs;
};
in
createMutableMultiProcessImage {
name = "multiprocess";
tag = "test";
contents = [ pkgs.mc ];
exprFile = ./processes.nix;
idResourcesFile = ./idresources.nix;
idsFile = ./ids.nix;
processManager = "supervisord"; # sysvinit, disnix, s6-rc are also valid options
}

The above Nix expression invokes the createMutableMultiProcessImage function that constructs a Docker image that provides a base system with a process manager, and a bootstrap script that deploys the multi-process system:

  • The name, tag, and contents parameters specify the image name, tag and the packages that need to be included in the image.
  • The exprFile parameter refers to a processes model that captures the configurations of the process instances that need to be deployed.
  • The idResources parameter refers to an ID resources model that specifies from which resource pools unique IDs need to be selected.
  • The idsFile parameter refers to an IDs model that contains the unique ID assignments for each process instance. Unique IDs resemble TCP/UDP port assignments, user IDs (UIDs) and group IDs (GIDs).
  • We can use the processManager parameter to select the process manager we want to use. In the above example it is supervisord, but other options are also possible.

We can use the following processes model (processes.nix) to deploy a small version of our example system:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
nix-processmgmt = builtins.fetchGit {
url = https://github.com/svanderburg/nix-processmgmt.git;
ref = "master";
};

ids = if builtins.pathExists ./ids.nix then (import ./ids.nix).ids else {};

sharedConstructors = import "${nix-processmgmt}/examples/services-agnostic/constructors/constructors.nix" {
inherit pkgs stateDir runtimeDir logDir cacheDir tmpDir forceDisableUserChange processManager ids;
};

constructors = import "${nix-processmgmt}/examples/webapps-agnostic/constructors/constructors.nix" {
inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager ids;
};
in
rec {
webapp = rec {
port = ids.webappPorts.webapp or 0;
dnsName = "webapp.local";

pkg = constructors.webapp {
inherit port;
};

requiresUniqueIdsFor = [ "webappPorts""uids""gids" ];
};

nginx = rec {
port = ids.nginxPorts.nginx or 0;

pkg = sharedConstructors.nginxReverseProxyHostBased {
webapps = [ webapp ];
inherit port;
} {};

requiresUniqueIdsFor = [ "nginxPorts""uids""gids" ];
};
}

The above Nix expression configures two process instances, one webapp process that returns a static HTML page with its identity and an Nginx reverse proxy that forwards connections to it.

A notable difference between the expression shown above and the processes models of the same system shown in my previous blog posts, is that this expression does not contain any references to files on the local filesystem, with the exception of the ID assignments expression (ids.nix).

We obtain all required functionality from the Nix process management framework by invoking builtins.fetchGit. Eliminating local references is required to allow the processes model to be copied into the container and deployed from within the container.

We can build a Docker image as follows:


$ nix-build

load the image into Docker:


$ docker load -i result

and create and start a Docker container:


$ docker run -it --name webapps --network host multiprocess:test
unpacking channels...
warning: Nix search path entry '/nix/var/nix/profiles/per-user/root/channels' does not exist, ignoring
created 1 symlinks in user environment
2021-02-21 15:29:29,878 CRIT Supervisor is running as root. Privileges were not dropped because no user is specified in the config file. If you intend to run as root, you can set user=root in the config file to avoid this message.
2021-02-21 15:29:29,878 WARN No file matches via include "/etc/supervisor/conf.d/*"
2021-02-21 15:29:29,897 INFO RPC interface 'supervisor' initialized
2021-02-21 15:29:29,897 CRIT Server 'inet_http_server' running without any HTTP authentication checking
2021-02-21 15:29:29,898 INFO supervisord started with pid 1
these derivations will be built:
/nix/store/011g52sj25k5k04zx9zdszdxfv6wy1dw-credentials.drv
/nix/store/1i9g728k7lda0z3mn1d4bfw07v5gzkrv-credentials.drv
/nix/store/fs8fwfhalmgxf8y1c47d0zzq4f89fz0g-nginx.conf.drv
/nix/store/vxpm2m6444fcy9r2p06dmpw2zxlfw0v4-nginx-foregroundproxy.sh.drv
/nix/store/4v3lxnpapf5f8297gdjz6kdra8g7k4sc-nginx.conf.drv
/nix/store/mdldv8gwvcd5fkchncp90hmz3p9rcd99-builder.pl.drv
/nix/store/r7qjyr8vr3kh1lydrnzx6nwh62spksx5-nginx.drv
/nix/store/h69khss5dqvx4svsc39l363wilcf2jjm-webapp.drv
/nix/store/kcqbrhkc5gva3r8r0fnqjcfhcw4w5il5-webapp.conf.drv
/nix/store/xfc1zbr92pyisf8lw35qybbn0g4f46sc-webapp.drv
/nix/store/fjx5kndv24pia1yi2b7b2bznamfm8q0k-supervisord.d.drv
these paths will be fetched (78.80 MiB download, 347.06 MiB unpacked):
...

As may be noticed by looking at the output, on first startup the Nix process management framework is invoked to deploy the system with Nix.

After the system has been deployed, we should be able to connect to the webapp process via the Nginx reverse proxy:


$ curl -H 'Host: webapp.local' http://localhost:8080
<!DOCTYPE html>
<html>
<head>
<title>Simple test webapp</title>
</head>
<body>
Simple test webapp listening on port: 5000
</body>
</html>

When it is desired to upgrade the system, we can change the system's configuration by connecting to the container instance:


$ docker exec -it webapps /bin/bash

In the container, we can edit the processes.nix configuration file:


$ mcedit /etc/nixproc/processes.nix

and make changes to the configuration of the system. For example, we can change the processes model to include a second webapp process:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
nix-processmgmt = builtins.fetchGit {
url = https://github.com/svanderburg/nix-processmgmt.git;
ref = "master";
};

ids = if builtins.pathExists ./ids.nix then (import ./ids.nix).ids else {};

sharedConstructors = import "${nix-processmgmt}/examples/services-agnostic/constructors/constructors.nix" {
inherit pkgs stateDir runtimeDir logDir cacheDir tmpDir forceDisableUserChange processManager ids;
};

constructors = import "${nix-processmgmt}/examples/webapps-agnostic/constructors/constructors.nix" {
inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager ids;
};
in
rec {
webapp = rec {
port = ids.webappPorts.webapp or 0;
dnsName = "webapp.local";

pkg = constructors.webapp {
inherit port;
};

requiresUniqueIdsFor = [ "webappPorts""uids""gids" ];
};

webapp2 = rec {
port = ids.webappPorts.webapp2 or 0;
dnsName = "webapp2.local";

pkg = constructors.webapp {
inherit port;
instanceSuffix = "2";
};

requiresUniqueIdsFor = [ "webappPorts""uids""gids" ];
};

nginx = rec {
port = ids.nginxPorts.nginx or 0;

pkg = sharedConstructors.nginxReverseProxyHostBased {
webapps = [ webapp webapp2 ];
inherit port;
} {};

requiresUniqueIdsFor = [ "nginxPorts""uids""gids" ];
};
}

In the above process model model, a new process instance named: webapp2 was added that listens on a unique port that can be reached with the webapp2.local virtual host value.

By running the following command, the system in the container gets upgraded:


$ nixproc-supervisord-switch

resulting in two webapp process instances running in the container:


$ supervisorctl
nginx RUNNING pid 847, uptime 0:00:08
webapp RUNNING pid 459, uptime 0:05:54
webapp2 RUNNING pid 846, uptime 0:00:08
supervisor>

The first instance: webapp was left untouched, because its configuration was not changed.

The second instance: webapp2 can be reached as follows:


$ curl -H 'Host: webapp2.local' http://localhost:8080
<!DOCTYPE html>
<html>
<head>
<title>Simple test webapp</title>
</head>
<body>
Simple test webapp listening on port: 5001
</body>
</html>

After upgrading the system, the new configuration should also get reactivated after a container restart.

A more interesting example: Hydra


As explained earlier, to create upgradable containers we require a fully functional Nix installation in a container. This observation made a think about a more interesting example than the trivial web application system.

A prominent example of a system that requires Nix and is composed out of multiple tightly integrated process is Hydra: the Nix-based continuous integration service.

To make it possible to deploy a minimal Hydra service in a container, I have packaged all its relevant components for the Nix process management framework.

The processes model looks as follows:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
nix-processmgmt = builtins.fetchGit {
url = https://github.com/svanderburg/nix-processmgmt.git;
ref = "master";
};

nix-processmgmt-services = builtins.fetchGit {
url = https://github.com/svanderburg/nix-processmgmt-services.git;
ref = "master";
};

constructors = import "${nix-processmgmt-services}/services-agnostic/constructors.nix" {
inherit nix-processmgmt pkgs stateDir runtimeDir logDir tmpDir cacheDir forceDisableUserChange processManager;
};

instanceSuffix = "";
hydraUser = hydraInstanceName;
hydraInstanceName = "hydra${instanceSuffix}";
hydraQueueRunnerUser = "hydra-queue-runner${instanceSuffix}";
hydraServerUser = "hydra-www${instanceSuffix}";
in
rec {
nix-daemon = {
pkg = constructors.nix-daemon;
};

postgresql = rec {
port = 5432;
postgresqlUsername = "postgresql";
postgresqlPassword = "postgresql";
socketFile = "${runtimeDir}/postgresql/.s.PGSQL.${toString port}";

pkg = constructors.simplePostgresql {
inherit port;
authentication = ''
# TYPE DATABASE USER ADDRESS METHOD
local hydra all ident map=hydra-users
'';
identMap = ''
# MAPNAME SYSTEM-USERNAME PG-USERNAME
hydra-users ${hydraUser} ${hydraUser}
hydra-users ${hydraQueueRunnerUser} ${hydraUser}
hydra-users ${hydraServerUser} ${hydraUser}
hydra-users root ${hydraUser}
# The postgres user is used to create the pg_trgm extension for the hydra database
hydra-users postgresql postgresql
'';
};
};

hydra-server = rec {
port = 3000;
hydraDatabase = hydraInstanceName;
hydraGroup = hydraInstanceName;
baseDir = "${stateDir}/lib/${hydraInstanceName}";
inherit hydraUser instanceSuffix;

pkg = constructors.hydra-server {
postgresqlDBMS = postgresql;
user = hydraServerUser;
inherit nix-daemon port instanceSuffix hydraInstanceName hydraDatabase hydraUser hydraGroup baseDir;
};
};

hydra-evaluator = {
pkg = constructors.hydra-evaluator {
inherit nix-daemon hydra-server;
};
};

hydra-queue-runner = {
pkg = constructors.hydra-queue-runner {
inherit nix-daemon hydra-server;
user = hydraQueueRunnerUser;
};
};

apache = {
pkg = constructors.reverseProxyApache {
dependency = hydra-server;
serverAdmin = "admin@localhost";
};
};
}

In the above processes model, each process instance represents a component of a Hydra installation:

  • The nix-daemon process is a service that comes with Nix package manager to facilitate multi-user package installations. The nix-daemon carries out builds on behalf of a user.

    Hydra requires it to perform builds as an unprivileged Hydra user and uses the Nix protocol to more efficiently orchestrate large builds.
  • Hydra uses a PostgreSQL database backend to store data about projects and builds.

    The postgresql process refers to the PostgreSQL database management system (DBMS) that is configured in such a way that the Hydra components are authorized to manage and modify the Hydra database.
  • hydra-server is the front-end of the Hydra service that provides a web user interface. The initialization procedure of this service is responsible for initializing the Hydra database.
  • The hydra-evaluator regularly updates the repository checkouts and evaluates the Nix expressions to decide which packages need to be built.
  • The hydra-queue-runner builds all jobs that were evaluated by the hydra-evaluator.
  • The apache server is used as a reverse proxy server forwarding requests to the hydra-server.

With the following commands, we can build the image, load it into Docker, and deploy a container that runs Hydra:


$ nix-build hydra-image.nix
$ docker load -i result
$ docker run -it --name hydra-test --network host hydra:test

After deploying the system, we can connect to the container:


$ docker exec -it hydra-test /bin/bash

and observe that all processes are running and managed by supervisord:


$ supervisorctl
apache RUNNING pid 1192, uptime 0:00:42
hydra-evaluator RUNNING pid 1297, uptime 0:00:38
hydra-queue-runner RUNNING pid 1296, uptime 0:00:38
hydra-server RUNNING pid 1188, uptime 0:00:42
nix-daemon RUNNING pid 1186, uptime 0:00:42
postgresql RUNNING pid 1187, uptime 0:00:42
supervisor>

With the following commands, we can create our initial admin user:


$ su - hydra
$ hydra-create-user sander --password secret --role admin
creating new user `sander'

We can connect to the Hydra front-end in a web browser by opening http://localhost (this works because the container uses host networking):


and configure a job set to a build a project, such as libprocreact:


Another nice bonus feature of having multiple process managers supported is that if we build Hydra's Nix process management configuration for Disnix, we can also visualize the deployment architecture of the system with disnix-visualize:


The above diagram displays the following properties:

  • The outer box indicates that we are deploying to a single machine: localhost
  • The inner box indicates that all components are managed as processes
  • The ovals correspond to process instances in the processes model and the arrows denote dependency relationships.

    For example, the apache reverse proxy has a dependency on hydra-server, meaning that the latter process instance should be deployed first, otherwise the reverse proxy is not able to forward requests to it.

Building a Nix-enabled container image


As explained in the previous section, mutable Docker images require a fully functional Nix package manager in the container.

Since this may also be an interesting sub use case, I have created a convenience function: createNixImage that can be used to build an image whose only purpose is to provide a working Nix installation:


let
pkgs = import <nixpkgs> {};

nix-processmgmt = builtins.fetchGit {
url = https://github.com/svanderburg/nix-processmgmt.git;
ref = "master";
};

createNixImage = import "${nix-processmgmt}/nixproc/create-image-from-steps/create-nix-image.nix" {
inherit pkgs;
};
in
createNixImage {
name = "foobar";
tag = "test";
contents = [ pkgs.mc ];
}

The above Nix expression builds a Docker image with a working Nix setup and a custom package: the Midnight Commander.

Conclusions


In this blog post, I have described a new function in the Nix process management framework: createMutableMultiProcessImage that creates reproducible mutable multi-process container images, by combining the reproducibility properties of Docker and Nix. With the exception of the process manager, process instances in a container can be upgraded without bringing the entire container down.

With this new functionality, the deployment workflow of a multi-process container configuration has become very similar to how physical and virtual machines are managed with NixOS -- you can edit a declarative specification of a system and run a single command-line instruction to deploy the new configuration.

Moreover, this new functionality allows us to deploy a complex, tightly coupled multi-process system, such as Hydra: the Nix-based continuous integration service. In the Hydra example case, we are using Nix for three deployment aspects: constructing the Docker image, deploying the multi-process system configuration and building the projects that are configured in Hydra.

A big drawback of mutable multi-process images is that there is no sharing possible between multiple multi-process containers. Since the images are not built from common layers, the Nix store is private to each container and all packages are deployed in the writable custom layer, this may lead to substantial disk and RAM overhead per container instance.

Deploying the processes model to a container instance can probably be made more convenient by using Nix flakes -- a new Nix feature that is still experimental. With flakes we can easily deploy an arbitrary number of Nix expressions to a container and pin the deployment to a specific version of Nixpkgs.

Another interesting observation is the word: mutable. I am not completely sure if it is appropriate -- both the layers of a Docker image, as well as the Nix store paths are immutable and never change after they have been built. For both solutions, immutability is an important ingredient in making sure that a deployment is reproducible.

I have decided to still call these deployments mutable, because I am looking at the problem from a Docker perspective -- the writable layer of the container (that is mounted on top of the immutable layers of an image) is modified each time that we upgrade a system.

Future work


Although I am quite happy with the ability to create mutable multi-process containers, there is still quite a bit of work that needs to be done to make the Nix process management framework more usable.

Most importantly, trying to deploy Hydra revealed all kinds of regressions in the framework. To cope with all these breaking changes, a structured testing approach is required. Currently, such an approach is completely absent.

I could also (in theory) automate the still missing parts of Hydra. For example, I have not automated the process that updates the garbage collector roots, which needs to run in a timely manner. To solve this, I need to use a cron service or systemd timer units, which is beyond the scope of my experiment.

Availability


The createMutableMultiProcessImage function is part of the experimental Nix process management framework GitHub repository that is still under heavy development.

Because the amount of services that can be deployed with the framework has grown considerably, I have moved all non-essential services (not required for testing) into a separate repository. The Hydra constructor functions can be found in this repository as well.

Using the Nix process management framework as an infrastructure deployment solution for Disnix

$
0
0
As explained in many previous blog posts, I have developed Disnix as a solution for automating the deployment of service-oriented systems -- it deploys heterogeneous systems, that consist of many different kinds of components (such as web applications, web services, databases and processes) to networks of machines.

The deployment models for Disnix are typically not fully self-contained. Foremost, a precondition that must be met before a service-oriented system can be deployed, is that all target machines in the network require the presence of Nix package manager, Disnix, and a remote connectivity service (e.g. SSH).

For multi-user Disnix installations, in which the user does not have super-user privileges, the Disnix service is required to carry out deployment operations on behalf of a user.

Moreover, the services in the services model typically need to be managed by other services, called containers in Disnix terminology (not to be confused with Linux containers).

Examples of container services are:

  • The MySQL DBMS container can manage multiple databases deployed by Disnix.
  • The Apache Tomcat servlet container can manage multiple Java web applications deployed by Disnix.
  • systemd can act as a container that manages multiple systemd units deployed by Disnix.

Managing the life-cycles of services in containers (such as activating or deactivating them) is done by a companion tool called Dysnomia.

In addition to Disnix, these container services also typically need to be deployed in advance to the target machines in the network.

The problem domain that Disnix works in is called service deployment, whereas the deployment of machines (bare metal or virtual machines) and the container services is called infrastructure deployment.

Disnix can be complemented with a variety of infrastructure deployment solutions:

  • NixOps can deploy networks of NixOS machines, both physical and virtual machines (in the cloud), such as Amazon EC2.

    As part of a NixOS configuration, the Disnix service can be deployed that facilitates multi-user installations. The Dysnomia NixOS module can expose all relevant container services installed by NixOS as container deployment targets.
  • disnixos-deploy-network is a tool that is included with the DisnixOS extension toolset. Since services in Disnix can be any kind of deployment unit, it is also possible to deploy an entire NixOS configuration as a service. This tool is mostly developed for demonstration purposes.

    A limitation of this tool is that it cannot instantiate virtual machines and bootstrap Disnix.
  • Disnix itself. The above solutions are all NixOS-based, a software distribution that is Linux-based and fully managed by the Nix package manager.

    Although NixOS is very powerful, it has two drawbacks for Disnix:

    • NixOS uses the NixOS module system for configuring system aspects. It is very powerful but you can only deploy one instance of a system service -- Disnix can also work with multiple container instances of the same type on a machine.
    • Services in NixOS cannot be deployed to other kinds software distributions: conventional Linux distributions, and other operating systems, such as macOS and FreeBSD.

    To overcome these limitations, Disnix can also be used as a container deployment solution on any operating system that is capable of running Nix and Disnix. Services deployed by Disnix can automatically be exposed as container providers.

    Similar to disnix-deploy-network, a limitation of this approach is that it cannot be used to bootstrap Disnix.

Last year, I have also added a new major feature to Disnix making it possible to deploy both application and container services in the same Disnix deployment models, minimizing the infrastructure deployment problem -- the only requirement is to have machines with Nix, Disnix, and a remote connectivity service (such as SSH) pre-installed on them.

Although this integrated feature is quite convenient, in particular for test setups, a separated infrastructure deployment process (that includes container services) still makes sense in many scenarios:

  • The infrastructure parts and service parts can be managed by different people with different specializations. For example, configuring and tuning an application server is a different responsibility than developing a Java web application.
  • The service parts typically change more frequently than the infrastructure parts. As a result, they typically have different kinds of update cycles.
  • The infrastructure components can typically be reused between projects (e.g. many systems use a database backend such as PostgreSQL or MySQL), whereas the service components are typically very project specific.

I also realized that my other project: the Nix process management framework can serve as a partial infrastructure deployment solution -- it can be used to bootstrap Disnix and deploy container services.

Moreover, it can also deploy multiple instances of container services and used on any operating system that the Nix process management framework supports, including conventional Linux distributions and other operating systems, such as macOS and FreeBSD.

Deploying and exposing the Disnix service with the Nix process management framework


As explained earlier, to allow Disnix to deploy services to a remote machine, a machine needs to have Disnix installed (and run the Disnix service for a multi-user installation), and be remotely connectible, e.g. through SSH.

I have packaged all required services as constructor functions for the Nix process management framework.

The following process model captures the configuration of a basic multi-user Disnix installation:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, spoolDir ? "${stateDir}/spool"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
ids = if builtins.pathExists ./ids-bare.nix then (import ./ids-bare.nix).ids else {};

constructors = import ../../services-agnostic/constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir spoolDir forceDisableUserChange processManager ids;
};
in
rec {
sshd = {
pkg = constructors.sshd {
extraSSHDConfig = ''
UsePAM yes
'';
};

requiresUniqueIdsFor = [ "uids""gids" ];
};

dbus-daemon = {
pkg = constructors.dbus-daemon {
services = [ disnix-service ];
};

requiresUniqueIdsFor = [ "uids""gids" ];
};

disnix-service = {
pkg = constructors.disnix-service {
inherit dbus-daemon;
};

requiresUniqueIdsFor = [ "gids" ];
};
}

The above processes model (processes.nix) captures three process instances:

  • sshd is the OpenSSH server that makes it possible to remotely connect to the machine by using the SSH protocol.
  • dbus-daemon runs a D-Bus system daemon, that is a requirement for the Disnix service. The disnix-service is propagated as a parameter, so that its service directory gets added to the D-Bus system daemon configuration.
  • disnix-service is a service that executes deployment operations on behalf of an authorized unprivileged user. The disnix-service has a dependency on the dbus-service making sure that the latter gets activated first.

We can deploy the above configuration on a machine that has the Nix process management framework already installed.

For example, to deploy the configuration on a machine that uses supervisord, we can run:


$ nixproc-supervisord-switch processes.nix

Resulting in a system that consists of the following running processes:


$ supervisorctl
dbus-daemon RUNNING pid 2374, uptime 0:00:34
disnix-service RUNNING pid 2397, uptime 0:00:33
sshd RUNNING pid 2375, uptime 0:00:34

As may be noticed, the above supervised services correspond to the processes in the processes model.

On the coordinator machine, we can write a bootstrap infrastructure model (infra-bootstrap.nix) that only contains connectivity settings:


{
test1.properties.hostname = "192.168.2.1";
}

and use the bootstrap model to capture the full infrastructure model of the system:


$ disnix-capture-infra infra-bootstrap.nix

resulting in the following configuration:


{
"test1" = {
properties = {
"hostname" = "192.168.2.1";
"system" = "x86_64-linux";
};
containers = {
echo = {
};
fileset = {
};
process = {
};
supervisord-program = {
"supervisordTargetDir" = "/etc/supervisor/conf.d";
};
wrapper = {
};
};
"system" = "x86_64-linux";
};
}

Despite the fact that we have not configured any containers explicitly, the above configuration (infrastructure.nix) already exposes a number of container services:

  • The echo, fileset and process container services are built-in container providers that any Dysnomia installation includes.

    The process container can be used to automatically deploy services that daemonize. Services that daemonize themselves do not require the presence of any external service.
  • The supervisord-program container refers to the process supervisor that manages the services deployed by the Nix process management framework. It can also be used as a container for processes deployed by Disnix.

With the above infrastructure model, we can deploy any system that depends on the above container services, such as the trivial Disnix proxy example:


{ system, distribution, invDistribution, pkgs
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "supervisord"
, nix-processmgmt ? ../../../nix-processmgmt
}:

let
customPkgs = import ../top-level/all-packages.nix {
inherit system pkgs stateDir logDir runtimeDir tmpDir forceDisableUserChange processManager nix-processmgmt;
};

ids = if builtins.pathExists ./ids.nix then (import ./ids.nix).ids else {};

processType = import "${nix-processmgmt}/nixproc/derive-dysnomia-process-type.nix" {
inherit processManager;
};
in
rec {
hello_world_server = rec {
name = "hello_world_server";
port = ids.ports.hello_world_server or 0;
pkg = customPkgs.hello_world_server { inherit port; };
type = processType;
requiresUniqueIdsFor = [ "ports" ];
};

hello_world_client = {
name = "hello_world_client";
pkg = customPkgs.hello_world_client;
dependsOn = {
inherit hello_world_server;
};
type = "package";
};
}

The services model shown above (services.nix) captures two services:

  • The hello_world_server service is a simple service that listens on a TCP port for a "hello" message and responds with a "Hello world!" message.
  • The hello_world_client service is a package providing a client executable that automatically connects to the hello_world_server.

With the following distribution model (distribution.nix), we can map all the services to our deployment machine (that runs the Disnix service managed by the Nix process management framework):


{infrastructure}:

{
hello_world_client = [ infrastructure.test1 ];
hello_world_server = [ infrastructure.test1 ];
}

and deploy the system by running the following command:


$ disnix-env -s services-without-proxy.nix \
-i infrastructure.nix \
-d distribution.nix \
--extra-params '{ processManager = "supervisord"; }'

The last parameter: --extra-params configures the services model (that indirectly invokes the createManagedProcess abstraction function from the Nix process management framework) in such a way that supervisord configuration files are generated.

(As a sidenote: without the --extra-params parameter, the process instances will be built for the disnix process manager generating configuration files that can be deployed to the process container, expecting programs to daemonize on their own and leave a PID file behind with the daemon's process ID. Although this approach is convenient for experiments, because no external service is required, it is not as reliable as managing supervised processes).

The result of the above deployment operation is that the hello-world-service service is deployed as a service that is also managed by supervisord:


$ supervisorctl
dbus-daemon RUNNING pid 2374, uptime 0:09:39
disnix-service RUNNING pid 2397, uptime 0:09:38
hello-world-server RUNNING pid 2574, uptime 0:00:06
sshd RUNNING pid 2375, uptime 0:09:39

and we can use the hello-world-client executable on the target machine to connect to the service:


$ /nix/var/nix/profiles/disnix/default/bin/hello-world-client
Trying 192.168.2.1...
Connected to 192.168.2.1.
Escape character is '^]'.
hello
Hello world!

Deploying container providers and exposing them


With Disnix, it is also possible to deploy systems that are composed of different kinds of components, such as web services and databases.

For example, the Java variant of the ridiculous Staff Tracker example consists of the following services:


The services in the diagram above have the following purpose:

  • The StaffTracker service is the front-end web application that shows an overview of staff members and their locations.
  • The StaffService service is web service with a SOAP interface that provides read and write access to the staff records. The staff records are stored in the staff database.
  • The RoomService service provides read access to the rooms records, that are stored in a separate rooms database.
  • The ZipcodeService service provides read access to zip codes, that are stored in a separate zipcodes database.
  • The GeolocationService infers the location of a staff member from its IP address using the GeoIP service.

To deploy the system shown above, we need a target machine that provides Apache Tomcat (for managing the web application front-end and web services) and MySQL (for managing the databases) as container provider services:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, spoolDir ? "${stateDir}/spool"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
ids = if builtins.pathExists ./ids-tomcat-mysql.nix then (import ./ids-tomcat-mysql.nix).ids else {};

constructors = import ../../services-agnostic/constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir spoolDir forceDisableUserChange processManager ids;
};

containerProviderConstructors = import ../../service-containers-agnostic/constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir spoolDir forceDisableUserChange processManager ids;
};
in
rec {
sshd = {
pkg = constructors.sshd {
extraSSHDConfig = ''
UsePAM yes
'';
};

requiresUniqueIdsFor = [ "uids""gids" ];
};

dbus-daemon = {
pkg = constructors.dbus-daemon {
services = [ disnix-service ];
};

requiresUniqueIdsFor = [ "uids""gids" ];
};

tomcat = containerProviderConstructors.simpleAppservingTomcat {
commonLibs = [ "${pkgs.mysql_jdbc}/share/java/mysql-connector-java.jar" ];
webapps = [
pkgs.tomcat9.webapps # Include the Tomcat example and management applications
];

properties.requiresUniqueIdsFor = [ "uids""gids" ];
};

mysql = containerProviderConstructors.mysql {
properties.requiresUniqueIdsFor = [ "uids""gids" ];
};

disnix-service = {
pkg = constructors.disnix-service {
inherit dbus-daemon;
containerProviders = [ tomcat mysql ];
};

requiresUniqueIdsFor = [ "gids" ];
};
}

The process model above is an extension of the previous processes model, adding two container provider services:

  • tomcat is the Apache Tomcat server. The constructor function: simpleAppServingTomcat composes a configuration for a supported process manager, such as supervisord.

    Moreover, it bundles a Dysnomia container configuration file, and a Dysnomia module: tomcat-webapplication that can be used to manage the life-cycles of Java web applications embedded in the servlet container.
  • mysql is the MySQL DBMS server. The constructor function also creates a process manager configuration file, and bundles a Dysnomia container configuration file and module that manages the life-cycles of databases.
  • The container services above are propagated as containerProviders to the disnix-service. This function parameter is used to update the search paths for container configuration and modules, so that services can be deployed to these containers by Disnix.

After deploying the above processes model, we should see the following infrastructure model after capturing it:


$ disnix-capture-infra infra-bootstrap.nix
{
"test1" = {
properties = {
"hostname" = "192.168.2.1";
"system" = "x86_64-linux";
};
containers = {
echo = {
};
fileset = {
};
process = {
};
supervisord-program = {
"supervisordTargetDir" = "/etc/supervisor/conf.d";
};
wrapper = {
};
tomcat-webapplication = {
"tomcatPort" = "8080";
"catalinaBaseDir" = "/var/tomcat";
};
mysql-database = {
"mysqlPort" = "3306";
"mysqlUsername" = "root";
"mysqlPassword" = "";
"mysqlSocket" = "/var/run/mysqld/mysqld.sock";
};
};
"system" = "x86_64-linux";
};
}

As may be observed, the tomcat-webapplication and mysql-database containers (with their relevant configuration properties) were added to the infrastructure model.

With the following command we can deploy the example system's services to the containers in the network:


$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

resulting in a fully functional system:


Deploying multiple container provider instances


As explained in the introduction, a limitation of the NixOS module system is that it is only possible to construct one instance of a service on a machine.

Process instances in a processes model deployed by the Nix process management framework as well as services in a Disnix services model are instantiated from functions that make it possible to deploy multiple instances of the same service to the same machine, by making conflicting properties configurable.

The following processes model was modified from the previous example to deploy two MySQL servers and two Apache Tomcat servers to the same machine:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, spoolDir ? "${stateDir}/spool"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
ids = if builtins.pathExists ./ids-tomcat-mysql-multi-instance.nix then (import ./ids-tomcat-mysql-multi-instance.nix).ids else {};

constructors = import ../../services-agnostic/constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir spoolDir forceDisableUserChange processManager ids;
};

containerProviderConstructors = import ../../service-containers-agnostic/constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir spoolDir forceDisableUserChange processManager ids;
};
in
rec {
sshd = {
pkg = constructors.sshd {
extraSSHDConfig = ''
UsePAM yes
'';
};

requiresUniqueIdsFor = [ "uids""gids" ];
};

dbus-daemon = {
pkg = constructors.dbus-daemon {
services = [ disnix-service ];
};

requiresUniqueIdsFor = [ "uids""gids" ];
};

tomcat-primary = containerProviderConstructors.simpleAppservingTomcat {
instanceSuffix = "-primary";
httpPort = 8080;
httpsPort = 8443;
serverPort = 8005;
ajpPort = 8009;
commonLibs = [ "${pkgs.mysql_jdbc}/share/java/mysql-connector-java.jar" ];
webapps = [
pkgs.tomcat9.webapps # Include the Tomcat example and management applications
];
properties.requiresUniqueIdsFor = [ "uids""gids" ];
};

tomcat-secondary = containerProviderConstructors.simpleAppservingTomcat {
instanceSuffix = "-secondary";
httpPort = 8081;
httpsPort = 8444;
serverPort = 8006;
ajpPort = 8010;
commonLibs = [ "${pkgs.mysql_jdbc}/share/java/mysql-connector-java.jar" ];
webapps = [
pkgs.tomcat9.webapps # Include the Tomcat example and management applications
];
properties.requiresUniqueIdsFor = [ "uids""gids" ];
};

mysql-primary = containerProviderConstructors.mysql {
instanceSuffix = "-primary";
port = 3306;
properties.requiresUniqueIdsFor = [ "uids""gids" ];
};

mysql-secondary = containerProviderConstructors.mysql {
instanceSuffix = "-secondary";
port = 3307;
properties.requiresUniqueIdsFor = [ "uids""gids" ];
};

disnix-service = {
pkg = constructors.disnix-service {
inherit dbus-daemon;
containerProviders = [ tomcat-primary tomcat-secondary mysql-primary mysql-secondary ];
};

requiresUniqueIdsFor = [ "gids" ];
};
}

In the above processes model, we made the following changes:

  • We have configured two Apache Tomcat instances: tomcat-primary and tomcat-secondary. Both instances can co-exist because they have been configured in such a way that they listen to unique TCP ports and have a unique instance name composed from the instanceSuffix.
  • We have configured two MySQL instances: mysql-primary and mysql-secondary. Similar to Apache Tomcat, they can both co-exist because they listen to unique TCP ports (e.g. 3306 and 3307) and have a unique instance name.
  • Both the primary and secondary instances of the above services are propagated to the disnix-service (with the containerProviders parameter) making it possible for a client to discover them.

After deploying the above processes model, we can run the following command to discover the machine's configuration:


$ disnix-capture-infra infra-bootstrap.nix
{
"test1" = {
properties = {
"hostname" = "192.168.2.1";
"system" = "x86_64-linux";
};
containers = {
echo = {
};
fileset = {
};
process = {
};
supervisord-program = {
"supervisordTargetDir" = "/etc/supervisor/conf.d";
};
wrapper = {
};
tomcat-webapplication-primary = {
"tomcatPort" = "8080";
"catalinaBaseDir" = "/var/tomcat-primary";
};
tomcat-webapplication-secondary = {
"tomcatPort" = "8081";
"catalinaBaseDir" = "/var/tomcat-secondary";
};
mysql-database-primary = {
"mysqlPort" = "3306";
"mysqlUsername" = "root";
"mysqlPassword" = "";
"mysqlSocket" = "/var/run/mysqld-primary/mysqld.sock";
};
mysql-database-secondary = {
"mysqlPort" = "3307";
"mysqlUsername" = "root";
"mysqlPassword" = "";
"mysqlSocket" = "/var/run/mysqld-secondary/mysqld.sock";
};
};
"system" = "x86_64-linux";
};
}

As may be observed, the infrastructure model contains two Apache Tomcat instances and two MySQL instances.

With the following distribution model (distribution.nix), we can divide each database and web application over the two container instances:


{infrastructure}:

{
GeolocationService = {
targets = [
{ target = infrastructure.test1;
container = "tomcat-webapplication-primary";
}
];
};
RoomService = {
targets = [
{ target = infrastructure.test1;
container = "tomcat-webapplication-secondary";
}
];
};
StaffService = {
targets = [
{ target = infrastructure.test1;
container = "tomcat-webapplication-primary";
}
];
};
StaffTracker = {
targets = [
{ target = infrastructure.test1;
container = "tomcat-webapplication-secondary";
}
];
};
ZipcodeService = {
targets = [
{ target = infrastructure.test1;
container = "tomcat-webapplication-primary";
}
];
};
rooms = {
targets = [
{ target = infrastructure.test1;
container = "mysql-database-primary";
}
];
};
staff = {
targets = [
{ target = infrastructure.test1;
container = "mysql-database-secondary";
}
];
};
zipcodes = {
targets = [
{ target = infrastructure.test1;
container = "mysql-database-primary";
}
];
};
}

Compared to the previous distribution model, the above model uses a more verbose notation for mapping services.

As explained in an earlier blog post, in deployments in which only a single container is deployed, services are automapped to the container that has the same name as the service's type. When multiple instances exist, we need to manually specify the container where the service needs to be deployed to.

After deploying the system with the following command:


$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

we will get a running system with the following deployment architecture:


Using the Disnix web service for executing remote deployment operations


By default, Disnix uses SSH to communicate to target machines in the network. Disnix has a modular architecture and is also capable of communicating to target machines by other means, for example via NixOps, the backdoor client, D-Bus, and directly executing tasks on a local machine.

There is also an external package: DisnixWebService that remotely exposes all deployment operations from a web service with a SOAP API.

To use the DisnixWebService, we must deploy a Java servlet container (such as Apache Tomcat) with the DisnixWebService application, configured in such a way that it can connect to the disnix-service over the D-Bus system bus.

The following processes model is an extension of the non-multi containers Staff Tracker example, with an Apache Tomcat service that bundles the DisnixWebService:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, spoolDir ? "${stateDir}/spool"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
ids = if builtins.pathExists ./ids-tomcat-mysql.nix then (import ./ids-tomcat-mysql.nix).ids else {};

constructors = import ../../services-agnostic/constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir spoolDir forceDisableUserChange processManager ids;
};

containerProviderConstructors = import ../../service-containers-agnostic/constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir spoolDir forceDisableUserChange processManager ids;
};
in
rec {
sshd = {
pkg = constructors.sshd {
extraSSHDConfig = ''
UsePAM yes
'';
};

requiresUniqueIdsFor = [ "uids""gids" ];
};

dbus-daemon = {
pkg = constructors.dbus-daemon {
services = [ disnix-service ];
};

requiresUniqueIdsFor = [ "uids""gids" ];
};

tomcat = containerProviderConstructors.disnixAppservingTomcat {
commonLibs = [ "${pkgs.mysql_jdbc}/share/java/mysql-connector-java.jar" ];
webapps = [
pkgs.tomcat9.webapps # Include the Tomcat example and management applications
];
enableAJP = true;
inherit dbus-daemon;

properties.requiresUniqueIdsFor = [ "uids""gids" ];
};

apache = {
pkg = constructors.basicAuthReverseProxyApache {
dependency = tomcat;
serverAdmin = "admin@localhost";
targetProtocol = "ajp";
portPropertyName = "ajpPort";

authName = "DisnixWebService";
authUserFile = pkgs.stdenv.mkDerivation {
name = "htpasswd";
buildInputs = [ pkgs.apacheHttpd ];
buildCommand = ''
htpasswd -cb ./htpasswd admin secret
mv htpasswd $out
'';
};
requireUser = "admin";
};

requiresUniqueIdsFor = [ "uids""gids" ];
};

mysql = containerProviderConstructors.mysql {
properties.requiresUniqueIdsFor = [ "uids""gids" ];
};

disnix-service = {
pkg = constructors.disnix-service {
inherit dbus-daemon;
containerProviders = [ tomcat mysql ];
authorizedUsers = [ tomcat.name ];
dysnomiaProperties = {
targetEPR = "http://$(hostname)/DisnixWebService/services/DisnixWebService";
};
};

requiresUniqueIdsFor = [ "gids" ];
};
}

The above processes model contains the following changes:

  • The Apache Tomcat process instance is constructed with the containerProviderConstructors.disnixAppservingTomcat constructor function automatically deploying the DisnixWebService and providing the required configuration settings so that it can communicate with the disnix-service over the D-Bus system bus.

    Because the DisnixWebService requires the presence of the D-Bus system daemon, it is configured as a dependency for Apache Tomcat ensuring that it is started before Apache Tomcat.
  • Connecting to the Apache Tomcat server including the DisnixWebService requires no authentication. To secure the web applications and the DisnixWebService, I have configured an apache reverse proxy that forwards connections to Apache Tomcat using the AJP protocol.

    Moreover, the reverse proxy protects incoming requests by using HTTP basic authentication requiring a username and password.

We can use the following bootstrap infrastructure model to discover the machine's configuration:


{
test1.properties.targetEPR = "http://192.168.2.1/DisnixWebService/services/DisnixWebService";
}

The difference between this bootstrap infrastructure model and the previous is that it uses a different connection property (targetEPR) that refers to the URL of the DisnixWebService.

By default, Disnix uses the disnix-ssh-client to communicate to target machines. To use a different client, we must set the following environment variables:


$ export DISNIX_CLIENT_INTERFACE=disnix-soap-client
$ export DISNIX_TARGET_PROPERTY=targetEPR

The above environment variables instruct Disnix to use the disnix-soap-client executable and the targetEPR property from the infrastructure model as a connection string.

To authenticate ourselves, we must set the following environment variables with a username and password:


$ export DISNIX_SOAP_CLIENT_USERNAME=admin
$ export DISNIX_SOAP_CLIENT_PASSWORD=secret

The following command makes it possible to discover the machine's configuration using the disnix-soap-client and DisnixWebService:


$ disnix-capture-infra infra-bootstrap.nix
{
"test1" = {
properties = {
"hostname" = "192.168.2.1";
"system" = "x86_64-linux";
"targetEPR" = "http://192.168.2.1/DisnixWebService/services/DisnixWebService";
};
containers = {
echo = {
};
fileset = {
};
process = {
};
supervisord-program = {
"supervisordTargetDir" = "/etc/supervisor/conf.d";
};
wrapper = {
};
tomcat-webapplication = {
"tomcatPort" = "8080";
"catalinaBaseDir" = "/var/tomcat";
"ajpPort" = "8009";
};
mysql-database = {
"mysqlPort" = "3306";
"mysqlUsername" = "root";
"mysqlPassword" = "";
"mysqlSocket" = "/var/run/mysqld/mysqld.sock";
};
};
"system" = "x86_64-linux";
}
;
}

After capturing the full infrastructure model, we can deploy the system with disnix-env if desired, using the disnix-soap-client to carry out all necessary remote deployment operations.

Miscellaneous: using Docker containers as light-weight virtual machines


As explained earlier in this blog post, the Nix process management framework is only a partial infrastructure deployment solution -- you still need to somehow obtain physical or virtual machines with a software distribution running the Nix package manager.

In a blog post written some time ago, I have explained that Docker containers are not virtual machines or even light-weight virtual machines.

In my previous blog post, I have shown that we can also deploy mutable Docker multi-process containers in which process instances can be upgraded without stopping the container.

The deployment workflow for upgrading mutable containers, is very machine-like -- NixOS has a similar workflow that consists of updating the machine configuration (/etc/nixos/configuration.nix) and running a single command-line instruction to upgrade machine (nixos-rebuild switch).

We can actually start using containers as VMs by adding another ingredient in the mix -- we can also assign static IP addresses to Docker containers.

With the following Nix expression, we can create a Docker image for a mutable container, using any of the processes models shown previously as the "machine's configuration":


let
pkgs = import <nixpkgs> {};

createMutableMultiProcessImage = import ../nix-processmgmt/nixproc/create-image-from-steps/create-mutable-multi-process-image-universal.nix {
inherit pkgs;
};
in
createMutableMultiProcessImage {
name = "disnix";
tag = "test";
contents = [ pkgs.mc pkgs.disnix ];
exprFile = ./processes.nix;
interactive = true;
manpages = true;
processManager = "supervisord";
}

The exprFile in the above Nix expression refers to a previously shown processes model, and the processManager the desired process manager to use, such as supervisord.

With the following command, we can build the image with Nix and load it into Docker:


$ nix-build
$ docker load -i result

With the following command, we can create a network to which our containers (with IP addresses) should belong:


$ docker network create --subnet=192.168.2.0/8 disnixnetwork

The above command creates a subnet with a prefix: 192.168.2.0 and allocates an 8-bit block for host IP addresses.

We can create and start a Docker container named: containervm using our previously built image, and assign it an IP address:


$ docker run --network disnixnetwork --ip 192.168.2.1 \
--name containervm disnix:test

By default, Disnix uses SSH to connect to remote machines. With the following commands we can create a public-private key pair and copy the public key to the container:


$ ssh-keygen -t ed25519 -f id_test -N ""

$ docker exec containervm mkdir -m0700 -p /root/.ssh
$ docker cp id_test.pub containervm:/root/.ssh/authorized_keys
$ docker exec containervm chmod 600 /root/.ssh/authorized_keys
$ docker exec containervm chown root:root /root/.ssh/authorized_keys

On the coordinator machine, that carries out the deployment, we must add the private key to the SSH agent and configure the disnix-ssh-client to connect to the disnix-service:


$ ssh-add id_test
$ export DISNIX_REMOTE_CLIENT=disnix-client

By executing all these steps, containervm can be (mostly) used as if it were a virtual machine, including connecting to it with an IP address over SSH.

Conclusion


In this blog post, I have described how the Nix process management framework can be used as a partial infrastructure deployment solution for Disnix. It can be used both for deploying the disnix-service (to facilitate multi-user installations) as well as deploying container providers: services that manage the life-cycles of services deployed by Disnix.

Moreover, the Nix process management framework makes it possible to do these deployments on all kinds of software distributions that can use the Nix package manager, including NixOS, conventional Linux distributions and other operating systems, such as macOS and FreeBSD.

If I had developed this solution a couple of years ago, it would probably have saved me many hours of preparation work for my first demo in my NixCon 2015 talk in which I wanted demonstrate that it is possible to deploy services to a heterogeneous network that consists of a NixOS, Ubuntu and Windows machine. Back then, I had to do all the infrastructure deployment tasks manually.

I also have to admit (but this statement is mostly based on my personal preferences, not facts), is that I find the functional style that the framework uses is IMO far more intuitive than the NixOS module system for certain service configuration aspects, especially for configuring container services and exposing them with Disnix and Dysnomia:

  • Because every process instance is constructed from a constructor function that makes all instance parameters explicit, you are guarded against common configuration errors such as undeclared dependencies.

    For example, the DisnixWebService-enabled Apache Tomcat service requires access to the dbus-service providing the system bus. Not having this service in the processes model, causes a missing function parameter error.
  • Function parameters in the processes model make it more clear that a process depends on another process and what that relationship may be. For example, with the containerProviders parameter it becomes IMO really clear that the disnix-service uses them as potential deployment targets for services deployed by Disnix.

    In comparison, the implementations of the Disnix and Dysnomia NixOS modules are far more complicated and monolithic -- the Dysnomia module has to figure for all potential container services deployed as part of a NixOS configuration, their properties, convert them to Dysnomia configuration files, and configure the systemd configuration for the disnix-service for proper activation ordering.

    The wants parameter (used for activation ordering) is just a list of strings, not knowing whether it contains valid references to services that have been deployed already.

Availability


The constructor functions for the services as well as the deployment examples described in this blog post can be found in the Nix process management services repository.

Future work


Slowly more and more of my personal use cases are getting supported by the Nix process management framework.

Moreover, the services repository is steadily growing. To ensure that all the services that I have packaged so far do not break, I really need to focus my work on a service test solution.

A test framework for the Nix process management framework

$
0
0
As already explained in many previous blog posts, the Nix process management framework adds new ideas to earlier service management concepts explored in Nixpkgs and NixOS:

  • It makes it possible to deploy services on any operating system that can work with the Nix package manager, including conventional Linux distributions, macOS and FreeBSD. It also works on NixOS, but NixOS is not a requirement.
  • It allows you to construct multiple instances of the same service, by using constructor functions that identify conflicting configuration parameters. These constructor functions can be invoked in such a way that these configuration properties no longer conflict.
  • We can target multiple process managers from the same high-level deployment specifications. These high-level specifications are automatically translated to parameters for a target-specific configuration function for a specific process manager.

    It is also possible to override or augment the generated parameters, to work with configuration properties that are not universally supported.
  • There is a configuration option that conveniently allows you to disable user changes making it possible to deploy services as an unprivileged user.

Although the above features are interesting, one particular challenge is that the framework cannot guarantee that all possible variations will work after writing a high-level process configuration. The framework facilitates code reuse, but it is not a write once, run anywhere approach.

To make it possible to validate multiple service variants, I have developed a test framework that is built on top of the NixOS test driver that makes it possible to deploy and test a network of NixOS QEMU virtual machines with very minimal storage and RAM overhead.

In this blog post, I will describe how the test framework can be used.

Automating tests


Before developing the test framework, I was mostly testing all my packaged services manually. Because a manual test process is tedious and time consuming, I did not have any test coverage for anything but the most trivial example services. As a result, I frequently ran into many configuration breakages.

Typically, when I want to test a process instance, or a system that is composed of multiple collaborative processes, I perform the following steps:

  • First, I need to deploy the system for a specific process manager and configuration profile, e.g. for a privileged or unprivileged user, in an isolated environment, such as a virtual machine or container.
  • Then I need to wait for all process instances to become available. Readiness checks are critical and typically more complicated than expected -- for most services, there is a time window between a successful invocation of a process and its availability to carry out its primary task, such as accepting network connections. Executing tests before a service is ready, typically results in errors.

    Although there are process managers that can generally deal with this problem (e.g. systemd has the sd_notify protocol and s6 its own protocol and a sd_notify wrapper), the lack of a standardized protocol and its adoption still requires me to manually implement readiness checks.

    (As a sidenote: the only readiness check protocol that is standardized is for traditional System V services that daemonize on their own. The calling parent process should almost terminate immediately, but still wait until the spawned daemon child process notifies it to be ready.

    As described in an earlier blog post, this notification aspect is more complicated to implement than I thought. Moreover, not all traditional System V daemons follow this protocol.)
  • When all process instances are ready, I can check whether they properly carry out their tasks, and whether the integration of these processes work as expected.

An example


I have developed a Nix function: testService that automates the above process using the NixOS test driver -- I can use this function to create a test suite for systems that are made out of running processes, such as the webapps example described in my previous blog posts about the Nix process management framework.

The example system consists of a number of webapp processes with an embedded HTTP server returning HTML pages displaying their identities. Nginx reverse proxies forward incoming connections to the appropriate webapp processes by using their corresponding virtual host header values:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, libDir ? "${stateDir}/lib"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
sharedConstructors = import ../../../examples/services-agnostic/constructors/constructors.nix {
inherit pkgs stateDir runtimeDir logDir cacheDir libDir tmpDir forceDisableUserChange processManager;
};

constructors = import ../../../examples/webapps-agnostic/constructors/constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager;
webappMode = null;
};
in
rec {
webapp1 = rec {
port = 5000;
dnsName = "webapp1.local";

pkg = constructors.webapp {
inherit port;
instanceSuffix = "1";
};
};

webapp2 = rec {
port = 5001;
dnsName = "webapp2.local";

pkg = constructors.webapp {
inherit port;
instanceSuffix = "2";
};
};

webapp3 = rec {
port = 5002;
dnsName = "webapp3.local";

pkg = constructors.webapp {
inherit port;
instanceSuffix = "3";
};
};

webapp4 = rec {
port = 5003;
dnsName = "webapp4.local";

pkg = constructors.webapp {
inherit port;
instanceSuffix = "4";
};
};

nginx = rec {
port = if forceDisableUserChange then 8080 else 80;
webapps = [ webapp1 webapp2 webapp3 webapp4 ];

pkg = sharedConstructors.nginxReverseProxyHostBased {
inherit port webapps;
} {};
};

webapp5 = rec {
port = 5004;
dnsName = "webapp5.local";

pkg = constructors.webapp {
inherit port;
instanceSuffix = "5";
};
};

webapp6 = rec {
port = 5005;
dnsName = "webapp6.local";

pkg = constructors.webapp {
inherit port;
instanceSuffix = "6";
};
};

nginx2 = rec {
port = if forceDisableUserChange then 8081 else 81;
webapps = [ webapp5 webapp6 ];

pkg = sharedConstructors.nginxReverseProxyHostBased {
inherit port webapps;
instanceSuffix = "2";
} {};
};
}

The processes model shown above (processes-advanced.nix) defines the following process instances:

  • There are six webapp process instances, each running an embedded HTTP service, returning HTML pages with their identities. The dnsName property specifies the DNS domain name value that should be used as a virtual host header to make the forwarding from the reverse proxies work.
  • There are two nginx reverse proxy instances. The former: nginx forwards incoming connections to the first four webapp instances. The latter: nginx2 forwards incoming connections to webapp5 and webapp6.

With the following command, I can connect to webapp2 through the first nginx reverse proxy:


$ curl -H 'Host: webapp2.local' http://localhost:8080
<!DOCTYPE html>
<html>
<head>
<title>Simple test webapp</title>
</head>
<body>
Simple test webapp listening on port: 5001
</body>
</html>

Creating a test suite


I can create a test suite for the web application system as follows:


{ pkgs, testService, processManagers, profiles }:

testService {
exprFile = ./processes.nix;

readiness = {instanceName, instance, ...}:
''
machine.wait_for_open_port(${toString instance.port})
'';

tests = {instanceName, instance, ...}:
pkgs.lib.optionalString (instanceName == "nginx" || instanceName == "nginx2")
(pkgs.lib.concatMapStrings (webapp: ''
machine.succeed(
"curl --fail -H 'Host: ${webapp.dnsName}' http://localhost:${toString instance.port} | grep ': ${toString webapp.port}'"
)
'') instance.webapps);

inherit processManagers profiles;
}

The Nix expression above invokes testService with the following parameters:

  • processManagers refers to a list of names of all the process managers that should be tested.
  • profiles refers to a list of configuration profiles that should be tested. Currently, it supports privileged for privileged deployments, and unprivileged for unprivileged deployments in an unprivileged user's home directory, without changing user permissions.
  • The exprFile parameter refers to the processes model of the system: processes-advanced.nix shown earlier.
  • The readiness parameter refers to a function that does a readiness check for each process instance. In the above example, it checks whether each service is actually listening on the required TCP port.
  • The tests parameter refers to a function that executes tests for each process instance. In the above example, it ignores all but the nginx instances, because explicitly testing a webapp instance is a redundant operation.

    For each nginx instance, it checks whether all webapp instances can be reached from it, by running the curl command.

The readiness and tests functions take the following parameters: instanceName identifies the process instance in the processes model, and instance refers to the attribute set containing its configuration.

Furthermore, they can refer to global process model configuration parameters:

  • stateDir: The directory in which state files are stored (typically /var for privileged deployments)
  • runtimeDir: The directory in which runtime files are stored (typically /var/run for privileged deployments).
  • forceDisableUserChange: Indicates whether to disable user changes (for unprivileged deployments) or not.

In addition to writing tests that work on instance level, it is also possible to write tests on system level, with the following parameters (not shown in the example):

  • initialTests: instructions that run right after deploying the system, but before the readiness checks, and instance-level tests.
  • postTests: instructions that run after the instance-level tests.

The above functions also accept the same global configuration parameters, and processes that refers to the entire processes model.

We can also configure other properties useful for testing:

  • systemPackages: installs additional packages into the system profile of the test virtual machine.
  • nixosConfig defines a NixOS module with configuration properties that will be added to the NixOS configuration of the test machine.
  • extraParams propagates additional parameters to the processes model.

Composing test functions


The Nix expression above is not self-contained. It is a function definition that needs to be invoked with all required parameters including all the process managers and profiles that we want to test for.

We can compose tests in the following Nix expression:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, processManagers ? [ "supervisord""sysvinit""systemd""disnix""s6-rc" ]
, profiles ? [ "privileged""unprivileged" ]
}:

let
testService = import ../../nixproc/test-driver/universal.nix {
inherit system;
};
in
{

nginx-reverse-proxy-hostbased = import ./nginx-reverse-proxy-hostbased {
inherit pkgs processManagers profiles testService;
};

docker = import ./docker {
inherit pkgs processManagers profiles testService;
};

...
}

The above partial Nix expression (default.nix) invokes the function defined in the previous Nix expression that resides in the nginx-reverse-proxy-hostbased directory and propagates all required parameters. It also composes other test cases, such as docker.

The parameters of the composition expression allow you to globally configure all the desired service variants:

  • processManagers allows you to select the process managers you want to test for.
  • profiles allows you to select the configuration profiles.

With the following command, we can test our system as a privileged user, using systemd as a process manager:


$ nix-build -A nginx-reverse-proxy-hostbased.privileged.systemd

we can also run the same test, but then as an unprivileged user:


$ nix-build -A nginx-reverse-proxy-hostbased.unprivileged.systemd

In addition to systemd, any configured process manager can be used that works in NixOS. The following command runs a privileged test of the same service for sysvinit:


$ nix-build -A nginx-reverse-proxy-hostbased.privileged.sysvinit

Results


With the test driver in place, I have managed to expand my repository of example services, provided test coverage for them and fixed quite a few bugs in the framework caused by regressions.

Below is a screenshot of Hydra: the Nix-based continuous integration service showing an overview of test results for all kinds of variants of a service:


So far, the following services work multi-instance, with multiple process managers, and (optionally) as an unprivileged user:

  • Apache HTTP server. In the services repository, there are multiple constructors for deploying an Apache HTTP server: to deploy static web applications or dynamic web applications with PHP, and to use it as a reverse proxy (via HTTP and AJP) with HTTP basic authentication optionally enabled.
  • Apache Tomcat.
  • Nginx. For Nginx we also have multiple constructors. One to deploy a configuration for serving static web apps, and two for setting up reverse proxies using paths or virtual hosts to forward incoming requests to the appropriate services.

    The reverse proxy constructors can also generate configurations that will cache the responses of incoming requests.
  • MySQL/MariaDB.
  • PostgreSQL.
  • InfluxDB.
  • MongoDB.
  • OpenSSH.
  • svnserve.
  • xinetd.
  • fcron. By default, the fcron user and group are hardwired into the executable. To facilitate unprivileged user deployments, we automatically create a package build override to propagate the --with-run-non-privileged configuration flag so that it can run as unprivileged user. Similarly, for multiple instances we create an override to use a different user and group that does not conflict with the primary instance.
  • supervisord
  • s6-svscan

The following service also works with multiple instances and multiple process managers, but not as an unprivileged user:


The following services work with multiple process managers, but not multi-instance or as an unprivileged user:

  • D-Bus
  • Disnix
  • nix-daemon
  • Hydra

In theory, the above services could be adjusted to work as an unprivileged user, but doing so is not very useful -- for example, the nix-daemon's purpose is to facilitate multi-user package deployments. As an unprivileged user, you only want to facilitate package deployments for yourself.

Moreover, the multi-instance aspect is IMO also not very useful to explore for these services. For example, I can not think of a useful scenario to have two Hydra instances running next to each other.

Discussion


The test framework described in this blog post is an important feature addition to the Nix process management framework -- it allowed me to package more services and fix quite a few bugs caused by regressions.

I can now finally show that it is doable to package services and make them work under nearly all possible conditions that the framework supports (e.g. multiple instances, multiple process managers, and unprivileged user installations).

The only limitation of the test framework is that it is not operating system agnostic -- the NixOS test driver (that serves as its foundation), only works (as its name implies) with NixOS, which itself is a Linux distribution. As a result, we can not automatically test bsdrc scripts, launchd daemons, and cygrunsrv services.

In theory, it is also possible to make a more generalized test driver that works with multiple operating systems. The NixOS test driver is a combination of ideas (e.g. a shared Nix store between the host and guest system, an API to control QEMU, and an API to manage services). We could also dissect these ideas and run them on conventional QEMU VMs running different operating systems (with the Nix package manager).

Although making a more generalized test driver is interesting, it is beyond the scope of the Nix process management framework (which is about managing process instances, not entire systems).

Another drawback is that while it is possible to test all possible service variants on Linux, it may be very expensive to do so.

However, full process manager coverage is often not required to get a reasonable level of confidence. For many services, it typically suffices to implement the following strategy:

  • Pick two process managers: one that prefers foreground processes (e.g. supervisord) and one that prefers daemons (e.g. sysvinit). This is the most significant difference (from a configuration perspective) between all these different process managers.
  • If a service supports multiple configuration variants, and multiple instances, then create a processes model that concurrently deploys all these variants.

Implementing the above strategy only requires you to test four variants, providing a high degree of certainty that it will work with all other process managers as well.

Future work


Most of the interesting functionality required to work with the Nix process management framework is now implemented. I still need to implement more changes to make it more robust and "dog food" more of my own problems as much as possible.

Moreover, the docker backend still requires a bit more work to make it more usable.

Eventually, I will be thinking of an RFC that will upstream the interesting bits of the framework into Nixpkgs.

Availability


The Nix process management framework repository as well as the example services repository can be obtained from my GitHub page.

An unconventional method for creating backups and exchanging files

$
0
0

I have written many blog posts about software deployment and configuration management. For example, a couple of years ago, I have discussed a very basic configuration management process for small organizations, in which I explained that one of the worst things that could happen is that a machine breaks down and everything that it provides gets lost.

Fortunately, good configuration management practices and deployment tools (such as Nix) can help you to restore a machine's configuration with relative ease.

Another problem is managing a machine's data, which in many ways is even more important and complicated -- software packages can be typically obtained from a variety of sources, but data is typically unique (and therefore more valuable).

Even if a machine stays operational, the data that it stores can still be at risk -- it may get deleted by accident, or corrupted (for example, by the user, or a hardware problem).

It also does not matter whether a machine is used for business (for example, storing data for information systems) or personal use (for example, documents, pictures, and audio files). In both cases, data is valuable, and as a result, needs to be protected from loss and corruption.

In addition to recovery, the availability of data is often also very important -- many users (including me) typically own multiple devices (e.g. a desktop PC, laptop and phone) and typically want access to the same data from multiple places.

Because of the importance of data, I sometimes get questions from non-technical users that want to know how I manage my personal data (such as documents, images and audio files) and what tools I would recommend.

Similar to most computer users, I too have faced my own share of reliability problems -- of all the desktop computers I owned, I ended up with a completely broken hard drive three times, and a completely broken laptop once. Furthermore, I have also worked with all kinds of external media (e.g. floppy disks, CD-ROMs etc.) each having their own share of reliability problems.

To cope with data availability and loss, I came up with a custom script that I have been conveniently using to create backups and synchronize my data between the machines that I use.

In this blog post, I will explain how this script works.

About storage media


To cope with the potential loss of data, I have always made it a habit to transfer data to external media. I have worked with a variety of them, each having their advantages and disadvantages:

  • In the old days, I used floppy disks. Most people who are (at the time reading this blog post) in their early twenties or younger, may probably have no clue what I am talking about (for those people perhaps the 'Save icon' used in many desktop applications looks familiar).

    Roughly 25 years ago, floppy disks were a common means to exchange data between computers.

    Although they were common, they had many drawbacks. Probably the biggest drawback was their limited storage capacity -- I used to own 5.25 inch disks that (on PCs) were capable of storing ~360 KiB (if both sides are used), and the more sturdy 3.5 inch disks providing double density (720 KiB) and high density capacity (1.44 MiB).

    Furthermore, floppy disks were also quite slow and could be easily damaged, for example, by toughing the magnetic surface.
  • When I switched from the Commodore Amiga to the PC, I also used tapes for a while in addition to floppy disks. They provided a substantial amount of storage capacity (~500 MiB in 1996). As of 2019 (and this probably still applies to today), tapes are still considered very cheap and reliable media for archival of data.

    What I found impractical about tapes is that they are difficult to use as random access memory -- data on a tape is stored sequentially. As a consequence, it is typically very slow to find files or to "update" existing files. Typically, a backup tool needs to scan the tape from the beginning to the end or maintain a database with known storage locations.

    Many of my personal files (such as documents) are regularly updated and older versions do not have to be retained. Instead, they should be removed to clear up storage space. With tapes this is very difficult to do.
  • When writable CD/DVDs became affordable, I used them as a backup media for a while. Similar to tapes, they also have substantial storage capacity. Furthermore, they are very fast and convenient to read.

    A similar disadvantage is that they are not a very convenient medium for updating files. Although it is possible to write multi-sessions discs, in which files can be added, overwritten, or made invisible (essentially a "soft delete"), it remained inconvenient because you can not clear up the storage space that a deleted file used to occupy.

    I also learned the hard way that writable discs (and in particular rewritable discs) are not very reliable for long term storage -- I have discarded many old writable discs (10 years or older) that can no longer be read.

Nowadays, I use a variety of USB storage devices (such as memory sticks, hard drives) as backup media. They are relatively cheap, fast, have more than enough storage capacity, and I can use them as random access memory -- it is no problem at all to update and delete data existing data.

To cope with the potential breakage of USB storage media, I always make sure that I have at least two copies of my important data.

About data availability


As already explained in the introduction, I have multiple devices for which I want the same data to be available. For example, on both my desktop PC and company laptop, I want to have access to my music and research papers collection.

A possible solution is to use a shared storage medium, such as a network drive. The advantage of this approach is that there is a single source of truth and I only need to maintain a single data collection -- when I add a new document it will immediately be available to both devices.

Although a network drive may be a possible solution, it is not a good fit for my use cases -- I typically use laptops for traveling. When I am not at home, I can no longer access my data stored on the network drive.

Another solution is to transfer all required files to the hard drive on my laptop. Doing a bulk transfer for the first time is typically not a big problem (in particular, if you use orthodox file managers), but keeping collections of files up-to-date between machines is in my experience quite tedious to do by hand.

Automating data synchronization


For both backing up and synchronizing files to other machines I need to regularly compare and update files in directories. In the former case, I need to sync data between local directories, and for the latter I need to sync data between directories on remote machines.

Each time I want make updates to my files, I want to inspect what has changed, and see which files require updating before actually doing it, so that I do not end up wasting time or risk modifying the wrong files.

Initially, I started to investigate how to implement a synchronization tool myself, but quite quickly I realized that there is already a tool available that is quite suitable for the job: rsync.

rsync is designed to efficiently transfer and synchronize files between drivers and machines across networks by comparing the modification times and sizes of files.

The only thing that I consider a drawback is that it is not fully optimized to conveniently automate my personal workflow -- to accomplish what I want, I need to memorize all the relevant rsync command-line options and run multiple command-line instructions.

To alleviate this problem, I have created a custom script, that evolved into a tool that I have named: gitlike-rsync.

Usage


gitlike-rsync is a tool that facilitates synchronisation of file collections between directories on local or remote machines using rsync and a workflow that is similar to managing Git projects.

Making backups


For example, if we have a data directory that we want to back up to another partition (for example, that refers to an external USB drive), we can open the directory:


$ cd /home/sander/Documents

and configure a destination directory, such as a directory on a backup drive (/media/MyBackupDrive/Documents):


$ gitlike-rsync destination-add /media/MyBackupDrive/Documents

By running the following command-line instruction, we can create a backup of the Documents folder:


$ gitlike-rsync push
sending incremental file list
.d..tp..... ./
>f+++++++++ bye.txt
>f+++++++++ hello.txt

sent 112 bytes received 25 bytes 274.00 bytes/sec
total size is 10 speedup is 0.07 (DRY RUN)
Do you want to proceed (y/N)? y
sending incremental file list
.d..tp..... ./
>f+++++++++ bye.txt
4 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=1/3)
>f+++++++++ hello.txt
6 100% 5.86kB/s 0:00:00 (xfr#2, to-chk=0/3)

sent 202 bytes received 57 bytes 518.00 bytes/sec
total size is 10 speedup is 0.04

The output above shows me the following:

  • When no additional command-line parameters have been provided, the script will first do a dry run and show the user what it intends to do. In the above example, it shows me that it wants to transfer the contents of the Documents folder that consists of only two files: hello.txt and bye.txt.
  • After providing my confirmation, the files in the destination directory will be updated -- the backup drive that is mounted on /media/MyBackupDrive.

I can conveniently make updates in my documents folder and update my backups.

For example, I can add a new file to the Documents folder named: greeting.txt, and run the push command again:


$ gitlike-rsync push
sending incremental file list
.d..t...... ./
>f+++++++++ greeting.txt

sent 129 bytes received 22 bytes 302.00 bytes/sec
total size is 19 speedup is 0.13 (DRY RUN)
Do you want to proceed (y/N)? y
sending incremental file list
.d..t...... ./
>f+++++++++ greeting.txt
9 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=1/4)

sent 182 bytes received 38 bytes 440.00 bytes/sec
total size is 19 speedup is 0.09

In the above output, only the greeting.txt file is transferred to backup partition, leaving the other files untouched, because they have not changed.

Restoring files from a backup


In addition to the push command, gitlike-rsync also supports pull that can be used to sync data from the configured destination folders. The pull command can be used as a means to restore data from a backup partition.

For example, if I accidentally delete a file from the Documents folder:


$ rm hello.txt

and run the pull command:


$ gitlike-rsync pull
sending incremental file list
.d..t...... ./
>f+++++++++ hello.txt

sent 137 bytes received 22 bytes 318.00 bytes/sec
total size is 19 speedup is 0.12 (DRY RUN)
Do you want to proceed (y/N)? y
sending incremental file list
.d..t...... ./
>f+++++++++ hello.txt
6 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=0/4)

sent 183 bytes received 38 bytes 442.00 bytes/sec
total size is 19 speedup is 0.09

the script is able to detect that hello.txt was removed and restore it from the backup partition.

Synchronizing files between machines in a network


In addition to local directories, that are useful for back ups, the gitlike-rsync script can also be used in a similar way to exchange files between machines, such as my desktop PC and office laptop.

With the following command-line instruction, I can automatically clone the Documents folder from my desktop PC to the Documents folder on my office laptop:


$ gitlike-rsync clone sander@desktop-pc:/home/sander/Documents

The above command connects to my desktop PC over SSH and retrieves the content of the Documents/ folder. It will also automatically configure the destination directory to synchronize with the Documents folder on the desktop PC.

When new documents have been added on the desktop PC, I just have to run the following command on my office laptop to update it:


$ gitlike-rsync pull

I can also modify the contents of the Documents folder on my office laptop and synchronize the changed files to my desktop PC with a push:


$ gitlike-rsync push

About versioning


As explained in the beginning of this blog post, in addition to the recovery of failing machines and equipment, another important reason to create backups is to protect yourself against accidental modifications.

Although gitlike-rsync can detect and display file changes, it does not do any versioning of any kind. This feature is deliberately left unimplemented, for very good reasons.

For most of my personal files (e.g. images, audio, video) I do not need any versioning. As soon as they are organized, they are not supposed to be changed.

However, for certain kinds of files I do need versioning, such as software development projects. Whenever I need versioning, my answer is very simple: I use the "ordinary" Git, even for projects that are private and not supposed to be shared on a public hosting service, such as GitHub.

As seasoned Git users may probably already know, you can turn any local directory into a Git repository, by running:


$ git init

The above command creates a local .git folder that tracks and stores changes locally.

When using a public hosting service, such as GitHub, and cloning a repository from GitHub, a remote: origin has been automatically configured to automatically push and pull changes to and from GitHub.

It is also possible to synchronize Git changes between arbitrary computers using a private SSH connection. I can, for example, configure a remote for a private repository, as follows:


$ git remote add origin sander@desktop-pc:/home/sander/Development/private-project

the above command configures the Git project that is stored in the /home/sander/Development/private-project directory on my desktop PC as a remote.

I can pull changes from the remote repository, by running:


$ git pull origin

and push locally stored changes, by running:


$ git push origin

As you may probably have already noticed, the above workflow is very similar to exchanging documents, shown earlier in this blog post.

What about backing up private Git repositories? To do this, I typically create tarballs of the Git project directories and sync them to my backup media with gitlike-rsync. The presence of the .git folder suffices to retain a project's history.

Conclusion


In this blog post, I have described gitlike-rsync, a simple opinionated wrapper script for exchanging files between local directories (for backups) and remote directories (for data exchange between machines).

As its name implies, it heavily builds on top of rsync for efficient data exchange, and the concepts of git as an inspiration for the workflow.

I have been conveniently using this script for over ten years, and it works extremely well for my own use cases and a variety of operating systems (Linux, Windows, macOS and FreeBSD).

My solution is obviously not rocket science -- my contribution is only the workflow automation. The "true credits" should go the developers of rsync and Git.

I also have to thank the COVID-19 crisis that allowed me to finally find the time to polish the script, document it and give it a name. In the Netherlands, as of today, there are still many restrictions, but the situation is slowly getting better.

Availability


I have added the gitlike-rsync script described in this blog post to my custom-scripts repository that can be obtained from my GitHub page.

Viewing all 159 articles
Browse latest View live