Quantcast
Channel: Sander van der Burg's blog
Viewing all 159 articles
Browse latest View live

Third annual blog reflection

$
0
0
Today, it's three years ago that I started this blog, so I thought this is a good opportunity to reflect about last year's writings.

Obtaining my PhD degree


I think the most memorable thing that happened for me this year is the fact that I finally obtained my PhD degree. From that moment on, I can finally call myself Dr. Sander, or actually (if I take my other degree into account): Dr. Ir. Sander (or in fact: Dr. Ir. Ing. Sander if I take all of them into account, but I believe the last degree has been been superseded by the middle one, but I'm not sure :-) ).

Anyway, after obtaining my PhD degree I don't feel that much different, apart from the fact that I feel relieved that it's done. It took me quite some effort to get my PhD dissertation and all the preparations done for the defense. Besides my thesis, I also had to defend my propositions. Most of them were not supposed to be directly related to my research subject.

Programming in JavaScript


From the moment that I have switched jobs, I have also been involved in a lot of JavaScript programming these days. Every programming language and runtime environment have their weird/obscure problems/challenges, but in my opinion JavaScript is a very special case.

As a former teaching assistant for the concepts of programming languages course, I remain interested in discovering important lessons allowing me to prevent turning code into a mess and dealing with challenges that a particular programming gives. So far, I have investigated object oriented programming through prototypes, and twoperspectives of dealing with asynchronous programming problems that come with most JavaScript environments.

Besides programming challenges I also have to perform deployment tasks from JavaScript programs. People who happen to know me know that I prefer Nix and Nix-related solutions. I have developed NiJS: an internal DSL for Nix to make my life a bit easier to do that.

Continuous integration and testing


Another technical aspect I have been working on is setting up a continuous integration facility by using Hydra: the Nix-based continuous integration server. I wrote a couple of blog posts describing its features, how to set it up and how to secure it.

I also made a couple improvements to the Android and iOS Nix build functions, so that I can use Hydra to continuously build mobile apps.

Nix/NixOS development


Besides Hydra, I have been involved with various other parts of the Nix project as well. One of the more interesting things I did is developing a Nix function that can be used to compose FHS-compatible chroot environments. This function is particularly useful to run binary-only software in NixOS that cannot be patched, such as Steam.

I also wrote two blog posts to explain the user environment and development environment concepts.

Fun programming


Besides my PhD defense and all other activities, there was a bit of room to do some fun programming as well. I have improved my Amiga video emulation library (part of my IFF file format experiments project) a bit by implementing support for super hires resolutions and redefining its architecture.

Moreover, I have updated all the packages to use the new version of this library.

Research


After obtaining my PhD degree, I'm basically relieved from my research/publication duties. However, there is one research related thing that caught my attention two months ago.

The journal paper titled: 'Disnix: A toolset for distributed deployment' that got accepted in April last year is finally going to get published in Volume 79 of 'Science of Computer Programming', unbelievable!

Although it sounds like good news that another paper of mine gets published, the thing that disturbs me is that the publication process took an insanely long time! I wrote the first version of this paper for the WASDeTT-3 workshop that was held in October 2010. So that technically means that I started doing the work for the paper several months prior to that.

In February 2011, I have adapted/extended the workshop paper and submitted its first journal draft. Now, in January 2014 it finally gets published, which means that it took almost 3 years to get published (if you take the workshop into account as well, then it's actually closer to 3,5 years!).

In some academic assessment standards, journal papers have more value than conference papers. Although this journal paper should increase my value as a researcher, it's actually crazy if you think too much about it. The first reason is that I wrote the first version of this paper before I started this blog. Meanwhile, I have already written 54 blog articles, two tech reports, published two papers at conferences, and I finished my PhD dissertation.

The other reason is that peer reviewing and publishing should help the authors and the research discipline in general. To me this does not look like any help. Meanwhile, in the current development version of Disnix some aspects of its architecture have evolved considerably compared to what has been described in the paper, so it is no use for anyone else in the research community anymore.

The only value the paper still provides are the general ideas and the way Disnix manifests itself externally.

Although the paper is not completely valueless, and I'm happy it gets published, it also feels weird that I don't depend on it anymore.

Blog posts


As with my previous annual reflections, I will also publish the top 10 of my most frequently read blog posts:

  1. On Nix and GNU Guix. This is a critical blog post that also ended up first in last year's top 10. I think this blog posts will remain at the top position for the time being, since it attracted an insane amount of visitors.
  2. An alternative explanation of the Nix package manager. My alternative explanation of Nix, which I wrote to clarify things. It was also second in last year's top 10.
  3. Setting up a Hydra build cluster for continuous integration and testing (part 1). Apparently, Hydra and some general principles about continuous integration have attracted quite some vistors. However, the follow up blog posts I wrote about Hydra don't seem to be that interesting to outsiders.
  4. Using Nix while doing development. I wrote this blog post 2 days ago, and it attracted quite some visitors. I have noticed that setting up development environments is an attractive feature for Nix users.
  5. Second computer. This is an old blog post about my good ol' Amiga. It was also in all previous top 10s and I think it will remain like that for the time being. The Amiga rocks!
  6. An evaluation and comparison of GoboLinux. Another blog article that remains popular from the beginning. It's still a pity that GoboLinux has not been updated and sticks to their 014.01 release, which dates from 2008.
  7. Composing FHS-compatible chroot environments with Nix (or deploying Steam in NixOS). This is something I have developed to be able to run Steam in NixOS. It seems to have attracted quite some users, which does not come as a surprise. NixOS users want to play Half-Life!
  8. Software deployment complexity. An old blog post about software deployment complexity in general. Still remains popular.
  9. Deploying iOS applications with the Nix package manager. A blog post that I wrote last year describing how we can use the Nix package manager to build apps for the iPhone/iPad. For a long time the Android variant of this blog post was more popular, but recently this blog article surpassed it. I have no clue why.
  10. Porting software to AmigaOS (unconventional style). People still seem to like one of my craziest experiments.

Conclusion


I already have three more blog posts in draft/planning stages and more ideas that I like to explore, so expect more to come. The remaining thing I'd like to say is:

HAPPY NEW YEAR!!!


Building Appcelerator Titanium apps with Nix

$
0
0
Last month, I have been working on quite a lot of things. One of the things I did was improving the Nix function that builds Titanium SDK applications. In fact, it was in Nixpkgs for quite a while already, but I have never written about it on my blog, apart from a brief reference in an earlier blog post about Hydra.

The reason that I have decided to write about this function is because the process of getting Titanium applications deployable with Nix is quite painful (although I have managed to do it) and I want to report about my experiences so that these issues can be hopefully resolved in the future.

Although I have a strong opinion on certain aspects of Titanium, this blog post is not to discuss about the development aspects of the Titanium framework. Instead, the focus is on getting the builds of Titanium apps automated.

What is Titanium SDK?


Titanium is an application framework developed by Appcelerator, which purpose is to enable rapid development of mobile apps for multiple platforms. Currently, Titanium supports iOS, Android, Tizen, Blackberry and mobile web applications.

With Titanium, developers use JavaScript as an implementation language. The JavaScript code is packaged along with the produced app bundles, deployed to an emulator or device and interpreted there. For example, on Android Google's V8 JavaScript runtime is used, and on iOS Apple's JavaScriptCore is used.

Besides using JavaScript code, Titanium also provides an API supporting database access and (fairly) cross platform GUI widgets that have a (sort of) native look on each platform.

Titanium is not a write once run anywhere approach when it comes to cross platform support, but claims that 60-90% of the app code can be reused among platforms.

Finally, the Titanium Studio software distribution is proprietary software, but most of its underlying components (including the Titanium SDK) are free and open-source software available under the Apache Software License. As far as I can see, the Nix function that I wrote does not depend on any proprietary components, besides the Java Development Kit.

Packaging the Titanium CLI


The first thing that needs to be done to automate Titanium builds is being able to build stuff from the command-line. Appcelerator provides a command-line utility (CLI) that is specifically designed for this purpose and is provided as a Node.js package that can be installed through the NPM package manager.

Packaging NPM stuff in Nix is actually quite straight forward and probably the easiest part of getting the builds of Titanium apps automated. Simply adding titanium to the list of node packages (pkgs/top-level/node-packages.json) in Nixpkgs and running npm2nix, a utility developed by Shea Levy that automatically generates Nix expressions for any node package and all their dependencies, did the job for me.

Packaging the Titanium SDK


The next step is packaging the Titanium SDK that contains API libraries, templates and build script plugins for each target platform. The CLI supports multiple SDK versions at the same time and requires at least one version of an SDK installed.

I've obtained an SDK version from Appcelerator's continuous builds page. Since the SDK distributions are ZIP files containing binaries, I have to use the patching/wrapping tricks I have described in a few earlier blog posts again.

The Nix expression I wrote for the SDK basically unzips the 3.1.4 distribution, copies the contents into the Nix store and makes the following changes:

  • The SDK distribution contains a collection of Python scripts that execute build and debugging tasks. However, to be able to run them in NixOS, the shebangs must be changed so that the Python interpreter can be found:


    find . -name \*.py | while read i
    do
    sed -i -e "s|#!/usr/bin/env python|#!${python}/bin/python|" $i
    done
  • The SDK contains a subdirectory (mobilesdk/3.1.4.v20130926144546) with a version number and timestamp in it. However, the timestamp is a bit inconvenient, because the Titanium CLI explicitly checks for SDK folders that correspond to a Titanium SDK version number in a Titanium project file (tiapp.xml). Therefore, I strip it out of the directory name to make my life easier:


    $ cd mobilesdk/*
    $ mv 3.1.4.v20130926144546 3.1.4.GA
  • The Android builder script (mobilesdk/*/android/builder.py) packages certain files into an APK bundle (which is technically a ZIP file).

    However, the script throws an exception if it encounters files with timestamps below January 1, 1980, which are not supported by the ZIP file format. This is a problem, because Nix automatically resets timestamps of deployed packages to one second after January 1, 1970 (a.k.a. UNIX-time: 1) to make builds more deterministic. To remedy the issue, I had to modify several pieces of the builder script.

    What I basically did to fix this is searching for invocations to ZipFile.write() that adds a file from the filesystem to a zip archive, such as:


    apk_zip.write(os.path.join(lib_source_dir, 'libtiverify.so'), lib_dest_dir + 'libtiverify.so')

    I refactored such invocations into a code fragment using a file stream:


    info = zipfile.ZipInfo(lib_dest_dir + 'libtiverify.so')
    info.compress_type = zipfile.ZIP_DEFLATED
    info.create_system = 3
    tf = open(os.path.join(lib_source_dir, 'libtiverify.so'))
    apk_zip.writestr(info, f.read())
    tf.close()

    The above code fragment ignores the timestamp of the files to be packaged and uses the current time instead, thus fixing the issue with files that reside in the Nix store.
  • There were two ELF executables (titanium_prep.{linux32,linux64}) in the distribution. To be able to run them under NixOS, I had to patch them so that the dynamic linker can be found:


    $ patchelf --set-interpreter ${stdenv.gcc.libc}/lib/ld-linux-x86-64.so.2 \
    titanium_prep.linux64
  • The Android builder script (mobilesdk/*/android/builder.py) requires the sqlite3 python module and the Java Development Kit. Since dependencies do not reside in standard locations in Nix, I had to wrap the builder script to allow it to find them:


    mv builder.py .builder.py
    cat > builder.py <<EOF
    #!${python}/bin/python

    import os, sys

    os.environ['PYTHONPATH'] = '$(echo ${python.modules.sqlite3}/lib/python*/site-packages)'
    os.environ['JAVA_HOME'] = '${jdk}/lib/openjdk'

    os.execv('$(pwd)/.builder.py', sys.argv)
    EOF

    Although the Nixpkgs collection has a standard function (wrapProgram) to easily wrap executables, I could not use it, because this function turns any executable into a shell script. The Titanium CLI expects that this builder script is a Python script and will fail if there is a shell code around it.
  • The iOS builder script (mobilesdk/osx/*/iphone/builder.py) invokes ditto to do a recursive copy of a directory hierarchy. However, this executable cannot be found in a Nix builder environment, since the PATH environment variable is set to only the dependencies that are specified. The following command fixes it:


    $ sed -i -e "s|ditto|/usr/bin/ditto|g" \
    $out/mobilesdk/osx/*/iphone/builder.py
  • When building IPA files for iOS devices, the Titanium CLI invokes xcodebuild, that in turn invokes the Titanium CLI again. However, it does not seem to propagate all parameters properly, such as the path to the CLI's configuration file. The following modification allows me to set an environment variable called: NIX_TITANIUM_WORKAROUND providing additional parameters to work around it:


    $ sed -i -e "s|--xcode|--xcode '+process.env['NIX_TITANIUM_WORKAROUND']+'|" \
    $out/mobilesdk/osx/*/iphone/cli/commands/_build.js

Building Titanium Apps


Besides getting the Titanium CLI and SDK packaged in Nix, we must also be able to build Titanium apps. Apps can be built for various target platforms and come in several variants.

For some unknown reason, the Titanium CLI (in contrast to the old Python build script) forces people to login with their Appcelerator account, before any build task can be executed. However, I discovered that after logging in a file is written into the ~/.titanium folder indicating that the system has logged in. I can simulate logins by creating this file myself:


export HOME=$TMPDIR

mkdir -p $HOME/.titanium
cat > $HOME/.titanium/auth_session.json <<EOF
{ "loggedIn": true }
EOF

We also have to tell the Titanium CLI where the Titanium SDK can be found. The following command-line instruction updates the config to provide the path to the SDK that we have just packaged:


$ echo "{}"> $TMPDIR/config.json
$ titanium --config-file $TMPDIR/config.json --no-colors \
config sdk.defaultInstallLocation ${titaniumsdk}

I have also noticed that if the SDK version specified in a Titanium project file (tiapp.xml) does not match the version of the installed SDK, the Titanium CLI halts with an exception. Of course, the version number in a project file can be adapted, but it in my opinion, it's more flexible to just be able to take any version. The following instruction replaces the version inside tiapp.xml into something else:


$ sed -i -e "s|<sdk-version>[0-9a-zA-Z\.]*</sdk-version>|<sdk-version>${tiVersion}</sdk-version>|" tiapp.xml

Building Android apps from Titanium projects


For Android builds, we must tell the Titanium CLI where to find the Android SDK. The following command-line instruction adds its location to the config file:


$ titanium config --config-file $TMPDIR/config.json --no-colors \
android.sdkPath ${androidsdkComposition}/libexec/android-sdk-*

The variable: androidsdkComposition refers to an Android SDK plugin composition provided by the Android SDK Nix expressions I have developed earlier.

After performing the previous operation, the following command-line instruction can be used to build a debug version of an Android app:


$ titanium build --config-file $TMPDIR/config.json --no-colors --force \
--platform android --target emulator --build-only --output $out

If the above command succeeds, an APK bundle called app.apk is placed in the Nix store output folder. This bundle contains all the project's JavaScript code and is signed with a developer key.

The following command produces a release version of the APK (meant for submission to the Play Store) in the Nix store output folder, with a given key store, key alias and key store password:


$ titanium build --config-file $TMPDIR/config.json --no-colors --force \
--platform android --target dist-playstore --keystore ${androidKeyStore} \
--alias ${androidKeyAlias} --password ${androidKeyStorePassword} \
--output-dir $out

Before the JavaScript files are packaged along with the APK file, they are first passed through Google's Closure Compiler, which performs some static checking, removes dead code, and minifies all the source files.

Building iOS apps from Titanium projects


Apart from Android, we can also build iOS apps from Titanium projects.

I have discovered that while building for iOS, the Titanium CLI invokes xcodebuild which in turn invokes the Titanium CLI again. However, it does not propagate the --config-file parameter, causing it to fail. The earlier hack that I made in the SDK expression with the environment variable can be used to circumvent this:


export NIX_TITANIUM_WORKAROUND="--config-file $TMPDIR/config.json"

After applying the workaround, building an app for the iPhone simulator is straight forward:


$ cp -av * $out
$ cd $out

$ titanium build --config-file $TMPDIR/config.json --force --no-colors \
--platform ios --target simulator --build-only \
--device-family universal --output-dir $out

After running the above command, the simulator executable is placed into the output Nix store folder. It turns out that the JavaScript files of the project folder are symlinked into the folder of the executable. However, after the build has completed these symlink references will become invalid, because the temp folder has been deleted. To allow the app to find these JavaScript files, I simply copy them along with the executable into the Nix store.

Finally, the most complicated task is producing IPA bundles to deploy an app to a device for testing or to the App Store for distribution.

Like native iOS apps, they must be signed with a certificate and mobile provisioning profile. I used the same trick described in an earlier blog post on building iOS apps with Nix to generate a temporary keychain in the user's home directory for this:


export HOME=/Users/$(whoami)
export keychainName=$(basename $out)

security create-keychain -p "" $keychainName
security default-keychain -s $keychainName
security unlock-keychain -p "" $keychainName
security import ${iosCertificate} -k $keychainName -P "${iosCertificatePassword}" -A

provisioningId=$(grep UUID -A1 -a ${iosMobileProvisioningProfile} | grep -o "[-A-Z0-9]\{36\}")

if [ ! -f "$HOME/Library/MobileDevice/Provisioning Profiles/$provisioningId.mobileprovision" ]
then
mkdir -p "$HOME/Library/MobileDevice/Provisioning Profiles"
cp ${iosMobileProvisioningProfile} \
"$HOME/Library/MobileDevice/Provisioning Profiles/$provisioningId.mobileprovision"
fi

I also discovered that builds fail, because some file (the facebook module) from the SDK cannot be read (Nix makes deployed package read-only). I circumvented this issue by making a copy of the SDK in my temp folder, fixing the file permissions, and configure the Titanium CLI to use the copied SDK instance:


cp -av ${titaniumsdk} $TMPDIR/titaniumsdk

find $TMPDIR/titaniumsdk | while read i
do
chmod 755 "$i"
done

titanium --config-file $TMPDIR/config.json --no-colors \
config sdk.defaultInstallLocation $TMPDIR/titaniumsdk

Because I cannot use the temp folder as a home directory, I also have to simulate a login again:


$ mkdir -p $HOME/.titanium
$ cat > $HOME/.titanium/auth_session.json <<EOF
{ "loggedIn": true }
EOF

Finally, I can build an IPA by running:


$ titanium build --config-file $TMPDIR/config.json --force --no-colors \
--platform ios --target dist-adhoc --pp-uuid $provisioningId \
--distribution-name "${iosCertificateName}" \
--keychain $HOME/Library/Keychains/$keychainName \
--device-family universal --output-dir $out

The above command-line invocation minifies the JavaScript code, builds an IPA file with a given certificate, mobile provisioning profile and authentication credentials, and puts the result in the Nix store.

Example: KitchenSink


I have encapsulated all the builds commands shown in the previous section into a Nix function called: titaniumenv.buildApp {}. To test the usefulness of this function, I took KitchenSink, an example app provided by Appcelerator, to show Titanium's abilities. The App can be deployed to all target platforms that the SDK supports.

To package KitchenSink, I wrote the following expression:


{ titaniumenv, fetchgit
, target, androidPlatformVersions ? [ "11" ], release ? false
}:

titaniumenv.buildApp {
name = "KitchenSink-${target}-${if release then "release" else "debug"}";
src = fetchgit {
url = https://github.com/appcelerator/KitchenSink.git;
rev = "d9f39950c0137a1dd67c925ef9e8046a9f0644ff";
sha256 = "0aj42ac262hw9n9blzhfibg61kkbp3wky69rp2yhd11vwjlcq1qc";
};
tiVersion = "3.1.4.GA";

inherit target androidPlatformVersions release;

androidKeyStore = ./keystore;
androidKeyAlias = "myfirstapp";
androidKeyStorePassword = "mykeystore";
}

The above function fetches the KitchenSink example from GitHub and builds it for a given target, such as iphone or android, and supports building a debug version for an emulator/simulator, or a release version for a device or for the Play store/App store.

By invoking the above function as follows, a debug version of the app for Android is produced:


import ./kitchensink {
inherit (pkgs) fetchgit titaniumenv;
target = "android";
release = false;
}
The following function invocation produces an iOS executable that can be run in the iPhone simulator:


import ./kitchensink {
inherit (pkgs) fetchgit titaniumenv;
target = "iphone";
release = false;
}

As may be observed, building KitchenSink through Nix is a straight forward process for most targets. However, the target producing an IPA version of KitchenSink that we can deploy to a real device is a bit complicated to use, because of some restrictions made by Apple.

Since all apps that are deployed to a real device have to be signed and the mobile provisioning profile should match the app's app id, this is sort of problem. Luckily, I can also do a comparable renaming trick as I have described earlier with in a blog post about improving the testability of iOS apps. Simply executing the following commands in the KitchenSink folder were sufficient:


sed -i -e "s|com.appcelerator.kitchensink|${newBundleId}|" tiapp.xml
sed -i -e "s|com.appcelerator.kitchensink|${newBundleId}|" manifest

The above commands change the com.appcelerator.kitchensink app id into any other specified string. If this app id is changed to the corresponding id in a mobile provisioning profile, then you should be able to deploy KitchenSink to a real device.

I have added the above renaming procedure to the KitchenSink expression. The following example invocation to the earlier Nix function, shows how we can rename the app's id to: com.example.kitchensink and how to use a certificate and mobile provisioning profile for an exisiting app:


import ./kitchensink {
inherit (pkgs) stdenv fetchgit titaniumenv;
target = "iphone";
release = true;
rename = true;
newBundleId = "com.example.kitchensink";
iosMobileProvisioningProfile = ./profile.mobileprovision;
iosCertificate = ./certificate.p12;
iosCertificateName = "Cool Company";
iosCertificatePassword = "secret";
}


By using the above expressions KitchenSink can be built for both Android and iOS. The left picture above shows what it looks like on iOS, the right picture shows what it looks like on Android.

Discussion


With the Titanium build function described in this blog post, I can automatically build Titanium apps for both iOS and Android using the Nix package manager, although it was quite painful to get it done and tedious to maintain.

What bothers me the most about this process is the fact that Appcelerator has crafted their own custom build tool with lots of complexity (in terms of code size), flaws (e.g. not propagating the CLI's argument properly from xcodebuild) and weird issues (e.g. an odd way of detecting the presence of the JDK, and invoking the highly complicated legacy python scripts), while there are already many more mature build solutions available that can do the same job.

A quick inspection of Titanium CLI's git repository shows me that it consists of 8174 lines of code. However, not all of their build stuff is there. Some common stuff, such as the JDK and Android detection stuff, resides in the node-appc project. Moreover, the build steps are performed by plugin scripts that are distributed with the SDK.

A minor annoyance is that the new Node.js based Titanium CLI requires Oracle's Java Development Kit to make Android builds work, while the old Python based build script worked fine with OpenJDK. I have no idea yet how to fix this. Since we cannot provide a Nix expression that automatically downloads Oracle's JDK that automatically (due to license restrictions), Nix users are forced to manually download and import it into the Nix store first, before any of the Titanium stuff can be built.

So how did I manage to figure all this mess out?

Besides knowing that I have to patch executables, fix shebangs and wrap certain executables, the strace command on Linux helps me out a lot (since it shows me things like files that can not be opened) as well as the fact that Python and Node.js show me error traces with line numbers when something goes wrong so that I can debug easily what's going on.

However, since I also have to do builds on Mac OS X for iOS devices, I observed that there is no strace making ease my pain on that particular operating system. However, I discovered that there is a similar tool called: dtruss, that provides me similar data regarding system calls.

There is one minor annoyance with dtruss -- it requires super-user privileges to work. Fortunately, thanks to this MacWorld article, I can fix this by setting the setuid bit on the dtrace executable:


$ sudo chmod u+s /usr/sbin/dtrace

Now I can conveniently use dtruss in unprivileged build environments on Mac OS X to investigate what's going on.

Availability


The Titanium build environment as well as the KitchenSink example are part of Nixpkgs.

The top-level expression for KitchenSink example as well as the build operations described earlier is located in pkgs/development/mobile/titaniumenv/examples/default.nix. To build a debug version of KitchenSink for Android, you can run:


$ nix-build -A kitchensink_android_debug

The release version can be built by running:


$ nix-build -A kitchensink_android_release

The iPhone simulator version can be built by running:


$ nix-build -A kitchensink_ios_development

Building an IPA is slightly more complicated. You have to provide a certificate and mobile provisioning profile, and some renaming trick settings as parameters to make it work (which should of course match to what's inside the mobile provisioning profile that is actually used):


$ nix-build --arg rename true \
--argstr newBundleId com.example.kitchensink \
--arg iosMobileProvisionProfile ./profile.mobileprovision \
--arg iosCertificate ./certificate.p12 \
--argstr iosCertificateName "Cool Company" \
--argstr iosCertificatePassword secret \
-A kitchensink_ipa

There are also a couple of emulator jobs to easily spawn an Android emulator or iPhone simulator instance.

Currently, iOS and Android are the only target platforms supported. I did not investigate Blackberry, Tizen or Mobile web applications.

Reproducing Android app deployments (or playing Angry Birds on NixOS)

$
0
0
Some time ago, I did a couple of fun experiments with my Android phone and the Android SDK. Moreover, I have developed a function that can be used to automate Android builds with Nix.

Not so long ago, somebody asked me if it would be possible to run arbitrary Android apps in NixOS. I realised that this was exactly the goal of my fun experiments. Therefore, I think it would be interesting to report about it.

Obtaining Android apps from a device


Besides development versions of apps that can be built with the Android SDK and deployed to a device or emulator through a USB connection, the major source of acquiring Android apps is the Google Playstore.

Although most devices (such as my phone and tablet) bundle the Google Playstore app as part of their software distributions, the system images that come with the Android SDK do not seem to have the Google Playstore app included.

Despite the fact that emulator system images do not have the Google Playstore app installed, we can still get most of the apps we want deployed to an emulator instance. What I typically do is installing an app on my phone with the Google Playstore, then downloading it from my phone and installing it in an emulator instance.

If I attach my phone to the computer and enable USB debugging on my device, I can run the following command to open a shell session:


$ adb -d shell

While navigating through the filesystem, I discovered that my phone stores apps in two locations. The system apps are stored in /system/app. All other apps reside in /data/app. One of the annoying things about the latter folder is that root access to my phone is restricted and I'm not allowed to read the contents of it:


$ cd /data/app
$ ls
opendir failed, Permission denied

Later I discovered that Android distributions use a tool called pm to deploy Android packages. Running the following command-line instruction gives me an overview of all the installed packages and the locations where they reside on the filesystem:


$ pm list packages -f
package:/system/app/GoogleSearchWidget.apk=android.googleSearch.googleSearchWidget
package:/data/app/com.example.my.first.app-1.apk=com.example.my.first.app
package:/system/app/KeyChain.apk=com.android.keychain
package:/data/app/com.appcelerator.kitchensink-1.apk=com.appcelerator.kitchensink
package:/system/app/Shell.apk=com.android.shell
package:/data/app/com.capcom.smurfsandroid-1.apk=com.capcom.smurfsandroid
package:/data/app/com.rovio.angrybirds-2.apk=com.rovio.angrybirds
package:/data/app/com.rovio.BadPiggies-1.apk=com.rovio.BadPiggies
package:/data/app/com.android.chrome-2.apk=com.android.chrome
...

As can be seen, the package manager shows me the location of all installed apps, including those that reside in the folder that I could not inspect. Moreover, downloading the actual APKs files through the Android debugger does not seem to be restricted either. For example, I can run the following Android debugger instruction to obtain Angry Birds that I have installed on my phone:


$ adb -d pull /data/app/com.rovio.angrybirds-2.apk
5688 KB/s (45874906 bytes in 7.875s)

Running arbitrary APKs in the emulator


In my earlier blogposts on automating Android builds with Nix, I have described how I implemented a Nix function (called androidenv.emulateApp { }) that generates scripts spawning emulator instances in which a development app is automatically deployed and started.

I have adapted this function to make it more convenient to deploy existing APKs and to make it more suitable for running apps for other purposes than development:

  • The original script stores the state files of the emulator instance in a temp folder, which gets discarded afterwards. For test automation this is quite useful in most cases. However, we don't want to lose our savegames while playing games. Therefore, I added a parameter called avdHomeDir allowing someone to store the state files in a non-volatile location on the filesystem, such as the user's home directory. If this parameter is not provided, the script remains to use a temp directory.
  • Since we want to keep the state of the emulator instance around, there is also no need to create it every time we launch the emulator. I have adapted the script in such a way that it only creates the AVD if it does not exists. Running the following instruction seems to be sufficient to check whether the AVD exists:


    $ android list avd | grep "Name: device"

  • The same thing applies to the app that gets deployed to the emulator instance. It's only supposed to be deployed if it is not installed yet. Running the following command-line instruction did the trick for me:


    $ adb -s emulator-5554 pm list packages | \
    grep package:com.rovio.angrybirds
    package:com.rovio.angrybirds

    It shows me the name of the package if it is installed already.

Automatically starting apps in the emulator


As described in my earlier blog post, the script that launches the emulator can also automatically start the app. To do this, we need the Java package identifier of the app and the name of the start activity. While developing apps, these properties can be found in the manifest file that is part of the development repository. However, it's a bit trickier to obtain these attributes if you only have a binary APK.

I have discovered that the aapt tool (that comes with the Android SDK) is quite useful to find what I need. While running the following command-line instruction with the Angry Birds APK, I discovered the following:


$ aapt l -a com.rovio.angrybirds-2.apk
...
A: package="com.rovio.angrybirds" (Raw: "com.rovio.angrybirds")

E: application (line=47)
A: android:label(0x01010001)="Angry Birds" (Raw: "Angry Birds")
A: android:icon(0x01010002)=@0x7f020001
A: android:debuggable(0x0101000f)=(type 0x12)0x0
A: android:hardwareAccelerated(0x010102d3)=(type 0x12)0x0
E: activity (line=48)
A: android:theme(0x01010000)=@0x1030007
A: android:name(0x01010003)="com.rovio.fusion.App" (Raw: "com.rovio.fusion.App")
A: android:launchMode(0x0101001d)=(type 0x10)0x2
A: android:screenOrientation(0x0101001e)=(type 0x10)0x0
A: android:configChanges(0x0101001f)=(type 0x11)0x4a0
E: intent-filter (line=49)
E: action (line=50)
A: android:name(0x01010003)="android.intent.action.MAIN" (Raw: "android.intent.action.MAIN")
E: category (line=51)
A: android:name(0x01010003)="android.intent.category.LAUNCHER" (Raw: "android.intent.category.LAUNCHER")

Somewhere at the end of the output, the package name is shown (com.rovio.angrybirds) and the app's activities. The activity that supports android.intent.action.MAIN intent is actually the one we are looking for. According to the information that we have collected, the start activity that we have to call is named com.rovio.fusion.App.

Writing a Nix expression


Now that we have retrieved the Angry Birds APK and discovered the attributes to automatically start it, we can automate the process that sets up an emulator instance. I wrote the following Nix expression to do this:


with import <nixpkgs> {};

androidenv.emulateApp {
name = "angrybirds";
app = ./com.rovio.angrybirds-2.apk;
platformVersion = "18";
useGoogleAPIs = false;
enableGPU = true;
abiVersion = "x86";

package = "com.rovio.angrybirds";
activity = "com.rovio.fusion.App";

avdHomeDir = "$HOME/.angrybirds";
}

The above Nix expression sets the following parameters:
  • The name parameter is simply used to make the Nix store path better readable.
  • The app parameter points to the Angry Birds APK that I just downloaded from my phone. It gets automatically installed in the spawned emulator instance.
  • platformVersion refers to the API-level of the system image that the emulator runs. API-level 18 corresponds to Android version 4.3
  • If we need Google specific functionality (such as Google Maps) we need a Google API-enabled system image. Angry Birds does not seem to require it.
  • To allow games to run smoothly, it's better to enable hardware GPU emulator/acceleration through the enableGPU parameter
  • The abiVersion sets the CPU architecture of the emulator. Most apps are actually developed for armeabi-v7a and this is usually the safest or the only option that works (unless the app is not using any native code or supports other desired architectures). Angry Birds also supports x86 which can be much faster emulated.
  • The package and activity parameters are used to automatically start the app
  • We use the avdHomeDir parameter to persistently store the state of the emulator in the .angrybirds folder of my home directory, so that the progress is retained.

I can build the earlier Nix expression with the following command:


$ nix-build angrybirds.nix

And then play Angry Birds, by running:


./result/bin/run-test-emulator

The above script starts the emulator, installs Angry Birds, and starts it. This is the result (to rotate the screen I used the 7 and 9 keys on the numpad):


Isn't it awesome? ;)

Transferring state


I also discovered how to transfer the state of apps (such as settings and savegames) from a device to an emulator instance and vice-versa. For some games, you can obtain these through Android's backup functionality. The following instruction makes a backup from my phone of the state of a particular app:


$ adb -d backup com.rovio.angrybirds -f state
Now unlock your device and confirm the backup operation.

When running the above instruction, you'll be asked for confirmation for making the backup and some details to optionally encrypt it.

With the following instruction, I can restore the captured state in the emulator:


$ adb -s emulator-5554 restore com.rovio.angrybirds -f state
Now unlock your device and confirm the backup operation.

While running the latter operation, you'll also be asked for confirmation.

Conclusion


In this blog post, I have described how we can automatically deploy existing Android APKs in an emulator instance using the Nix package manager. I have used it to play Angry Birds and a couple of other Android games in NixOS.

There are few caveats that you have to keep in mind:

  • I have observed that quite a few apps, especially games, have native dependencies. Most of these games only seem to work on ARM-based systems. Although x86 images are much faster to emulate, you will not benefit from the speed boost they may give you if this CPU architecture is not supported.
  • Some apps use Google API specific functionality. Unfortunately, the Android SDK does not provide non-ARM based system images that support them. In a previous blog post, I have developed a Nix expression that can be used to create x86 Google API enabled system images from the ARM-based images, although it may be a bit tricky to set them up.
  • Some apps may install additional files besides the APK when they are installed through the Google Playstore. For me running adb logcat and inspecting the error messages in the logs helped me out a few times.

Availability


The androidenv.emulateApp { } function is part of Nixpkgs.

It's also important to point out that the Nixpkgs repository does NOT contain any prepackaged Android games or apps. You have to obtain and deploy these apps yourself!

Implementing consistent layouts for websites

$
0
0
Recently, I have wiped the dust off an old dormant project and I have decided to put it on GitHub, since I have found some use for it again. It is a personal project I started a long time ago.

Background


I got the inspiration for this project while working on my bachelor thesis project internship at IBM in 2005. I was developing an application usage analyzer system which included a web front-end implementing their intranet layout. I observed that it was a bit tedious to get it implemented properly. Moreover, I noticed that I had to repeat the same patterns over and over again for each page.

I saw some "tricks" that other people did to cope with these issues, but I considered all of them workarounds -- they were basically a bunch of includes in combination with a bit of iteration to make it work, but looked overly complicated and had all kinds of issues.

Some time before my internship, I learned about the Model-view-controller architectural pattern and I was looking into applying this pattern to the web front-end I was developing.

After some searching on the web using the MVC and Java Enterprise Edition (which was the underlying technology used to implement the system) keywords, I stumbled upon the following JavaWorld article titled: 'Understanding JavaServer Pages Model 2 architecture'. Although the article was specifically about the Model 2 architecture, I considered the Model 1 variant -- also described in the same article -- good enough for what I needed.

I observed that every page of an intranet application looks quite similar to others. For example, they had the same kinds of sections, same style, same colors etc. The only major differences were the selected menu item in the menu section and the contents (such as text) that is being displayed.

I created a model of the intranet layout that basically encodes the structure of the menu section that is being displayed on all pages of the web application. Each item in the menu redirects the user to the same page which -- based on the selected menu option -- displays different contents and a different "active" link. To cite my bachelor's thesis (which was written in Dutch):

De menu instantie bevat dus de structuur van het menu en de JSP zorgt ervoor dat het menu in de juiste opmaak wordt weergegeven. Deze aanpak is gebaseerd is op het Model 1 [model1] architectuur:

which I could translate into something like:

Hence, the menu instance contains the structure of the menu and the JSP is responsible for properly displaying the menu structure. This approach is based on the Model 1 [model1] architecture.

(As a sidenote: The website I am referring to calls "JSP Model 1" an architecture, which I blindly adopted in my thesis. These days, MVC is not something I would call an architecture, but rather an architectural pattern!)

I was quite satisfied with my implementation of the web front-end and some of my coworkers liked the fact that I was capable of implementing the intranet layout completely on my own and to be able to create and modify pages so easily.

Creating a library


After my internship, I was not too satisfied with the web development work I did prior to it. I had developed several websites and web applications that I still maintained, but all of them were implemented in an ad-hoc way -- one web application had a specific aspect implemented in a better way than others. Moreover, I kept reimplementing similar patterns over and over again including layout elements. I also did not reuse code effectively apart from a bit of copying and pasting.

From that moment on, I wanted everything that I had to develop to have the same (and the best possible) quality and to reuse as much code as possible so that every project would benefit from it.

I started a new library project from scratch. In fact, it were two library projects for two different programming languages. Initially I started implementing a Java Servlet/JSP version, since I became familiar with it during my internships at IBM and I considered it to be good and interesting technology to use.

However, all my past projects were implemented in PHP and also most of the web applications I maintained were hosted at shared webhosting providers only supporting PHP. As a result, I also developed a PHP version which became the version that I actually used for most of the time.

I could not use any code from my internship. Apart from the fact that it was IBM's property, it was also too specific for IBM intranet layouts. Moreover, I needed something that was even more general and more flexible so that I could encode all the layouts that I had implemented myself in the past. However, I kept the idea of the Model-1 and Model-2 architectural patterns that I discovered in mind.

Moreover, I also studied some usability heuristics (provided by the Nielsen-Norman Group) which I tried to implement in the library:

  • Visibility of system status. I tried supporting this aspect, by ensuring that the selected links in the menu section were explicitly marked as such so that users always know where they are in the navigation structure.
  • The "Consistency and standards" aspect was supported by the fact that every page has the same kinds of sections with the same purposes. For example, the menu sections have the same behavior as well as the result of clicking on a link.
  • I tried support "Error prevention" by automatically hiding menu links that were not accessible.

I kept evolving and improving the libraries until early 2009. The last thing I did with it was implementing my own personal homepage, which is still up and running today.

Usage


So how can these libraries be used? First, a model has to be created which captures common layout properties and the sub pages of which the application consists. In PHP, a simple application model could be defined as follows:

<?php
$application = new Application(
/* Title */
"Simple test website",

/* CSS stylesheets */
array("default.css"),

/* Sections */
array(
"header" => new StaticSection("header.inc.php"),
"menu" => new MenuSection(0),
"submenu" => new MenuSection(1),
"contents" => new ContentsSection(true)
),

/* Pages */
new StaticContentPage("Home", new Contents("home.inc.php"), array(
"page1" => new StaticContentPage("Page 1", new Contents("page1.inc.php"), array(
"page11" => new StaticContentPage("Subpage 1.1",
new Contents("page1/subpage11.inc.php")),
"page12" => new StaticContentPage("Subpage 1.2",
new Contents("page1/subpage12.inc.php")),
"page13" => new StaticContentPage("Subpage 1.3",
new Contents("page1/subpage13.inc.php")))),
...
)))
);

The above code fragment specifies the following:

  • The title of the entire web application is: "Simple test website", which will be visible in the title bar of the browser window for every sub page.
  • Every sub page of the application uses a common stylesheet: default.css
  • Every sub page has the same kinds of sections:
    • The header section always displays the same (static) content which code resides in a separate PHP include (header.inc.php)
    • The menu section displays a menu navigation section displaying links reachable from the entry page.
    • The submenu section displays a menu navigation section displaying links reachable from the pages in the previous menu section.
    • The contents section displays the actual dynamic contents (usually text) that makes the page unique based on the link that has been selected in one of the menu sections.
  • The remainder of the code defines the sub pages of which the web application consists. Sub pages are organised in a tree-like structure. The first object is entry page, the entry page has zero or more sub pages. Each sub page may have sub pages on their own, and so on.

    Every sub page provides their own contents to be displayed in contents section that has been defined earlier. Moreover, the menu sections automatically display links to the reachable sub pages from the current page that is being displayed.

By calling the following view function, with the application model as parameter we can display any of its sub pages:
displayRequestedPage($application);
?>

The above function generates a basic HTML page. The title of the page is composed of the application's title and the selected page title. Moreover, the sections are translated to div elements having an id attribute set to their corresponding array key. Each of these divs contains the contents of the include operations. The sub page selection is done by taking the last few path components of the URL that come after the script component.

If I create a "fancy" stylesheet, a bit of basic artwork and some actual contents for each include, something like this could appear on your screen:


Although the generated HTML by displayRequestedPage() is usually sufficient, I could also implement a custom one if I want to do more advanced stuff. I decomposed most if its aspects in sub functions that can be easily invoked from a custom function that does something different.

I also create a Java version of the same thing, which predates the PHP version. In the Java version, the model would look like this:

package test;

import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;
import io.github.svanderburg.layout.model.*;
import io.github.svanderburg.layout.model.page.*;
import io.github.svanderburg.layout.model.page.content.*;
import io.github.svanderburg.layout.model.section.*;

public class IndexServlet extends io.github.svanderburg.layout.view.IndexServlet
{
private static final long serialVersionUID = 6641153504105482668L;

private static final Application application = new Application(
/* Title */
"Test website",

/* CSS stylesheets */
new String[] { "default.css" },

/* Pages */
new StaticContentPage("Home", new Contents("home.jsp"))
.addSubPage("page1", new StaticContentPage("Page 1", new Contents("page1.jsp"))
.addSubPage("subpage11", new StaticContentPage("Subpage 1.1",
new Contents("page1/subpage11.jsp")))
.addSubPage("subpage12", new StaticContentPage("Subpage 1.2",
new Contents("page1/subpage12.jsp")))
.addSubPage("subpage13", new StaticContentPage("Subpage 1.3",
new Contents("page1/subpage13.jsp"))))
...
)
/* Sections */
.addSection("header", new StaticSection("header.jsp"))
.addSection("menu", new MenuSection(0))
.addSection("submenu", new MenuSection(1))
.addSection("contents", new ContentsSection(true));

protected void doGet(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException
{
dispatchLayoutView(application, req, resp);
}

protected void doPost(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException
{
dispatchLayoutView(application, req, resp);
}
}

As may be observed, since Java is statically typed language, more code is needed to express the same thing. Furthermore, Java has no associative arrays in its language, so I decided to use fluent interfaces instead.

Moreover, the model is also embedded in a Java Servlet, that dispatches the requests to a JSP page (WEB-INF/index.jsp) that represents the view. This JSP page could be implemented as follows:

<%@ page language="java" contentType="text/html; charset=UTF-8"
pageEncoding="UTF-8" import="io.github.svanderburg.layout.model.*"
import="io.github.svanderburg.layout.model.page.*,test.*"%>
<%
Application app = (Application)request.getAttribute("app");
Page currentPage = (Page)request.getAttribute("currentPage");
%>
<%@ taglib uri="http://svanderburg.github.io" prefix="layout" %>
<layout:index app="<%= app %>" currentPage="<%= currentPage %>" />

The above page takes the application model and the current page (determined by the URL to call it) as requests parameters. It invokes the indextaglib (instead of a function in PHP) to compose an HTML page from it. Moreover, I have also encoded sub parts of the index page as reusable taglibs.

Other features


Besides the simple usage scenario shows earlier, the libraries support a collection of other interesting features, such as:

  • Multiple content section support
  • Per-page style and script includes
  • Error pages
  • Security handling
  • Controller sections to handle GET or POST parameters. In Java, you can invoke Java Servlets to do this, making the new library technically compliant with the JSP Model-2 architectural pattern.
  • Using path components as parameters
  • Internationalised sub pages

Conclusion


In this blog post, I have described an old dormant project that I revived and released. I always had the intention to release it as free/open-source software in the past, but never actually did it until now.

These days, some people do not really consider me a "web guy". I was very active in this domain a long time ago, but I (sort of) put that interest into the background, although I am still very much involved with web application development today (in addition to software deployment techniques and several other interests).

This interesting oatmeal comic clearly illustrates one of the major reasons why I have put my web technology interests into the background. This talk about web technology from Zed Shaw has an overlap with my other major reason.

Today, I am not so interested anymore in making web sites for people or to make this library a killer feature, but I don't mind sharing code. The only thing I care about at this moment is to use it to please myself.

Availability


The Java (java-sblayout) as well as the PHP (php-sblayout) versions of the libraries can be obtained from my GitHub page and used under the terms and conditions of the Apache Software License version 2.0.

Structured asynchronous programming (Asynchronous programming with JavaScript part 3)

$
0
0
A while ago, I explained that JavaScript execution environments, such as a web browser or Node.js, do not support multitasking. Such environments have a single event loop and when JavaScript code is being executed, nothing else can be done. As a result, it might (temporarily or indefinitely) block the browser or prevent a server from handling incoming connections.

In order to execute multiple tasks concurrently, typically events are generated (such as ticks or timeouts), the execution of the program is stopped so that the event loop can process events, and eventually execution is resumed by invoking the callback function attached to an event. This model works as long as implementers properly "cooperate".

One of its undesired side effects is that code is much harder to structure due to the extensive use of callback functions. Many solutions have been developed to cope with this. In my previous blog posts I have covered the async library and promises as possible solutions.

However, after reading a few articles on the web, some discussion, and some thinking, I came to the observation that asynchronous programming, that is: programming in environments in which executions have to be voluntarily interrupted and resumed between statements and -- as a consequence -- cannot immediately deliver their results within the same code block, is an entirely different programming world.

To me, one of the most challenging parts of programming (regardless of what languages and tools are being used) is being able to decompose and translate problems into units that can be programmed using concepts of a programming language.

In an asynchronous programming world, you have unlearn most of the concepts that are common in the synchronous programming world (to which JavaScript essentially belongs in my opinion) and replace them by different ones.

Are callbacks our new generation's "GOTO statement"?


When I think about unlearning programming language concepts: A classic (and very famous) example that comes into my mind is the "GOTO statement". In fact, a few other programmers using JavaScript claim that the usage of callbacks in JavaScript (and other programming languages as well) are our new generation's "GOTO statement".

Edsger Dijkstra said in his famous essay titled: "A case against the GO TO statement" (published as "Go To Statement Considered Harmful" in the March 1968 issue of the "Communications of the ACM") the following about it:

I become convinced that the go to statement should be abolished from all "higher level" programming languages (i.e. everything except -perhaps- plain machine code)

As a consequence, nearly every modern programming language used these days, lack the GOTO statement and people generally consider it a bad practice to use it. But I have the impression that most of us seem to have forgotten why.

To re-explain Dijkstra's essay a bit in my own words: it was mainly about getting programs correctly implemented by construction. He briefly refers to three mental aids programmers can use (which he explains in more detail in his manuscript titled: "Notes on Structured Programming") namely: enumeration, mathematical induction, and abstraction:

  • The first mental aid: enumeration, is useful to determine the correctness of a code block executing sequential and conditional (e.g. if-then-else or switch) statements.

    Basically, it is about stepping through each statement sequentially and reason whether for each step whether some invariant holds. You could address each step independently with what he describes: "a single textual index".
  • The second mental aid: mathematical induction, comes in handy when working with (recursive) procedures and loops (e.g. while and doWhile loops).

    In his manuscript, he shows that validity of a particular invariant can be proved by looking at the basis (first step of an iteration) first and then generalize the proof to all successive steps.

    For these kinds of proofs, a single textual index no longer suffices to address each step. However, using an additional dynamic index that represents each successive procedure call or iteration step still allows one to uniquely address them. The previous index and this second (dynamic) index constitutes something that he calls "an independent coordinate system".
  • Finally, abstraction (i.e. encapsulating common operations into a procedure) is useful in many ways to me. One of the things Dijkstra said about this is that somebody basically just have to think about "what it does", disregarding "how it works".

The advantage of "an independent coordinate system" is that the value of a variable can be interpreted only with respect to the progress of the process. According to Dijkstra, using the "GOTO statement" makes it quite hard (though not impossible) to define a set of meaningful set of such coordinates, making it harder to reason about correctness and not to make your program a mess.

So what are these coordinates really about you may wonder? Initially, they sound a bit abstract to me, but after some thinking, I have noticed that the way execution/error traces are presented in commonly used programming language these days (e.g. when capturing an exception or using a debugger) use a coordinate system like that IMHO.

These traces have coordinates with two dimensions -- the first dimension is the name of the text file and the corresponding line number that we are currently at (assuming that each line contains a single statement). The second dimension is the stack of function invocations, each showing their corresponding location in the corresponding text files. It also makes sense to me that adding the effects of GOTOs (even when marking each of them with an individual number) to such traces is not helpful, because there could be so many of them that these traces become unreadable.

However, when using structured programming concepts (as described in his manuscript), such as the sequential decomposition, alteration (e.g. if-then-else and switch), and repetition (e.g. while-do, and repeat-until) the first two mental aids can be effectively used to proof validity, mainly because the structure of the program at runtime stays quite close to its static representation.

JavaScript language constructs


Like many other conventional programming languages that are in use these days, the JavaScript programming language supports structured programming language concepts, as well as a couple of other concepts, such as functional programming and object oriented programming through prototypes. Moreover, JavaScript lacks the goto statement.

JavaScript has been originally "designed" to work in a synchronous world, which makes we wonder: what are the effects of using JavaScript's language concepts in an asynchronous world? And are the implications of these effects similar to the effects of using GOTO statements?

Function definitions


The most basic thing one can do in a language such as JavaScript is executing statements, such as variable assignments or function invocations. This is already something that changes when moving from a synchronous world to an asynchronous world. For example, take the following trivial synchronous function definition that simply prints some text on the console:


function printOnConsole(value) {
console.log(value);
}

When moving to an asynchronous world, we may want to interrupt the execution of the function (yes I know it is not a very meaningful example for this particular case, but anyway):


function printOnConsole(value, callback) {
process.nextTick(function() {
console.log(value);
callback();
});
}

Because we generate a tick event first when calling the function and then stop the execution, the function returns immediately without doing its work. The callback, that is invoked later, will do it instead.

As a consequence, we do not know when the execution is finished by merely looking when a function returns. Instead, a callback function (provided as a function parameter) can be used, that gets invoked once the work has been done. This is the reason why JavaScript functions in an asynchronous world use callbacks.

As a sidenote: I have seen some people claiming that merely changing the function interface to have a callback, makes their code asynchronous. This is absolutely not true. Code becomes asynchronous if it interrupts and resumes its execution. The callback interface is simply a consequence of providing an equivalent for the return statement that has lost its relevance in an asynchronous world.

Same thing holds for functions that return values, such as the following that translates one numerical digit into a word:


function generateWord(num) {
var words = [ "zero", "one", "two", "three", "four",
"five", "six", "seven", "eight", "nine" ];
return words[num];
}

In asynchronous world, we have to use a callback to pass its result to the caller:


function generateWord(digit, callback) {
var words;
process.nextTick(function() {
words = [ "zero", "one", "two", "three", "four", "five",
"six", "seven", "eight", "nine" ];
callback(words[num]);
});
}

Sequential decomposition


The fact that function interfaces have become different and function invocations have to be done differently, affects all other programming language concepts in JavaScript.

Let's take the simplest structured programming concept: the sequence. Consider the following synchronous code fragment executing a collection of statements in sequential order:


var a = 1;
var b = a + 1;
var number = generateWord(b);
printOnConsole(number); // two

To me it looks straight forward to use enumerative reasoning to conclude that the output shown in the console will be "two".

As explained earlier, in an asynchronous world, we have to pass callback functions as parameters to know when they return. As a consequence, each successive statement has to be executed within the corresponding callback. If we do this in a dumb way, we probably end up writing:


var a = 1;
var b = a + 1;

generateWord(b, function(result) {
var number = result;
printOnConsole(number, function() {

}); // two
});

As can be observed in the above code fragment, we end up one indentation level deeper every time we invoke a function, turning the code fragment into pyramid code.

Pyramid code is nasty in many ways. For example, it affects maintenance, because it has become harder to change the order of two statements. It has also become hard to add a statement, say, in the beginning of the code block, because it requires us to refactor all the successive statements. It also becomes a bit harder to read the code because of the nesting and indentation.

However, it also makes me wonder this whether pyramid code is a "new GOTO"? I would say no, because I think we still have not lost our ability to address statements through a "single textual index" and the ability to use enumerative reasoning.

We could also say that the fact that we invoke callback functions for each function invocation introduces the second dynamic index, but on the other hand, we know that a given callback is only called by the same caller, so we can discard that second index because of that.

My conclusion is that we still have enumerative reasoning abilities when implementing a sequence. However, the overhead of each enumeration step is (in my opinion) bigger because we have to keep the indentation and callback nesting into account.

Fortunately, I can create an abstraction to clean up this pyramid code:


function runStatement(stmts, index, callback, result) {
if(index >= stmts.length) {
if(typeof callback == "function")
callback(result);
} else {
stmts[index](function(result) {
runStatement(stmts, index + 1, callback, result);
}, result);
}
}

function sequence(stmts, callback) {
runStatement(stmts, 0, callback, undefined);
}

The above function: sequence() takes an array of functions each requiring a callback as parameter. Each function represents a statement. Moreover, since the abstraction is an asynchronous function itself, we also have to use a callback parameter to notify the caller when it has finished. I can refactor the earlier asynchronous code fragment into the following:


var a;
var b;
var number;

slasp.sequence([
function(callback) {
a = 1;
callback();
},

function(callback) {
b = a + 1;
callback();
},

function(callback) {
generateWord(b, callback);
},

function(callback, result) {
number = result;
printOnConsole(number); // two
}
]);

By using the sequence() function, we have eliminated all pyramid code, because we can indent the statements on the same level. Moreover, we can also maintain it better, because we do not have to fix the indentation and callback nesting each time we insert or move a statement.

Alteration


The usage of alteration constructs is also slightly different in an asynchronous world. Consider the following example that basically checks whether some variable contains my first name and lets the user know whether this is the case or not:


function checkMe(name) {
return name == "Sander"
}

var name = "Sander";

if(checkMe(name)) {
printOnConsole("It's me!");
printOnConsole("Isn't it awesome?");
} else {
printOnConsole("It's someone else!");
}

(As you may probably notice, I intentionally captured the conditional expression in a function, soon it will become clear why).

Again, I think that it will be straight forward to use enumerative reasoning to conclude that the output will be:


It's me!
Isn't it awesome?

When moving to an asynchronous world (which changes the signature of the checkMe() to have a callback) things become a bit more complicated:


function checkMe(name, callback) {
process.nextTick(function() {
callback(name == "Sander");
});
}

var name = "Sander";

checkMe(name, function(result) {
if(result) {
printOnConsole("It's me!", function() {
printOnConsole("Isn't it awesome?");
});
} else {
printOnConsole("It's someone else!");
}
});

We can no longer evaluate the conditional expression within the if-clause. Instead, we have to evaluate it earlier, then use the callback to retrieve the result and use that to evaluate the if conditional expression.

Although it is a bit inconvenient not being able to directly evaluate a conditional expression, again I still do not think this affect the ability to use enumeration for similar reasons as the sequential decomposition. The above code fragment basically just adds an additional sequential step, nothing more. So in my opinion, we still have not encountered a new GOTO.

Fortunately, I can also create an abstraction for the above pattern:


function when(conditionFun, thenFun, elseFun, callback) {
sequence([
function(callback) {
conditionFun(callback);
},

function(callback, result) {
if(result) {
thenFun(callback);
} else {
if(typeof elseFun == "function")
elseFun(callback);
else
callback();
}
}
], callback);
}

and use this function to express the if-statement as follows:


slasp.when(function(callback) {
checkMe(name, callback);
}, function(callback) {
slasp.sequence([
function(callback) {
printOnConsole("It's me!", callback);
},

function(callback) {
printOnConsole("Isn't it awesome?", callback);
}
], callback);
}, function(callback) {
printOnConsole("It's someone else!", callback);
});

Now I can embed a conditional expression in my artificial when statement.

Same thing applies to the other alteration construct in JavaScript: the switch statement -- you also cannot evaluate a conditional expression directly if it invokes an asynchronous function invocation. However, I can also make an abstraction (which I have called circuit) to cope with that.

Repetition


How are the repetition constructs (e.g. while and do-while) affected in an asynchronous world? Consider the following example implementing a while loop:


function checkTreshold() {
return (approx.toString().substring(0, 7) != "3.14159");
}

var approx = 0;
var denominator = 1;
var sign = 1;

while(checkTreshold()) {
approx += 4 * sign / denominator;
printOnConsole("Current approximation is: "+approx);

denominator += 2;
sign *= -1;
}

The synchronous code fragment shown above implements the Gregory-Leibniz formula to approximate pi up to 5 decimal places. To reason about its correctness, we have to use both enumeration and mathematical induction. First, we reason that the first two components of the series are correct, then we can use induction to reason that each successive component of the series is correct, e.g. they have an alternating sign, and a denominator increases with 2 for each successive step.

If we move to an asynchronous world, we have a couple of problems, beyond those that are described earlier. First, repetition blocks the event loop for an unknown amount of time so we must interrupt it. Second, if we interrupt a loop, we cannot resume it with a callback. Therefore, we must write our asynchronous equivalent of the previous code as follows:


function checkTreshold(callback) {
process.nextTick(function() {
callback(approx.toString().substring(0, 7) != "3.14159");
});
}

var approx = 0;
var denominator = 1;
var sign = 1;

(function iteration(callback) {
checkTreshold(function(result) {
if(result) {
approx += 4 * sign / denominator;
printOnConsole("Current approximation is: "+approx, function() {
denominator += 2;
sign *= -1;
setImmediate(function() {
iteration(callback);
});
});
}
});
})();

In the above code fragment, I have refactored the code into a recursive algorithm. Moreover, for each iteration step, I use setImmediate() to generate an event (I cannot use process.nextTick() in Node.js because it skips processing certain kinds of events) and I suspend the execution. The corresponding callback starts the next iteration step.

So is this implication the new GOTO? I would still say no! Even though we were forced to discard the while construct and use recursion instead, we can still use mathematical induction to reason about its correctness, although certain statements are wrapped in callbacks that make things a bit uglier and harder to maintain.

Luckily, I can also capture the above pattern in an abstraction:


function whilst(conditionFun, statementFun, callback) {
when(conditionFun, function() {
sequence([
statementFun,

function() {
setImmediate(function() {
whilst(conditionFun, statementFun, callback);
});
}
], callback);
}, callback);
}

The above function (called: whilst) takes three functions as parameters: the first parameter takes a function returning (through a callback) a boolean that represents the conditional expression, the second parameter takes a function that has to be executed for each iteration, and the third parameter is a callback that gets invoked if the repetition has finished.

Using the whilst() function, I can rewrite the earlier example as follows:


var approx = 0;
var denominator = 1;
var sign = 1;

slasp.whilst(checkTreshold, function(callback) {
slasp.sequence([
function(callback) {
approx += 4 * sign / denominator;
callback();
},

function(callback) {
printOnConsole("Current approximation is: "+approx, callback);
},

function(callback) {
denominator += 2;
callback();
},

function(callback) {
sign *= -1;
callback();
}
], callback);
});

The same thing that we have encountered also holds for the other repetition constructs in JavaScript. doWhile is almost the same, but we have to evaluate the conditional expression at the end of each iteration step. We can refactor a for and for-in loop as a while loop, thus the same applies to these constructs as well. For all these constructs I have developed corresponding asynchronous abstractions: doWhilst, from and fromEach.

Exceptions


With all the work done so far, I could already conclude that moving from a synchronous to an asynchronous world (using callbacks) results in a couple of nasty issues, but these issues are definitely not the new GOTO. However, a common extension to structured programming is the use of exceptions, which JavaScript also supports.

What if we expand our earlier example with the generateWord() function to throw an exception if a parameter is given that is not a single positive digit?


function generateWord(num) {
if(num < 0 || num > 9) {
throw "Cannot convert "+num+" into a word";
} else {
var words = [ "zero", "one", "two", "three", "four", "five",
"six", "seven", "eight", "nine" ];
return words[num];
}
}

try {
var word = generateWord(1);
printOnConsole("We have a: "+word);
word = generateWord(10);
printOnConsole("We have a: "+word);
} catch(err) {
printOnConsole("Some exception occurred: "+err);
} finally {
printOnConsole("Bye bye!");
}

The above code also captures a possible exception and always prints "Bye bye!" on the console regardless of the outcome.

The problem with exceptions in an asynchronous world is basically the same as with the return statement. We cannot just catch an exception because it may not have been thrown yet. So instead of throwing and catching exception, we must simulate them. This is commonly done in Node.js by a introducing another callback parameter called err (that is the first parameter of callback) that is not null if some error has been thrown.

Changing the above function definition to throw errors using this callback parameter is straight forward:


function generateWord(num, callback) {
var words;
process.nextTick(function() {
if(num < 0 || num > 9) {
callback("Cannot convert "+num+" into a word");
} else {
words = [ "zero", "one", "two", "three", "four", "five",
"six", "seven", "eight", "nine" ];
callback(null, words[num]);
}
});
}

However simulating the effects of a throw, and the catch and finally clauses is not straight forward. I am not going to much into the details (and it's probably best to just just briefly skim over the next code fragment), but this is what I basically what I ended up writing (which is still partially incomplete):


generateWord(1, function(err, result) {
if(err) {
printOnConsole("Some exception occured: "+err, function(err) {
if(err) {
// ...
} else {
printOnConsole("Bye bye!");
}
});
} else {
var word = result;
printOnConsole("We have a: "+word, function(err) {
if(err) {
printOnConsole("Some exception occurred: "+err, function(err) {
if(err) {
// ...
} else {
printOnConsole("Bye bye!");
}
});
} else {
generateWord(10, function(err, result) {
if(err) {
printOnConsole("Some exception occurred: "+err, function(err) {
if(err) {
// ...
} else {
printOnConsole("Bye bye!");
}
});
} else {
word = result;
printOnConsole("We have a: "+word, function(err) {
if(err) {
printOnConsole("Some exception occurred: "+err, function(err) {
if(err) {
// ...
} else {
printOnConsole("Bye bye!");
}
});
} else {
// ...
}
});
}
});
}
});
}
});

As you may notice, now the code clearly blows up and you also see lots of repetition because of the fact that we need to simulate the effects of the throw and finally clauses.

To create an abstraction to cope with exceptions, we must adapt all the abstraction functions that I have shown previously to evaluate the err callback parameters. If the err parameter is set to something, we must stop the execution and propagate the err parameter to its callback.

Moreover, I can also define a function abstraction named: attempt, to simulate a try-catch-finally block:


function attempt(statementFun, captureFun, lastlyFun) {
statementFun(function(err) {
if(err) {
if(typeof lastlyFun != "function")
lastlyFun = function() {};

captureFun(err, lastlyFun);
} else {
if(typeof lastlyFun == "function")
lastlyFun();
}
});
}

and I can rewrite the mess shown earlier as follows:


slasp.attempt(function(callback) {
slasp.sequence([
function(callback) {
generateWord(1, callback);
},

function(callback, result) {
word = result;
printOnConsole("We have a: "+word, callback);
},

function(callback) {
generateWord(10, callback);
},

function(callback, result) {
word = result;
printOnConsole("We have a: "+word);
}

], callback);
}, function(err, callback) {
printOnConsole("Some exception occured: "+err, callback);
}, function() {
printOnConsole("Bye bye!");
});

Objects


Another extension in JavaScript is the ability to construct objects having prototypes. In JavaScript constructors are functions as well as object methods. I think the same applies to these kind of functions just as regular ones -- they cannot return values immediately because they may not have finished their execution yet.

Consider the following example:


function Rectangle(width, height) {
this.width = width;
this.height = height;
}

Rectangle.prototype.calculateArea = function() {
return this.width * this.height;
};

var r = new Rectangle(2, 2);

printOnConsole("Area is: "+r.calculateArea());

The above code fragment simulates a Rectangle class, constructs a rectangle having a width and height of 2, and calculates and displays its area.

When moving to an asynchronous world, we have to take into account all things we did previously. I ended up writing:


function Rectangle(self, width, height, callback) {
process.nextTick(function() {
self.width = width;
self.height = height;
callback(null);
});
}

Rectangle.prototype.calculateArea = function(callback) {
var self = this;
process.nextTick(function() {
callback(null, self.width * self.height);
});
};

function RectangleCons(width, height, callback) {
function F() {};
F.prototype = Rectangle.prototype;
var self = new F();
Rectangle(self, width, height, function(err) {
if(err === null)
callback(null, self);
else
callback(self);
});
}

RectangleCons(2, 2, function(err, result) {
var r = result;
r.calculateArea(function(err, result) {
printOnConsole("Area is: "+result);
});
});

As can be observed, all functions -- except for the constructor -- have an interface including a callback.

The reason that I had to do something different for the constructor is that functions that are called in conjunction with new cannot propagate this back to the caller without including weird internal properties. Therefore, I had to create a "constructor wrapper" (named: RectangleCons) that first constructs an empty object with the right prototype. After the empty object has been constructed, I invoke the real constructor doing the initialisation work.

Furthermore, the this keyword only works properly within the scope of the constructor function. Therefore, I had to use a helper variable called self to make the properties of this available in the scope of the callbacks.

Writing a "wrapper constructor" is something we ideally do not want to write ourselves. Therefore, I created an abstraction for this:


function novel() {
var args = Array.prototype.slice.call(arguments, 0);

var constructorFun = args.shift();
function F() {};
F.prototype = constructorFun.prototype;
F.prototype.constructor = constructorFun;

var self = new F();
args.unshift(self);

var callback = args[args.length - 1];
args[args.length - 1] = function(err, result) {
if(err)
callback(err);
else
callback(null, self);
};

constructorFun.apply(null, args);
}
}

And using this abstraction, I can rewrite the code as follows:


function Rectangle(self, width, height, callback) {
process.nextTick(function() {
self.width = width;
self.height = height;
callback(null);
});
}

Rectangle.prototype.calculateArea = function(callback) {
var self = this;
process.nextTick(function() {
callback(null, self.width * self.height);
});
};

slasp.novel(Rectangle, 2, 2, function(err, result) {
var r = result;
r.calculateArea(function(err, result) {
printOnConsole("Area is: "+result);
});
});

When using novel() instead of new, we can conveniently construct objects asynchronously.

As a sidenote: if you want to use simulated class inheritance, you can still use my inherit() function that takes two constructor functions as parameters described in an earlier blog post. They should also work with "asynchronous" constructors.

Discussion


In this blog post, I have shown that in an asynchronous world, functions have to be defined and used differently. As a consequence, most of JavaScript's language constructs are either unusable or have to be used in a different way. So basically, we have to forget about most common concepts that we normally intend to use in a synchronous world, and learn different ones.

The following table summarizes the synchronous programming language concepts and their asynchronous counterparts for which I have directly and indirectly derived patterns or abstractions:

ConceptSynchronousAsynchronous
Function interface
function f(a) { ... }
function f(a, callback) { ... }
Return statement
return val;
callback(null, val);
Sequence
a; b; ...
slasp.sequence([
function(callback) {
a(callback);
},

function(callback) {
b(callback);
}
...
]);
if-then-else
if(condFun())
thenFun();
else
elseFun();
slasp.when(condFun,
thenFun,
elseFun);
switch
switch(condFun()) {
case "a":
funA();
break;
case "b":
funB();
break;
...
}
slasp.circuit(condFun,
function(result, callback) {
switch(result) {
case "a":
funA(callback);
break;
case "b":
funB(callback);
break;
...
}
});
Recursion
function fun() { fun(); }
function fun(callback) {
setImmediate(function() {
fun(callback);
});
}
while
while(condFun()) {
stmtFun();
}
slasp.whilst(condFun, stmtFun);
doWhile
do {
stmtFun();
} while(condFun());
slasp.doWhilst(stmtFun, condFun);
for
for(startFun();
condFun();
stepFun()
) {
stmtFun();
}
slasp.from(startFun,
condFun,
stepFun,
stmtFun);
for-in
for(var a in arrFun()) {
stmtFun();
}
slasp.fromEach(arrFun,
function(a, callback) {
stmtFun(callback);
});
throw
throw err;
callback(err);
try-catch-finally
try {
funA();
} catch(err) {
funErr();
} finally {
funFinally();
}
slasp.attempt(funA,
function(err, callback) {
funErr(callback);
},
funFinally);
constructor
function Cons(a) {
this.a = a;
}
function Cons(self, a, callback) {
self.a = a;
callback(null);
}
new
new Cons(a);
slasp.novel(Cons, a, callback);

To answer the question whether callbacks are the new GOTO: my conclusion is that they are not the new GOTO. Although they have drawbacks, such as the fact that it becomes harder to read, maintain and adapt code, it does not affect our ability to use enumeration or mathematical induction.

However, if we start using exceptions, then things become way more difficult. Then developing abstractions is unavoidable, but this has nothing to do with callbacks. Simulating exception behaviour in general makes things complicated, which is fueled by the nasty side effects of callbacks.

Another funny observation is that it has become quite common to use JavaScript for asynchronous programming. Since it has been developed for synchronous programming, means that most its constructs are useless. Fortunately, we can cope with that by implementing useful abstractions ourselves (or through third party libraries), but it would be better IMHO that a programming language has the all relevant facilities that are suitable for the domain in which it is going to be used.

Conclusion


In this blog post, I have explained that when moving from a synchronous to an asynchronous world requires forgetting certain programming language concepts and use different asynchronous equivalents.

I have made a JavaScript library out of the abstractions in this blog post (yep, that is yet another abstraction library!), because I think they might come in handy at some point. It is named slasp (SugarLess Asynchronous Structured Programming), because it implements abstractions that are close to the bare bones of JavaScript. It provides no sugar, such as borrowing abstractions from functional programming languages and so on, which most other libraries do.

The library can be obtained from my GitHub page and through NPM and used under the terms and conditions of the MIT license.

Asynchronous package management with NiJS

$
0
0
Last week, I have implemented some additional features in NiJS: an internal DSL for Nix in JavaScript. One of its new features is an alternative formalism to write package specifications and some use cases.

Synchronous package definitions


Traditionally, a package in NiJS can be specified in JavaScript as follows:


var nijs = require('nijs');

exports.pkg = function(args) {
return args.stdenv().mkDerivation ({
name : "file-5.11",

src : args.fetchurl()({
url : new nijs.NixURL("ftp://ftp.astron.com/pub/file/file-5.11.tar.gz"),
sha256 : "c70ae29a28c0585f541d5916fc3248c3e91baa481f63d7ccec53d1534cbcc9b7"
}),

buildInputs : [ args.zlib() ],

meta : {
description : "A program that shows the type of files",
homepage : new nijs.NixURL("http://darwinsys.com/file")
}
});
};

The above CommonJS module exports a function which specifies a build recipe for a package named file, that uses zlib as a dependency and executes the standard GNU Autotools build procedure (i.e. ./configure; make; make install) to build it.

The above module specifies how to build a package, but not which versions or variants of the dependencies that should be used. The following CommonJS module specifies how to compose packages:


var pkgs = {

stdenv : function() {
return require('./pkgs/stdenv.js').pkg;
},

fetchurl : function() {
return require('./pkgs/fetchurl').pkg({
stdenv : pkgs.stdenv
});
},

zlib : function() {
return require('./pkgs/zlib.js').pkg({
stdenv : pkgs.stdenv,
fetchurl : pkgs.fetchurl
});
},

file : function() {
return require('./pkgs/file.js').pkg({
stdenv : pkgs.stdenv,
fetchurl : pkgs.fetchurl,
zlib : pkgs.zlib
});
}
}

export.pkgs = pkgs;

As can be seen, the above module includes the previous package specification and provides all its required parameters (such as a variant of the zlib library that we need). Moreover, all its dependencies are composed in the above module as well.

Asynchronous package definitions


The previous modules are synchronous package definitions, meaning that once they are being evaluated nothing else can be done. In the latest version of NiJS, we can also write asynchronous package definitions:


var nijs = require('nijs');
var slasp = require('slasp');

exports.pkg = function(args, callback) {
var src;

slasp.sequence([
function(callback) {
args.fetchurl()({
url : new nijs.NixURL("ftp://ftp.astron.com/pub/file/file-5.11.tar.gz"),
sha256 : "c70ae29a28c0585f541d5916fc3248c3e91baa481f63d7ccec53d1534cbcc9b7"
}, callback);
},

function(callback, _src) {
src = _src;
args.zlib(callback);
},

function(callback, zlib) {
args.stdenv().mkDerivation ({
name : "file-5.11",
src : src,
buildInputs : [ zlib ],

meta : {
description : "A program that shows the type of files",
homepage : new nijs.NixURL("http://darwinsys.com/file")
}
}, callback);
}
], callback);
};

The above module defines exactly the same package as shown earlier, but defines it asynchronously. For example, it does not return, but uses a callback function to pass the evaluation result back to the caller. I have used the slasp library to flatten its structure to make it better readable and maintainable.

Moreover, because packages implement an asynchronous function interface, we also have to define the composition module in a slightly different way:


var pkgs = {

stdenv : function(callback) {
return require('./pkgs-async/stdenv.js').pkg;
},

fetchurl : function(callback) {
return require('./pkgs-async/fetchurl').pkg({
stdenv : pkgs.stdenv
}, callback);
},

zlib : function(callback) {
return require('./pkgs-async/zlib.js').pkg({
stdenv : pkgs.stdenv,
fetchurl : pkgs.fetchurl
}, callback);
},

file : function(callback) {
return require('./pkgs-async/file.js').pkg({
stdenv : pkgs.stdenv,
fetchurl : pkgs.fetchurl,
zlib : pkgs.zlib
}, callback);
}
}

exports.pkgs = pkgs;

Again, this composition module has the same meaning as the one showed earlier, but each object member implements an asynchronous function interface having a callback.

So why are these asynchronous package specifications useful? In NiJS, there are two use cases for them. The first use case is to compile them to Nix expressions and build them with the Nix package manager (which can also be done with synchronous package definitions):


$ nijs-build pkgs-async.js -A file --async
/nix/store/c7zy6w6ls3mfmr9mvzz3jjaarikrwwrz-file-5.11

The only minor difference is that in order to use asynchronous package definitions, we have to pass the --async parameter to the nijs-build command so that they are properly recognized.

The second (and new!) use case is to execute the functions directly with NiJS. For example, we can also use the same composition module to do the following:


$ nijs-execute pkgs-async.js -A file
/home/sander/.nijs/store/file-5.11

When executing the above command, the Nix package manager is not used at all. Instead, NiJS directly executes the build function implementing the corresponding package and all its dependencies. All resulting artifacts are stored in a so-called NiJS store, which resides in the user's home directory, e.g.: /home/sander/.nijs/store.

The latter command does not depend on Nix at all making it possible for NiJS to act as an independent package manager, yet having the most important features that Nix also has.

Implementation


The implementation of nijs-execute is straight forward. Every package directly or indirectly invokes the same function that actually executes a build operation: args.stdenv().mkDerivation(args, callback).

The original implementation for nijs-build (that compiles a JavaScript composition module to a Nix expression) looks as follows:


var nijs = require('nijs');

exports.pkg = {
mkDerivation : function(args, callback) {
callback(null, new nijs.NixExpression("pkgs.stdenv.mkDerivation "
+nijs.jsToNix(args)));
}
};

To make nijs-execute work, we can simply replace the above implementation with the following:


var nijs = require('nijs');

exports.pkg = {
mkDerivation : function(args, callback) {
nijs.evaluateDerivation(args, callback);
}
};

We replace the generated Nix expression that invokes Nixpkgs'stdenv.mkDerivation {} by a direct invocation to nijs.evaluateDerivation() that executes a build directly.

The evaluateDerivation() translates the first parameter object (representing build parameters) to environment variables. Each key corresponds to an environment variable and each value is translated as follows:

  • A null value is translated to an empty string
  • true translates to "1" and false translates to an empty string.
  • A string, number, or xml object are translated to strings literally.
  • Objects that are instances of the NixFile and NixURL prototypes are also translated to strings literally.
  • Objects instances of the NixInlineJS prototype are converted into a separate builder script, which gets executed by the default builder.
  • Objects instances of the NixRecursiveAttrSet prototype and arbitrary objects are considered derivations that need to be evaluated separately.

Furthermore, evaluateDerivation() invokes a generic builder script with similar features as the one in Nixpkgs:

  • All environment variables are cleared or set to dummy values, such as HOME=/homeless-shelter.
  • It supports the execution of phases. By default, it runs the following phases: unpack, patch, configure, build, install and can be extended with custom ones.
  • By default, it executes a GNU Autotools build procedure: ./configure; make; make install with configurable settings (that have a default fallback value).
  • It can also take custom build commands so that a custom build procedure can be performed
  • It supports build hooks so that the appropriate environment variables are set when providing a buildInputs parameter. By default, the builder automatically sets PATH, C_INCLUDE_PATH and LIBRARY_PATH environment variables. Build hooks can be used to support other languages and environments' settings, such as Python (e.g. PYTHONPATH) and Node.js (e.g. NODE_PATH)

Discussion


Now that NiJS has the ability to act as an independent package manager in addition to serving the purpose of an internal DSL, means that we can deprecate Nix and its sub projects soon and use Nix (for the time being) as a fallback for things that are not supported by NiJS yet.

NiJS has the following advantages over Nix and its sub projects:

  • I have discovered that the Nix expression language is complicated and difficult to learn. Like Haskell, it has a solid theoretical foundation and powerful features (such as laziness), but it's too hard to learn by developers without an academic background.

    Moreover, I had some difficulties accepting JavaScript in the past, but after discovering how to deal with prototypes and asynchronous programming, I started to appreciate it and really love it now.

    JavaScript has all the functional programming abilities that we need, so why should we implement our own language to accomplish the same? Furthermore, many people have proven that JavaScript is the future and we can attract more users if we use a language that more people are familiar with.
  • NiJS also prevents some confusion with a future Linux distribution that is going to be built around it. For most people, it is too hard to make a distinction between Nix and NixOS.

    With NiJS this is not a problem -- NiJS is supposed to be pronounced in Dutch as: "Nice". The future Linux distribution that will be built around it will be called: "NiJSOS", which should be pronounced as "Nice O-S" in Dutch. This is much easier to remember.
  • Same thing holds for Disnix -- Nix sounds like "Nothing" in Dutch and Disnix sounds like: "This is nothing!". This strange similarity has prevented me to properly spread the word to the masses. However "DisniJS" sounds like "This is nice!" which (obviously) sounds much better and is much easier to remember.
  • NiJS also makes continuous integration more scalable than Nix. We can finally get rid of all the annoying Perl code (and the Template Toolkit) in Hydra and reimplement it in Node.js using all its powerful frameworks. Since in Node.js all I/O operations are non-blocking, we can make Hydra even more faster and more scalable.

Conclusion


In this blog post, I have shown that we can also specify packages asynchronously in NiJS. Asynchronous package specifications can be built directly with NiJS, without requiring them to be compiled to Nix expressions that must be built with Nix.

Since NiJS has become an independent package manager and JavaScript is the future, we can deprecate Nix (and its sub projects) soon, since NiJS has significant advantages over Nix.

NiJS can be downloaded from my GitHub page and from NPM. NiJS can also bootstrap itself :-)

Moreover, soon I will create a website, set up mailing lists, create an IRC channel, and define the other sub projects that can be built on top of it.

Follow up


UPDATE: It seems that this blog post has attracted quite a bit of attention today. For example, there has been some discussion about it on the Nix mailing list as well as the GNU Guix mailing list. Apparently, I also made a few people upset :-)

Moreover, a lot readers probably did not notice the publishing date! So let me make it clear:

IT'S APRIL FOOLS' DAY!!!!!!!!!!!!!!!

The second thing you may probably wonder is: what exactly is this "joke" supposed to mean?

In fact, NiJS is not a fake package -- it actually does exists, can be installed through Nix and NPM, and is really capable of doing the stuff described in this blog post (as well the two previous ones).

However, the intention to make NiJS a replacement for Nix was a joke! As a matter of fact, I am a proponent of external DSLs and Nix already does what I need!

Furthermore, only 1% of NiJS' features are actually used by me. For the rest, the whole package is just simply a toy, which I created to explore the abilities of internal DSLs and to explore some "what if" scenarios, no matter how silly they would look :-)

Although NiJS can build packages without reliance on Nix, its mechanisms are extremely primitive! The new feature described in this blog post was basically a silly experiment to develop a JavaScript specification that can be both compiled (to Nix) and interpreted (executed by NiJS directly)!

Moreover, the last few years I have heard a lot of funny, silly, and stupid things, about all kinds of aspects related to Nix, NixOS, Disnix and Node.js which I kept in mind. I (sort of) integrated these things into a story and used a bit of sarcasm as a glue! What these things exactly are is an open exercise for the reader :-).

Rendering 8-bit palettized surfaces in SDL 2.0 applications

$
0
0
Recently, I have ported SDL_ILBM to SDL 2.0, since it's out there for a while now. SDL 2.0 has many improvements over SDL 1.2 and is typically a better fit for modern hardware.

The majority of the porting activities were quite straight forward, thanks to the SDL 1.2 to 2.0 migration guide. What impacted SDL_ILBM were the following:

  • SDL 2.0 supports multiple windows, so we have to create (and discard) them in a slightly different way in the viewer app.
  • There is a more advanced graphics rendering pipeline in SDL 2.0 that uses hardware acceleration (e.g. through OpenGL or Direct3D) where possible.

Properly supporting the latter aspect puzzled me a bit, because the migration guide did not clearly describe what the best way is to continuously render 8-bit palettized surfaces for every frame.

SDL_ILBM generates these kind of surfaces for the majority of ILBM images by default (images using the HAM viewport mode are notable exceptions, because they typically contain more than 256 distinct colors). I need to update these surfaces for every frame to support animated cycle ranges.

In this blog post, I will describe what I did to support this, because I couldn't find any hands on information about this elsewhere and I think this information might be helpful to others as well.

Rendering surfaces using SDL 1.2


In the old implementation using SDL 1.2, the viewer application basically did the following: It first parses an IFF file, then extracts the ILBM images from it and finally composes SDL_Surface instances from the ILBM images that are shown to the user. Then the application enters a main loop which responds to user's input and continuously updates what is being displayed in the window.

Expressing this in very simplified code, it looks as follows. First, we obtain an SDL_Surface that contains the represents an ILBM image that we want to display:


SDL_Surface *pictureSurface;

When using the default options, it produces a 8-bit palettized surface, unless you try to open an image using the HAM viewport setting.

Then we construct a window that displays the image. One of the options of the viewer is to make the dimensions of the window equal to the dimensions of the image:


SDL_Surface *windowSurface = SDL_SetVideoMode(pictureSurface->w,
pictureSurface->h,
32,
SDL_HWSURFACE | SDL_DOUBLEBUF);

Besides constructing a window, SDL_SetVideoMode() also returns an SDL surface representing the graphics that are displayed in the window. The SDL_HWSURFACE parameter is used to tell SDL that the pixel data should reside in hardware memory (video RAM) instead of software memory (ordinary RAM), which the picture surface uses.

Eventually we reach the main loop of the program that responds to user events (e.g. keyboard presses), updates the pixels in the picture surface when cycling mode is turned on and flips the logical and physical screen buffers so that the changes become visible:


while(TRUE)
{
/* Process events */

/* Modify the pixels of pictureSurface */

/* Blit picture surface on the window */
SDL_BlitSurface(pictureSurface, NULL, windowSurface, NULL);

/* Update the screen */
SDL_Flip(windowSurface);
}

After changing the palette and/or the pixels in the picture surface, we can simply use SDL_BlitSurface() to make the modified surface visible in the window surface. This function also converts the pixels into the format of the target surface automatically, which means that it should work for both 8-bit palettized surfaces as well as RGBA surfaces.

Rendering surfaces using SDL 2.0


Since SDL 2.0 has a more advanced graphics rendering pipeline, there are more steps that we need to perform. Constructing the same window with the same dimensions in SDL 2.0 must be done as follows:


SDL_Window *sdlWindow = SDL_CreateWindow("ILBM picture",
SDL_WINDOWPOS_UNDEFINED,
SDL_WINDOWPOS_UNDEFINED,
pictureSurface->w, pictureSurface->h,
0);

Besides a window, we must also construct a renderer instance, that renders textures on the window:


SDL_Renderer *renderer = SDL_CreateRenderer(sdlWindow, -1, 0);

The renderer is capable of automatically scaling a texture to the window's dimensions if needed. The following instructions configure the renderer's dimensions and instruct it to use linear scaling:


SDL_SetHint(SDL_HINT_RENDER_SCALE_QUALITY, "linear");
SDL_RenderSetLogicalSize(sdlRenderer,
pictureSurface->w, pictureSurface->h);

In SDL 2.0, a texture is basically a pixel surface that resides in hardware memory while an 'ordinary' surface resides in software memory. In SDL 1.2 both were SDL surfaces with a different flag parameter.

The previous steps were quite easy. However, while I was trying to port the main loop to SDL 2.0 I was a bit puzzled. In order to show something in the window, we must ensure that the pixels are in hardware memory (i.e. in the texture). However, we do not have direct access to a texture's pixels. One of the solutions that the migration guide suggests is to convert a surface (that resides in software memory) to a texture:


SDL_Texture *texture = SDL_CreateTextureFromSurface(renderer, surface);

The above function invocation creates a texture out of the surface and performs all necessary steps to do it, such as allocating memory for the texture and converting chunky pixels to RGBA pixels.

Although this function seems to do everything we need, it has two drawbacks. It allocates memory for the texture which we have to free ourselves over and over again, while we know that the texture always has the same size. Second, the resulting texture is a static texture and its pixels can only be modified through SDL_UpdateTexture(), which is also a slow operation. Therefore, it is not recommended to run it to render every frame.

A faster alternative (according to the migration guide) is to use a streaming texture:


SDL_Texture *texture = SDL_CreateTexture(renderer,
SDL_PIXELFORMAT_RGBA8888,
SDL_TEXTUREACCESS_STREAMING,
pictureSurface->w,
pictureSurface->h);

However, we cannot construct textures that store its pixels in chunky format, so we have to convert them from the surface's format to the texture's format. After studying the SDL documentation a bit, I stumbled upon SDL_ConvertPixels(), but that did not seem to work with the picture surface, because the function cannot convert surfaces with an indexed palette.

I ended up implementing the main loop as follows:


/* Construct a surface that's in a format close to the texture */
SDL_Surface *windowSurface = SDL_CreateRGBSurface(0,
pictureSurface->w, pictureSurface->h,
32, 0, 0, 0, 0);

void *pixels;
int pitch;

while(TRUE)
{
/* Process events */

/* Modify the pixels of pictureSurface */

/*
* Blit 8-bit palette surface onto the window surface that's
* closer to the texture's format
*/
SDL_BlitSurface(pictureSurface, NULL, windowSurface, NULL);

/* Modify the texture's pixels */
SDL_LockTexture(texture, NULL, &pixels, &pitch);
SDL_ConvertPixels(windowSurface->w, windowSurface->h,
windowSurface->format->format,
windowSurface->pixels, windowSurface->pitch,
SDL_PIXELFORMAT_RGBA8888,
pixels, pitch);
SDL_UnlockTexture(texture);

/* Make the modified texture visible by rendering it */
SDL_RenderCopy(renderer, texture, NULL, NULL);
}

I introduced another SDL surface (windowSurface) that uses a format closer to the texture's format (RGBA pixels) so that we can do the actual conversion with SDL_ConvertPixels(). After modifying the pixels in the 8-bit palettized surface (pictureSurface), we blit it to the window surface, which automatically converts the pixels to RGBA format. Then we use the window surface to convert the pixels to the texture's format, and finally we render the texture to make it visible to end users.

This seems to do the trick for me and this is the result:


Moreover, if I enable cycling mode the bird and bunny also seem to animate smoothly.

Conclusion


In this blog post I have described my challenges of porting SDL_ILBM from SDL 1.2 to SDL 2.0. To be able to modify the pixels and the palette of an 8-bit palettized surface for every frame, I used a second surface that has a format closer to the texture's format, allowing me to easily convert and transfer those pixels to a streaming texture.

This approach may also be useful for porting classic 8-bit graphics games to SDL 2.0.

Moreover, besides SDL_ILBM, I also ported SDL_8SVX to SDL 2.0, which did not require any modifications in the code. Both packages can be obtained from my GitHub page.

Porting GNU Autotools projects to Visual C++ projects (unconventional style)

$
0
0
Not so long ago, I have ported my IFF experiments' sub projects to SDL 2.0. Besides moving to SDL 2.0, I did some additional development work. One of the interesting things that I did was porting these sub projects to Visual C++ to make them work under Windows natively.

The main reason why I did this, is because I have received numerous feature requests for Visual C++ support in the past. Finally, I found a bit of time to actually do it. :-)

In general, the porting process was straight forward, but nonetheless I encountered a few annoyances. Moreover, since I have a research background in software deployment, I also want to make this deployment process manageable using my favourite deployment tools.

In this blog post, I will describe the porting steps I have done using my own "unconventional" style.

Generating projects


The first thing I did was turning relevant units of each package into Visual C++ projects so that they can be actually built with Visual C++. Typically, a Visual Studio project produces one single binary artifact, such as a library or executable.


Visual Studio 2013 has a very useful feature that composes solutions out of existing source directories automatically. It can be invoking by opening Visual Studio, and selecting 'File' -> 'New' -> 'Project From Existing Code...' from the menu bar.

For example, for the libiff package, I created three projects. One library project that builds libiff and two projects that produce executables: iffpp and iffjoin. To compose a project out of the src/libiff sub directory, I provided the following settings to the import wizard:

  • What type of project would you like to create? Visual C++
  • Project file location: D:\cygwin64\home\Sander\Development\libiff\src\libiff
  • Project name: libiff
  • How do you want to build the project? Use Visual Studio
  • Project type: Dynamically linked library (DLL) project

For the remaining projects (the command-line utilities), I used Console application project as a project type.

Configuring project dependencies


After generating solutions for all relevant units of each package, I have configured their dependencies so that they can be built correctly.

The first aspect is to get the dependencies right between projects in a package. For example, the libiff package builds two executables: iffpp and iffjoin which both have a common dependency on the libiff shared library. The solutions I just generated from the sub directories, have no knowledge about any of its dependencies yet.


To set the inter-project dependencies, I composed a new empty solution, through 'File' -> 'New' -> 'Project' and then by picking 'Blank Solution' under 'Other project types'. In this blank solution, I have added all the projects of the package that I have generated previously.

After adding the projects to the solution, I can configure their dependency relationships. This is done by right clicking on each project and selecting the 'Build dependencies' -> 'Project dependencies...'. I used this dialog to make libiff a project dependency of iffpp and iffjoin.

After setting up a common solution for all the package's sub projects, I can delete the individual solution files for each project, since they are no longer needed.

Configuring library dependencies


Besides getting the dependencies among projects right, executables must also be linked against dynamic libraries that are either produced by other projects in the solution or by other means (e.g. in a different solution or prebuilt). In order to configure these dependencies, we have to change the project settings:

  • To link an executable to a dynamic library, we must right click on the corresponding project, select 'Properties', and pick the option 'Configuration Properties' -> 'Linker' -> 'Input'. In the configuration screen, we must add the name of the export library (a *.LIB file) to the 'Additional Dependencies' field.
  • We must also specify where the export libraries can be found. The library search directories can be configure by selecting 'Configuration Properties' -> 'Linker' -> 'General' in the properties screen and adapting the 'Additional Library Directories' field by adding its corresponding path.
  • Projects using shared libraries typically also have to find their required header files. The paths to these files can be configured by picking the option 'Configuration Properties' -> 'C/C++' -> 'General'. The required paths must be added to the 'Additional Include Directories' property.

It is a bit tricky to specify some of these paths in a "portable" way. Fortunately, there were a couple very useful macros that I was able to use, such as: $(OutDir) to refer to the output directory of the solution.

To refer to external libraries that do not reside in the solution (such as SDL 2.0), I defined my own custom properties and manually added them to the Visual C++ project files. For example, to allow a project to find SDL 2.0, I added the following lines to SDL_ILBM's project file:

<PropertyGroup>
<SDL2IncludePath>..\..\..\SDL2-2.0.3\include</SDL2IncludePath>
<SDL2LibPath>..\..\..\SDL2-2.0.3\lib\x86</SDL2LibPath>
</PropertyGroup>

I can refer to the properties with the following macros: $(SDL2IncludePath), $(SDL2LibPath) from the 'Additional Includes' and 'Additional Library Directories' fields. Moreover, these properties can also be overridden by the build infrastructure, which is very useful as we will see later.

Building export libraries


Another thing I observed is that in Visual C++, you need export libraries (*.LIB files) in order to be able to link a dynamic library to something else. However, if no exports are defined in a library project, then this file is not generated.

The most common way to export functions is probably by annotating headers and class definitions, but I don't find this very elegant for my project, since it requires me to adapt the source code and add non-standard pieces to it.

Another way is creating a Module-Definition File (*.DEF file) and adding it to the project that builds a library. A module definition file can be added to a project by right clicking on it, picking 'Add new item...', and selecting 'Visual C++' -> 'Code' -> 'Module-Defintion File (.def)'.


Creating this module definition file is straight forward. I investigated all headers files of the library to see what functions need to be accessible. Then I created a module definition file that looks as follows:

LIBRARY    libiff
EXPORTS
IFF_readFd @1
IFF_read @2
IFF_writeFd @3
IFF_write @4
IFF_free @5
IFF_check @6
IFF_print @7
IFF_compare @8

The above file basically lists the names of all publically accessible functions with a unique numeric id. These steps were enough for me to get an export library built.

Porting the command-line interfaces


Another porting issue was getting the command-line interfaces to work. This is actually the only "non-standard" piece of code of the IFF sub projects and depends on getopt() or getopt_long(), which is not part of Visual C++'s standard runtime. Furthermore, getopt-style parameters are a bit weird for Windows command line utilities.

I have decided to create a replacement command-line interface for Windows, that follows Windows console application conventions for command line parameters. For example on Unix-like platforms we can request the help text of iffpp as follows:

$ iffpp -h

On windows the same option can be requested as follows:

$ iffpp /?

As can be observed, we use Windows-style command-line parameters.

Automating build processes


The last aspect I had to take care of is getting the entire build process of all the sub projects automated. People who know me, know that I have a preference for Nix-related tools, for various reasons (check the link for the exact reasons).

Like .NET software, Visual C++ projects can be built from the command-line with MSBuild. Fortunately, I have already created a Nix function that invokes MSBuild to compile C# projects some time ago.

I have not used Nix on Cygwin for a while, and Nix's Cygwin support seemed to be broken, so I had to revive it. Fortunately, the changes were relatively minor. Moreover, the .NET build function still seemed to work after reviving Cygwin support.

To support building Visual C++ projects, I basically had to make two changes to the function that I have used for Visual C# projects. First, I had to set the following environment variable:

$ export SYSTEMDRIVE="C:"

Without this environment variable set, the compiler complains that paths are not well formed.

The second thing is to make the parameters to MSBuild configurable through an environment variable named msBuildOpts:

$ MSBuild.exe ... $msBuildOpts

The reason why I have added this feature, is because I want to make the properties that refer to external libraries (such as SDL 2.0 through the $(SDL2IncludePath) and $(SDL2LibPath) macros) configurable so that they can refer to dependencies that reside in the Nix store.

With these changes, I can write a Nix expression for any IFF file format experiment project. I used the following partial expression to build SDL_ILBM with Visual C++:

with import <nixpkgs> {};

let
SDL2devel = stdenv.mkDerivation {
name = "SDL2-devel-2.0.3";
src = fetchurl {
url = http://www.libsdl.org/release/SDL2-devel-2.0.3-VC.zip;
sha256 = "0q6fs678i59xycjlw7blp949dl0p2f1y914prpbs1cspz98x3pld";
};
buildInputs = [ unzip ];
installPhase = ''
mkdir -p $out
mv * $out
'';
dontStrip = true;
};
in
dotnetenv.buildSolution {
name = "SDL_ILBM";
src = ./.;
baseDir = "src";
slnFile = "SDL_ILBM.sln";
preBuild = ''
...
export msBuildOpts="$msBuildOpts /p:SDL2IncludePath=\"$(cygpath --windows ${SDL2devel}/include)\""
export msBuildOpts="$msBuildOpts /p:SDL2LibPath=\"$(cygpath --windows ${SDL2devel}/lib/x86)\""
'';
}

The above expression builds the SDL_ILBM solution that resides in the src/ folder of the package. It uses the msBuildOpts variable to pass the override the properties that we have defined earlier to pass the paths of the external projects to the build, such as SDL 2.0. It uses the cygpath command to translate UNIX paths to Windows paths so that they can be used with MSBuild.

By running the following command-line instruction:

$ nix-build sdlilbm.nix

SDL_ILBM including all its dependencies are automatically downloaded and built by Nix and stored in the Nix store.

Conclusion


By performing all the steps described in the blog post, I was able to port all my IFF file format experiment sub projects to Visual C++, which I can also automatically build with the Nix package manager to make the deployment process convenient and repeatable.

The following screenshots may show some interesting results to you:


Availability


The updated IFF projects can be obtained from my GitHub page. Moreover, if you want to use Nix to build Visual C++ projects on Cygwin, then you need to use my personal forks of Nix and Nixpkgs, which contain Cygwin specific fixes. I may push these changes upstream if others are interested in it and I consider them stable enough.


Backing up Nix (and Hydra) builds

$
0
0
One of the worst things that may happen to any computer user is that filesystems get corrupted or that storage mediums, such as hard drives, break down. As a consequence, valuable data might get lost.

Likewise, this could happen to machines storing Nix package builds, such as a Hydra continuous build machine that exposes builds through its web interface to end users.

Reproducible deployment


One of the key features of the Nix package manager and its related sub projects is reproducible deployment -- using Nix expressions (which are basically recipes that describe how components are built from source code and its dependencies), we can construct all static components of which a system consists (such as software packages and configuration files).

Moreover, Nix ensures that all dependencies are present and correct, and removes many side effects while performing a build. As a result, producing the same configuration with the same set of expressions on a different machine should yield (nearly) a bit identical configuration.

So if we keep a backup of the Nix expressions stored elsewhere, such as a remote Git repository, we should (in theory) have enough materials to reproduce a previously deployed system configuration.

However, there are still a few inconveniences if you actually have to do this:

  • It takes time to rebuild and redownload everything. Some packages and system configurations might consists of hundreds or thousands of components taking many hours to complete.
  • The source tarballs may not be available from their original download locations anymore. I have encountered these situations quite a few times when I was trying to reproduce very old configurations. Some suppliers may decide to remove old releases after a while, or to move them to different remote locations, which requires me to search for them and to adapt very old Nix expressions, which I preferably don't want to do.
  • We also have to restore state which cannot be done by the Nix package manager. For example, if the Hydra database gets lost, we have to configure all projects, jobsets, user accounts and releases from scratch again, which is tedious and time consuming.

Getting the dependencies of packages


To alleviate the first two inconveniences, we must also backup the actual Nix packages belonging to a configuration including all their dependencies.

Since all packages deployed by the Nix package manager typically reside in a single Nix store folder (typically /nix/store), that may also contain junk and irrelevant stuff, we have to somehow select the packages that we consider relevant.

Binary deployments


In Nix, there are various ways to query specific dependencies of a package. When running the following query on the Nix store path of a build result, such as a Disnix, we can fetch all its runtime dependencies:


$ nix-store --query --requisites /nix/store/sh8025fhmz1wq27663bakmq915a2pf79-disnix-0.3pre1234
/nix/store/31kl46d8l4271f64q074bzi313hjmdmv-linux-headers-3.7.1
/nix/store/94n64qy99ja0vgbkf675nyk39g9b978n-glibc-2.19
...
/nix/store/hjbzw7s8wbvrf7mjjfkm1ah6fhnmyhzw-libxml2-2.9.1
/nix/store/hk8wdzs9s52iw9gnxbi1n9npdnvvibma-libxslt-1.1.28
/nix/store/kjlv4klmrarn87ffc5sjslcjfs75ci7a-getopt-1.1.4
/nix/store/sh8025fhmz1wq27663bakmq915a2pf79-disnix-0.3pre1234

What the above command does is listing the transitive Nix store path references that a package contains. In the above example, these paths correspond to the runtime dependencies of Disnix, since they are referenced from bash scripts, as well as the RPATH fields of the ELF binaries and prevent the executables to run properly if any of them is missing.

According to the nix-store manual page, the above closure refers to a binary deployment of a package, since it contains everything required to run it.

Source deployments


We can also run the same query on a store derivation file. While evaluating Nix expressions to build packages -- including its build-time dependencies --, a store derivation file is generated each time the derivation { } function is invoked.

Every Nix expression that builds something indirectly calls this function. The purpose of a derivation is composing environments in which builds are executed.

For example, if we run the previous query on a store derivation file:


$ nix-store --query --requisites /nix/store/3icf7dxf3inky441ps1dl22aijhimbxl-disnix-0.3pre1234.drv
...
/nix/store/4bj56z61q6qk69657bi0iqlmia7np5vc-bootstrap-tools.cpio.bz2.drv
...
/nix/store/4hlq4yvvszqjrwsc18awdvb0ppbcv920-libxml2-2.9.1.tar.gz.drv
/nix/store/g32zn0z6cz824vbj20k00qvj7i4arqy4-setup-hook.sh
/nix/store/n3l0x63zazksbdyp11s3yqa2kdng8ipb-libxml2-2.9.1.drv
/nix/store/nqc9vd5kmgihpp93pqlb245j71yghih4-libxslt-1.1.28.tar.gz.drv
/nix/store/zmkc3jcma77gy94ndza2f1y1rw670dzh-libxslt-1.1.28.drv
...
/nix/store/614h56k0dy8wjkncp0mdk5w69qp08mdp-disnix-tarball-0.3pre1234.drv
/nix/store/3icf7dxf3inky441ps1dl22aijhimbxl-disnix-0.3pre1234.drv

Then all transitive references to the store derivation files are shown, which correspond to all build-time dependencies of Disnix. According to the nix-store manual page the above closure refers to a source deployment of package, since the store derivations are low-level specifications allowing someone to build a package from source including all its build time dependencies.

Cached deployments


The previous query only returns the store derivation files. These files still need to be realised in order to get a build, that may take some time. We can also query all store derivation files and their corresponding build outputs, by running:


$ nix-store --query --requisites --include-outputs \
/nix/store/3icf7dxf3inky441ps1dl22aijhimbxl-disnix-0.3pre1234.drv
...
/nix/store/zmkc3jcma77gy94ndza2f1y1rw670dzh-libxslt-1.1.28.drv
...
/nix/store/hk8wdzs9s52iw9gnxbi1n9npdnvvibma-libxslt-1.1.28
...
/nix/store/3icf7dxf3inky441ps1dl22aijhimbxl-disnix-0.3pre1234.drv

The above command only includes the realised store paths that have been built before. By adding the --force-realise parameter to the previous command-line instruction, we can force all outputs of the derivations to be built.

According to the nix-store manual page, the above closure refers to a cached deployment of a package.

Backing up Nix components


Besides querying the relevant Nix store components that we intend to backup, we also have to store them elsewhere. In most cases, we cannot just simply copy the Nix store paths to another location and copy it back into the Nix store at some later point:

  • Some backup locations may use more primitive filesystems than Linux (and other UNIX-like systems). For example, we require filesystem features, such as symlinks and read, write and executable bits.
  • We also require necessary meta-information to allow it to be imported into the Nix store, such as the set of references to other paths.

For these reasons, it is recommendable to use nix-store --export, that serializes a collection of Nix store paths into a single file including their meta-information. For example, the following command-line instruction serializes a cached deployment closure of Disnix:


$ nix-store --export $(nix-store --query --requisites --include-outputs \
/nix/store/3icf7dxf3inky441ps1dl22aijhimbxl-disnix-0.3pre1234.drv) > disnix-cached.closure

The resulting closure file (disnix-cached.closure) can easily be stored on many kinds of mediums, such as an external harddrive using a FAT32 filesystem. We can import the the closure file into another Nix store by running:


$ nix-store --import < disnix-cached.closure

The above command imports Disnix including all its dependencies into the Nix store. If any dependencies are already in the Nix store, then they are skipped. If any dependency appears to be missing, it returns an error. All these properties can be verified because the serialization contains all the required meta-information.

Storing backups of a collection of Nix components efficiently


In principle, the export and import nix-store operations should be sufficient to make reliable backups of any Nix package. However, the approach I described has two drawbacks:

  • For each package, we serialize the entire closure of dependencies. Although this approach is reliable, it is also inefficient if we want to backup multiple packages at the same time. Typically, many packages share the same common set of dependencies. As a consequence, each backup contains many redundant packages wasting a lot of precious disk space.
  • If we change a package's source code, such as Disnix, and rebuild it, we have to re-export the entire closure again, while many of the dependencies of remain the same. This makes the backup process time considerably longer than necessary.

To fix these inefficiencies, we need an approach that stores serializations of each Nix store path individually, so that we can check which paths have been backed up already and which still need to be serialized. Although we could implement such an approach ourselves, there is already a Nix utility that does something similar, namely: nix-push.

Normally, this command is used to optimize the build times of source builds by making binary substitutes available that can be downloaded instead, but it turns out to be quite practical for making backups as well.

If I run the following instruction on a collection of Nix store paths:


$ nix-push --dest /home/sander/cache /nix/store/4h4mb7lb5c0g390bd33k658dgzahkjn7-disnix-0.3pre1234

A binary cache is created in the /home/sander/cache directory from the closure of the Disnix package. The resulting binary cache has the following structure:


$ ls /home/sander/cache
03qpb8b4j4kc1w3fvwg9f8igc4skfsgj9rqb3maql9pi0nh6aj47.nar.xz
053yi53qigf113xsw7n0lg6fsvd2j1mapl6byiaf9vy80a821irk.nar.xz
05vfk68jlgj9yqd9nh1kak4rig379s09.narinfo
06sx7428fasd5bpcq5jlczx258xhfkaqqk84dx2i0z7di53j1sfa.nar.xz
...
11wcp606w07yh8afgnidqvpd1q3vyha7ns6dhgdi2354j84xysy9.nar.xz
...
4h4mb7lb5c0g390bd33k658dgzahkjn7.narinfo
...

For each Nix store path of the closure, an xz compressed NAR file is generated (it is also possible to use bzip2 or no compression) that contains a serialization of an individual Nix store path (without meta-information) and a narinfo file that contains its corresponding meta-information. The prefix of the NAR file corresponds to its output hash while the prefix of the narinfo file corresponds to the hash component of the Nix store path. The latter file contains a reference to the former NAR file.

If, for example, we change Disnix and run the same nix-push command again, then only the paths that have not been serialized are processed while the existing ones remain untouched, saving redundant diskspace and backup time.

We can also run nix-push on a store derivation file. If a store derivation file is provided, a binary cache is generated from the cached deployment closure.

Restoring a package from a binary cache can be done as follows:


$ nix-store --option binary-caches file:///home/sander/cache \
--realise /nix/store/3icf7dxf3inky441ps1dl22aijhimbxl-disnix-0.3pre1234

Simply realizing a Nix store path while providing the location to the binary cache as a parameter causes it to download the substitute into the Nix store, including all its dependencies.

Creating releases on Hydra for backup purposes


How can this approach be applied to Hydra builds? Since Hydra stores many generations of builds (unless they are garbage collected), I typically make a selection of the ones that I consider important enough by adding them to a release.

Releases on Hydra are created as follows. First, you have to be logged in and you must select a project from the project overview page, such as Disnix:


Clicking on a project will redirect you to a page that shows you the corresponding jobsets. By unfolding the actions tab, you can create a release for that particular project:


Then a screen will be opened that allows you define a release name and description:


After the release has been created, you can add builds to it. Builds can be added by opening the jobs page and selecting build results, such as build.x86_64-linux:


After clicking on a job, we can add it to a release by unfolding the 'Actions' tab and selecting 'Add to release':


The following dialog allows us to add the build to our recently created: disnix-0.3 release:


When we open the 'Releases' tab of the project page and we select the disnix-0.3 release, we can see that the build has been added:


Manually adding individual builds is a bit tedious if you have many them. Hydra has the ability to add all jobs of an evaluation to a release in one click. The only prerequisite is that each build must tell Hydra (through a file that resides in $out/nix-support/hydra-release-name of the build result) to which release it should belong.

For me adapting builds is a bit inconvenient and I also don't need the ability to add builds to arbitrary releases. Instead, I have created a script that adds all builds of an evaluation to a single precreated release, which does not require me to adapt anything.

For example running:


$ hydra-release-eval config.json 3 "disnix-0.3""Disnix 0.3"

Automatically creates a release with name: disnix-0.3 and description: "Disnix 0.3", and adds all the successful builds of evaluation 3 to it.

Exporting Hydra releases


To backup Hydra releases, I have created a Perl script that takes a JSON configuration file as parameter that looks as follows:


{
"dbiConnection": "dbi:Pg:dbname=hydra;host=localhost;user=hydra;",

"outDir": "/home/sander/hydrabackup",

"releases": [
{
"project": "Disnix",
"name": "disnix-0.3",
"method": "binary"
},
]
}

The configuration file defines an object with three members:

  • dbiConnection contains the Perl DBI connection string that connects to Hydra's PostgreSQL database instance.
  • outDir refers to a path in which the binary cache and other backup files will be stored. This path could refer to (for example) the mount point of another partition or network drive.
  • releases is an array of objects defining which releases must be exported. The method field determines the deployment type of the closure that needs to be serialized, which can be either a binary or cache deployment.

By running the following command, I can backup the releases:


$ hydra-backup config.json

The above command creates two folders: /home/sander/hydrabackup/cache contains the binary cache generated by nix-pull using the corresponding store derivation files or outputs of each job. The /home/sander/hydrabackup/releases folder contains text files with the actual paths belonging to the closures of each release.

The backup approach (using a binary cache) also allows me to update the releases and to efficiently make new backups. For example, by changing the disnix-0.3 release and running the same command again, only new paths are being exported.

One of the things that may happen after updating releases is that some NAR and narinfo files have become obsolete. I have also created a script that takes care of removing them automatically. What it basically does is comparing the release's closure files with the contents of the binary cache and removing the files that are not defined in any of the closure files. It can be invoked as follows:


$ hydra-collect-backup-garbage config.json

Restoring Hydra releases on a different machine can be done by copying the /home/sander/hydrabackup folder to a different machine and by running:


$ hydra-restore config.json

Backing up the Hydra database


In addition to releases, we may want to keep the Hydra database so that we don't have to reconfigure all projects, jobsets, releases and user accounts after a crash. A dump of the database can be created, by running:


$ pg_dump hydra | xz > /home/sander/hydrabackup/hydra-20140722.pgsql.xz

And we can restore it by running the following command:


$ xzcat /home/sander/hydrabackup/hydra-20140722.pgsql.xz | psql hydra

Conclusion


In this blog post, I have described an approach that allows someone to fully backup Nix (and Hydra) builds. Although it may feel great to have the ability to do so, it also comes with a price -- closures consume a lot of disk space, since every closure contains all transitive dependencies that are required to run or build it. In some upgrade scenarios, none of the dependencies can be shared which is quite costly.

In many cases it would be more beneficial to only backup the Nix expressions and Hydra database, and redo the builds with the latest versions of the dependencies, unless there is really a good reason to exactly reproduce an older configuration.

Furthermore, I am not the only person who has investigated Hydra backups. The Hydra distribution includes a backup script named: hydra-s3-backup-collect-garbage that automatically stores relevant artifacts in an Amazon S3 bucket. However, I have no clue how to use it and what it's capabilities are. Moreover, I am an old fashioned guy who still wants store backups on physical mediums rather than in the cloud. :).

The scripts described in this blog post can be obtained from my Github page. If other people consider any these scripts useful, I might reserve some time to investigate whether they can be included in the Hydra distribution package.

Managing private Nix packages outside the Nixpkgs tree

$
0
0
In a couple of older blog posts, I have explained the basic concepts of the Nix package manager as well as how to write package "build recipes" (better known as Nix expressions) for it.

Although Nix expressions may look unconventional, the basic idea behind specifying packages in the Nix world is simple: you define a function that describes how to build a package from source code and its dependencies, and you invoke the function with the desired variants of the dependencies as parameters to build it. In Nixpkgs, a collection of more than 2500 (mostly free and open source) packages that can be deployed with Nix, all packages are basically specified like this.

However, there might still be some practical issues. In some cases, you may just want to experiment with Nix or package private software not meant for distribution. In such cases, you typically want to store them outside the Nixpkgs tree.

Although the Nix manual describes how things are packaged in Nixpkgs, it does not (clearly) describe how to define and compose packages while keeping them separate from Nixpkgs.

Since it is not officially documented anywhere and I'm getting (too) many questions about this from beginners, I have decided to write something about it.

Specifying a single private package


In situations in which I want to quickly try or test one simple package, I typically write a Nix expression that looks as follows:


with import <nixpkgs> {};

stdenv.mkDerivation {
name = "mc-4.8.12";

src = fetchurl {
url = http://www.midnight-commander.org/downloads/mc-4.8.12.tar.bz2;
sha256 = "15lkwcis0labshq9k8c2fqdwv8az2c87qpdqwp5p31s8gb1gqm0h";
};

buildInputs = [ pkgconfig perl glib gpm slang zip unzip file gettext
libX11 libICE e2fsprogs ];

meta = {
description = "File Manager and User Shell for the GNU Project";
homepage = http://www.midnight-commander.org;
license = "GPLv2+";
maintainers = [ stdenv.lib.maintainers.sander ];
};
}

The above expression is a Nix expression that builds Midnight Commander, one of my favorite UNIX utilities (in particular the editor that comes with it :-) ).

In the above Nix expression, there is no distinction between a function definition and invocation. Instead, I directly invoke stdenv.mkDerivation {} to build Midnight Commander from source and its dependencies. I obtain the dependencies from Nixpkgs by importing the composition attribute set into the lexical scope of the expression through with import <nixpkgs> {};.

I can put the above file (named: mc.nix) in a folder outside the Nixpkgs tree, such as my home directory, and build it as follows:


$ nix-build mc.nix
/nix/store/svm98wmbf01dswlfcvvxfqqzckbhp5n5-mc-4.8.12

Or install it in my profile by running:


$ nix-env -f mc.nix -i mc

The dependencies (that are provided by Nixpkgs) can be found thanks to the NIX_PATH environment variable that contains a setting for nixpkgs. On NixOS, this environment variable has already been set. On other Linux distributions or non-NixOS installations, this variable must be manually configured to contain the location of Nixpkgs. An example could be:


$ export NIX_PATH=nixpkgs=/home/sander/nixpkgs

The above setting specifies that a copy of Nixpkgs resides in my home directory.

Maintaining a collection private packages


It may also happen that you want to package a few of the dependencies of a private package while keeping them out of Nixpkgs or just simply maintaining a collection of private packages. In such cases, I basically define every a package as a function, which is no different than the way it is done in Nixpkgs and described in the Nix manual:


{ stdenv, fetchurl, pkgconfig, glib, gpm, file, e2fsprogs
, libX11, libICE, perl, zip, unzip, gettext, slang
}:

stdenv.mkDerivation rec {
name = "mc-4.8.12";

src = fetchurl {
url = http://www.midnight-commander.org/downloads/mc-4.8.12.tar.bz2;
sha256 = "15lkwcis0labshq9k8c2fqdwv8az2c87qpdqwp5p31s8gb1gqm0h";
};

buildInputs = [ pkgconfig perl glib gpm slang zip unzip file gettext
libX11 libICE e2fsprogs ];

meta = {
description = "File Manager and User Shell for the GNU Project";
homepage = http://www.midnight-commander.org;
license = "GPLv2+";
maintainers = [ stdenv.lib.maintainers.sander ];
};
}

However, to compose the package (i.e. calling the function with the arguments that are used as dependencies), I have to create a private composition expression instead of adapting pkgs/top-level/all-packages.nix in Nixpkgs.

A private composition expression could be defined as follows:


{ system ? builtins.currentSystem }:

let
pkgs = import <nixpkgs> { inherit system; };
in
rec {
pkgconfig = import ./pkgs/pkgconfig {
inherit (pkgs) stdenv fetchurl automake;
};

gpm = import ./pkgs/gpm {
inherit (pkgs) stdenv fetchurl flex bison ncurses;
};

mc = import ./pkgs/mc {
# Use custom pkgconfig and gpm packages as dependencies
inherit pkgconfig gpm;
# The remaining dependencies come from Nixpkgs
inherit (pkgs) stdenv fetchurl glib file perl;
inherit (pkgs) zip unzip gettext slang e2fsprogs;
inherit (pkgs.xlibs) libX11 libICE;
};
}

The above file (named: custom-packages.nix) invokes the earlier Midnight Commander expression (defining a function) with its required parameters.

Two of its dependencies are also composed in the same expression, namely: pkgconfig and gpm that are also stored outside the Nixpkgs tree. The remaining dependencies of Midnight Commander are provided by Nixpkgs.

Using the above Nix expression file and by running the following command-line instruction:


$ nix-build custom-packages.nix -A mc
/nix/store/svm98wmbf01dswlfcvvxfqqzckbhp5n5-mc-4.8.12

I can build our package using our private composition of packages.

I can also install it into my Nix profile by running:


$ nix-env -f custom-packages.nix -iA mc

Because the composition expression is also a function taking system as a parameter (which defaults to the same system architecture as the host system), I can also build Midnight Commander for a different system architecture, such as a 32-bit Intel Linux system:


$ nix-build custom-packages.nix -A mc --argstr system i686-linux

Simplifying the private composition expression


The private composition expression shown earlier passes all required function arguments to each package definition, which basically requires anyone to write function arguments twice. First to define them and later to provide them.

In 95% of the cases, the function parameters are typically packages defined in the same composition attribute set having the same attribute names as the function parameters.

In Nixpkgs, there is a utility function named callPackage {} that simplifies things considerably -- it automatically passes all requirements to the function by taking the attributes with the same name from the composition expression. So there is no need to write: inherit gpm ...; anymore.

We can also define our own private callPackage {} function that does this for our private composition expression:


{ system ? builtins.currentSystem }:

let
pkgs = import <nixpkgs> { inherit system; };

callPackage = pkgs.lib.callPackageWith (pkgs // pkgs.xlibs // self);

self = rec {
pkgconfig = callPackage ./pkgs/pkgconfig { };

gpm = callPackage ./pkgs/gpm { };

mc = callPackage ./pkgs/mc { };
};
in
self

The above expression is a simplified version of our earlier composition expression (named: custom-packages.nix) that uses callPackage {} to automatically pass all required dependencies to the functions that build a package.

callPackage itself is composed from the pkgs.lib.callPackageWith function. The first parameter (pkgs // pkgs.xlibs // self) defines the auto-arguments. In this particular case, I have specified that the automatic function arguments come from self (our private composition) first, then from the xlibs sub attribute set from Nixpkgs, and then from the main composition attribute set of Nixpkgs.

With the above expression, we accomplish exactly the same thing as in the previous expression, but with fewer lines of code. We can also build the Midnight Commander exactly the same way as we did earlier:


$ nix-build custom-packages.nix -A mc
/nix/store/svm98wmbf01dswlfcvvxfqqzckbhp5n5-mc-4.8.12

Conclusion


In this blog post, I have described how I typically maintain a single package or a collection packages outside the Nixpkgs tree. More information on how to package things in Nix can be found in the Nix manual and the Nixpkgs manual.

Wireless ad-hoc distributions of iOS applications with Hydra

$
0
0
In a number of earlier blog posts, I have shown Hydra, a Nix-based continuous integration server, and Nix functions allowing someone to automatically build mobile applications for Android and iOS with the Nix package manager (and Hydra).

Apart from being able to continuously build new versions of mobile applications, Hydra offers another interesting benefit -- we can use a web browser on an Android device, such as a phone or tablet (or even an emulator instance) to open the Hydra web interface, and conveniently install any Android app by simply clicking on the resulting hyperlink to an APK bundle.

It is also possible to automatically deliver iOS apps in a similar way. However, accomplishing this with Hydra turns out to be quite tedious and complicated. In this blog post, I will explain what I did to make this possible.

Wireless adhoc distributions of iOS apps


According to the following webpage: http://gknops.github.io/adHocGenerate two requirements have to be met in order to provide wireless adhoc releases of iOS apps.

First, we must compose a plist file containing a collection of meta attributes of the app to be distributed. For example:


<plist version="1.0">
<dict>
<key>items</key>
<array>
<dict>
<key>assets</key>
<array>
<dict>
<key>kind</key>
<string>software-package</string>
<key>url</key>
<string>http://192.168.1.101/Renamed.ipa</string>
</dict>
</array>
<key>metadata</key>
<dict>
<key>bundle-identifier</key>
<string>com.myrenamedcompany.renamedapp</string>
<key>bundle-version</key>
<string>1.0</string>
<key>kind</key>
<string>software</string>
<key>title</key>
<string>Renamed</string>
</dict>
</dict>
</array>
</dict>
</plist>

The above plist file defines a software package with bundle id: com.myrenamedcompany.renamedapp, version: 1.0 and name: Renamed. The corresponding IPA bundle is retrieved from the following URL: http://192.168.1.101/Renamed.ipa.

The second thing that we have to do is opening a specialized URL in the browser of an iOS device that refers to the plist file that we have defined earlier:


itms-services://?action=download-manifest&url=http://192.168.1.101/distribution.plist

If the plist file properly reflects the app's properties and the signing of the IPA file is done right, e.g. the device is authorized to install the app, then it should be automatically installed on the device after the user has accepted the confirmation request.

Generating a plist file and link page in a Nix/Hydra build


At first sight, integrating wireless adhoc distribution support in Nix (and Hydra) builds seemed to look easy to me -- I just generate the required plist file and an HTML page containing the specialized link URL (that gets clicked automatically by some JavaScript code) and expose these files as Hydra build products so that they are accessible from Hydra's web interface.

Unfortunately, it turned out it is actually a bit more complicated than I thought -- the URLs to the plist and IPA files must be absolute. An absolute path to an IPA file served by Hydra may look as follows:


http://192.168.1.101/build/35256/download/1/Renamed.ipa

Two components of the URL are causing a bit of inconvenience. First, we must know the hostname of the Hydra server. If I would make this value a build property, then the build becomes dependent on Hydra's hostname, which forces us to rebuild the app if it changes for some reason.

Second, the URL contains a unique build id assigned by Hydra that we do not know while performing the build. We have to obtain this value by some other means.

Solution: using page indirection


To solve this problem, I used a very hacky solution introducing an extra layer of indirection -- I have adapted the Nix function that builds iOS applications to generate an HTML file as a Hydra build product from the following template:

<!DOCTYPE html>

<html>
<head>
<title>Install IPA</title>
</head>

<body>
<a id="forwardlink" href="@INSTALL_URL@">
Go to the install page or wait a second
</a>

<script type="text/javascript">
setTimeout(function() {
var link = document.getElementById('forwardlink');

if(document.createEvent) {
var eventObj = document.createEvent('MouseEvents');
eventObj.initEvent('click', true, false);
link.dispatchEvent(eventObj);
} else if(document.createEventObject) {
link.fireEvent('onclick');
}
}, 1000);
</script>
</body>
</html>

What the above page does is showing a hyperlink that redirects the user to another page. Some JavaScript code automatically clicks on the link after one second. After clicking on the link, the user gets forwarded to another web page that is responsible for providing the installation link. We use this obscure page indirection trick to allow the next page to extract some relevant Hydra properties from the referrer URL.

The build script substitutes the @INSTALL_URL@ template property by a relative (or absolute) path that may look as follows:


/installipa.php?bundle=com.myrenamedcompany.renamedapp&version=1.0&title=Renamed

Besides forwarding the user to another page, we also pass the relevant build properties that we need to generate a plist file as GET parameters. Furthermore, the generated HTML build product's URL has nearly the same structure as the URL of an IPA file:


http://192.168.1.101/build/35256/download/2/Renamed.html

The fact that the build product URL of the redirection page has nearly the same structure makes it quite easy for us to extract the remaining properties (the hostname and build id) we need to generate the plist file.

The PHP page that we link to (/installipa.php) is responsible for generating a web page with the specialized itms-services:// URL that triggers an installation. With the following PHP code we can extract the hostname, app name and build id from the referrer URL:


$url_components = parse_url($_SERVER["HTTP_REFERER"]);
$hostname = $url_components["host"];
$app_path = dirname(dirname($url_components["path"]));
$app_name = basename($url_components["path"], ".html");

We can determine the protocol that is being used as follows:


if($_SERVER["HTTPS"] == "")
$protocol = "http://";
else
$protocol = "https://";

And compose the absolute IPA URL out of the previous variables:


$ipa_url = $protocol.$hostname.$app_path."/1/".$app_name.".ipa";

Then we display a hyperlink with the specialized installation URL that is generated as follows:


<?php
$plistURL = $protocol.$hostname."/distribution.plist.php".$plistParams;
?>
<a href="itms-services://?action=download-manifest&amp;url=<?php print($plistURL); ?>">
Click this link to install the IPA
</a>

The plist file that the itms-services:// URL refers to is another PHP script that generates the plist dynamically from a number of GET parameters. The GET parameters are composed as follows:


$plistParams = urlencode("?ipa_url=".$ipa_url.
"&bundleId=".$_REQUEST["bundleId"].
"&version=".$_REQUEST["version"].
"&title=".$_REQUEST["title"]);

By applying the same JavaScript trick shown earlier, we can also automatically click on the installation link to save the user some work.

Adapting Hydra's configuration to use the IPA installation script


To allow users to actually do wireless adhoc installations, the two PHP scripts described earlier must be deployed to the Hydra build coordinator machine. If NixOS is used to deploy the Hydra coordinator machine, then it is simply a matter of adding a few additional configuration properties to the HTTP reverse proxy service section of its NixOS configuration file:


services.httpd = {
enable = true;
adminAddr = "admin@example.com";
hostName = "hydra.example.com";
extraModules = [
{ name = "php5"; path = "${pkgs.php}/modules/libphp5.so"; }
];
documentRoot = pkgs.stdenv.mkDerivation {
name = "distribution-proxy";
src = pkgs.fetchgit {
url = https://github.com/svanderburg/nix-xcodeenvtests.git;
rev = "0ba187cc83941bf16c691094480f0632b8116e48";
sha256 = "4f440e4f3c7b58c40b86e2c8c18608606b64bf341aed233519e9023fff1ceb01";
};
buildCommand = ''
mkdir -p $out
cp $src/src/distribution-proxy/*.php $out
'';
};

extraConfig = ''
<proxy>
Order deny,allow
Allow from all
</proxy>

ProxyPass /installipa.php !
ProxyPass /distribution.plist.php !

ProxyRequests Off
ProxyPreserveHost On
ProxyPass / http://localhost:3000/ retry=5 disablereuse=on
ProxyPassReverse / http://localhost:3000/
'';
};

What I did in the above reverse proxy server configuration snippet, is configuring the documentRoot to refer to a folder containing the two PHP scripts we have shown earlier. The scripts are retrieved from a Git repository. Before I configure the reverse proxy, I declare that two request URLs, namely: the PHP scripts, should not be forwarded to Hydra's Catalyst server.

Usage


After setting up a Hydra instance that hosts these two PHP scripts, we can build an iOS app (such as our trivial example testcase) that includes an HTML forwarding page that allows us to automatically install the app on an iOS device. This can be done with the following expression:


{xcodeenv}:

xcodeenv.buildApp {
name = "Renamed";
src = ../../src/Renamed;
release = true;

certificateFile = /Users/sander/mycertificate.p12;
certificatePassword = "secret";
codeSignIdentity = "iPhone Distribution: My Cool Company";
provisioningProfile = /Users/sander/provisioningprofile.mobileprovision;
generateIPA = true;

enableWirelessDistribution = true;
installURL = "/installipa.php";
bundleId = "com.mycoolcompany.renamed";
version = "1.0";
title = "Renamed";
}

Setting the enableWirelessDistribution parameter to true makes the build function generating the HTML page as build product. The installURL, bundleId, version and title parameters are used for the page forwarding and the plist file generation.

Result


By setting up a Hydra jobset using the above function, we can open the Hydra web application in a web browser on an iOS device and navigate to an iOS build:


Clicking on the 'Documentation of type install' build product does our page forwarding trick. After 2 seconds a confirmation dialog should appear:


After clicking on the 'Install' button, the app gets installed and appears in the program menu:


And finally we can run it! Isn't it great?

Concluding remarks


In this blog post I have described a hacky method using page indirection making it possible to use Hydra to do wireless adhoc distributions of iOS apps.

Unfortunately, I also discovered that for devices running iOS 7.1 and onwards, an HTTPS connection to the plist and IPA files is required, with a valid, trustable cross-signed certificate, making things even more tedious and complicated.

The hacky PHP scripts described in this blog post are part of the Nix xcode test package that can be obtained from my GitHub page.

It also quite funny to realise that all these steps are not required at all for Android apps. Simply making APK files available for download is enough.

Deploying NPM packages with the Nix package manager

$
0
0
I have encountered several people saying that the Nix package manager is a nice tool, but they do not want to depend on it to build software. Instead, they say that they want to keep using the build tools they are familiar with.

To clear up some confusion: Nix's purpose is not to replace any build tools, but complementing them by composing isolated environments in which these build tools are executed.

Isolated environments


Isolated environments composed by Nix have the following traits:

  • All environment variables are initially cleared or set to dummy values.
  • Environment variables are modified in such a way that only the declared dependencies can be found, e.g. by adding the full path of these packages (residing in separate directories) to PATH, PERL5LIB, CLASSPATH etc.
  • Processes can only write to a designated temp folder and output folders in the Nix store. Write access to any other folder is restricted.
  • After the build has finished, the output files in the Nix store are made read-only and their timestamps are reset to 1 UNIX-time.
  • The environment can optionally be composed in a chroot environment in which no undeclared dependencies and non-package related arbitrary files on the filesystem can be accidentally accessed, no network activity is possible and other processes cannot interfere.

In these environments, you can execute many kinds of build tools, such as GNU Autotools, GNU Make, CMake, Apache Ant, SCons, Perl's MakeMaker and Python's setuptools, typically with little problems. In Nixpkgs, a collection of more than 2500 mostly free and open-source packages, we run many kinds of build tools inside isolated environments composed by Nix.

Moreover, besides running build tools, we can also do other stuff in isolated environments, such as running unit tests, or spawning virtual machine instances in which system integration tests are performed.

So what are the benefits of using such an approach as opposed to running build tools directly in an ad-hoc way? The main benefit is that package deployment (and even entire system configurations and networks of services and machines) become much more reliable and reproducible. Moreover, we can also run multiple builds safely in parallel improving the efficiency of deployment processes.

The only requirements that must be met in a software project are some simple rules so that builds do not fail because of the restrictions that these isolated environments impose. A while ago, I have written a blog post on techniques and lessons to improve software deployment that gives some more details on this. Moreover, if you follow these rules you should still be able to build your software project with your favourite build tools outside Nix.

(As a sidenote: Nix can actually also be a used as a build tool, but this application area is still experimental and not frequently used. More info on this can be found in Chapter 10 of Eelco Dolstra's PhD thesis that can be obtained from his publications page).

Dependency management


The fact that many build tools can be complimented by Nix probably sounds good, but there is one particular class of build tools that are problematic to use with Nix -- namely build tools that also do dependency management in addition to build management. For these kinds of tools, the Nix package manager conflicts, because the build tool typically insists taking over Nix's responsibilities as a dependency manager.

Moreover, Nix's facilities typically restrict such tools to consult external resources, but if we would allow them to do their own dependency management tasks (which is actually possible by hacking around Nix's deployment model), then the corresponding hash codes inside the Nix store paths (which are derived from all buildtime dependencies) are no longer guaranteed to accurately represent the same build results limiting reliable and reproducible deployment. The fact that other dependency managers use weaker nominal version specifications mainly contributes to that.

Second, regardless of what package manager is used, you can no longer rely on the package management system's dependency manager to deploy a system, but you also depend on extra tools and additional distribution channels, which is generally considered tedious by software distribution packagers and end-users.

NPM package manager


A prominent example of a tool doing both build and dependency management is the Node.js Package Manager (NPM), which is the primary means within the Node.js community to build and distribute software packages. It can be used for a variety of Node.js related deployment tasks.

The most common deployment task is probably installing the NPM package dependencies of a development project. What developers typically do is entering the project's working directory and running:

$ npm install

To install all its dependencies (which are obtained from the NPM registry, external URLs and Git repositories) in a special purpose folder named node_modules/ in the project workspace so that it can be run.

You can also globally install NPM packages from the NPM registry (such as command-line utilities), by running:

$ npm install -g nijs

The above command installs a NPM package named NiJS globally including all its dependencies. After the installation has been completed you should be able to run the following instruction on the command-line:

$ nijs-build --help

NPM related deployment tasks are driven by a specification called package.json that is included in every NPM package or the root folder of a development project. For example, NiJS' package.json file looks as follows:

{
"name" : "nijs",
"version" : "0.0.18",
"description" : "An internal DSL for the Nix package manager in JavaScript",
"repository" : {
"type" : "git",
"url" : "https://github.com/svanderburg/nijs.git"
},
"author" : "Sander van der Burg",
"license" : "MIT",
"bin" : {
"nijs-build" : "./bin/nijs-build.js",
"nijs-execute" : "./bin/nijs-execute.js"
},
"main" : "./lib/nijs",
"dependencies" : {
"optparse" : ">= 1.0.3",
"slasp": "0.0.4"
}
}

The above package.json file defines a package configuration object having the following properties:

  • The name and version attributes define the name of the package and its corresponding version number. These two attributes are mandatory and if they are undefined, NPM deployment fails. Moreover, version numbers are required to follow the semver standard. One of semver's requirements is that the version attribute should consist of three version components.
  • The description, repository, author and license attributes are simply just meta information. They are not used during the execution of deployment steps.
  • The bin attribute defines which executable files it should deploy and to which CommonJS modules in the package they map.
  • The main attribute refers to the module that is primary entry point to the package if it is included through require().
  • The dependencies parameter specifies the dependencies that this package has on other NPM packages. This package depends on a library called optparse that must be of version 1.0.3 or higher and a library named slasp which must be exactly of version 0.0.4. More information on how NPM handles dependencies is explained in the next section.

Since the above package is a pure JavaScript package (which most NPM packages are) no build steps are needed. However, if some package do need to perform build steps, e.g. compiling CoffeeScript to JavaScript, or building bindings to native code, then a collection of scripts can be specified, which are run at various times in the lifecycle of a package, e.g. before and after the installation steps. These scripts can (for example) execute the CoffeeScript compiler, or invoke Gyp that compiles bindings to native code.

Replacing NPM's dependency management


So how can we deploy NPM packages in an isolated environment composed by Nix? In other words: how can we "complement" NPM with Nix?

To accomplish this view, we must substitute NPM's dependency manager, that conflicts with the Nix package manager, by something that does the dependency management the "Nix way" while retaining the NPM semantics and keeping its build facilities.

Luckily, we can easily do that by just running NPM inside a Nix expression and "fooling" it not to install any dependencies itself, by providing a copies of these dependencies in the right locations ourselves.

For example, to make deployment of NiJS work, we can just simply extract the tarball's contents, copy the result into the Nix store, entering the output folder, and copying its dependencies into the node_modules directory ourselves:
mkdir -p node_modules
cp -r ${optparse} node_modules
cp -r ${slasp} node_modules
(The above antiquoted expressions, such as ${optparse} refer to the result of Nix expressions that build the corresponding dependencies).

Finally, we should be able to run NPM inside a Nix expression as follows:

$ npm --registry http://www.example.com --nodedir=${nodeSources} install

When running the above command-line instruction after the copy commands, NPM notifies that all the required dependencies of NiJS are already present and simply proceeds without doing anything.

We also provide a couple of additional parameters to npm install:

  • The --registry parameter prevents that, if any dependency is appears to be missing, the NPM registry is consulted, which is undesirable. We want deployment of NPM package dependencies to be Nix's responsibility and making it fail when dependency specifications are incomplete is exactly what we need to be sure that we correctly specify all required dependencies.
  • The --nodedir parameter specifies where the Node.js source code can be found, which is used to build NPM packages that have bindings to native code. nodeSources is a directory containing the unpacked Node.js source code:

    nodeSources = runCommand "node-sources" {} ''
    tar --no-same-owner --no-same-permissions -xf ${nodejs.src}
    mv node-* $out
    '';

  • When running NPM in a source code directory (as shown earlier), all development dependencies are installed as well, which is often not required. By providing the --production parameter, we can deploy the package in production mode, skipping the development dependencies.

    Unfortunately, there is one small problem that could occur with some packages defining a prepublish script -- NPM tries to execute this script while a development dependency might be missing causing the deployment to fail. To remedy this problem, I also provide the --ignore-scripts parameter to npm install and I only run the install scripts afterwards, through:

    $ npm run install --registry http://www.example.com --nodedir=${nodeSources}

Translating NPM's dependencies


The main challenge of deploying NPM packages with Nix is implementing a Nix equivalent for NPM's dependency manager.

Dependency classes


Currently, an NPM package configuration could declare the following kinds of dependencies which we somehow have to fit in Nix's deployment model:

  • The dependencies attribute specifies which dependencies must be installed along with the package to run it. As we have seen earlier, simply copying the package of the right version into the node_modules folder in the Nix expression suffices.
  • The devDependencies attribute specifies additional dependencies that are installed in development mode. For example, when running: npm install inside the folder of a development project, the development dependencies are installed as well. Also, simply copying them suffices to allow deployment in a Nix expression to work.
  • The peerDependencies attribute might suggest another class of dependencies that are installed along with the package, because of the following sentence in the package.json specification:

    The host package is automatically installed if needed.

    After experimenting with a basic package configuration containing only one peer dependency, I discovered that peer dependencies are basically used as a checking mechanism to see whether no incompatible versions are accidentally installed. In a Nix expression, we don't have to do any additional work to support this and we can leave the check up to NPM that we run inside the Nix expression.
  • bundledDependencies affects the publishing process of the package to the NPM registry. The bundled dependencies refer to a subset of the declared dependencies that are statically bundled along with the package when it's published to the NPM registry.

    When downloading and unpacking a package from the NPM registry that has bundled dependencies, then a node_modules folder exist that contains these dependencies including all their dependencies.

    To support bundled dependencies in Nix, we must first check whether a dependency already exists in the node_modules folder. If this is the case, we should leave as it is, instead of providing the dependency ourselves.
  • optionalDependencies are also installed along with a package, but do not cause the deployment to fail if any error occurs. In Nix, optional dependencies can be supported by using the same copying trick as regular dependencies. However, accepting failures (especially non-deterministic ones), is not something the Nix deployment model supports. Therefore, I did not derive any equivalent for it.

Version specifications


There are various ways to refer to a specific version of a dependency. Currently, NPM supports the following kinds of version specifications:

  • Exact version numbers (that comply with the semver standard), e.g. 1.0.1
  • Version ranges complying with the semver standard, e.g. >= 1.0.3, 5.0.0 - 7.2.3
  • Wildcards complying with the semver standard, e.g. any version: * or any 1.0 version: 1.0.x
  • The latest keyword referring to the latest stable version and unstable keyword referring to the latest unstable version.
  • HTTP/HTTPS URLs referring to a TGZ file being an NPM package, e.g. http://localhost/nijs-0.0.18.tgz.
  • Git URLs referring to a Git repositories containing a NPM package, e.g. https://github.com/svanderburg/nijs.git.
  • GitHub identifiers, referring to an NPM package hosted at GitHub, e.g. svanderburg/nijs
  • Local paths, e.g. /home/sander/nijs

As described earlier, we can't leave fetching the dependencies up to NPM, but Nix has to do this instead. For most version specifications (the only exception being local paths) we can't simply write a function that takes a version specifier as input and fetches it:

  • Packages with exact version numbers and version ranges are fetched from the NPM registry. In Nix, we have to translate these into fetchurl {} invocations, which requires an URL and an output hash value as as parameters allowing us to check the result to make builds reliable and reproducible.

    Luckily, we can retrieve the URL to the NPM package's TGZ file and its corresponding SHA1 hash by fetching the package's metadata from the NPM registry, by running:
    $ npm info nijs@0.0.18
    { name: 'nijs',
    description: 'An internal DSL for the Nix package manager in JavaScript',
    'dist-tags': { latest: '0.0.18' },
    ...
    dist:
    { shasum: 'bfdf140350d2bb3edae6b094dbc31035d6c7bec8',
    tarball: 'http://registry.npmjs.org/nijs/-/nijs-0.0.18.tgz' },
    ...
    }

    We can translate the above metadata into the following Nix function invocation:

    fetchurl {
    name = "nijs-0.0.18.tgz";
    url = http://registry.npmjs.org/nijs/-/nijs-0.0.18.tgz;
    sha1 = "bfdf140350d2bb3edae6b094dbc31035d6c7bec8";
    }

  • Version ranges are in principle unsupported in Nix in the sense that you cannot write a function that takes a version range specifier and simply downloads the latest version of the package that conforms to it, since it conflicts with Nix's reproducibility properties.

    If we would allow version ranges to be downloaded then the hash code inside a Nix store path does not necessarily refer to the same build result anymore. For example, running the same download tomorrow might give a different result, because the package has been updated.

    For example, the following path:

    /nix/store/j631r0ak98156v1xkx22n4fsl3zbmzi8-node-slasp-0.0.x

    Might refer to slasp version 0.0.4 today and to version 0.0.5 tomorrow, while the hash code remains identical. This is incompatible with Nix's deployment model.

    To still support deployment of packages having dependencies on version ranges of packages, we basically have to "snapshot" a dependency version by running:

    $ npm info nijs@0.0.x

    and create a fetchurl {} invocation from the particular version that is returned. The disadvantage of this approach is that, if we want to keep our versions up to date, we have to repeat this step every time a package has been updated.

  • The same thing applies to wildcard version specifiers. However, there is another caveat -- if we encounter a wildcard version specifier, we cannot always assume that the latest conforming version can be taken, because NPM also supports shared dependencies.

    If a shared dependency conforms to a wildcard specifier, then the dependency is not downloaded, but the shared dependency is used instead, which may not necessarily be the latest version. Otherwise, the latest conforming version is downloaded. Shared dependencies are explained in the next section.
  • Also for 'latest' and 'unstable' we must do a snapshot trick. However, we must also do something else. If NPM encounters version specifiers like these, it will always try to consult the NPM registry to check which version corresponds, which is undesirable. To prevent that we must substitute these version specifiers in the package.json file by '*'.
  • For HTTP/HTTPS and Git/GitHub URLs, we must manually compose fetchurl {} and fetchgit {} function invocations, and we must compute their output hashes in advance. The nix-prefetch-url and nix-prefetch-git utilities are particularly useful for this. Moreover, we also have to substitute URLs by '*' in the package.json before we run NPM inside a Nix expression, to prevent it from consulting external resources.

Private, shared and cyclic dependencies


Like the Nix package manager, NPM has the ability to support multiple versions of packages simultaneously -- not only the NPM package we intend to deploy, but also all its dependencies (which are also NPM packages) can have their own node_modules/ folder that contain a package's private dependencies.

Isolation works for CommonJS modules, because when a module inside a package tries to include another package, e.g. through:

var slasp = require('slasp');

then first the node_modules/ folder of the package is consulted and the module is loaded from that folder if it exists. Furthermore, the CommonJS module system uses the absolute resolved full paths to the modules to make a distinction between module variants and not only their names. As a consequence, if a resolved path to a module with a same name is different, it's considered a different module by the module loader and thus does not conflict with others.

If a module cannot be found in the private node_modules/ folder, the module loading system recursively looks for node_modules/ folders in the parent directories, e.g.:

./nijs/node_modules/slasp/node_modules
./nijs/node_modules
./node_modules

This is how package sharing is accomplished in NPM.

NPM's policy regarding dependencies is basically that each package stores all its dependencies privately unless a dependency can be found in any of the parent directories that conforms to the version specification declared in the package. In such cases, the private dependency is omitted and a shared one will be used instead.

Also, because a dependency is installed only once, it's also possible to define cyclic dependencies. Although it's generally known that cyclic dependencies are a bad practice, they are actually used by some NPM packages, such as es6-iterator.

The npm help install manual page says the following about cycles:

To avoid this situation, npm flat-out refuses to install any name@version that is already present anywhere in the tree of package folder ancestors. A more correct, but more complex, solution would be to symlink the existing version into the new location. If this ever affects a real use-case, it will be investigated.

In Nix, private and shared dependencies are handled differently. In Nix, packages can be "private" because they are stored in separate folders in the Nix store which paths are made unique because they contain hash codes derived from all its build-time dependencies.

Sharing is accomplished when a package refers to the same Nix store path with the same hash code. In Nix these mechanisms are more powerful, because they are not restricted to specific component types.

Nix does not support cyclic dependencies and lacks the ability to refer to a parent if a package is a dependency of another package.

To simulate NPM's way of sharing packages (and means of breaking dependency cycles) in Nix, I ended up write our function that deploys NPM packages (named: buildNodePackage {}) roughly as follows:

{stdenv, nodejs, ...}:
{name, version, src, dependencies, ...}:
{providedDependencies}:

let
requiredDependencies = ...;
shimmedDependencies = ...;
in
stdenv.mkDerivation {
name = "node-${name}-${version}";
inherit src;
...
buildInputs = [ nodejs ... ] ++ requiredDependencies;
buildCommand = ''
# Move extracted package into the Nix store
mkdir -p $out/lib/node_modules/${name}
mv * $out/lib/node_modules/${name}
cd $out/lib/node_modules/${name}
...

mkdir -p node_modules
# Copy the required dependencies
# Generate shims for the provided dependencies
...

# Perform the build by running npm install
npm --registry http://www.example.com --nodedir=${nodeSources} install
...

# Remove the shims
'';
}

The above expression defines a nested function with the following structure:

  • The first (outermost) function's parameters refer to the build time dependencies used for the deployment of any NPM package, such as the Nix standard environment that contains a basic UNIX toolset (stdenv) and Node.js (nodejs).
  • The second function's parameters refer to a specific NPM package's deployment parameters, such as the name of the package, the version, a reference to the source code (e.g. local path, URL or Git repository) and its dependencies.
  • The third (innermost) function's parameter (providedDependencies) is used by a package to propagate the identifiers of the already provided shared dependencies to a dependency that's being included, so that they are not deployed again. This is required to simulate NPM's shared dependency mechanism and to escape infinite recursion, because of cyclic dependencies.
  • From the dependencies and providedDependencies parameters, we can determine the required dependencies that we actually need to include privately to deploy the package. requiredDependencies are the dependencies minus the providedDependencies. The actual computation is quite tricky:
    • The dependencies parameter could be encoded as follows:
      {
      optparse = {
      ">= 1.0.3" = {
      version = "1.0.5";
      pkg = registry."optparse-1.0.5";
      };
      };
      ...
      }

      The dependencies parameter refers to an attribute set in which each attribute name represents a package name. Each member of this attribute set represents a dependency specification. The dependency specification refers to an attribute set that provides the latest snapshot of the corresponding version specification.
    • The providedDependences parameter could be encoded as follows:
      {
      optparse."1.0.5" = true;
      ...
      }

      The providedDependencies parameter is an attribute set composed of package names and a versions. If a package is in this attribute set then it means it has been provided by any of the parents and should not be included again.
    • We use the semver utility to see whether any of the provided dependencies map to any of the version specifications in dependencies. For example for optparse means that we run:

      $ semver -r '>= 1.0.3' 1.0.5
      $ echo $?
      0

      The above command exits with a zero exit status, meaning that there is a shared dependency providing it and we should not deploy optparse privately again. As a result, it's not added to the required dependencies.
    • The above procedure is basically encapsulated in a derivation that generates a Nix expression with the list of required dependencies that gets imported again -- a "trick" that I also used in NiJS and the Dynamic Disnix framework.

      The reason why we execute this procedure in a separate derivation is that, if we do the same thing in the builder environment of the NPM package, we always refer to all possible dependencies which prevents us escaping any potential infinite recursion.
  • The required dependencies are copied into the private node_modules/ as follows:

    mkdir -p node_modules
    cp -r ${optparse propagatedProvidedDependencies} node_modules
    cp -r ${slasp propagatedProvidedDependencies} node_modules

    Now the innermost function parameter comes in handy -- to each dependency, we propagate the already provided dependencies, our own dependencies, and the package itself, to properly simulate NPM's way of sharing and breaking any potential cycles.

    As a sidenote: to ensure that dependencies are always correctly addressed, we must copy the dependencies. In older implementations, we used to create symlinks, which works fine for private dependencies, but not for shared dependencies.

    If a shared dependency is addressed, the module system looks relative to its own full resolved path, not to the symlink. Because the resolved path is completely different, the shared dependency cannot be found.
  • For the packages that are not considered required dependencies, we must generate shims to allow the deployment to still succeed. While these dependencies are provided by the includers at runtime, they are not visible in the Nix builder environment at buildtime and, as a consequence, deployment will fail.

    Generating shims is quite easy. Simply generating a directory with a minimal package.json file only containing the name and version is enough. For example, the following suffices to fool NPM that the shared dependency optparse version 1.0.5 is actually present:

    mkdir node_modules/optparse
    cat > node_modules/optparse/package.json <<EOF
    {
    "name": "optparse",
    "version": "1.0.5"
    }
    EOF

  • Then we run npm install to execute the NPM build steps, which should succeed if all dependencies are correctly specified.
  • Finally, we must remove the generated shims, since they do not have any relevant meaning anymore.

Manually writing a Nix expression to deploy NPM packages


The earlier described function: buildNodePackage {} can be used to manually write Nix expressions to deploy NPM packages:

with import <nixpkgs> {};

let
buildNodePackage = import ./build-node-package.nix {
inherit (pkgs) stdenv nodejs;
};

registry = {
"optparse-1.0.5" = buildNodePackage {
...
};
"slasp-0.0.4" = buildNodePackage {
...
};
"nijs-0.0.18" = buildNodePackage {
name = "nijs";
version = "0.0.18";
src = ./.;
dependencies = {
optparse = {
">= 1.0.3" = {
version = "1.0.5";
pkg = registry."optparse-1.0.5";
};
};
slasp = {
"0.0.4" = {
version = "0.0.4";
pkg = registry."slasp-0.0.4";
};
};
};
};
};
in
registry

The partial Nix expression (shown above) can be used to deploy the NiJS NPM package through Nix.

Moreover, it also provides NiJS' dependencies that are also built by the same function abstraction. By using the above expression, and the following command-line instruction:

$ nix-build -A '"nijs-0.0.18"'
/nix/store/y5r0raja6d8xlaav1mhw8jjxvx7bap85-node-nijs-0.0.18

NiJS is deployed by the Nix package manager including its dependencies.

Generating Nix packages from NPM package configurations


The buildNodePackage {} function shown earlier makes it possible to deploy NPM packages with Nix. However, its biggest drawback is that we have to manually write expressions for the package we want to deploy including all its dependencies. Moreover, since version ranges are unsupported, we must manually check for updates and update the corresponding expressions every time, which is labourious and tedious.

To solve this problem, a tool has been developed named: npm2nix that can automatically generate Nix expressions from NPM package.json specifications and collection specifications. It has several kinds of use cases.

Deploying a Node.js development project


Running the following command generates a collection of Nix expressions from a package.json file of a development project:

$ npm2nix

The above command generates three files registry.nix containing Nix expressions for all package dependencies and the packge itself, node-env.nix contains the build logic and default.nix is a composition expression allowing users to deploy the package.

By running the following Nix command with these expressions, the project can be built:

$ nix-build -A build

Generating a tarball from a Node.js development project


The earlier generated expressions can also be used to generate a tarball from the project:

$ nix-build -A tarball

The above command-line instruction (that basically runs npm pack) produces a tarball that can is placed in the following location:

$ ls result/tarballs/npm2nix-6.0.0.tgz

The above tarball can be distributed to others and installed with NPM by running:

$ npm install npm2nix-6.0.0.tgz

Deploying a development environment of a Node.js development project


The following command-line instruction uses the earlier generated expressions to deploy all the dependencies and opens a development environment:

$ nix-shell -A build

Within this shell session, files can be modified and run without any hassle. For example, the following command should work without any trouble:

$ node bin/npm2nix.js --help

Deploying a collection of NPM packages from the NPM registry


You can also deploy existing NPM packages from the NPM registry, which is driven by a JSON specification that looks as follows:

[
"async",
"underscore",
"slasp",
{ "mocha" : "1.21.x" },
{ "nijs": "0.0.18" },
{ "npm2nix": "git://github.com/NixOS/npm2nix.git" }
]

The above specification is basically an array of objects. For each element that is a string, the latest version is obtained from the NPM registry. To obtain a specific version of a package, an object must defined in which the keys are the names of the packages and the values are their version specifications. Any version specification that NPM supports can be used.

Nix expressions can be generated from this JSON specification as follows:

$ npm2nix -i node-packages.json

And using the generated Nix expressions, we can install NiJS through Nix as follows:

$ nix-env -f default.nix -iA '"nijs-0.0.18"'

Concluding remarks


In this lengthy blog post (which was quite a project!) I have outlined some differences between NPM and Nix, sketched an approach that can be used to deploy NPM packages with Nix, and described a generator: npm2nix that automates this approach.


The reason why I wrote this stuff down is that the original npm2nix developer has relinquished his maintainership and I became co-maintainer. Since the NixOS sprint in Ljubljana I've been working on reengineering npm2nix and solving the problem with cyclic dependencies and version mismatches with shared dependencies. Because the problem is quite complicated, I think it would be good to have something documented that describes the problems and my thoughts.

As part of the reengineering process, I ported npm2nix from CoffeeScript to JavaScript, used some abstraction facilities to tidy up the pyramid code (caused by nesting of callbacks), and modularized the codebase it a bit further.

I am using NiJS for the generation of Nix expressions, and I modified it to have most Nix language concepts supported (albeit some of them can only be written in an abstract syntax). Moreover, the expressions generated by NiJS are now also pretty printed so that the generated code is still (mostly) readable.

The reengineered npm2nix can be obtained from the reengineering branch of my private GitHub fork and is currently in testing phase. Once it is considered stable enough, it will replace the old implementation.

Acknowledgements


The majority of npm2nix is not my work. Foremost, I'd like to thank Shea Levy, who is the original developer/author of npm2nix. He was maintaining it since 2012 and figured out most of NPM's internals, mappings of NPM concepts to Nix and how to use NPM specific modules (such as the NPM registry client) to obtain metadata from the NPM registry. Most of the stuff in the reengineered version is ported directly from the old implementation done by him.

Also I'd like to thank the other co-maintainers: Rok Garbas and Rob Vermaas for their useful input during the NixOS sprint in Ljubljana.

Finally, although the feedback period is open for only a short time, I've already received some very useful comments on #nixos and the Nix mailing list by various developers that I would like to thank.

Related work


NPM is not the only tool that does build and dependency management. Another famous (or perhaps notorious!) tool I found myself struggling with in the past was Apache Maven, which is quite popular in the Java world.

Furthermore, converters for other kinds of packages to Nix also exists. Other converters I am currently aware of are: cabal2nix, python2nix, go2nix, and bower2nix.

Deploying iOS applications with the Nix package manager revisited

$
0
0
Previously, I have written a couple of blog posts about iOS application deployment. For example, I have developed a Nix function that can be used to build apps for the iOS simulator and real iOS devices, made some testability improvements, and implemented a dirty trick to make wireless ad-hoc distributions of iOS apps possible with Hydra, the Nix-based continuous integration server.

Recently, I made a some major changes to the Nix build function which I will describe in this blog post.

Supporting multiple Xcode versions


Xcode version 6.0 and beyond do not support iOS SDK versions below 8.0. Sometimes, it might still be desirable to build apps against older SDKs, such as 7.0. To be able to do that, we must also install older Xcode versions alongside newer versions.

As with recent Xcode versions, we must also install older Xcode versions manually first and use a Nix proxy function to use it. DMG files for older Xcode versions can be obtained from Apple's developer portal.

When installing a second Xcode DMG, you typically get a warning that looks as follows:


The installer attempts to put Xcode in its standard location (/Applications/Xcode.app), but if you click on 'Keep Both' then it is installed in a different path, such as /Applications/Xcode 2.app.

I modified the proxy function (described in the first blog post) in such a way that the version number and path to Xcode are configurable:

{ stdenv
, version ? "6.0.1"
, xcodeBaseDir ? "/Applications/Xcode.app"
}:

stdenv.mkDerivation {
name = "xcode-wrapper-"+version;
buildCommand = ''
mkdir -p $out/bin
cd $out/bin
ln -s /usr/bin/xcode-select
ln -s /usr/bin/security
ln -s /usr/bin/codesign
ln -s "${xcodeBaseDir}/Contents/Developer/usr/bin/xcodebuild"
ln -s "${xcodeBaseDir}/Contents/Developer/usr/bin/xcrun"
ln -s "${xcodeBaseDir}/Contents/Developer/Applications/iOS Simulator.app/\
Contents/MacOS/iOS Simulator"

cd ..
ln -s "${xcodeBaseDir}/Contents/Developer/Platforms/\
iPhoneSimulator.platform/Developer/SDKs"

# Check if we have the xcodebuild version that we want
if [ -z "$($out/bin/xcodebuild -version | grep -x 'Xcode ${version}')" ]
then
echo "We require xcodebuild version: ${version}"
exit 1
fi
'';
}

As can be seen in the expression, two parameters have been added to the function definition. Moreover, only tools that a particular installation of Xcode does not provide are referenced from /usr/bin. The rest of the executables are linked to the specified Xcode installation.

We can configure an alternative Xcode version by modifying the composition expression shown in the first blog post:

rec {
stdenv = ...;

xcodeenv = import ./xcode-wrapper.nix {
version = "5.0.2";
xcodeBaseDir = "/Applications/Xcode 2.app";
inherit stdenv;
};

helloworld = import ./pkgs/helloworld {
inherit xcodeenv;
};

...
}

As may be observed, we pass a different Xcode version number and path as parameters to the Xcode wrapper which correspond to an alternative Xcode 5.0.2 installation.

The app can be built with Nix as follows:

$ nix-build default.nix -A helloworld
/nix/store/0nlz31xb1q219qrlmimxssqyallvqdyx-HelloWorld
$ cd result
$ ls
HelloWorld.app HelloWorld.app.dSYM

Simulating iOS apps


Previously, I also developed a Nix function that generates build scripts that automatically spawn iOS simulator instances in which apps are deployed, which is quite useful for testing purposes.

Unfortunately, things have changed considerably in the new Xcode 6 and the old method no longer works.

I created a new kind of script that is based on details described in the following Stack overflow article: http://stackoverflow.com/questions/26031601/xcode-6-launch-simulator-from-command-line.

First, simulator instances must be created through Xcode. This can be done by starting Xcode and opening Window -> Devices in the Xcode menu:


A new simulator instance can be added by clicking on the '+' button on the bottom left in the window:


In the above example, I create a new instance with a name 'iPhone 6' that simulates an iPhone 6 running iOS 8.0.

After creating the instance, it should appear in the device list:


Furthermore, each simulator instance has a unique device identifier (UDID). In this particular example, the UDID is: 0AD5FC1C-A360-4D05-9D6A-FD719C46A149

We can launch the simulator instance we just created from the command-line as follows:


$ open -a "$(readlink "${xcodewrapper}/bin/iOS Simulator")" --args \
-CurrentDeviceUDID 0AD5FC1C-A360-4D05-9D6A-FD719C46A149

We can provide the UDID of the simulator instance as a parameter to automatically launch it. If we don't know the UDID of a simulator instance, we can obtain a list from the command line by running:


$ xcrun simctl list
== Device Types ==
iPhone 4s (com.apple.CoreSimulator.SimDeviceType.iPhone-4s)
iPhone 5 (com.apple.CoreSimulator.SimDeviceType.iPhone-5)
iPhone 5s (com.apple.CoreSimulator.SimDeviceType.iPhone-5s)
iPhone 6 Plus (com.apple.CoreSimulator.SimDeviceType.iPhone-6-Plus)
iPhone 6 (com.apple.CoreSimulator.SimDeviceType.iPhone-6)
iPad 2 (com.apple.CoreSimulator.SimDeviceType.iPad-2)
iPad Retina (com.apple.CoreSimulator.SimDeviceType.iPad-Retina)
iPad Air (com.apple.CoreSimulator.SimDeviceType.iPad-Air)
Resizable iPhone (com.apple.CoreSimulator.SimDeviceType.Resizable-iPhone)
Resizable iPad (com.apple.CoreSimulator.SimDeviceType.Resizable-iPad)
== Runtimes ==
iOS 7.0 (7.0.3 - 11B507) (com.apple.CoreSimulator.SimRuntime.iOS-7-0)
iOS 7.1 (7.1 - 11D167) (com.apple.CoreSimulator.SimRuntime.iOS-7-1)
iOS 8.0 (8.0 - 12A365) (com.apple.CoreSimulator.SimRuntime.iOS-8-0)
== Devices ==
-- iOS 7.0 --
-- iOS 7.1 --
-- iOS 8.0 --
iPhone 4s (868D3066-A7A2-4FD1-AF6A-25A90F480A30) (Shutdown)
iPhone 5 (7C672CBE-5A08-481A-A5EF-2EA834E3FCD4) (Shutdown)
iPhone 6 (0AD5FC1C-A360-4D05-9D6A-FD719C46A149) (Shutdown)
Resizable iPhone (E95FC563-8748-4547-BD2C-B6333401B381) (Shutdown)

We can also install an app into the simulator instance from the command-line. However, to be able to install any app produced by Nix, we must first copy the app to a temp directory and restore write permissions:


$ appTmpDir=$(mktemp -d -t appTmpDir)
$ cp -r "$(echo ${app}/*.app)" $appTmpDir
$ chmod -R 755 "$(echo $appTmpDir/*.app)"

The reason why we need to do this is because Nix makes a package immutable after it has been built by removing the write permission bits. After restoring the permissions, we can install it in the simulator by running:


$ xcrun simctl install 0AD5FC1C-A360-4D05-9D6A-FD719C46A149 \
"$(echo $appTmpDir/*.app)"

And launch the app in the simulator with the following command:


$ xcrun simctl launch 0AD5FC1C-A360-4D05-9D6A-FD719C46A149 \
MyCompany.HelloWorld

Like the old simulator function, I have encapsulated the earlier described steps in a Nix function that generates a script spawning the simulator instance automatically. The example app can be deployed by writing the following expression:


{xcodeenv, helloworld}:

xcodeenv.simulateApp {
name = "HelloWorld";
bundleId = "MyCompany.HelloWorld";
app = helloworld;
}

By running the following command-line instructions, we can automatically deploy an app in a simulator instance:


$ nix-build -A simulate_helloworld
/nix/store/jldajknmycjwvf3s6n71x9ikzwnvgjqs-simulate-HelloWorld
./result/bin/run-test-simulator 0AD5FC1C-A360-4D05-9D6A-FD719C46A149

And this is what the result looks like:


The UDID parameter passed to the script is not required. If a UDID has been provided, it deploys the app to that particular simulator instance. If the UDID parameter is omitted, it displays a list of simulator instances and asks the user to select one.

Conclusion


In this blog post, I have described an addition to the Nix function that builds iOS application to support multiple versions of Xcode. Furthermore, I have implemented a new simulator spawning script that works with Xcode 6.

The example case can be obtained from my GitHub page.

Dr. Sander

$
0
0


As a follow up story on the previous two blog posts, I can tell you that I have defended my PhD thesis last Monday, which went successfully. Now I can officially call myself a Doctor.

It was quite a tense, busy and interesting day. I'm not really used to days in which you are the centre of attention for most of the time. In the morning, our foreign guests: Sam Malek and Roberto di Cosmo (who are members of my committee) arrived to tell a bit about their ongoing research. We had some sort of a small software deployment workshop with quite some interesting ideas and discussions. Having them visiting us, gave me some renewed excitement about ongoing research and some interesting future ideas, since software deployment (as I have explained) is typically a misunderstood and under appreciated research subject.

After our small workshop, I had to quickly pick up our suits, and then return to the university. In the early afternoon, I had to give a laymen's talk to my family and friends in which I tried to explain the background and the goal of my PhD thesis. Then the actual defence started lasting for exactly one hour (no more, no less) in which the committee members asked me questions about my thesis and the accompanied propositions.

At Delft University of Technology and any other Dutch university, there is a lot of ceremonial stuff involved with the PhD defence. Professors have to be dressed up in gowns. Me and my paranimphs (the two people sitting in front of me who help me and take over my defence if I pass out) had to be dressed up in suits. I have to address to committee members formally depending on their roles, such as: "hooggeleerde opponent" and the committee members had to formally address me with "waarde promovendus".

I received some interesting questions during my defence round. To have an idea what these questions were, Vadim Zaytsev has made 140 character transcriptions of each question and answer on Twitter.

Although I was experiencing some nervousness at the beginning of the day, as I didn't know what exactly was going to happen and what kind of questions I would receive, in the end it was fun and I liked it very much. After the defence I received my PhD diploma:


For people that want to know what my thesis is exactly about:

On the improvement of software deployment processes and some definitions

$
0
0
Some time ago, I wrote a blog post about techniques and lessons to improve software deployment processes. The take-home message of this blog post was that in order to improve deployment processes, you must automate everything from the very beginning in a software development process and properly decompose the process into sub units to make the process more manageable and efficient.

In this blog post, I'd like to dive a bit deeper into the latter aspect by exploring some definitions of "decomposition units" in the literature and by deriving mappings between them.

Software projects


The first "definition" that I want to mention is the software project, for which I (interestingly enough) could not find anything in the literature. The reason why I start with this term is that software deployment issues often already appear in the early stages of a software development process.

The term "software project" is something which is hard to define formally IMHO. To me they typically manifest themselves as directories of files that I can divide into the following categories:

  • Executable code. Files typically containing code implementing a program that performs computation and manipulates data.
  • Resources/data. Files not implementing anything that is executed, which are used or referenced by the program, such as images, configuration files, video, audio, HTML pages, etc.
  • Build configuration files. Configuration files used by a build system that transform or change the files belonging to the earlier two categories.

    For example, executable code is often implemented in higher level programming languages and must be compiled to object code so that the program can be executed. Also many kinds of other processing steps can be executed, such as scaling images to lower resolutions, obfuscating/minifying code, running a style checker, bundling object code and resources etc.

Sometimes it is hard to draw a hard line between executable code and data files. For example, it may be possible that a data artifact (e.g. an HTML page) includes executable code (e.g. embedded JavaScript), and the other way around, such as assembly code containing strings in their code sections for efficiency.

Software projects can often be conveniently created by an Integrated Development Environment (IDE) that typically provides useful templates and automatically fills in many boilerplate settings. However, for small projects, people frequently create software projects manually, for example, by manually creating a directory of source files with a Makefile.

It is probably obvious to notice that dealing with software deployment complexity requires automation and files belonging to the third category (build configuration files) must be provided. Yet, I have seen quite a few projects in the past in which nothing is automated and people still rely on manually executing executing build tasks in an IDE, which is often tedious, time consuming and error prone.

Software modules


An automated build process of a software project provides a basic and typically faster means of (re)producing releases of a software product and is often less error prone than a manual build process.

However, besides build process automation there could still be many other issues. For example, if a software project has a monolithic build structure in which nothing can be built separately, deployment times become unnecessarily long and their configurations often have a huge maintenance complexity. Also, upgrading an existing deployment is typically difficult, expensive and unreliable.

To improve the efficiency of build processes, we need to decompose them into units that can be built separately. An import prerequisite to accomplish build decomposition is functional separation of important aspects of a software project.

A relatively simple concept supporting functional separation is the software module. According to Clemens Szyperski's "Component Software" book, a software module is a unit that has the following characteristics:

  • A module implements an ADT (Abstract Data Type).
  • Encapsulates multiple entities, often classes, but sometimes other kinds of entities, such as functions.
  • Have no concept of instantiation, in other words: there is one and only one instance of a module.

Several programming languages have a notion of modules, such as Module-2, Ada, C# and Java (since version 9). Sometimes the module concept is named differently in these languages. For example, in Ada modules are called packages and in C# they are called assemblies.

Not all programming languages support modular programming. Sometimes external facilities must be used, such as CommonJS in JavaScript. Moreover, modules can also be "simulated" in various ways, such as with static classes or singleton objects.

Encapsulating functionality into modules also typically imposes a certain filesystem structure for organizing the source code files. In some contexts, a module must correspond to a single file (e.g. in CommonJS) and in others to directories of files following a certain convention (e.g. in Java the names of directories should correspond to the package names, and the names of regular files to the name of the enclosing type in the code). Sometimes files belonging to a module can also be bundled into a single archive, such as a Zip container (e.g. a JAR file) or library file (e.g. *.dll or *.so files).

Refactoring a monolithic codebase into modules in a meaningful way is all but trivial. According to the paper "On the criteria to be used in decomposing systems into modules" written by David Parnas, it is a good practice to minimize coupling between modules (i.e. the dependencies between modules should be minimized) and maximize cohesion within modules (i.e. strongly related things should belong to the same module).

Software components


The biggest benefit of modularization is that parts of the code can be effectively reused. Reuse of software assets can be improved even further by turning modules (that typically work on code level) into software components that work on system level. Clemens Szyperski's "Component Software" book says the following about them:
The characteristic properties of a component are that it:

  • is a unit of independent deployment
  • is a unit of third-party composition
  • has no (externally) observable state

The above characteristics have several implications:

  • Independent deployment means that a component is well separated from the environment and other components, never deployed partially and third parties should not require access to its construction details.
  • To allow third-party composition a component must be sufficiently self contained and have clear specifications of what it provides and what it requires. In other words, they interact with the environment with well defined interfaces.
  • No externally observable state means that no distinction can be made between multiple copies of components.

So in what way are components different than modules? From my point of view, modularization is a prerequisite for componentization and some modules may already qualify themselves as minimal components.

However, some notable differences between modules and components is that the former are allowed to have observable state (e.g. having global variables that are imperatively modified) and dependencies on implementations rather than interfaces.

Furthermore, to implement software components standardized component models are frequently used, such as CORBA, COM, EJB, or web services (e.g. SOAP, WSDL, UDDI) that provide various kinds of facilities, such as (some sort of) a platform independent interface, lookup and discovery. Modules typically use the interface facilities provided by a programming language.

Build-Level Components


Does functional separation of a monolithic codebase into modules and/or components also improve deployment? According to Merijn de Jonge's IEEE TSE paper titled: "Build-Level components" this is not necessarily true.

For example, it may still be possible that source code files implementing modules or components on a functional level, are scattered across directories of source code files. For example, between the directories in a codebase, many references may exist (strong coupling) and directories often contain too many files (weak cohesion).

According to the paper, strong coupling and weak cohesion on the build level have the following disadvantages:
  1. potentially reusable code, contained in some of the entangled modules, cannot easily be made available for reuse;
  2. the fixed nature of directory hierarchies makes it hard to add or to remove functionality;
  3. the build system will easily break when the directory structure changes, or when files are removed or renamed.

In the paper, the author shows that Component-Based Software Engineering (CBSE) principles can be applied to the build level as well. Build-Level components can be formed by directories of source files and serve as a unit of composition. Access occurs via build, configuration, and requires interfaces:

  • The build interface defines which build operations to execute. In a GNU Autotools project following the GNU Coding Standards (used in the paper), these operations correspond to a number standardized make targets, e.g. make all, make install, make dist.
  • The configuration interface defines which variability points and parameters can be enabled or disabled. In a GNU Autotools project, this interface correspond to the --enable-foo and --disable-foo parameters passed to the configure script -- each enable or disable parameter defines a certain feature that can be enabled or disabled.
  • The requires interface can be used to bind dependencies to components. In a GNU Autotools project, this interface correspond to the --with-foo and --without-foo parameters passed to the configure script that take the paths to the corresponding dependencies as parameters allowing the configuration script to find it.

Although the paper only uses GNU Autotools-based for implementation purposes, build-level components are not restricted to any build technology -- the only thing that matters is that the operations for these three interfaces are standardized so that any component can be configured, composed, and built uniformly.

The paper describes a collection of smells and some refactor patterns that need to be applied to turn directories of source files into build level components. The rules mentioned in the paper are the following:
  1. Components with directory granularity
  2. Circular dependencies should be prevented
  3. Software building via standardized build interface
  4. Compile-time variability binding via standardized configuration interface
  5. Late binding of dependencies via require interface
  6. Build process definition per component
  7. Configuration process definition per component
  8. Component deployment with build level-packages
  9. Automated component composition

Software packages


As described in the previous sections, functional separation is a prerequisite to compose build level components. One important aspect of build-level components is that build processes of modules and components are separated. But how does build separation affect the overall deployment process (to which the build phase also belongs)?

Many deployment processes are typically carried out by tools called package managers. Package managers install units that are called software packages. According to the paper: "Package Upgrades in FOSS Distributions: Details and Challenges" written by Di Cosmo et al (HotSWUp 2008), a software package can be defined as follows:
Packages are abstractions defining the granularity at which users can act (add, remove, upgrade, etc.) on available software.

According to the paper a package is typically a bundle of 3 parts:

  • Set of files. Contains all kinds of files that must be copied somewhere to the host system to make the software work, such as scripts, binaries, resources etc.
  • Set of valued meta-information. Contains various kinds of meta attributes, such as the name of the package, the version, a description and its license. Most importantly, it contains information about the inter-package relationships which includes a set of dependencies on other packages and a set of conflicts with other packages. Package managers typically install its required dependencies automatically and refuses to install if a conflict has been encountered.
  • Executable configuration scripts (also known as maintainer scripts). These are basically scripts that imperatively "glue" files from the package to files already residing on the system. For example, after a certain package has been installed, some configuration files of the host system are automatically adapted so that it can be used properly.

Getting a software project packaged typically involves defining the meta data (including the dependencies/conflicts on external packages), bundling the build process (for source package managers) or the resulting build artifacts (for binary package managers), and composing maintainer scripts taking care of the remaining bits to make the package work (although I would personally not recommend using these kinds of scripts).

This process already works for big monolithic software projects. However, it has several drawbacks for these kinds of projects. Since it needs to deploy a big project as a whole, deployment is typically an expensive process. Not only a fresh installation of a package takes time, but also upgrading, since it has to replace an existing installation as a whole instead of the affected areas only.

Moreover, upgrading is also quite dangerous. Many package managers typically replace and remove files belonging to a package that reside in global locations on the filesystem, such as /usr/bin, /usr/lib (on Linux) or C:\WINDOWS\SYSTEM32 (on Windows). If an upgrade process gets interrupted, the system might reach an inconsistent state for which it might be difficult (or impossible) to do a rollback. The bigger a project is the more severe the potential damage becomes.

Packaging smaller units of a software project (e.g. a build-level component) is typically more work, but also has great benefits. It allows certain, smaller pieces of a software projects to be replaced separately, significantly increasing the efficiency and reliability of the upgrades. Moreover, the dependencies of software components and build-level components have already been identified and only need to be translated to the corresponding packages that provide them.

Nix packages


I typically use the Nix package manager (and related tools) for deployment activities. It borrows concepts from purely functional programming languages to make deployment reliable, reproducible and efficient.

In what way do packages deployed by Nix conform to the definition of software package shown earlier?

Deployment in Nix is driven by build recipes (called Nix expressions) that build packages including all its dependencies from source. Every package build (indirectly) invokes the derivation {} function that composes an isolated environment in which builds are executed in such a way that only the declared dependencies can be found and anything else cannot influence the build. The function arguments include package metadata, such as a description, license, maintainer etc. and the package dependencies.

References to dependencies in Nix are exact meaning that they bind to specific builds of other Nix packages. Conventional package managers, software components and build-level components typically use nominal version specifications consisting of the names and version numbers of the packages, which are less strict. Mapping nominal dependencies to exact dependencies is not always trivial. For example, nominal version ranges are unsupported in Nix and must be snapshotted. In an earlier blog post that describes how to deploy NPM packages with Nix has more details about this.

Another notable trait of Nix is that is has no notion of conflicts. In Nix, any package can coexist with another because they are all stored in isolated directories. However, conflicts may also indicate runtime conflicts between two packages. These kinds of issues need to be solved by other means.

Finally, Nix packages have no configuration (or maintainer) scripts, because they imperatively modify the system's state which conflicts with its underlying purely functional deployment model. Many things that configuration scripts typically do are accomplished in a different way if Nix is used for deployment. For example, configuration files are not adapted, but generated in a Nix expression and deployed as a Nix package. Service activation is typically done by generating a job description file (e.g. init script or systemd job) that starts and stops it.

NixOS configurations, Disnix configurations, Hydra jobsets


If something is packaged in a Nix expression you could easily broaden the application area of deployment:

  • With a few small modifications (mainly encapsulating several packages into a jobset), a Nix package can be turned into a Hydra jobset, so that a project can be integrated and tested continuously.
  • A package can be referenced from a NixOS module that, for example, automatically starts and stops a package on startup and shutdown. NixOS can be used to deploy entire system configurations from a single declarative specification in which the module is enabled.
  • A collection of NixOS configurations can also be deployed in a network of physical or virtual machines through NixOps.
  • A package can be turned into service by adding a specification of inter-dependencies (services that may reside on other machines in a network). These services can be used to compose a Disnix configuration that deploys services to machines in a network.

Summary


I can summarize all the terms described in this blog post and the activities that need to be performed to implement them in the following chart:


Concluding remarks


In this blog post, I have described some terminology and potential mappings between them with the purpose of defining a reengineering process that makes deployment processes more manageable and efficient.

The terms and mappings used in this blog post are quite abstract. However, if we make a number of concrete technology choices, e.g. a programming language (Java), component technology (web services), package manager (Nix), we can define a more concrete process allowing someone to make considerable improvements.

Moreover, the terms described in this blog post are idealistic. In practice, most units that are called modules or components do not fully qualify themselves as such, while it is still possible to package and deploy them individually. Perhaps, it would also be useful to make "weaker" definitions of some of the terms described in this blog post and to look for their corresponding minimum requirements.

Finally, we can also look into more refactor/reengineering patterns for the other terms and possible automation of them.


Fourth annual blog reflection

$
0
0
Today it's exactly four years ago that I started this blog, so again it's an interesting opportunity to reflect over last year's writings.

Software deployment


As usual, the majority of blog posts written this year were software deployment related. In the mobile application area, I have developed a Nix function allowing someone build Titanium apps for iOS and Android, I revised the Nix iOS build function to use the new simulator facilities of Xcode 6, did some nice tricks to get existing APKs deployed in the Android emulator, and I described an approach allowing someone to do wireless ad-hoc distributions of iOS apps with Hydra, the Nix-based continuous integration server.

A couple of other deployment blog posts were JavaScript related. I have extended NiJS with support for asynchronous package specifications, which can be used both for compilation to Nix expressions or standalone execution by NiJS directly. I advertised the improved version as the NiJS package manager and successor of the Nix package manager on April fools day. I received lots of hilarious comments that day! Some of them included thoughts and comments that I could not possibly think of!

The other JavaScript related deployment blog post was about my reengineering effort of npm2nix that generates Nix expressions from NPM package specifications. The original author/maintainer relinquished his maintainership, and I became a co-maintainer of it.

I also did some other deployment stuff such as investigating how Nix and Hydra builds can be backed up and describing how packages can be managed outside the Nixpkgs tree.

Finally, I have managed to get a more theoretical blog post finished earlier today in which I explore some terminology and mappings between them to improve software deployment processes.

IFF file format experiments


I also spent a bit of time on my fun project involving IFF file formats. I have ported the ILBM viewer and 8SVX player applications from SDL 1.2 to 2.0. I was a bit puzzled by one particular aspect -- namely: how to continuously render 8-bit palettized surfaces, so I have decided to write a blog post about it.

Another interesting thing I did is porting the project to Visual C++ so that they can be run on Windows natively. I wrote a blog post about a porting strategy and improvement to the Nix build function that can be used to build Visual Studio projects.

Research


Although I have left academia there is still something interesting to report about research this year. In the past we have worked on a dynamic build analysis approach to discover license constraints (also covered in Chapter 10 of my PhD thesis). Unfortunately, all the paper submission attempts we did were rejected and eventually we gave up publishing it.

However, earlier in April this year, one of our peers decided to give it another shot and got Shane McIntosh on board. Shane McIntosh and me have put a considerable amount of effort in improving the paper, which we titled: "Tracing software build processes to uncover license compliance inconsistencies". We submitted the improved paper to ASE 2014. Good news is: the paper got accepted! I'm glad to find out that someone can show me that I can be wrong sometimes! :-)

Miscellaneous stuff


I also spent some time on reviving an old dormant project helping me to consistently organise website layouts because I had found some use for it, and to release it as free and open source software on GitHub.

Another blog post I'm very proud of is about structured asynchronous programming in JavaScript. From my experience with Node.js I observed that to make server applications work smoothly, you must "forget" about certain synchronous programming constructs and replace them by asynchronous alternatives. Besides the blog post, I also wrote a library implementing the abstractions.

Blog posts


As with my previous annual reflections, I will also publish the top 10 of my most frequently read blog posts:

  1. On Nix and GNU Guix. As with the previous two annual reflections, this blog post remains on top and will probably stay at that position for a long time.
  2. An alternative explanation of the Nix package manager. Also this blog post's position remains unchanged since the last two reflections.
  3. Composing FHS-compatible chroot environments with Nix (or deploying Steam in NixOS). This blog post has moved to the third position and that's probably because of the many ongoing discussions on the Internet about Nix and the FHS, and the discussion whether NixOS can run Steam.
  4. Using Nix while doing development. This post also gained a bit of more popularity since last year, but I have no idea why.
  5. Setting up a Hydra build cluster for continuous integration and testing (part 1). A blog post about Hydra from and end user perspective that still remains popular.
  6. Setting up a multi-user Nix installation on non-NixOS systems. This blog post is also over one year old and has entered the all time top 10. This clearly indicates that the instructions in the Nix manual are still unclear and this feature is wanted.
  7. Asynchronous programming with JavaScript. Another older blog post that got some exposure on some discussion sites and entered the all time top 10 as a consequence.
  8. Second computer. Still shows that the good ol' Amiga remains popular! This blog post has been in the all-time top 10 since the first annual blog reflection.
  9. Yet another blog post about Object Oriented Programming and JavaScript. Yet another older blog post that was suddenly referenced by a Stackoverflow article. As a consequence, it entered the all time top 10.
  10. Wireless ad-hoc distributions of iOS applications with Hydra. This the only blog article I wrote this year that ended up in the all-time top 10. Why it is so popular is a mistery to me. :-)

Conclusion


I'm still not out of ideas and there will be more stuff to report about next year, so stay tuned! The remaining thing I'd like to say is:

HAPPY NEW YEAR!!!

Agile software development: my experiences

$
0
0
In a couple of older blog posts, I've reported about my experiences with companies, such as the people to whom I talked to at the LAC conference and my job searching month. One of the things that I have noticed is that nearly all of them were doing "Agile software development", or at least they claim to do so.


At the LAC conference, Agile seemed to be one of the hottest buzzwords and every company had its own success story in which they explained how much Agile methodologies have improved their business and the quality of the systems that they deliver.

The most popular Agile software development methodology nowadays is probably Scrum. All the companies that I visited in my job searching month, claimed that they have implemented it in their organisation. In fact, I haven't seen any company recently, that is intentionally not using Scrum or any other Agile software development methodology.

Although many companies claim to be Agile, I still have the impression that the quality of software systems and the ability to deliver software in time and within the budget haven't improved that much in general, although there are some exceptions, of course.

What is Agile?


I'm not an expert in Agile software development. One of the first things I wanted to discover is what "Agile" actually means. My feeling says that only a few people have an exact idea, especially non-native English speakers, such as people living in my country -- the Netherlands. To me, it looks like most of the developers with a Dutch mother tongue use this buzzword as if it's something as common as ordering a hamburger in a restaurant without consciously thinking about its meaning.

According to the Merriam Webster dictionary, Agile means:
1
: marked by ready ability to move with quick easy grace <an agile dancer>
2
: having a quick resourceful and adaptable character <an agile mind>

The above definition is a bit abstract, but contains a number of interesting keywords. To me it looks like if some person or object is Agile, then it has a combination of the following characteristics: quick, easy, resourceful, and adaptable.

Why do we want/need to be Agile in software development?


It's generally known that many software development projects partially or completely fail, because of many reasons, such as:
  • The resulting system is not delivered in time or cannot be delivered at all.
  • The resulting system does not do what the customer expects, a.k.a. mismatch of expectations. Apart from customers, this also happens internally in a development team -- developers may implement something totally different as a designer has intended.
  • There is a significant lack of quality, such as in performance or security.
  • The overall project costs (way) too much.

These issues are caused by many factors, such as:

  • Wrong estimations. It is difficult (or sometimes impossible) to estimate how much time something will take to implement. For example, I have encountered a few cases in my past career in which something took double or even ten times the amount of time that was originally estimated.
  • Unclarity. Sometimes a developer thinks he has a good understanding of what a particular feature should look like, but after implementing it, it turns out that many requirements were incorrectly interpreted (or sometimes even overlooked), requiring many revisions and extra development time.
  • Interaction problems among team members. For example, one particular developer cannot complete his task because of a dependency on another developer's task which has not been completed yet. Also, there could be a mismatch of expectations among team members. For example, a missing feature that has been overlooked by one developer blocking another developer.
  • Changing requirements/conditions. In highly competitive environments, it may be possible that a competitor implements missing features that a certain class of customers want making it impossible to sell your product. Another example could be Apple changing its submission requirements for the Apple Appstore making it impossible to distribute an application to iPhone/iPad users unless the new requirements have been met.
  • Unpredictable incidents and problems. For example, a team member gets sick and is unavailable for a while. The weather conditions are so bad (e.g. lots of snowfall) that people can't make it to the office. A production server breaks down and needs to be replaced by a new instance forcing the organisation to invest money to buy a new one and time to get it configured.
  • Lack of resources. There is not enough manpower to do the job. Specialized knowledge is missing. A server application requires much more system resources than expected, e.g. more RAM, more diskspace etc.

Ideally, in a software development project, these problems should be prevented. However, since this ideal is hard to achieve, it is also highly desirable to be able to respond to them as quickly as possible without too much effort, to prevent the corresponding problems to grow out of hand. That is why being Agile (e.g. quick, easy, resourceful, and adaptable) in software development is often not only wanted, but also necessary, in my opinion.

Agile manifesto


The "definition of Agile" has been "translated" to software development by a group of practitioners, into something that is known as the Agile manifesto. This manifesto states the following:

We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value:

Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan

That is, while there is value in the items on the right, we value the items on the left more.

The Agile manifesto looks very interesting, but when I compare it to the definition of Agile provided by the Merriam Webster dictionary, I don't see any of its characterizing keywords (such as adaptable and easy) in the text at all, which looks quite funny to me. The only piece that has some kind of connection is Responding to change (that has a connection to adaptable), but that is pretty much everything I can see that it has in common.

Interpretation


This observation makes me wonder: How is the Agile manifesto going to help us to become more agile in software development and more importantly, how should we interpret it?

Because it states that the items on the left have more value than the items on the right, I have seen many people considering the right items not to be relevant at all. As a consequence, I have seen the following things happen in practice:

  • Not thinking about a process. For example, in one of my past projects, it was common to create Git branches for all kinds of weirdly related tasks in an unstructured manner because that nicely integrated with the issue tracker system. Furthermore, merging was also done at unpredictable moments. As a consequence, it often came together with painful merge conflicts that were hard to resolve making the integration process tedious and much more time consuming than necessary.
  • Not documenting anything at all. This is not about writing down every detail per se, but rather about documenting a system from a high level perspective to make the basics clear to everyone involved in a development process, such as a requirements document.

    I have been involved in quite a few projects in which we just started implementing something without writing anything down at all and "trust" that it eventually gets right. So far, it always took us many more iterations than if most of the simple, basic details would be clear from the beginning. For example, some basic domain knowledge that may sound obvious, may turn out not be that obvious at all.
  • Not having an agreement with the customer. Some of the companies I worked for did integration with third party (e.g. customer's) systems. What, for example, if you are developing the front-end and some error occurs because of a bug in the customer's system? Who's going to get blamed? Typically, it's you unless you can prove otherwise. Moreover, unclear communication may also result in wrong expectations typically extending the development time.
  • Not having a plan at all. Of course being flexible with regard to changes is good, but sometimes you also have to stick yourself to something, because it might completely alter the current project's objectives otherwise. A right balance must be found between these two, or you might end up in a situation like this. It also happened to me a few times when I was developing web sites as a teenager.

From my perspective, the Agile manifesto does not say that the emphasis should lie on the left items only. In fact, I think the right items are also still important. However, in situations where things are unclear or when pressure arises, then the item on the left should take precedence. I'm not sure if this is something the authors of the manifesto have intended to communicate though.

For example, while developing a certain aspect of a system, it would still make sense to me to write their corresponding requirements down so that everyone involved knows about it. However, writing every possible detail down often does not make sense because they are typically not known or subject to change anyway. In these kind of situations, it would be better to proceed working on an implementation, validate that with the stakeholders and refine the requirements later.

Same thing, for example, applies to customer collaboration (in my opinion). An agreement should be made, but of course, there are always unforeseen things that both parties did not know of. In such situations it is good to be flexible, but it should not come at any price.

Why agile?


What is exactly Agile about finding a right balance between these items? I think in ideal situations, having a formalized processed that exactly describes the processes, documentation that catches everything, a solid contract that does not have to be changed and a plan of which you know that works is the quickest and easiest path to get software implemented.

However, since unpredictable and unforeseen things always happen, these might get in your way and you have to be flexible. In such cases, you must be adaptable by giving the items on the left precedence. I don't see, however, what's resourceful about all of this. :-)

So is this manifesto covering enough to consider software development "Agile" if it is done properly? Not everybody agrees! For example, there is also the More Agile Manifesto that covers organisations, not teams. Kent Beck, one of the signatories of the Agile manifesto, wrote an evolved version. Zed Shaw considers it all to be nonsense and simply says that people should do programming and nothing should get in their way.

I'm not really a strong believer in anything. I want to focus myself on facts, rather than on an idealism.

Scrum


As I have explained earlier, nearly all the companies that I visited during my job searching month as well as my current employer have implemented (or claim to have implemented) Scrum in their organisation. According to the Scrum guide, Scrum is actually not a methodology, but rather a process framework.

In a process implementing Scrum, development is iterative and divided into so-called sprints (that take up to 2-4 weeks). At the end of each sprint an increment is delivered that is considered "done". Each sprint has the following activities:

  • The Sprint planning is held at the beginning of each sprint in which the team discusses how and when to implement certain items from the product backlog.
  • Daily scrum is a short meeting held at the beginning of every development day in which team members briefly discuss the progress made the day before, the work that needs to be done the next 24 hours and any potential problems.
  • The Sprint review activity is held at the end of the sprint in which stakeholders review, reflect and demonstrate what is done. Furthermore, future goals are set during this meeting.
  • Finally, the Sprint retrospective meeting is held in which team members discuss what can be improved with regards to people, relationships, process, and tools in future sprints.

In a Scrum process, two kinds of "lists" are used. The product back log contains a list of items that need to be implemented to complete the product. The sprint back log contains a list of items reflecting the work that needs to be done to deliver the increment.

Teams typically consist of 3-9 persons. Scrum only defines three kinds of team member roles, namely the product owner (responsible for maintaining the product back log and validating it), the Scrum master (who guards to process and takes away anything that blocks developers) and developers.

The Scrum guide makes no distinction between specific developer roles, because (ideally) every team member should be able to take over each other's work if needed. Moreover, teams are self-organizing meaning that it's up to the developers themselves (and nobody else) to decide who does what and how things are done.

Why agile?


I have encountered quite a few people saying "Hey, we're doing Scrum in our company, so we're Agile!", because they appear to have some sort of a process reflecting the above listed traits. This makes me wonder: How is Scrum going to help and what is so agile about it?

In my opinion, most of its aspects facilitate transparency (such as the four activities) to prevent that certain things to go wrong or that too much time is wasted because of misunderstandings. It also facilitates reflection with the purpose to adapt and optimize the development process in future sprints.

Since Scrum only loosely defines a process, the activities defined by it (sort of) make sense to me, but also deliberately leaves some things open. As I have mentioned earlier, a completely predictable process would be the quickest and easiest way to do software development, but since that ideal is hard to achieve because of unpredictable/unforeseen events, we need some flexibility too. We must find a balance and that is what Scrum is (sort of doing) by providing a framework that still gives an adopter some degree of freedom.

A few things that came into my mind with regards to a process implementation are:

  • How to specify"items" on the product and sprint backlogs? Although the Scrum guide does not say anything on how to do this, I have seen many people using a so-called "user-story format" in which they describe items in a formalism like "As a <user role> I want to <do some activity / see something etc. >".

    From my point of view, a user story (sort of) reflects a functional or non-functional requirement or a combination of both. However, it is typically only an abstract definition of something that might not cover all relevant details. Moreover, it can also be easily misinterpreted.

    Some people have told me that writing more formal requirements (e.g. by adhering to a standard, such as the IEEE 830-1998 standard for software requirement specifications) is way too formal, too time consuming and "unagile".

    IMHO, I think it really depends on the context. In some projects, the resulting product has only simple requirements (that do not even have to be met fully) and in others more difficult ones. In the latter case, I think it pays off to think about requirements more thoroughly, than having to revise to product many times. Of course, a right balance must be found between specifying and implementing.
  • When is something considered "done"? I believe this is one of the biggest ongoing discussions within the Scrum community, because the Scrum guide intentionally leaves the meaning of this definition open to the implementer.

    Some questions that I sometimes think about are: Should something be demonstrated to stakeholders? Should it also be tested thoroughly (e.g. all automated test cases must pass and the coverage should be acceptable)? Can we simply run a prototype on a development machine or does the increment have to be deployed to a production environment?

    All these questions cannot be uniformly answered. If the sprint goal is a prototype then the meaning of this definition is probably different than a mission critical product. Furthermore, accomplishing all the corresponding tasks to consider something done might be more complicated than expected, e.g. software deployment is often a more difficult problem than people think.
  • How to effectively divide work among team members and how to compose teams? If for example, people have to work on a huge monolithic code base, then it is typically difficult to, for example, compose two teams working on it simultaneously because they might apply conflicting changes that slow things down and may break a system. This could also happen between individual team members. To counter this, modularization of a big codebase helps, but accomplishing this is all but trivial.
  • According to the Scrum guide, each developer is considered equal, but how can we ensure that one developer is capable of taking over another developer's work? That person needs to have the right skills and domain specific knowledge. For the latter aspect it is also important to have something documented, I guess.
  • How to respond to unpredictable events during a sprint? Should it be cancelled? Should the scope be altered?

In practice, I have not seen that many people consciously thinking about the implementation of certain aspects in a Scrum process at all. They are either too much concerned with the measurable aspects of a process, (e.g. is the burndown chart, that reflects the amount of work remaining, looking ok?), or the tools that are being used (e.g. should we add another user story?).

IMHO, Scrum solves and facilitates certain things that helps you to be Agile. But actually being Agile is a much broader and more difficult question to answer. Moreover, this question also needs to be continuously evaluated.

Conclusion


In this blog post, I have written about my experiences with Agile software development. I'm by no means an expert or a believer in any Agile methodology.

In my opinion, what being agile actually means and how to accomplish this is a difficult question to answer and must be continuously evaluated. There is no catch-all solution for being it.

References


I gained most of my inspiration for this blog post from my former colleague's (Rini van Solingen) video log named: "Groeten uit Delft" that covers many Scrum and Agile related topics. I used to work for the same research group (SERG) at Delft University of Technology.

A sales pitch explanation of NixOS

$
0
0
Exactly one week ago, I have visited FOSDEM for the seventh time. In this year's edition, we had a NixOS stand to promote NixOS and its related sub projects. Promoting NixOS is a bit of challenge, because properly explaining its underlying concepts (the Nix package manager) and their benefits is often not that straight forward.


Explaining Nix


Earlier I have written two recipes explaining the Nix package manager, each having its pros and cons. The first recipe is basically explaining Nix from a system administrator's perspective -- it starts by explaining what the disadvantages of conventional approaches are and then what Nix does differently: namely storing packages in isolation in separate directories in the Nix store using hash codes as prefixes. Usually when I show this to people, there is always a justification process involved, because these hash codes look weird and counter-intuitive. Sometimes it still works out despite the confusion, sometimes it does not.

The other recipe explains Nix from a programming language perspective, since Nix borrows its underlying concepts from purely functional programming languages. In this explanation recipe, I first explain in what way purely functional programming languages are different compared to conventional programming languages. Then I draw an analogy to package managers. I find this the better explanation recipe, because the means used to make Nix purely functional (e.g. using hash codes) make sense in this context. The only drawback is that a large subset of the people using package managers are often not programmers and typically do not understand nor appreciate the programming language aspect.

To summarize: advertising the Nix concepts is tough. While I was at the NixOS stand, I had to convince people passing by in just a few minutes that it is worth to give NixOS (or any of its sub projects) a try. In the following section, I will transcribe my "sales pitch explanation" of NixOS.

The pitch


NixOS is a Linux distribution built around the Nix package manager solving package and configuration management problems in its own unique way. When installing systems running Linux distributions by conventional means, it is common to do activities, such as installing the distribution itself, then installing additional custom packages, modifying configuration files and so on, which is often a tedious, time consuming and error prone process.

In NixOS the deployment of an entire system is fully automated. Deployment is driven by a declarative configuration file capturing the desired properties of a system, such as the harddrive partitions, services to run (such as OpenSSH, the Apache webserver), the desktop (e.g. KDE, GNOME or Xfce) and end-user packages (e.g. Emacs and Mozilla Firefox). With just one single command-line instruction, an entire system configuration can be deployed. By adapting the declarative configuration file and running the same command-line instruction again, an existing configuration can be upgraded.

NixOS has a couple of other nice properties as well. Upgrading is always safe, so there is no reason to be worried that an interruption will break a system. Moreover, older system configurations are retained by default, and if an upgrade, for example, makes a system unbootable, you can always switch back to any available older configuration. Also configurations can be reproduced on any system by simply providing the declarative configuration file to someone else.

Several tools in the Nix project extend this deployment approach to other areas: NixOps can be used to deploy a network of NixOS machines in the cloud, Hydra is the Nix-based continuous integration server, Disnix deploys services into a network of a machines. Furthermore, the Nix package manager -- that serves as the basis for all of these tools -- can also be used on any Linux distribution and a few other operating systems as well, such as Mac OS X.

Concluding remarks


The above pitch does not reveal much about its technical aspects, but simply focuses itself on its key aspect -- fully automated deployment and some powerful quality properties. This often leads to more questions from people passing by, but I consider that a good thing.

This year's FOSDEM was a very nice experience. I'd like to thank all the fellow Nixers who did all the organisation work for the stand. As a matter of fact, apart from doing some promotion work at the stand I was not involved in any of its organizational aspects. Besides having a stand to promote our project, Nicolas Pierron gave a talk about NixOS in the distributions devroom. I also enjoyed Larry Wall's talk about Perl 6 very much:


I'm looking forward to see what next year's FOSDEM will bring us!

On NixOps, Disnix, service deployment and infrastructure deployment

$
0
0
I have written many software deployment related blog posts covering tools that are part of the Nix project. However, there is one tool that I have not elaborated about so far, namely: NixOps, which has become quite popular these days.

Although NixOps is quite popular, its availability also leads to a bit of confusion with another tool. Some Nix users, in particular newbies, are suffering from this. The purpose of this blog post is to clear up that confusion.

NixOps


NixOps is something that is advertised as the "NixOS-based cloud deployment tool". It basically expands NixOS' (a Linux distribution built around the Nix package manager) approach of deploying a complete system configuration from a declarative specification to networks of machines and instantiates and provisions the required machines (e.g. in an IaaS cloud environment, such as Amazon EC2) automatically if requested.

A NixOps deployment process is driven by one or more network models that encapsulate multiple (partial) NixOS configurations. In a standard NixOps workflow, network models are typically split into a logical network model capturing settings that are machine independent and a physical network model capturing machine specific properties.

For example, the following code fragment is a logical network model consisting of three machines capturing the configuration properties of a Trac deployment, a web-based issue tracker system:

{
network.description = "Trac deployment";

storage =
{ pkgs, ... }:

{ services.nfs.server.enable = true;
services.nfs.server.exports = ''
/repos 192.168.1.0/255.255.255.0(rw,no_root_squash)
'';
services.nfs.server.createMountPoints = true;
};

postgresql =
{ pkgs, ... }:

{ services.postgresql.enable = true;
services.postgresql.package = pkgs.postgresql;
services.postgresql.enableTCPIP = true;
services.postgresql.authentication = ''
local all all trust
host all all 127.0.0.1/32 trust
host all all ::1/128 trust
host all all 192.168.1.0/24 trust
'';
};

webserver =
{ pkgs, ... }:

{ fileSystems = [
{ mountPoint = "/repos";
device = "storage:/repos";
fsType = "nfs";
}
];
services.httpd.enable = true;
services.httpd.adminAddr = "root@localhost";
services.httpd.extraSubservices = [
{ serviceType = "trac"; }
];
environment.systemPackages = [
pkgs.pythonPackages.trac
pkgs.subversion
];
};
}

The three machines in the above example have the following purpose:

  • The first machine, named storage, is responsible for storing Subversion source code repositories in the following folder: /repos and makes the corresponding folder available as a NFS mount.
  • The second machine, named postgresql, runs a PostgreSQL database server storing the tickets.
  • The third machine, named webserver, runs the Apache HTTP server hosting the Trac web application front-end. Moreover, it mounts the /repos folder as a network file system connecting to the storage machine so that the Trac web application can view the Subversion repositories stored inside it.

The above specification can be considered a logical network model, because it captures the configuration we want to deploy, without any machine specific characteristics. Regardless of what kind of machine we intend deploy, we want these services to be available.

However, a NixOS configuration cannot be deployed without any machine specific settings. These remaining settings can be specified by writing a second model, the physical network model, capturing these:

{
storage =
{ pkgs, ...}:

{ boot.loader.grub.version = 2;
boot.loader.grub.device = "/dev/sda";

fileSystems = [
{ mountPoint = "/";
label = "root";
}
];

swapDevices = [
{ label = "swap"; }
];

networking.hostName = "storage";
};

postgresql = ...

webserver = ...
}

The above partial network model specifies the following physical characteristics for the storage machine:

  • GRUB version 2 should be used as bootloader and should installed on the MBR of the hard drive partition: /dev/sda.
  • The hard drive partition with label: root should be mounted as root partition.
  • The hard drive partition with label: swap should be mounted as swap partition.
  • The hostname of the system should be: 'storage'

By invoking NixOps with the two network models shown earlier as parameters, we can create a NixOps deployment -- an environment containing a set of machines that belong together:

$ nixops create ./network-logical.nix ./network-physical.nix -d test

The above command creates a deployment named: test. We can run the following command to actually deploy the system configurations:

$ nixops deploy -d test

What the above command does is invoking the Nix package manager to build all the machine configurations, then it transfers their corresponding package closures to the target machines and finally activates the NixOS configurations. The end result is a collection of machines running the new configuration, if all previous steps have succeeded.

If we adapt any of the network models, and run the deploy command again, the system is upgraded. In case of an upgrade, only the packages that have been changed are built and transferred, making this phase as efficient as possible.

We can also replace the physical network model shown earlier with the following model:

{
storage = {
deployment.targetEnv = "virtualbox";
deployment.virtualbox.memorySize = 1024;
};

postgresql = ...

webserver = ...
}

The above physical network configuration states that the storage machine is a VirtualBox Virtual Machine (VM) requiring 1024 MiB of RAM.

When we instantiate a new deployment with the above physical network model and deploy it:

$ nixops create ./network-logical.nix ./network-vbox.nix -d vbox
$ nixops deploy -d vbox

NixOps does an extra step before doing the actual deployment of the system configurations -- it first instantiates the VMs by consulting VirtualBox and populates them with a basic NixOS disk image.

Similarly, we can also create a physical network model like this:

let
region = "us-east-1";
accessKeyId = "ABCD..."; # symbolic name looked up in ~/.ec2-keys

ec2 =
{ resources, ... }:

{ deployment.targetEnv = "ec2";
deployment.ec2.accessKeyId = accessKeyId;
deployment.ec2.region = region;
deployment.ec2.instanceType = "m1.medium";
deployment.ec2.keyPair = resources.ec2KeyPairs.my-key-pair;
deployment.ec2.securityGroups = [ "my-security-group" ];
};
in
{
storage = ec2;

postgresql = ec2;

webserver = ec2;

resources.ec2KeyPairs.my-key-pair = {
inherit region accessKeyId;
};
}

The above physical network configuration states that the storage machine is a virtual machine residing in the Amazon EC2 cloud.

Running the following commands:

$ nixops create ./network-logical.nix ./network-ec2.nix -d ec2
$ nixops deploy -d ec2

Automatically instantiate the virtual machines in EC2, populates them with basic NixOS AMI images and finally deploys the machines to run our desired Trac deployment.

(In order to make EC2 deployment work, you need to create the security group (e.g. my-security-group) through the Amazon EC2 console first and you must set the AWS_SECRET_ACCESS_KEY environment variable to contain the secret access key that you actually need to connect to the Amazon services).

Besides physical machines, VirtualBox, and Amazon EC2, NixOps also supports the Google Computing Engine (GCE) and Hetzner. Moreover, preliminary Azure support is also available in the development version of NixOps.

With NixOps you can also do multi-cloud deployment -- it is not required to deploy all VMs in the same IaaS environment. For example, you could also deploy the first machine to Amazon EC2, the second to Hetzner and the third to a physical machine.

In addition to deploying system configurations, NixOps can be used to perform many other kinds of system administration tasks that work on machine level.

Disnix


Readers who happen to know me a bit, may probably notice that many NixOps features are quite similar to things I did in the past -- while I was working for Delft University of Technology as a PhD student, I was investigating distributed software deployment techniques and developed a tool named: Disnix that also performs distributed deployment tasks using the Nix package manager as underlying technology.

I have received quite a few questions from people asking me things such as: "What is the difference between Disnix and NixOps?", "Is NixOps the successor of/a replacement for Disnix?", "Are Disnix and NixOps in competition with each other?"

The short answer is: while both tools perform distributed deployment tasks and use the Nix package manager as underlying (local) deployment technology, they are designed for different purposes and address different concerns. Furthermore, they can also be effectively used together to automate deployment processes for certain kinds of systems.

In the next chapters I will try to clarify the differences and explain how they can be used together.

Infrastructure and service deployment


NixOps does something that I call infrastructure deployment -- it manages configurations that work on machine level and deploys entire system configurations as a whole.

What Disnix does is service deployment -- Disnix is primarily designed for deploying service-oriented systems. What "service-oriented system" is exactly supposed to mean has always been an open debate, but a definition I have seen in the literature is "systems composed of platform-independent entities that can be loosely coupled and automatically discovered" etc.

Disnix expects a system to be decomposed into distributable units (called services in Disnix terminology) that can be built and deployed independently to machines in a network. These components can be web services (systems composed of web services typically qualify themselves as service-oriented systems), but this is not a strict requirement. Services in a Disnix-context can be any unit that can be deployed independently, such as web services, UNIX processes, web applications, and databases. Even entire NixOS configurations can be considered a "service" by Disnix, since they are also a unit of deployment, although they are quite big.

Whereas NixOps builds, transfers and activates entire Linux system configurations, Disnix builds, transfers and activates individual services on machines in a network and manages/controls them and their dependencies individually. Moreover, the target machines to which Disnix deploys are neither required to run NixOS nor Linux. They can run any operating system and system distribution capable of running the Nix package manager.

Being able to deploy services to heterogeneous networks of machines is useful for service-oriented systems. Although services might manifest themselves as platform independent entities (e.g. because of their interfaces), they still have an underlying implementation that might be bound to technology that only works on a certain operating system. Furthermore, you might also want to have the ability to experiment with the portability of certain services among operating systems, or effectively use a heterogeneous network of operating systems to use their unique selling points effectively for the appropriate services (e.g. a Linux, Windows, OpenBSD hybrid).

For example, although I have mainly used Disnix to deploy services to Linux machines, I also once did an experiment with deploying services implemented with .NET technology to Windows machines running Windows specific system services (e.g. IIS and SQL server), because our industry partner in our research project was interested in this.

To be able to do service deployment with Disnix one important requirement must be met -- Disnix expects preconfigured machines to be present running a so-called "Disnix service" providing remote access to deployment operations and some other system services. For example, to allow Disnix to deploy Java web applications, a predeployed Servlet container, such as Apache Tomcat, must already be present on the target machine. Also other container services, such as a DBMS, may be required.

Disnix does not automate the deployment of machine configurations (that include the Disnix service and containers) requiring people to deploy a network of machines by other means first and writing an infrastructure model that reflects the machine configurations accordingly.

Combining service and infrastructure deployment


To be able to deploy a service-oriented system into a network of machines using Disnix, we must first deploy a collection of machines running some required system services first. In other words: infrastructure deployment is a prerequisite for doing service deployment.

Currently, there are two Disnix extensions that can be used to integrate service deployment and infrastructure deployment:

  • DisnixOS is an extension that complements Disnix with NixOS' deployment features to do infrastructure deployment. With this extension you can do tasks such as deploying a network of machines with NixOps first and then do service deployment inside the deployed network with Disnix.

    Moreover, with DisnixOS you can also spawn a network of NixOS VMs using the NixOS test driver and run automated tests inside them.

    A major difference from a user perspective between Disnix and DisnixOS is that the latter works with network models (i.e. networked NixOS configurations used by NixOps and the NixOS test driver) instead of infrastructure models and does the conversion between these models automatically.

    A drawback of DisnixOS is that service deployment is effectively tied to NixOS, which is a Linux distribution. DisnixOS is not very helpful if a service-oriented system must be deployed in a heterogeneous network running multiple kinds of operating systems.
  • Dynamic Disnix. With this extension, each machine in the network is supposed to publish its configuration and a discovery service running on the coordinator machine generates an infrastructure model from the supplied settings. For each event in the network, e.g. a crashing machine, a newly added machine or a machine upgrade, a new infrastructure model is generated that can be used to trigger a redeployment.

    The Dynamic Disnix approach is more powerful and not tied to NixOS specifically. Any infrastructure deployment approach (e.g. Norton Ghost for Windows machines) that includes the deployment of the Disnix service and container services can be used. Unfortunately, the Dynamic Disnix framework is still a rough prototype and needs to become more mature.

Is service deployment useful?


Some people have asked me: "Is service deployment really needed?", since it also possible to deploy services as part of a machine's configuration.

In my opinion it depends on the kinds of systems that you want to deploy and what problems you want to solve. For some kinds of distributed systems, Disnix is not really helpful. For example, if you want to deploy a cluster of DBMSes that are specifically tuned for specific underlying hardware, you cannot really make a decomposition into "distributable units" that can be deployed independently. Same thing with filesystem services, as shown in the Trac example -- doing an NFS mount is a deployment operation, but not a really an independent unit of deployment.

As a sidenote: That does not imply that you cannot do such things with Disnix. With Disnix you could still encapsulate an entire (or partial) machine specific configuration as a service and deploy that, or doing a network mount by deploying a script performing the mount, but that defeats its purpose.

At the same time, service-oriented systems can also be deployed on infrastructure level, but this sometimes leads to a number of inconveniences and issues. Let me illustrate that by giving an example:


The above picture reflects the architecture of one of the toy systems (Staff Tracker Java version) I have created for Disnix for demonstration purposes. The architecture consists of three layers:

  • Each component in the upper layer is a MySQL database storing certain kinds of information.
  • The middle layer encapsulates web services (implemented as Java web applications) responsible for retrieving and modifying data stored in the databases. An exception is the GeolocationService, which retrieves data by other means.
  • The bottom layer contains a Java web application that is consulted by end-users.

Each component in the above architecture is a distributable service and each arrow denotes dependency relationships between them which manifest themselves as HTTP and TCP connections. Because components are distributable, we could, for example, deploy all of them to one single machine, but we can also run each of them on a separate machine.

If we want to deploy the example system on infrastructure level, we may end up composing a networked machine configuration that looks as follows:


The above picture shows a deployment scenario in which the services are divided over two machines a network:

  • The MySQL databases are hosted inside a MySQL DBMS running on the first machine.
  • The web application front-end and one of the web services granting access to the databases are deployed inside the Apache Tomcat Servlet container running on the first machine.
  • The remaining web services are deployed inside an Apache Tomcat container running on the second machine.

When we use NixOps to deploy the above machine configurations, then the entire machine configurations are deployed and activated as a whole, which have a number of implications. For example, the containers have indirectly become dependent on each other as can be seen in the picture below in which I have translated the dependencies from service level to container level:


In principle, Apache Tomcat does not depend on MySQL, so under normal circumstances, these containers can be activated in any order. However, because we host a Java web application that requires a database, suddenly the order in which these services are activated does matter. If we would activate them in the wrong order then the web service (and also indirectly the web application front-end) will not work. (In extreme cases: if a system has been poorly developed, it may even crash and need to be manually reactivated again!)

Moreover, there is another implication -- the web application front-end also depends on services that are deployed to the second machine, and two of these services require access to databases deployed to the first machine. On container level, you could clearly see that this situation leads to two machines having a cyclic dependency on each other. That means that you cannot solve activation problems by translating service-level dependencies to machine-level dependencies.

As a matter of fact: NixOps allows cyclic dependencies between machines and activates their configurations in arbitrary order and is thus incapable of dealing with temporarily or permanent inconsistency issues (because of broken dependencies) while deploying a system as shown in the example.

Another practical issue with deploying such a system on infrastructure level is that it is tedious to do redeployment, for example when a requirement changes. You need to adapt machine configurations as a whole -- you cannot easily specify a different distribution scenario for services to machines.

As a final note, in some organisations (including a company that I have worked for in the past) it is common practice that infrastructure and service deployment are separated. For example, one department is responsible for providing machines and system services and another department (typically a development group) responsible for building the services and deploying them to the provided machines.

Conclusion


In this blog post, I have described NixOps and elaborated on the differences between NixOps and Disnix -- the former tool does infrastructure deployment, while the latter does service deployment. Infrastructure deployment is a prerequisite of doing service deployment and both tools can actually be combined to automate both concerns.

Service deployment is particularly useful for distributed systems that can be decomposed into "distributable units" (such as service-oriented systems), but not all kinds of distributed systems.

Moreover, NixOps is a tool that has been specifically designed to deploy NixOS configurations, while Disnix can deploy services to machines running any operating system capable of running the Nix package manager.

Finally, I have been trying to answer three questions, which I mentioned somewhere in the middle of this blog post. There is another question I have intentionally avoided that obviously needs an answer as well! I will elaborate more on this in the next blog post.

References


More information about Disnix, service and infrastructure deployment, and in particular: integrating deployment concerns can be found in my PhD thesis.

Interestingly enough, during my PhD thesis defence there was also a question about the difference between service and infrastructure deployment. This blog post is a more elaborate version of the answer I gave earlier. :-)

Improving the testability of the Nix iOS build facilities

$
0
0
In the previous blog post, I have described some new features of the Nix Android build environment. Apart from Android, the Nix iOS build environment has a few improvements as well, which are mostly related to testing.

Simulating multiple iOS versions


In the previous blog post about the Nix iOS build function, I have described how to install various iPhone simulator SDK versions by picking the preferences option from the 'Xcode' menu bar and selecting the 'Downloads' tab in the window, as shown in the screenshot below:


However, I did not implement any facilities in the simulator function that take these SDK versions into account yet.

Picking a specific SDK version can be done quite easily, by providing the -currentSDKRoot parameter to the iPhone simulator, such as:


$ "iPhone Simulator" -SimulateApplication build/HelloWorld \
-SimulateDevice 'iPhone' -currentSDKRoot \
"/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform\
/Developer/SDKs/iPhoneSimulator6.1.sdk"

In the above case, we have configured the simulator to use the iPhone simulator 6.1 SDK. By changing this parameter, we can use a different version, such as version 5.1.

I have encapsulated the above command-line invocation in the xccodeenv.simulateApp {} Nix function and the paths to the iPhone simulator SDKs in the xcodewrapper. The SDK version can be configured by providing the sdkVersion parameter (which defaults to 6.1):


{xcodeenv, helloworld, device}:

xcodeenv.simulateApp {
name = "HelloWorld";
app = helloworld;
inherit device;
sdkVersion = "6.1";
}

This parameter makes it possible to spawn a simulator instance running a specific iOS version.

The described facilities earlier are only for simulating apps. To build an app for a specific iOS revision, the iOS target version must be changed inside the Xcode project settings.

Building signed Apps for release


Apart from the simulator, we may also want to deploy apps to real devices, such as an iPhone, iPad or iPod, either directly or through the App store. In order to do that, we require permission from Apple by signing the apps that we have just built.

As described in the previous blog post, we require a certificate that contains a developer's or company's identity and a mobile provisioning file describing which (groups of) apps of a certain company/developer can be run of which (groups of) devices. These files must be obtained through Apple's Dev Center.

Because of these restrictions, it's hard to test the trivial "Hello World" example case I have developed in the previous blog post on a real device, since it contains a dummy app and company name.

To alleviate these problems, I have created a script that "renames" the example app into a different app, so that an existing developer or company certificate and a mobile provisioning profile of a different app can be used.

By setting the rename parameter to true when calling the composition expression of the example case, we can automatically generate jobs building IPAs and xcarchives for the "renamed" app:


import ./nix-xcodeenvtests/deployment {
rename = true;
newName = "MyVeryCoolApp";
newId = "coolapp";
newCompanyName = "My Cool Company";
newDomain = "com.coolcompany";
ipaCertificateFile = ./adhoc.p12;
ipaCodeSignIdentity = "iPhone Distribution: My Cool Company";
ipaCertificatePassword = "";
ipaProvisioningProfile = ./adhoc.mobileprovision;
xcArchiveCertificateFile = ./distribution.p12;
xcArchiveCodeSignIdentity = "iPhone Distribution: My Cool Company";
xcArchiveCertificatePassword = "";
xcArchiveProvisioningProfile = ./distribution.mobileprovision;
}

In the above expression, we changed the name of the app into: "MyVeryCoolApp" and the company name into: "My Cool Company". Moreover, we provide a certificate and mobile provisioning file for the IPA job (which can be deployed to a device directly) and the xcarchive job (which can be used to deploy to the App store).

By running the following command-line instruction:


$ nix-build -A renamedPkgs.testapp_ipa

We can build an IPA allowing us to deploy the test app on a real device. Of course, to do this yourself, the company name and app names should correspond to the ones defined in your certificate and mobile provisioning profile.

Moreover, as with the Android environment in the previous blog post, we can also use Hydra to provide parameters to the composition expression and to retrieve all the resulting artifacts:


Conclusion


In this blog post, I have described some improvements to the Nix iOS facilities. We are now also capable of configuring the SDK versions for the iPhone simulator, and we can build the trivial testcase to run on a real device in addition to the simulator. This is useful for testing the Nix iOS build function.

Viewing all 159 articles
Browse latest View live