RFC: Continuous OS Deployment
Summary
Operating System Continuous Deployment (OSCD) refers to the process of seamlessly upgrading my NixOS machines to the latest and greatest release with as little overhead and manual fiddling as possible. This RFC proposes an OSCD overhaul that achieves a greater level of automation and reproducibility in the OS upgrade process via a few added GitHub CD hooks, the new CLI tool anix-upgrade, and some development process changes.
Motivation
NixOS upgrades on my machines are done using the nixos-rebuild command, which builds the system configuration according to the (symlinked) file /etc/nixos/configuration.nix
, which points to a mutable root configuration file in ~/sources/anixpkgs/
.
This model lends itself best to rapid prototyping and test, as changes to the configuration can be immediately tested without having to commit any code. However, release deployment strategies are left to be more ad hoc and manual (and thus error-prone) under this model, as well. For example, to tag and deploy a new release off of master, the following manual steps must be taken:
- The desired release commit is tagged, either locally or via the GitHub releases page.
- The
dependencies.nix
file for all NixOS configurations is modified to refer to the new tag, manifesting as a new commit on the head of master. ~/sources/anixpkgs/
is checked out to the commit from step 2 (not step 1, ironically). Any local dev changes need to be stashed.- A
nixos-rebuild
command is run.
This RFC calls for a mature, stable OSCD pipeline that removes the need for the manual steps listed above while maintaining flexibility for rapid (and decoupled) on-machine development and testing.
Driving requirements
System-level requirements
- [R1] OSCD shall be totally decoupled from development work; there shall be no possibility of one accidentally polluting the other.
- [R2] OSCD shall be atomic such that an upgrade cannot be corrupted via a mismanagement or erroneous execution of steps.
- [R3] OS release tagging shall be wholly executable within an
anixpkgs
pull request, and shall be entirely automated except for a manual specification of the level of release (e.g., major, minor, patch) that the pull request corresponds to. - [R4] An OS upgrade on-machine shall take no more than one step to complete.
- [R5] The same upgrade mechanism from [SR4] shall enable rapid prototyping of active development branches.
Software-level requirements
- [R1.1] Development within
anixpkgs
shall happen in a separate location from the symlinked~/sources/anixpkgs
directory, which shall be reserved for OS upgrades. - [R1.2]
~/sources/anixpkgs
shall be read-only. - [R2.1] The release tagging process shall execute all required steps automatically, prompted by a single initiatory step, within the remote CD pipeline.
- [R2.2] All automated steps for [R2.1] shall kick off only after a pull request merge into master, which is push-protected.
- [R2.3] The OS upgrade process shall consist of an atomic source preparation step followed by an atomic rebuild step, both chained together automatically.
- [R3.1] The tagging process from [R2.1] shall be initiated via adding labels to a pull request. No further action should be required.
- [R4.1] OS upgrades shall be offered by a CLI tool that, with no arguments specified, will upgrade the system to the most recent
anixpkgs
release off of master. - [R4.2] The OS upgrade CLI tool shall perform rebuild
switch
by default, but allow for rebuildboot
to be manually specified instead. - [R5.1] The OS upgrade CLI tool shall allow (via provided arguments) for an upgrade to any particular tag, branch, or commit within the remote
anixpkgs
repository. - [R5.2] The OS upgrade CLI tool shall allow for builds with packages local to the specified build target, and not necessarily tied to that target's prescribed release version.
Detailed design
The OSCD requirements are addressed with three components: an updated deployment
pipeline within anixpkgs
GitHub actions, an OS upgrade CLI tool, and a new policy for anixpkgs
development and testing.
Deployment pipeline
Requirements [R2.1-2,3.1] are proposed to be fulfilled by adding three GitHub actions jobs with write permissions on master to the deployment workflow. Each of these three jobs will be responsible for either tagging a new major release, minor release, or patch release, and will be triggered by a corresponding pull request label on a merged pull request only.
Each of these jobs will consist of the following steps:
- Only execute if a merged pull request had the appropriate release label.
- Checkout
anixpkgs
master. - Run a version increment script that increments the release version according to the release label.
- Commit the semantic version changes and tag that commit with the new semantic version string.
- Push the commit and corresponding tag to master.
The version increment script will simply take as an input the release increment type (major, minor, or patch) and increment accordingly:
- (Major)
x.y.z
->x+1.0.0
- (Minor)
x.y.z
->x.y+1.0
- (Patch)
x.y.z
->x.y.z+1
In the event of a pull request getting merged with multiple labels, the execution order of the release tag jobs will be major
-> minor
-> patch
, such that each successive tag will be visible in the resulting semantic version. If the order were reversed, for example, a patch release increment would be obscured by a minor release increment, which would zero out the patch field.
OS upgrade CLI tool: anix-upgrade
One a new release has been properly and automatically tagged on anixpkgs
remote, a CLI tool called anix-upgrade is proposed to upgrade the system to the latest tag in an automated and incorruptible (i.e., no need nor opportunity to manually modify code or any configurations during the process) fashion and fulfill [R1.2,2.3,4.1-2,5.1-2].
As implied by [R2.3], anix-upgrade
will do two things:
- Prepare a read-only (and Git-less) version of
anixpkgs
in the~/sources
directory according to the exact version specified via arguments through the CLI. - Rebuild the system configuration according to the updated source/configuration.
(1) may be naturally accomplished via a Nix derivation, which by definition must consist of a read-only (and reproducible) output with no tolerance for side effects from things like .git
directories. The CLI tool will allow for (1) to be constructed from:
- No argument at all, which will assume that the target source corresponds to the current head of
anixpkgs
master. This will be the way to perform standard, non-dev-prototyping upgrades. - An alternative release tag string or a development branch name (assumed to be at the head of that branch) or a commit hash.
- An optional specification that the OS packages should be built from the local packages in that specific version of the
anixpkgs
source tree, and not necessarily from the prescribed release version of the packages.
Because Nix derivations cannot clone Git repositories, the Nix built-in tools fetchGit
and fetchTarball
must be used to fetch the exact right version of the source. In my experience, fetchGit
does not consistently fetch the expected "head-of" commit when only a branch name is specified, so in this implementation fetchTarball
is to be used for any version of (1) that does not specify the full commit hash. This is possible because of GitHub's feature where it allows for URL-based fetching of zipped source archives from tag names and branch names (reliably delivering the head-of commit in the latter case).
(2) is accomplished by aliasing to the nixos-rebuild
command (exposing an optional flag to require a reboot for changes to take effect as per [R4.2]) and only executing once (1) has successfully completed.
Development policy changes
All-in-all, the above changes serve to reduce the cognitive load associated with releasing and deploying a new version of anixpkgs
once development is complete. In the same vein, the above changes are best served by fulfilling [R1.1] as well, which moves anixpkgs
development out of the ~/sources
directory and into the ~/dev
directory, consistent with development on individual repositories. While mostly a clerical change, there are a couple of implications:
- The machines docs should be updated to modify instructions for setting up a new machine from scratch. The setup process may now be slightly more complicated on a one-time basis.
- The
~/.devrc
file on every development machine (see devshell) should be modified to set thepkgs_dir
variable to the~/dev/
location and not the~/sources/
one. This is to preserve the ability to mutate attribute-based sources with the same level of flexibility as before.
Drawbacks
The principal drawback of this design is the new requirement to commit OS prototype changes to GitHub when prototyping new OS versions via the specification of a remote tag, branch, or commit with the OS upgrade tool.
Alternatives
To address the slight OS prototyping drawback, an additional option may be added to anix-upgrade
to point to a local anixpkgs
source tree, which would symlink to the mutable source tree rather than an immutable nix derivation that pulls from GitHub.
Unresolved questions
- Must OS upgrades always be manually triggered via
anix-upgrade
, or might it be fruitful to execute automatic upgrades in the future? - How may OSCD fruitfully apply to machine closures that import
anixpkgs
closures but specify their own configurations?