As someone who loves NixOS and runs it on my daily-driver laptop -
I can't see running NixOS in production.
We're running 100% Kubernetes, including for databases and other stateful workloads. Kubernetes implements the author's pattern just fine - any OS state is defined within the container image, and any application state is defined within a Persistent Volume. Unfortunately, NixOS doesn't have a good story yet for service management (Disnix isn't nearly as featureful as the Kubernetes scheduler and doesn't see nearly the same activity / community buy-in as Nix / NixOS) let alone ensuring that networked storage is re-attached to the particular node that runs the service in the same reliable manner that Kubernetes offers.
IMO the way forward for Nix / NixOS in production is to:
a) develop a container runtime that would allow a Kubernetes node to run pods that specify Nix expressions directly in the image field, instead of the current workaround of creating Docker containers from Nix expressions and dealing with the overhead of external registries
b) improve the experience of running Kubernetes on NixOS such that ease of installation more closely approaches that offered by managed Kubernetes providers.
I'm familiar with Nixery and I think it's a really cool project. It's extremely close to what I'd like but it's not quite it - it requires either relying on nixery.dev to be online (unacceptable for production considering there are no availability guarantees) or running my own instance (which essentially means that I'm maintaining a type of registry).
Why is there a need for an image registry? Part of the beauty of Nix is that Nix benefits from remote binary caches, but they are not required. Why not have a container runtime that, instead of downloading image layers, instead fetches from a Nix binary cache if possible and builds from source if not (with the caveat that production nodes should basically never be building from source)?
(Also Nixery is GCE-only and we're on AWS but leave that aside).
> which essentially means that I'm maintaining a type of registry
Mhm, there's no state that can't be thrown away and recreated, so I'd argue the overhead of running it is much lower than a full-blown registry.
> Why not have a container runtime that, instead of downloading image layers, instead fetches from a Nix binary cache
It depends on where you want to do this - Kubernetes for example has lots of opinions about images and how they're downloaded, so just replacing the runtime wouldn't be enough.
Nixery is an incremental step towards the end-goal, but there's a lot of mindset shifting that needs to happen first I think.
Not OP, but a registry works anywhere, whereas a custom runtime doesn't (managed node groups in EKS, GKE, AKS etc).
Additionally, once you use a custom runtime, now you have to deal with multiple runtimes in your cluster. You can no longer easily just run pods, you have to ensure they run on the nodes with the runtime for the images you want.
If using k8 (self host or PaaS), keep an eye on https://github.com/xtruder/kubenix as it'll blow your mind. Noyaml, infrastructure testing framework, deployment etc using nix.
I'm familiar with that flag, I tried to use it to set up a local development environment on my NixOS laptop. It forces you to use easyCerts and Flannel, neither of which you should use in production on AWS as a default. Disabling them to have more control over rotating certificates and to use AWS VPC CNI networking takes you well far away from a managed experience.
Additionally, the channel ecosystem as it exists today does not allow you to choose your minor version of Kubernetes, which is another issue if you want to keep your underlying system up to date but also want to make sure that you're controlling when you adopt a new minor version so that you can deal with deprecations as necessary.
Ah yeah, that was a toy example to highlight how services are defined/enabled as k8 is notoriously known to be f'ing hard, involved, complex and intense to setup.
"Services.<service>.enable" is very similar to freebsd and /usr/local/etc except with standardised language to configure every daemon.
Kubernetes isn't building the image, really, it's just passing the Nix expression directly to the container runtime that Nix would provide. This is more or less how Nix works already, as the Nix tooling takes Nix expressions and builds derivations which are stored in the Nix store.
I can't see running NixOS in production.
We're running 100% Kubernetes, including for databases and other stateful workloads. Kubernetes implements the author's pattern just fine - any OS state is defined within the container image, and any application state is defined within a Persistent Volume. Unfortunately, NixOS doesn't have a good story yet for service management (Disnix isn't nearly as featureful as the Kubernetes scheduler and doesn't see nearly the same activity / community buy-in as Nix / NixOS) let alone ensuring that networked storage is re-attached to the particular node that runs the service in the same reliable manner that Kubernetes offers.
IMO the way forward for Nix / NixOS in production is to:
a) develop a container runtime that would allow a Kubernetes node to run pods that specify Nix expressions directly in the image field, instead of the current workaround of creating Docker containers from Nix expressions and dealing with the overhead of external registries
b) improve the experience of running Kubernetes on NixOS such that ease of installation more closely approaches that offered by managed Kubernetes providers.