It's a base image with binaries from Debian deb package, and with necessary stuff like ca-certificates, and absolutely nothing else, while still glibc-based (unlike Alpine base images).
What do you do when you need to debug an issue and the container contains no utils?
I expect someone will leave a comment saying "But you shouldn't be entering containers, you should be using Ansible/Kubernetes". Yes, that is how I manage changes but sometimes you just have to log in and see what is going on with htop/etc
Would start bash in the container, but also layer into the filesystem all the rest of a standard ubuntu image only for my tools, but not affecting the application in the running container.
I'm pretty sure that's possible with current linux kernel mount namespace/overlayfs infrastructure used by docker - all that's needed is the command line tool to support it.
The new ephemeral container support in kubernetes lets you do essentially that. You bring the filesystem from another container image into the PID/network namespace of a running container in a pod.
lol, it's fun to imagine going back in time to explain what you just said to my 2004 sysadmin self. back when I used to build servers, and colo them, and physically maintain them.
Which version of bash would you expect to run in this example?
(1) If it's the bash version from the standard ubuntu image, you will need to specify where to mount your application's filesystem inside the ubuntu filesystem.
(2) If it's the bash version from your application, then it's the other way around: you will need to specify where to mount the ubuntu filesystem inside your container.
Option (1) seems more practical. My point is that you will need to specify a mountpoint either way, and your commands will need to take this mountpoint into account.
You can debug such containers by running another debugging container to join their corresponding namespaces. For example the most frequently used namespaces are pid and network, with these two namespaces joined the target container, you can see its pid and binary as well as the network traffic.
For docker and k8s, there are two helpful tools which implement what I said with simple and intuitive UI:
A good pattern is to build an image specifically containing troubleshooting tools which can be run and attached to a problem container's namespace. That gives you a standard set of tools without having to bake them into every image.
FYI, you don't need any special permissions to strace in a docker container - you just need to disable the default seccomp profile (docker run --security-opt seccomp=unconfined), which blocks use of many unusual-in-production syscalls including ptrace: https://docs.docker.com/engine/security/seccomp/
One common workaround floating around the internets is to use --cap-add SYS_PTRACE. This has the side effect of permitting the ptrace syscall, but it also gives you the ability to ptrace processes owned by other users etc. That's more than you need and it's kind of dangerous in a production-ish container.
I might be thinking of a different scenario (and I'm generally using Singularity rather than Docker). I want to start my container under 'strace' and see everything. This is not generally possible in the obvious way, as there's a setuid-root binary in the process tree that blocks further strace'ing.
(One can still attach after everything's running, but that's not always good enough.)
For simple shell access, use the :debug variant of distroless images which include a shell.
For more complex troubleshooting, I think other people has recommended many ways. I haven't had the need to do such troubleshooting but if I need to I would mount an image with necessary binaries into the container. This is where distroless becomes handy: I can mount a Debian image and don't worry about ABI compatibility.
Answer to "How do you debug an issue in a running container?" is "you read logs or you don't."
Generally you build a special debug image that has busybox or whatever. In case of distroless, debug image has busybox and everything that comes with it.
Also, what are you trying to see with top/htop? In ideal world you will see a single process pid 1 that is your entry point. There shouldn't be more than one process running in it.
You can get resource consumption of a container without logging into the container just like you can get running processes without getting into container.
There is nothing else you can do without dragging wholelot of dependencies:
- Anything java related will require a JDK
- Debugging any native code will require a whole debugger
- Debugging python/ruby will either work or will require dev dependencies
Sidenote:
Who the fuck uses ansible to debug containers?
I went to a presentation on Sysdig thinking that would be some kind of solution. Not really; not unless you want to hunt down or write syscall filters (or find some online) or pay for the Enterprise version.
I just wish there was a way to do the basics:
1. Look at files within my running container (maybe even modify them, without needing vim or nano installed inside it).
2. Ping/ICMP something from within the container (again, without ping being in the container itself)
3. DNS lookups from within the container
4. Connect to a port on an IP or DNS name from within the container
5. Inspect the contents of a dead container that won't start without having to commit it first.
I did a post a while back on how I feel about debuggin within containers, and I should probably write another one because I don't think I cover those 5 things:
For point one you can grab the running container tag and then add a layer on top with any tools you need.
You obviously won’t get the same operational state but if you want to poke around a container you’ve built and see what’s in it, you can just extend it.
I'd love to have 50 upvotes to give to this particular comment. With Docker being so fashionable, too many applications which have strictly no business being containerized are shoehorned into containers (e.g. Confluence).
My rule of thumb is: as soon as I have to `docker exec` into a running container because something's wrong, this container needs to be stopped and a VM should be used instead.
Also, I beg you to not use Alpine. It is horrible from the security perspective. The Alpine team doesn't have sufficiently staffed security folks to upgrade packages when vulnerabilities are found. Most popular Linux distro vendors publish OVAL data[1] which can be used to find and fix vulnerable packages. But not with Alpine [2]
Alpine uses musl instead of glibc. That works well for most things, but not all things. For example, there are some JVM implementations that don't like musl unless you do quite a lot of work.
Never understood the whole alpine thing. I would say most sysadmins doing docker stuff do not understand the concept of what is a libc and its variants to understand the consequences of their choices. Hell I work in embedded and have worked with ulibc or musl and have been bitten quite hard. Some calls not working as expected are loads of fun. All for a few MB, when it could be done just using a proper small base image like has been suggested above.
The standard benefits of using something mainstream vs. something niche - x100 more scenarii tested in the wild.
In my limited experience, some things don't compile targetting musl. If everything works alpine is sup. Else, it may be fairly difficult (and mostly unjustified) to fix.
I use Python. Anytime you need to compile the source to build a pip package and that the upstream package developer decided that they don't support Alpine's libc implementation (aka musl) then you will have a big problem, unless you can control your dependencies and include as few pip packages that requires compilation or find binary builds.
speaking of dive... there's a bit of overlap between docker-slim and dive. docker-slim produces similar container image reports (with more details about the layers, but no diffs, for now... the layer diff support is coming)
Yeah, and all these could be copied from another container. I've slimmed down and lots of sidecar containers and, on top, I used UPX [0] on the binaries as well. Premature optimization is the root of all evils, true, but sometimes, for example, for Prometheus exporters you need to run on a bunch of nodes as sidecars, it totally makes sense to go the extra mile.
Well, distroless base is 3x alpine. Which is a clickbaity way of saying its 10mb bigger.
Distroless nodejs images is...10mb bigger than the same on alpine.
Main purpose of distroless is less attack surface rather than size. Without package manager mutating container is PITA. Not having `ps` or `cat` makes it hard to read secrets that you injected into container one way or another.
It's a base image with binaries from Debian deb package, and with necessary stuff like ca-certificates, and absolutely nothing else, while still glibc-based (unlike Alpine base images).
Example images I built with the base image - C binary, <10MB https://hub.docker.com/r/yegle/stubby-dns
- Python binary, <50MB https://hub.docker.com/r/yegle/fava
- Go binary, 5MB https://hub.docker.com/r/yegle/dns-over-https
Another trick I use is to use https://github.com/wagoodman/dive to find the deltas between layers and manually remove it in my Dockerfile.