
Three weeks before our SOC2 audit, we hired an outside firm to do a dry run so we wouldn't be surprised by the real thing. The engineer they sent spent maybe twenty minutes on our actual application code. Then she pulled one of our production images, ran docker history --no-trunc against it, and within about ninety seconds had a long-since-rotated AWS secret key sitting in her terminal baked into a layer from a Dockerfile change we'd made eight months earlier and assumed was long gone. That key had been rotated. It wasn't a live credential anymore, which is the only reason this story ends with an awkward Slack thread instead of an incident report. But it told us something uncomfortable: nobody had looked at what was actually sitting inside our images since the day we wrote the Dockerfiles, and we'd been treating Docker security as a checkbox scan for CVEs, calling it done instead of an ongoing practice. Why This Keeps Slipping Through Most teams I've worked with treat container security as something a vulnerability scanner handles in CI. Run Trivy or Grype, fail the build if it exceeds a severity threshold, and proceed accordingly. That approach catches outdated packages with known CVEs, which is important, but it overlooks nearly all the issues that are actually exploited in the real incidents I've encountered: secrets baked into layers, containers running as root by default, and the particularly concerning practice of mounting the Docker socket into a container as a shortcut. None of those show up in a CVE scan, because they're not vulnerabilities in a package. They're decisions someone made in a Dockerfile or a deployment manifest, usually under deadline pressure, and then nobody revisited. Mistake One: Secrets That Refuse to Die The leaked key came from an ARG and ENV pair we'd used to pull a private package during a build, years before BuildKit secret mounts were standard practice. The Dockerfile had since been rewritten to drop that approach entirely, but the old image was still sitting in the registry, and worse, even our current build still had a cached layer from a similar pattern in a different image. A RUN unset SECRET_KEY in a later line does nothing for the image; the value is already committed into the layer's history the moment it's set, and docker history will happily print it back out; no special tooling is required. The fix that actually matters is never letting the secret enter a layer in the first place: # syntax=docker/dockerfile:1.4 FROM node:20-slim RUN --mount=type=secret,id=npm_token \ NPM_TOKEN=$(cat /run/secrets/npm_token) \ npm install --omit=dev Built with docker build --secret id=npm token,src=./npm token.txt, the token is mounted into a tmpfs only for the duration of that RUN instruction and never touches a layer. We re-audited every image we had with docker history --no-trunc afterward and found three more instances of the same pattern, just less dramatic than a live AWS key. Mistake Two: Root by Default, Because Nobody Said Otherwise If a Dockerfile doesn't set a USER, the container runs as root, and most of ours didn't. We'd reasoned, somewhat lazily, that root inside a container isn't root on the host, so the risk felt abstract. It stops feeling abstract the moment any other misconfiguration, a writable host mount, a kernel exploit, or an overly broad capability set lets a process escape the container boundary, because at that point root inside maps directly to root outside. We started adding a dedicated user almost everywhere: RUN addgroup --system app && adduser --system --ingroup app app USER app # pair with capability dropping at run time # docker run --cap-drop=ALL --cap-add=NET_BIND_SERVICE ... In hindsight, non-root alone is necessary, but it's less important than the headline fix people treat it as. It reduces the blast radius; it doesn't eliminate it. The bigger find was still ahead of us. Mistake Three: The Socket That Hands Out the Keys to the Host Our CI runner image mounted /var/run/docker.sock into the container so that build jobs could trigger other Docker builds without setting up a separate Docker-in-Docker environment. It felt convenient, and at the time, convenience won. What I hadn't internalized until the audit spelled it out plainly: mounting the host's Docker socket into a container is functionally equivalent to giving that container root on the host. Anyone who can exec into it, or any compromised dependency running inside it, can spin up a new privileged container that bind-mounts the host filesystem and walks right out of any isolation we thought we had. We'd briefly considered Docker-in-Docker with --privileged as the alternative and rejected it for the same underlying reason: it's the same blast radius wearing a different hat. What we landed on instead was moving image builds to Kaniko, which builds images from a Dockerfile without needing a Docker daemon or socket access at all, running inside its own constrained pod with no path back to the host. It was slower to set up than either socket mount or DinD, and our build times got marginally worse, which the team grumbled about for a sprint. I'd take that trade every time. Mistake Four: --privileged as a Permissions Band-Aid Buried in an old deployment manifest, a service had been launched with --privileged set to true because someone encountered a device-access permission error two years earlier, chose a quick fix, and never returned to properly adjust the permissions. --privileged disables almost every container security boundary. Docker offers that it's not a slightly elevated permission; it's effectively zero. The honest fix took an afternoon: identify the specific device the service needed and grant only that with --device, instead of opening everything. When it comes to security, there are situations where locking everything down may not be the best approach Not every one of these is worth fixing everywhere, and I'd push back on anyone who tells you otherwise. A short-lived, fully isolated sandbox container on a host nobody else touches, spun up to run a one-off data migration script and torn down an hour later, probably doesn't need the same scrutiny as a long-running, internet-facing service. The cost of hardening has to be weighed against actual exposure, and spending a week locking down a throwaway tool while your customer-facing API still runs as root is solving the wrong problem first. Key Takeaways Secrets baked into any layer are recoverable forever via Docker history, even after later commands try to unset them. Use BuildKit secret mounts instead. Default to a non-root USER, but don't mistake it for complete protection; pair it with capability dropping. Never mount the Docker socket into an application or CI container; treat that as equivalent to granting host root. Audit every --privileged flag in your fleet; most were added under deadline pressure and never revisited. Prioritize hardening by actual exposure: internet-facing and long-running services first, short-lived isolated tooling last. Closing Thought The thing that stuck with me from that audit wasn't any single misconfiguration; it was how confident we'd felt going in, because our CI pipeline had a green checkmark next to “security scan passed.” That checkmark was accurate but largely irrelevant. I wonder how many teams are sitting on the same false confidence right now, looking at a passing vulnerability scan and never once running docker history on their images. What's actually in your layers? \
View original source — Hacker Noon ↗


