Case Study

K8s Hardening Sweep — 20 issues, public write-up

Most K8s 'production-ready' content is shaped like marketing or like a vendor's onboarding flow. The actual production-grade hardening checklist — the one a working operator runs through after they get bitten — is scattered across blog posts, Slack threads, and post-mortems. There's no public artifact that captures the failure-mode-shaped knowledge an audit should produce.

Client

Confidential

Year

2025

Services

Public artifact · zero client data · field notes

K8s Hardening Sweep — 20 issues, public write-up

The Challenge

Most K8s 'production-ready' content is shaped like marketing or like a vendor's onboarding flow. The actual production-grade hardening checklist — the one a working operator runs through after they get bitten — is scattered across blog posts, Slack threads, and post-mortems. There's no public artifact that captures the failure-mode-shaped knowledge an audit should produce.

Our Approach

Captured 20 distinct issues I've debugged on my own cluster and on engagements: NetworkPolicy port mismatches (port = pod port, NOT service port), Multi-Attach errors on RWO PVCs under rolling updates, Traefik v3 customResponseHeaders silently dropping, seccomp profiles that kill Stalwart at boot, k3s certificate rotation that doesn't invalidate old client certs, SELinux 'spc_t' as the anti-pattern that looks like hardening. Each entry has the symptom, the root cause, the fix, and the underlying spec citation.

The Results

The write-up is a public artifact: no client data, no NDA. It demonstrates the depth of audit you can expect when synkraft does Tier 1 on your cluster — most reviews surface 3-7 of these in the first 60 minutes. It's also why Tier 1 deliverables stand on their own: a written audit at this granularity is bid-able to any qualified vendor.