Kubernetes Penetration Testing: A Practical Guide
Kubernetes consolidates compute, identity, network, and secrets into one programmable surface. That power is also the problem. A single overprivileged service account, a pod with hostPath, an unauthenticated kubelet, an exposed metadata service, and a tester reaches cluster-admin in minutes. This guide covers the practical workflow for scoping, executing, and reporting on Kubernetes penetration tests across managed (EKS, AKS, GKE) and self-hosted clusters. It complements the broader cloud security assessment guide and aligns with NIST SP 800-190, the CIS Kubernetes Benchmark, OWASP Kubernetes Top 10, and MITRE ATT&CK for Containers.
Why Kubernetes Needs Its Own Engagement
A cloud pentest answers what is exposed in the provider account. A network pentest answers how far an attacker can pivot at the IP layer. A Kubernetes pentest answers a different question: how the workload identity and orchestration layer fails. The cluster runs as a tenant on the cloud account, but it has its own identity model, its own RBAC, its own network plane, and its own admission system. Each of these is a distinct attack surface that does not appear in a traditional cloud or network test.
Modern Kubernetes attacks are well documented and broadly automated. Peirates, kdigger, and BadPods reproduce escape, RBAC abuse, and metadata theft in seconds. The objective is not to prove Kubernetes is broken; it is to find the specific paths an attacker would take in this cluster, demonstrate them safely, and hand platform engineering a list they can fix.
For broader context, see our guides on penetration testing methodology and the difference between red team and pentest engagements.
1. Scoping a Kubernetes Engagement
Kubernetes scoping is harder than a flat web or network test. Clusters have control planes, data planes, multiple namespaces, mesh sidecars, GitOps pipelines, and trust boundaries into the cloud account. Lock the details down on paper.
- Cluster inventory: in-scope clusters, provider type (EKS, AKS, GKE, OpenShift, Rancher, kubeadm), Kubernetes version, node pool composition, namespaces, and CNI plugin.
- Starting position: external network only, compromised pod with default service account, developer kubeconfig, CI/CD service account token, or all of these. Document the source explicitly.
- Control plane scope: for managed clusters confirm that the API server, etcd, scheduler, and controller manager are out of scope. For self-hosted clusters confirm authorisation to interact with kube-apiserver flags, etcd certificates, and node-level configuration.
- Cloud boundary: agree whether the test crosses into the cloud account via node IAM, IRSA, Workload Identity, or the metadata service, and which actions are authorised once across.
- Workload constraints: production vs staging, blackout windows, denial-of-service limits, and what happens if a noisy security control quarantines a pod mid-test.
- Rules of engagement: testing windows, allowed and disallowed payloads, escalation contacts. Lock them in the scope of work.
- Authorisation: a signed letter of authorisation from a senior stakeholder, plus written agreement on emergency stop conditions.
- Deliverable expectations: report depth, retest scope, portal access, attack chain diagrams, and time-bound remediation tracking.
2. External Reconnaissance
Before any in-cluster access, gather what the network leaks for free. External recon against the cluster perimeter is fast, low-risk, and frequently produces the first foothold by itself.
- Identify ingress controllers, load balancers, and API endpoints exposed to the internet
- Probe the API server endpoint for anonymous access (managed providers usually patch this; self-hosted often does not)
- Enumerate node external IPs and probe for the kubelet read-only port (10255) and the kubelet API (10250)
- Check for exposed dashboards, kube-state-metrics, Prometheus, Grafana, ArgoCD, and Kiali endpoints
- Look for unauthenticated etcd (port 2379) or container runtime APIs reachable from outside
- Run kube-hunter against the perimeter to surface known issues quickly
- Inspect TLS certificates on the API server and ingress for SAN values that reveal cluster names, namespaces, and tenant identifiers
- Look for misconfigured cloud storage buckets used for backups, manifests, or kubeconfig files
3. Initial Access and Foothold Validation
The realistic starting point in most engagements is a pod with the default service account, a developer kubeconfig, or a leaked CI/CD token. Confirm the foothold and establish baseline visibility.
- Inspect the projected service account token at /var/run/secrets/kubernetes.io/serviceaccount/token
- Decode the JWT to read the namespace, service account name, audience, and expiry
- Probe the API server from inside the pod with curl using the token; confirm what verbs are permitted
- Run kubectl auth can-i --list to enumerate granted permissions in the current context
- Check the pod spec for hostPath, hostNetwork, hostPID, hostIPC, privileged, capabilities, and securityContext
- Check for mounted secrets, configmaps, projected volumes, and downward API exposures
- Read environment variables for connection strings, API keys, and registry credentials accidentally injected as plaintext
- Identify the CNI plugin and test pod-to-pod reachability across namespaces
4. RBAC Enumeration and Abuse
RBAC is the most common path to cluster compromise. Overprivileged ServiceAccounts and ClusterRoleBindings show up in nearly every assessment. Map the graph systematically.
- Enumerate every ClusterRole and Role; flag wildcard verbs (verbs: ["*"]) and wildcard resources (resources: ["*"])
- Map ClusterRoleBindings and RoleBindings; identify subjects bound to cluster-admin or near-equivalents
- Use rbac-tool and kubectl-who-can to find principals with create/update on pods, deployments, daemonsets, jobs, and cronjobs
- Check for the escalate, bind, and impersonate verbs on roles or clusterroles (privilege escalation primitives)
- Identify ServiceAccounts with token automount enabled in workloads that do not need API access
- Find roles granting get/list on secrets at namespace or cluster scope
- Trace from compromised ServiceAccount to all reachable resources via the binding graph
- Validate the path: can the current token actually create a pod that mounts the host filesystem? Demonstrate it safely in a scratch namespace if authorised
Capture the binding graph as evidence. RBAC review screenshots in the final report are far more persuasive when they show the exact paths an attacker would take.
5. Container Escape and Node Compromise
Once a pod can be created with elevated context, the host follows quickly. Test the escapes that real-world attackers use, validate them, and document the exact pod spec.
hostPath and hostNetwork abuse
A pod with a hostPath volume mounting / can read and write the entire node filesystem. Common impacts: stealing kubelet credentials from /var/lib/kubelet, planting a SUID binary, reading ssh keys, writing to /etc/cron.d. Demonstrate read access and document without persisting changes.
Privileged pods
A pod with privileged: true (or CAP_SYS_ADMIN) can mount host devices, load kernel modules, and chroot into the host namespace. Use BadPods or kdigger to demonstrate without leaving artefacts.
Container runtime socket mount
A pod with /var/run/docker.sock or the containerd/CRI-O socket mounted can issue arbitrary container creation commands, including pods that mount the host root. This finding is still common in CI/CD runners and image build pipelines.
CVE-driven escapes
CVE-2022-0185, CVE-2022-0492 (cgroup release_agent), CVE-2024-21626 (runc /proc/self/fd escape), and similar runtime CVEs have all been weaponised. Validate against the running runtime version before claiming impact; never deploy live exploit code without written authorisation.
Kubelet abuse
If the kubelet API (10250) accepts anonymous access or accepts the service account token, an attacker can list pods, exec into running containers across namespaces, and read node-level secrets. Test every node, not just one.
6. Secrets, Configuration, and Supply Chain
Secrets, configmaps, and the supply chain feeding workloads are recurring sources of cluster compromise. Search systematically.
- List every Secret in scope; identify those storing cloud credentials, registry tokens, database passwords, and TLS keys
- Check whether etcd encryption at rest is enabled and which provider (KMS, aescbc, aesgcm) is configured
- Read configmaps for database URLs, API keys, internal hostnames, and bootstrap tokens accidentally stored as plain text
- Inspect Deployment, StatefulSet, DaemonSet, and CronJob specs for env vars, envFrom, and volumeMounts referencing secrets
- Pull the images referenced in pod specs and scan with Trivy for known CVEs and exposed secrets in layers
- Check for image pull policy IfNotPresent allowing tag overwrites; recommend digest pinning
- Inspect imagePullSecrets and the registries they target; test for weak registry auth
- Look for sidecars from helm charts or operators that ship with default credentials
- Review GitOps repositories (ArgoCD, Flux) for direct write access; a compromised dev account often deploys arbitrary manifests
- Verify image signatures with Cosign where claimed; flag clusters that allow unsigned images
7. Cloud Trust Boundary
The cluster is a tenant of the cloud account. Test the boundary that pods cross when they request cloud credentials.
- From a pod, probe the cloud metadata service (169.254.169.254) and check whether IMDSv2 is enforced (EKS) or whether IMDS is reachable at all
- If reachable, retrieve the node instance role credentials and assess the resulting blast radius with the cloud CLI
- Check whether IRSA (EKS), Workload Identity (GKE), or Azure AD Workload Identity (AKS) is configured; inspect the trust policy of each pod-bound role
- Look for over-broad IAM policies on node roles (S3:*, ec2:*, kms:*) that should be scoped to specific resources
- From a pod, attempt to call cloud APIs that should not be reachable (cross-account assume role, KMS decrypt on unrelated keys)
- Identify whether NetworkPolicy or hostNetwork-based controls block pod access to internal cloud resources (RDS, internal ALBs, parameter store)
- Test for SSRF via in-cluster ingress that lets external clients reach the metadata service through a workload
- For self-hosted clusters, identify how nodes authenticate to the cloud account and whether bootstrap tokens are reachable from pods
8. Network Policy and East-West Movement
Most clusters run with no NetworkPolicy at all, which means any pod can talk to any other pod in the cluster. East-west movement is usually trivial.
- Enumerate every NetworkPolicy in scope; identify namespaces with no policy (default-allow)
- From a pod, attempt to reach the API server over the pod network and check whether it is restricted
- Probe service accounts on internal services for default credentials (Redis, MongoDB, Elasticsearch, Cassandra)
- Test whether the kube-system namespace is reachable from workload namespaces; it often is
- Look for service mesh sidecars and check whether mTLS is in PERMISSIVE mode (still accepts plaintext)
- Check whether DNS policy allows arbitrary external DNS resolution from pods (data exfiltration via DNS)
- Test egress: can the pod reach the public internet, internal admin endpoints, or other tenant clusters?
- For multi-tenant clusters, demonstrate cross-namespace reachability and document the impact
9. Admission Controllers and Policy Bypass
Pod Security Admission, Kyverno, OPA Gatekeeper, and validating webhooks are the primary defence against the bad pod specs covered above. Test that they enforce what the policy claims.
- Identify whether Pod Security Admission is configured at the namespace level and at which level (privileged, baseline, restricted)
- Try to deploy pods that should be rejected (hostPath, privileged, hostNetwork) into each namespace; record what is allowed
- For Kyverno or OPA Gatekeeper, list ClusterPolicies and ConstraintTemplates; check enforcement mode (audit vs enforce)
- Look for namespaces or service accounts excluded from policy (often debugging or vendor namespaces left exempt indefinitely)
- Test whether validating webhooks fail open (failurePolicy: Ignore) so a transient outage allows arbitrary pods
- Check for mutating webhooks that inject privileged sidecars; these become privilege escalation primitives if their image is controlled
- Look for ImagePolicyWebhook or admission controllers that enforce signed images, and confirm they are wired to a Cosign verifier or equivalent
- Demonstrate any bypass by deploying a non-compliant pod and capturing the admission audit log entry that should have rejected it
10. Persistence and Detection (Where Authorised)
Persistence demonstrations should always be coordinated and time-boxed. The objective is to validate detection and recovery, not to leave artefacts in production. Confirm written authorisation before any of the techniques below.
- Static pods placed under /etc/kubernetes/manifests on a compromised node (kubelet picks them up automatically)
- CronJob persistence in a namespace the attacker can write to but defenders rarely audit
- Mutating webhook persistence: a webhook that injects a backdoor sidecar on every pod creation
- Service account token theft and reuse from outside the cluster (long-lived legacy tokens, JWT replay)
- RBAC-based persistence: granting the attacker-controlled identity escalate or impersonate verbs
- Kube-system DaemonSet impersonation (a pod that looks like a system component on every node)
- Document each technique, the detection result, the time of action, and the cleanup step
For programmes that want broader adversary simulation rather than cluster-only depth, see the comparison between red team and penetration test engagements.
11. Reporting and Remediation Tracking
Cluster findings chain. A permissive RBAC binding leads to a privileged pod leads to a host filesystem read leads to node IAM credentials leads to cloud account compromise. A list of isolated CVSS scores hides that story. Structure the deliverable so platform engineering can follow the path and prioritise the chokepoints.
- Executive summary with business impact, attack narrative, and the chokepoints that break each chain
- Technical findings with reproduction commands, evidence, and CVSS scores validated using the CVSS calculator
- RBAC graph screenshots that match the attack chain in the narrative
- Per-finding remediation guidance distinguishing root cause from compensating control
- Mapping to compliance frameworks (PCI DSS 6.4.3 and 11.4, ISO 27001 A.8.8, SOC 2 CC6, NIST SP 800-190, CIS Kubernetes Benchmark)
- Prioritisation using CVSS plus EPSS plus asset tier, with cluster-admin and node escape paths always at the top
- Delivery in a portal that supports retest workflows and persistent remediation status, not just a static PDF
SecPortal's findings management ships with templates for common cluster findings (overprivileged RBAC, missing NetworkPolicy, privileged pods, exposed kubelet, IMDS reachability, unsigned images), AI-generated executive and technical reports, and a branded client portal so platform and application teams remediate without losing context. See the report template for the full structure.
12. Between Engagements: Hardening and Monitoring
Annual cluster pentests find the issues. Continuous hygiene keeps them found. Pair the engagement with ongoing controls and detection so the next test does not surface the same drift.
- Enforce Pod Security Standards at the restricted level for all workload namespaces
- Default-deny NetworkPolicy in every namespace; allow specific egress only as required
- Use IRSA, Workload Identity, or Azure AD Workload Identity instead of node-level IAM
- Block pod access to the cloud metadata service via NetworkPolicy or eBPF-based controls
- Enable etcd encryption at rest with a KMS provider
- Sign images with Cosign and enforce verification at admission
- Run kube-bench (CIS) and Trivy on a schedule; alert on regression
- Audit RBAC monthly; remove cluster-admin grants and wildcard verbs
- Patch nodes, the kubelet, and the container runtime on a defined cadence
- Schedule recurring external scans and continuous monitoring against ingress endpoints so configuration drift is caught quickly
For programme-level structure, see building continuous security monitoring and the vulnerability management programme guide.
The Quick Kubernetes Pentest Checklist
A condensed version to use during the engagement.
- Lock scope to clusters, provider type, namespaces, control-plane responsibility, and cloud trust boundary
- Confirm signed authorisation and emergency stop conditions before any traffic
- Enumerate ingress, kubelet, etcd, dashboard, and metrics endpoints from outside the cluster
- Validate the foothold: decode the projected service account token, run kubectl auth can-i --list
- Map the RBAC graph; flag wildcard verbs, escalate, bind, and impersonate; trace cluster-admin paths
- Inspect pod specs for hostPath, hostNetwork, privileged, capabilities, and runtime sockets
- Demonstrate container escape and node compromise where authorised; capture the exact pod spec
- Pull pod credentials from /var/lib/kubelet on a compromised node; test the resulting blast radius
- List secrets and configmaps; identify cleartext credentials and missing etcd encryption
- Scan images with Trivy; check for digest pinning, image signatures, and registry hardening
- Probe the cloud metadata service from pods; test IMDSv2 and Workload Identity boundaries
- Map east-west reachability; identify default-allow namespaces and weak NetworkPolicy
- Test admission controllers by deploying non-compliant pods; record what is allowed
- Run authorised persistence demonstrations only with written approval and detection coordination
- Score findings with CVSS, prioritise with EPSS plus asset tier, deliver in a portal that supports retest
- Schedule continuous coverage and configuration drift checks between assessments
Frequently Asked Questions About Kubernetes Pentesting
Run Kubernetes penetration tests with findings, RBAC paths, retests, and reports in one place
SecPortal gives security teams findings management with templates for cluster attacks, CVSS scoring, AI-assisted reporting, external and authenticated scanning, continuous monitoring, and a branded client portal so platform and application teams remediate fast. See pricing or start free.
Get Started Free