Kubernetes Security Best Practices - Securing Your Cluster Infrastructure

Master Kubernetes security with proven strategies, practical implementations, and real-world examples. Learn how to build and maintain secure Kubernetes clusters that protect your applications and data.

Securing your Kubernetes clusters is crucial for maintaining a robust and reliable infrastructure. This guide covers essential security practices, implementation strategies, and real-world examples for maintaining a secure Kubernetes environment.

Prerequisites

  • Basic understanding of Kubernetes
  • Access to a Kubernetes cluster
  • kubectl CLI tool installed
  • Familiarity with YAML configurations

Project Structure

.
├── security/
│   ├── network-policies/     # Network policy definitions
│   ├── rbac/                # Role-based access control configs
│   ├── pod-security/        # Pod security policies
│   └── secrets/             # Secret management
└── monitoring/
    ├── audit-logs/          # Audit logging configurations
    └── alerts/              # Security alert definitions

Core Security Components

1. Role-Based Access Control (RBAC)

Role Definition

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: default
  name: pod-reader
rules:
- apiGroups: [""]
  resources: ["pods", "pods/log"]
  verbs: ["get", "list", "watch"]

RoleBinding

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods
  namespace: default
subjects:
- kind: User
  name: jane
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

2. Pod Security Standards

Pod Security Context

apiVersion: v1
kind: Pod
metadata:
  name: secure-pod
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    fsGroup: 2000
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: app
    image: my-secure-app:1.0
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop: ["ALL"]
      readOnlyRootFilesystem: true

Pod Security Admission

apiVersion: pod-security.kubernetes.io/v1
kind: PodSecurityStandard
metadata:
  name: restricted
spec:
  enforce: "restricted"
  audit: "restricted"
  warn: "restricted"

3. Network Policies

Default Deny All

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress

Selective Access

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: api-allow
spec:
  podSelector:
    matchLabels:
      app: api
  policyTypes:
  - Ingress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          environment: production
    - podSelector:
        matchLabels:
          role: frontend
    ports:
    - protocol: TCP
      port: 8080

Implementation Best Practices

1. Image Security

Container Image Policy

apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: ImagePolicyWebhook
  configuration:
    imagePolicy:
      kubeConfigFile: /etc/kubernetes/admission-control/image-policy.yaml
      allowTTL: 50
      denyTTL: 50
      retryBackoff: 500
      defaultAllow: false

Private Registry Authentication

apiVersion: v1
kind: Secret
metadata:
  name: registry-credentials
type: kubernetes.io/dockerconfigjson
data:
  .dockerconfigjson: <base64-encoded-credentials>

2. Secrets Management

Encrypted Secrets

apiVersion: v1
kind: Secret
metadata:
  name: app-secrets
type: Opaque
stringData:
  API_KEY: "your-encrypted-api-key"
  DATABASE_URL: "your-encrypted-db-url"

External Secrets Operator

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: aws-secrets
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secretsmanager
    kind: SecretStore
  target:
    name: application-secrets
  data:
  - secretKey: api-key
    remoteRef:
      key: production/api/key

3. Audit Logging

Audit Policy

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata
  resources:
  - group: ""
    resources: ["pods", "services"]
- level: Request
  resources:
  - group: "rbac.authorization.k8s.io"
    resources: ["roles", "rolebindings"]
  namespaces: ["kube-system"]

Security Tools Integration

CNCF Graduated Tools

1. Harbor (CNCF Graduated)

Harbor is an open source container image registry that includes vulnerability scanning, image signing, and RBAC:

# Install Harbor using Helm
helm repo add harbor https://helm.goharbor.io
helm repo update
helm install harbor harbor/harbor \
  --namespace harbor \
  --create-namespace \
  --set expose.type=ingress \
  --set expose.tls.enabled=true \
  --set persistence.enabled=true

# Access Harbor
export HARBOR_PASSWORD=$(kubectl get secret --namespace harbor harbor-core -o jsonpath="{.data.HARBOR_ADMIN_PASSWORD}" | base64 --decode)
echo "Harbor Password: $HARBOR_PASSWORD"

# Configure Docker to use Harbor
docker login harbor.example.com -u admin -p $HARBOR_PASSWORD

2. CoreDNS (CNCF Graduated)

CoreDNS for secure DNS management and policy enforcement:

# Check CoreDNS configuration
kubectl get configmap coredns -n kube-system -o yaml

# Update CoreDNS configuration
kubectl edit configmap coredns -n kube-system

# Verify CoreDNS pods
kubectl get pods -n kube-system -l k8s-app=kube-dns

# Test DNS resolution
kubectl run dnsutils --image=gcr.io/kubernetes-e2e-test-images/dnsutils:1.3 --command -- sleep 3600
kubectl exec -it dnsutils -- nslookup kubernetes.default

3. Prometheus (CNCF Graduated)

For security monitoring and alerting:

# Install Prometheus Operator
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --create-namespace

# Access Prometheus UI
kubectl port-forward -n monitoring svc/prometheus-operated 9090:9090

# Check monitoring targets
curl localhost:9090/api/v1/targets

CNCF Incubating Tools

1. Falco (CNCF Incubating)

Runtime security monitoring and detection:

# Install Falco using Helm
helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo update
helm install falco falcosecurity/falco \
  --namespace falco \
  --create-namespace \
  --set driver.kind=module

# View Falco logs
kubectl logs -n falco -l app=falco -f

# Test Falco rules
kubectl exec -it <pod-name> -- bash -c "cat /etc/shadow"  # Should trigger an alert

2. Notary (CNCF Incubating)

For container image signing and verification:

# Install Notary server
helm repo add theupdateframework https://notaryproject.github.io/helm-charts/
helm install notary theupdateframework/notary \
  --namespace notary \
  --create-namespace

# Sign an image
notary init example.com/repository/image
notary add example.com/repository/image v1.0 --roles targets/releases
notary publish example.com/repository/image

3. SPIFFE/SPIRE (CNCF Incubating)

For workload identity and authentication:

# Install SPIRE server
helm repo add spiffe https://spiffe.github.io/helm-charts/
helm install spire spiffe/spire \
  --namespace spire \
  --create-namespace

# Register a workload
kubectl exec -n spire spire-server-0 -- \
  /opt/spire/bin/spire-server entry create \
  -spiffeID spiffe://example.org/ns/default/sa/default \
  -parentID spiffe://example.org/ns/spire/sa/spire-agent \
  -selector k8s:ns:default \
  -selector k8s:sa:default

CNCF Sandbox Tools

1. Kyverno (CNCF Sandbox)

Policy management:

# Install Kyverno
kubectl create -f https://raw.githubusercontent.com/kyverno/kyverno/main/definitions/release/install.yaml

# Apply a policy
kubectl create -f - <<EOF
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-security-context
spec:
  validationFailureAction: enforce
  rules:
  - name: check-security-context
    match:
      resources:
        kinds:
        - Pod
    validate:
      message: "Security context must be set"
      pattern:
        spec:
          securityContext:
            runAsNonRoot: true
EOF

# Check policy status
kubectl get cpol

2. Cert-Manager (CNCF Sandbox)

Certificate management:

# Install Cert-Manager
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.12.0/cert-manager.yaml

# Create ClusterIssuer for Let's Encrypt
kubectl apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: user@example.com
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
    - http01:
        ingress:
          class: nginx
EOF

# Check certificate status
kubectl get certificates,certificaterequests,orders,challenges -A

Security Audit Commands

# Check for pods running as root
kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.securityContext.runAsNonRoot != true) | .metadata.name'

# List service accounts and their permissions
kubectl get serviceaccounts --all-namespaces
kubectl get clusterrolebindings,rolebindings --all-namespaces

# Check for exposed secrets
kubectl get secrets --all-namespaces

# Audit pod security policies
kubectl get psp
kubectl describe psp <psp-name>

# Network policy audit
kubectl get networkpolicies --all-namespaces

Security Monitoring Commands

# Check failed authentication attempts
kubectl logs -n kube-system kube-apiserver-* | grep "Failed"

# Monitor security-related events
kubectl get events --all-namespaces | grep -i "security"

# Check pod security context
kubectl get pods -o json | jq '.items[].spec.securityContext'

# Audit RBAC permissions
kubectl auth can-i --list --namespace=default

Maintenance Guidelines

Daily Tasks

  1. Monitor security events
  2. Review audit logs
  3. Check pod security violations

Weekly Tasks

  1. Update security policies
  2. Review RBAC permissions
  3. Scan container images

Monthly Tasks

  1. Security assessment
  2. Policy review
  3. Compliance audit

Security Hardening Examples

1. Secure Pod Template

apiVersion: v1
kind: Pod
metadata:
  name: hardened-pod
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    fsGroup: 2000
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: app
    image: my-app:1.0
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      readOnlyRootFilesystem: true
    resources:
      limits:
        cpu: "1"
        memory: "1Gi"
      requests:
        cpu: "500m"
        memory: "512Mi"
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 10
      periodSeconds: 30

2. Service Account Configuration

apiVersion: v1
kind: ServiceAccount
metadata:
  name: restricted-sa
  annotations:
    kubernetes.io/enforce-mountable-secrets: "true"
automountServiceAccountToken: false

3. Ingress with TLS

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: secure-ingress
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  tls:
  - hosts:
    - secure.example.com
    secretName: tls-secret
  rules:
  - host: secure.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: secure-service
            port:
              number: 443

Troubleshooting Guide

Common Security Issues

  1. RBAC Misconfiguration

    • Symptom: Permission denied errors
    • Check: Review role bindings
    • Solution: Adjust RBAC permissions
  2. Network Policy Issues

    • Symptom: Connection timeouts
    • Check: Network policy rules
    • Solution: Update network policies
  3. Pod Security Violations

    • Symptom: Pod creation failed
    • Check: Security context
    • Solution: Adjust security settings

Resources