Kubernetes Cost Optimization - A Practical Guide
Managing costs in Kubernetes environments is a critical challenge faced by organizations of all sizes. This comprehensive guide presents battle-tested strategies and practical implementations for optimizing your Kubernetes costs without compromising performance or reliability.
Understanding Cost Components
Before implementing optimization strategies, it’s crucial to understand your Kubernetes cost structure:
Core Cost Drivers
- Compute Resources: CPU and memory usage across nodes
- Storage: Persistent volumes and their associated costs
- Network: Data transfer between zones/regions
- Management Overhead: Control plane and monitoring costs
Cost Analysis Example
Let’s analyze a typical scenario of a microservices application:
Production Environment Example:
├── Frontend Service: 10 pods × 0.5 CPU, 1Gi RAM
├── Backend APIs: 15 pods × 1 CPU, 2Gi RAM
├── Cache Layer: 3 pods × 2 CPU, 4Gi RAM
└── Database: 2 pods × 4 CPU, 8Gi RAM
Monthly Cost Breakdown:
- Compute: $2,500
- Storage: $800
- Network: $400
- Management: $200
Total: $3,900/month
Implementation Requirements
Technical Prerequisites
- Access to a Kubernetes cluster (v1.24+)
- kubectl CLI tool installed (v1.24+)
- Helm package manager (v3.0+)
- A cost monitoring solution (e.g., Kubecost, OpenCost)
- Cloud provider access (if using cloud-managed Kubernetes)
- Basic understanding of Kubernetes resource management
Core Optimization Strategies
1. Resource Management
Resource Quotas
Prevent resource hogging and overprovisioning:
apiVersion: v1
kind: ResourceQuota
metadata:
name: team-quota
namespace: production
spec:
hard:
# Compute Resources
requests.cpu: "100"
requests.memory: 200Gi
limits.cpu: "200"
limits.memory: 400Gi
# Storage
requests.storage: 500Gi
persistentvolumeclaims: "20"
# Object Counts
pods: "100"
services: "50"
configmaps: "50"
secrets: "50"
Container Limits
Set default resource constraints:
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
namespace: production
spec:
limits:
- default:
memory: "512Mi"
cpu: "500m"
defaultRequest:
memory: "256Mi"
cpu: "200m"
max:
memory: "2Gi"
cpu: "2"
min:
memory: "128Mi"
cpu: "100m"
type: Container
2. Dynamic Scaling
Horizontal Pod Autoscaling
Implement intelligent scaling based on metrics:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-service
minReplicas: 3
maxReplicas: 15
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 60
Vertical Pod Autoscaling
Optimize resource allocation automatically:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: api-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: api-service
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: '*'
minAllowed:
cpu: 100m
memory: 128Mi
maxAllowed:
cpu: 1
memory: 2Gi
controlledResources: ["cpu", "memory"]
3. Infrastructure Optimization
Node Pool Strategies
Implement cost-effective instance management:
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: production-cluster
region: us-west-2
nodeGroups:
- name: mixed-instances-1
instanceTypes:
- t3.large
- t3a.large
- m5.large
- m5a.large
desiredCapacity: 3
minSize: 1
maxSize: 10
spotInstances: true
spotAllocationStrategy: capacity-optimized
Workload Placement
Optimize pod scheduling for cost efficiency:
apiVersion: apps/v1
kind: Deployment
metadata:
name: cost-optimized-app
spec:
template:
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: instance-type
operator: In
values:
- spot
tolerations:
- key: spot
operator: Equal
value: "true"
effect: NoSchedule
4. Storage Optimization
StorageClass Configuration
Implement cost-effective storage tiers:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: standard-retain
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
iopsPerGB: "3000"
throughput: "125"
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
Cost Monitoring and Analysis
Implementing a robust cost monitoring solution is crucial for maintaining visibility into your Kubernetes spending and identifying optimization opportunities.
Kubecost Implementation
Prerequisites
- Kubernetes cluster (v1.16+)
- Helm 3
- Metrics Server installed
- At least 2 CPU cores and 4GB RAM available
- Storage class for persistent volumes
Installation Methods
Quick Installation
For a quick setup with default configuration:
# Create namespace
kubectl create namespace kubecost
# Add Helm repository
helm repo add kubecost https://kubecost.github.io/cost-analyzer/
helm repo update
# Quick install with basic configuration
helm install kubecost kubecost/cost-analyzer \
--namespace kubecost \
--set kubecostToken="" \
--set prometheus.nodeExporter.enabled=true \
--set prometheus.serviceMonitor.enabled=true \
--set networkCosts.enabled=true \
--set grafana.enabled=true
Advanced Installation
For production environments or custom configurations:
- Create Configuration File
# kubecost-values.yaml
global:
prometheus:
enabled: true
fqdn: http://kubecost-prometheus-server.kubecost.svc
kubecostProductConfigs:
clusters: ["cluster-one"]
networkCosts:
enabled: true
config:
services: true
pods: true
nodes: true
prometheus:
nodeExporter:
enabled: true
serviceMonitor:
enabled: true
server:
retention: 15d
resources:
requests:
cpu: 500m
memory: 2Gi
limits:
cpu: 1000m
memory: 4Gi
grafana:
enabled: true
sidecar:
dashboards:
enabled: true
resources:
requests:
cpu: 100m
memory: 512Mi
limits:
cpu: 200m
memory: 1Gi
serviceMonitor:
enabled: true
persistentVolume:
size: "0.2Gi"
dbSize: "32.0Gi"
notifications:
slack:
enabled: true
webhook: "https://hooks.slack.com/services/your/webhook/url"
thanos:
enabled: false # Enable for long-term storage
etl:
enabled: true
resources:
requests:
cpu: 200m
memory: 1Gi
limits:
cpu: 400m
memory: 2Gi
- Install with Custom Configuration
helm install kubecost kubecost/cost-analyzer \
--namespace kubecost \
-f kubecost-values.yaml
OpenCost Implementation
Prerequisites
- Kubernetes cluster (v1.16+)
- Helm 3
- Prometheus installed
- Metrics Server
Installation Methods
Quick Installation
For a quick setup with basic configuration:
# Create namespace
kubectl create namespace opencost
# Add Helm repository
helm repo add opencost https://opencost.github.io/opencost-helm-chart
helm repo update
# Create a values file for annotations
cat <<EOF > opencost-values.yaml
opencost:
prometheus:
external:
url: http://prometheus-server.monitoring.svc
metrics:
window: "1d" # Default window for queries
resolution: "1h" # Default resolution for queries
ui:
enabled: true
port: 9003
exporter:
enable: true
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi
service:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9003"
type: ClusterIP
EOF
# Install OpenCost using values file
helm install opencost opencost/opencost \
--namespace opencost \
-f opencost-values.yaml
Advanced Installation
For production environments or custom configurations:
- Create Configuration File
# opencost-values.yaml
opencost:
prometheus:
external:
url: http://prometheus-server.monitoring.svc
exporter:
enable: true
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi
metrics:
serviceMonitor:
enabled: true
interval: 30s
window: 1h
resolution: 1m
cloudProvider:
enabled: true
aws:
enabled: true
secretName: aws-secret
secretKey: credentials
customPricing:
enabled: true
configPath: /models/pricing/configs/custom.json
data:
CPU: 0.031611
RAM: 0.004237
storage: 0.00005
GPU: 0.95
service:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9003"
ui:
enabled: true
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: nginx
hosts:
- host: opencost.example.com
paths:
- path: /
pathType: Prefix
persistentVolume:
enabled: true
size: 10Gi
- Install with Custom Configuration
helm install opencost opencost/opencost \
--namespace opencost \
-f opencost-values.yaml
Using OpenCost
- Access Dashboard
# Port forward the OpenCost service
kubectl port-forward -n opencost svc/opencost 9003:9003
- Access the UI
# The UI is available at
open http://localhost:9003
- API Endpoints
# Get cost allocation for the last 24 hours
curl "http://localhost:9003/allocation/compute?window=24h"
# Get cost allocation for a specific time window
curl "http://localhost:9003/allocation/compute?window=168h" # Last 7 days
# Get asset information with window
curl "http://localhost:9003/assets?window=24h"
# Get efficiency metrics
curl "http://localhost:9003/efficiency?window=24h"
# Get all available metrics
curl "http://localhost:9003/metrics"
# Get cost allocation by namespace
curl "http://localhost:9003/allocation/compute?window=24h&aggregate=namespace"
# Get cost allocation by label
curl "http://localhost:9003/allocation/compute?window=24h&aggregate=label:app"
Common window parameters:
1h
: Last hour24h
: Last 24 hours7d
: Last 7 days30d
: Last 30 days- Custom range:
window=2023-01-01T00:00:00Z,2023-01-31T23:59:59Z
- Verify Installation
# Check if pods are running
kubectl get pods -n opencost
# Check service
kubectl get svc -n opencost
# View logs
kubectl logs -n opencost deployment/opencost
# Check if metrics are being collected
curl "http://localhost:9003/metrics" | grep "opencost_"
Key Differences:
-
Port Numbers:
- Kubecost uses port 9090 by default
- OpenCost uses port 9003 by default
-
UI Access:
- Kubecost provides a full-featured web UI at the root path
- OpenCost’s UI is available at the
/ui
path
-
API Access:
- Kubecost API is primarily accessed through the UI
- OpenCost provides direct REST API endpoints for programmatic access
-
Dashboard Features:
- Kubecost offers more built-in visualizations and reports
- OpenCost focuses on API-first approach with basic UI visualizations
Real-World Results
Cost Reduction Case Studies
-
E-Commerce Platform
- Initial Monthly Cost: $25,000
- Optimized Cost: $15,000
- Key Improvements:
- Resource right-sizing: -25%
- Spot instance usage: -15%
- Storage optimization: -10%
- Improved autoscaling: -10%
-
SaaS Application
- Initial Monthly Cost: $40,000
- Optimized Cost: $28,000
- Improvements:
- Cluster consolidation: -20%
- Network optimization: -15%
- Resource quotas: -10%
Implementation Success Metrics
-
Resource Efficiency
- CPU utilization improved from 35% to 70%
- Memory utilization improved from 45% to 75%
- Storage utilization improved from 50% to 85%
-
Cost Efficiency
- Cost per transaction reduced by 40%
- Cost per user reduced by 35%
- Infrastructure cost per revenue dollar reduced by 25%
Maintenance Guidelines
Daily Operations
- Resource Monitoring
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: daily-resource-monitor
spec:
selector:
matchLabels:
app.kubernetes.io/component: resource-metrics
endpoints:
- port: metrics
interval: 5m
path: /metrics
- Cost Alerts
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: daily-cost-alerts
spec:
groups:
- name: daily.costs
rules:
- alert: DailyCostSpike
expr: sum(rate(container_cpu_usage_seconds_total[1h])) by (namespace) > day_over_day_threshold
for: 1h
labels:
severity: warning
annotations:
description: Daily cost spike detected in namespace
Weekly Tasks
- Resource Optimization
apiVersion: batch/v1
kind: CronJob
metadata:
name: weekly-optimization
spec:
schedule: "0 0 * * 0"
jobTemplate:
spec:
template:
spec:
containers:
- name: optimizer
image: resource-optimizer:v1
args:
- --analyze-usage
- --suggest-optimizations
- --generate-report
- Compliance Checks
apiVersion: batch/v1
kind: CronJob
metadata:
name: compliance-check
spec:
schedule: "0 0 * * 1"
jobTemplate:
spec:
template:
spec:
containers:
- name: compliance-checker
image: compliance-check:v1
env:
- name: CHECK_RESOURCE_QUOTAS
value: "true"
- name: CHECK_COST_LABELS
value: "true"
Monthly Reviews
-
Cost Analysis
- Review monthly trends
- Compare against budgets
- Identify optimization opportunities
- Update cost allocation models
-
Performance Impact
- Review service SLAs
- Analyze resource efficiency
- Update scaling policies
- Optimize resource requests
Quarterly Planning
-
Strategy Review
- Evaluate cost optimization goals
- Update resource allocation strategies
- Review cloud provider pricing
- Plan major optimizations
-
Capacity Planning
- Forecast resource needs
- Plan cluster scaling
- Review storage requirements
- Update budget allocations
Automation Tools
- Resource Cleanup
apiVersion: batch/v1
kind: CronJob
metadata:
name: resource-cleanup
spec:
schedule: "0 0 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: cleanup
image: resource-janitor:v1
env:
- name: CLEANUP_UNUSED_PVS
value: "true"
- name: CLEANUP_UNBOUND_PVS
value: "true"
- name: MAX_PV_AGE_DAYS
value: "7"
- Cost Forecasting
apiVersion: batch/v1
kind: CronJob
metadata:
name: cost-forecasting
spec:
schedule: "0 0 1 * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: forecaster
image: cost-forecaster:v1
env:
- name: FORECAST_MONTHS
value: "3"
- name: INCLUDE_GROWTH_PATTERNS
value: "true"
- name: ALERT_ON_THRESHOLD
value: "true"
Best Practices for Cost Optimization
1. Resource Right-Sizing
Memory and CPU Optimization
apiVersion: apps/v1
kind: Deployment
metadata:
name: optimized-app
spec:
template:
spec:
containers:
- name: app
resources:
requests:
cpu: "200m" # Based on actual usage patterns
memory: "256Mi"
limits:
cpu: "500m" # Prevent resource hogging
memory: "512Mi"
# Enable resource metrics collection
env:
- name: JAVA_OPTS
value: "-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0"
Best Practice Tips:
- Start with metrics-based resource requests
- Set limits 2-3x higher than requests
- Use container-aware JVM settings
- Monitor and adjust based on actual usage
2. Cost-Effective Storage
Storage Class Configuration
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: optimized-storage
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
iopsPerGB: "3000"
encrypted: "true"
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
Storage Optimization Tips:
- Use appropriate storage tiers
- Enable volume expansion
- Implement automatic backup cleanup
- Configure volume snapshots
3. Network Cost Management
Network Policy Implementation
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: optimize-egress
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
purpose: production
ports:
- port: 443
protocol: TCP
Network Optimization Tips:
- Use regional clusters when possible
- Implement cross-zone traffic policies
- Enable VPC endpoints for cloud services
- Monitor and optimize egress traffic
4. Workload Scheduling
Node Affinity and Anti-Affinity
apiVersion: apps/v1
kind: Deployment
metadata:
name: cost-optimized-deployment
spec:
template:
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: node.kubernetes.io/instance-type
operator: In
values:
- spot
- preemptible
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
app: high-availability
Scheduling Best Practices:
- Use spot/preemptible instances for suitable workloads
- Implement proper pod disruption budgets
- Balance between cost and availability
- Consider time-based scaling
Implementation Roadmap
Phase 1: Assessment and Planning (Week 1-2)
- Baseline Measurement
apiVersion: batch/v1
kind: CronJob
metadata:
name: cost-baseline
spec:
schedule: "0 */6 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: metrics-collector
image: metrics-collector:v1
env:
- name: METRICS_ENDPOINT
value: "http://prometheus:9090"
- name: EXPORT_BUCKET
value: "s3://cost-metrics/baseline"
- Resource Audit
#!/bin/bash
# audit-resources.sh
kubectl get nodes -o json | jq '.items[] | {name: .metadata.name, capacity: .status.capacity, allocatable: .status.allocatable}'
kubectl get pods --all-namespaces -o json | jq '.items[] | {name: .metadata.name, namespace: .metadata.namespace, requests: .spec.containers[].resources.requests, limits: .spec.containers[].resources.limits}'
Phase 2: Initial Optimization (Week 3-4)
- Resource Quotas Implementation
apiVersion: v1
kind: ResourceQuota
metadata:
name: phase-1-quota
spec:
hard:
requests.cpu: "20"
requests.memory: 40Gi
limits.cpu: "40"
limits.memory: 80Gi
- Monitoring Setup
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: cost-monitor
spec:
selector:
matchLabels:
app.kubernetes.io/component: cost-metrics
endpoints:
- port: metrics
interval: 30s
path: /metrics
namespaceSelector:
matchNames:
- monitoring
Phase 3: Advanced Optimization (Week 5-8)
- Automated Scaling Implementation
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: phase-3-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: target-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
behavior:
scaleDown:
stabilizationWindowSeconds: 300
- Cost Allocation Implementation
apiVersion: v1
kind: Namespace
metadata:
name: team-a
labels:
cost-center: "1001"
department: "engineering"
environment: "production"
annotations:
billing.kubecost.com/alert: "true"
billing.kubecost.com/alert-threshold: "1000"
Phase 4: Continuous Optimization (Ongoing)
- Automated Cost Reports
apiVersion: batch/v1
kind: CronJob
metadata:
name: cost-report
spec:
schedule: "0 0 * * 1" # Weekly
jobTemplate:
spec:
template:
spec:
containers:
- name: reporter
image: cost-reporter:v1
env:
- name: REPORT_RECIPIENTS
value: "finance@company.com,engineering@company.com"
- name: COST_THRESHOLD
value: "10000"
- name: ALERT_ON_INCREASE
value: "20" # Alert on 20% increase
- Continuous Monitoring
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: cost-alerts
spec:
groups:
- name: costs
rules:
- alert: CostSpike
expr: rate(container_cpu_usage_seconds_total[6h]) > 2 * avg_over_time(container_cpu_usage_seconds_total[7d])
for: 1h
labels:
severity: warning
annotations:
summary: Cost spike detected
description: Resource usage significantly higher than weekly average
Implementation Checklist
-
Pre-Implementation
- Baseline cost metrics collected
- Resource utilization patterns analyzed
- Team responsibilities assigned
- Success metrics defined
-
Phase 1 Checklist
- Resource quotas implemented
- Monitoring tools installed
- Initial cost baseline established
- Team training completed
-
Phase 2 Checklist
- Autoscaling configured
- Storage optimizations implemented
- Network policies defined
- Initial cost reductions verified
-
Phase 3 Checklist
- Advanced monitoring in place
- Cost allocation implemented
- Automated reporting configured
- Team dashboards created
-
Continuous Optimization
- Weekly cost reviews scheduled
- Monthly optimization targets set
- Quarterly strategy reviews planned
- Annual cost analysis framework established
Monitoring and Control
Cost Monitoring
Prometheus Integration
Track resource utilization and costs:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: cost-metrics
spec:
selector:
matchLabels:
app: cost-exporter
endpoints:
- port: metrics
interval: 30s
path: /metrics
namespaceSelector:
matchNames:
- monitoring
Alert Configuration
Implement proactive cost controls:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: cost-alerts
spec:
groups:
- name: cost.rules
rules:
- alert: HighCostSpike
expr: |
sum(
rate(container_cpu_usage_seconds_total{container!=""}[1h])
) by (namespace) * on() group_left() cluster_hourly_rate > 100
for: 1h
labels:
severity: warning
annotations:
summary: High cost detected in namespace {{ $labels.namespace }}
description: Hourly cost has exceeded $100 threshold
Real-World Results
Cost Reduction Case Study
Before Optimization:
- Monthly Cost: $3,900
- Resource Utilization: 35%
- Idle Resources: 40%
After Optimization:
- Monthly Cost: $2,100 (46% reduction)
- Resource Utilization: 75%
- Idle Resources: 10%
Key Improvements:
1. Resource right-sizing: -25% cost
2. Spot instance usage: -15% cost
3. Autoscaling implementation: -20% cost
4. Storage optimization: -10% cost
Maintenance Guidelines
Daily Operations
- Resource Monitoring
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: daily-resource-monitor
spec:
selector:
matchLabels:
app.kubernetes.io/component: resource-metrics
endpoints:
- port: metrics
interval: 5m
path: /metrics
- Cost Alerts
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: daily-cost-alerts
spec:
groups:
- name: daily.costs
rules:
- alert: DailyCostSpike
expr: sum(rate(container_cpu_usage_seconds_total[1h])) by (namespace) > day_over_day_threshold
for: 1h
labels:
severity: warning
annotations:
description: Daily cost spike detected in namespace
Weekly Tasks
- Resource Optimization
apiVersion: batch/v1
kind: CronJob
metadata:
name: weekly-optimization
spec:
schedule: "0 0 * * 0"
jobTemplate:
spec:
template:
spec:
containers:
- name: optimizer
image: resource-optimizer:v1
args:
- --analyze-usage
- --suggest-optimizations
- --generate-report
- Compliance Checks
apiVersion: batch/v1
kind: CronJob
metadata:
name: compliance-check
spec:
schedule: "0 0 * * 1"
jobTemplate:
spec:
template:
spec:
containers:
- name: compliance-checker
image: compliance-check:v1
env:
- name: CHECK_RESOURCE_QUOTAS
value: "true"
- name: CHECK_COST_LABELS
value: "true"
Monthly Reviews
-
Cost Analysis
- Review monthly trends
- Compare against budgets
- Identify optimization opportunities
- Update cost allocation models
-
Performance Impact
- Review service SLAs
- Analyze resource efficiency
- Update scaling policies
- Optimize resource requests
Quarterly Planning
-
Strategy Review
- Evaluate cost optimization goals
- Update resource allocation strategies
- Review cloud provider pricing
- Plan major optimizations
-
Capacity Planning
- Forecast resource needs
- Plan cluster scaling
- Review storage requirements
- Update budget allocations
Automation Tools
- Resource Cleanup
apiVersion: batch/v1
kind: CronJob
metadata:
name: resource-cleanup
spec:
schedule: "0 0 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: cleanup
image: resource-janitor:v1
env:
- name: CLEANUP_UNUSED_PVS
value: "true"
- name: CLEANUP_UNBOUND_PVS
value: "true"
- name: MAX_PV_AGE_DAYS
value: "7"
- Cost Forecasting
apiVersion: batch/v1
kind: CronJob
metadata:
name: cost-forecasting
spec:
schedule: "0 0 1 * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: forecaster
image: cost-forecaster:v1
env:
- name: FORECAST_MONTHS
value: "3"
- name: INCLUDE_GROWTH_PATTERNS
value: "true"
- name: ALERT_ON_THRESHOLD
value: "true"