Prometheus Service Discovery in Kubernetes
Implement dynamic service discovery in Prometheus for automatic target detection
Prometheus Service Discovery in Kubernetes
Service Discovery in Prometheus enables automatic detection and monitoring of services in your Kubernetes cluster. This guide covers various service discovery mechanisms and their implementation.
Service Discovery Methods
1. Kubernetes Service Discovery
Pod Discovery
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
Service Discovery
scrape_configs:
- job_name: 'kubernetes-services'
kubernetes_sd_configs:
- role: service
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
2. ServiceMonitor CRD
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: example-app
labels:
team: frontend
spec:
selector:
matchLabels:
app: example
endpoints:
- port: web
interval: 15s
path: /metrics
Implementation Patterns
1. Auto-Discovery with Annotations
apiVersion: v1
kind: Pod
metadata:
name: example-app
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
containers:
- name: example-app
image: example/app:v1
ports:
- containerPort: 8080
2. Label-Based Discovery
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: app-monitor
spec:
selector:
matchLabels:
monitoring: enabled
namespaceSelector:
matchNames:
- default
- prod
endpoints:
- port: http
interval: 30s
Advanced Configurations
1. Multi-Target Discovery
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: multi-target
spec:
endpoints:
- port: http-metrics
interval: 15s
- port: additional-metrics
interval: 30s
path: /extra-metrics
selector:
matchLabels:
app: multi-metric-app
2. Namespace Discovery
scrape_configs:
- job_name: 'kubernetes-namespaces'
kubernetes_sd_configs:
- role: namespace
relabel_configs:
- source_labels: [__meta_kubernetes_namespace_label_monitoring]
regex: enabled
action: keep
Relabeling Configuration
1. Basic Relabeling
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app]
target_label: application
- source_labels: [__meta_kubernetes_namespace]
target_label: namespace
2. Advanced Relabeling
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app, __meta_kubernetes_pod_label_version]
separator: /
target_label: app_version
- source_labels: [__meta_kubernetes_pod_container_port_number]
regex: '([0-9]+)'
replacement: '$1'
target_label: port
Filtering and Target Selection
1. Label-Based Filtering
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: filtered-monitor
spec:
selector:
matchExpressions:
- key: environment
operator: In
values: [production, staging]
endpoints:
- port: metrics
2. Namespace Filtering
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: namespace-monitor
spec:
namespaceSelector:
matchExpressions:
- key: environment
operator: In
values: [prod, staging]
Custom Service Discovery
1. File-Based Discovery
scrape_configs:
- job_name: 'file-discovery'
file_sd_configs:
- files:
- '/etc/prometheus/file_sd/*.yaml'
relabel_configs:
- source_labels: [environment]
target_label: env
2. DNS-Based Discovery
scrape_configs:
- job_name: 'dns-discovery'
dns_sd_configs:
- names:
- 'service.consul.local'
type: 'A'
port: 9100
Monitoring Service Discovery
1. Service Discovery Metrics
# Monitor discovered targets
sum(prometheus_sd_discovered_targets)
# Monitor scrape pool synchronization
rate(prometheus_target_sync_length_seconds_sum[5m])
2. Target Status Dashboard
{
"dashboard": {
"panels": [
{
"title": "Active Targets",
"targets": [
{
"expr": "sum(up) by (job)"
}
]
},
{
"title": "Scrape Duration",
"targets": [
{
"expr": "rate(prometheus_target_interval_length_seconds_sum[5m])"
}
]
}
]
}
}
Best Practices
-
Labeling Strategy
- Use consistent label naming
- Avoid high cardinality labels
- Document label meanings
-
Performance Optimization
- Set appropriate scrape intervals
- Use efficient relabeling
- Monitor scrape duration
-
Security
- Use RBAC for service discovery
- Secure endpoint access
- Monitor unauthorized access attempts
-
Maintenance
- Regular configuration review
- Monitor discovery errors
- Update service selectors
Troubleshooting
Common Issues
- Target Not Found
# Check service monitor
kubectl get servicemonitor
# Verify labels
kubectl get pods --show-labels
- Scrape Failures
# Query scrape errors
sum(scrape_samples_scraped) by (job) == 0
# Check scrape duration
rate(prometheus_target_interval_length_seconds_sum[5m])
- Configuration Issues
# Validate configuration
promtool check config prometheus.yml
# Check Prometheus logs
kubectl logs -l app=prometheus