#100DaysOfSRE (Day 33): Monitoring Kubernetes Apps with Prometheus & Grafana
Why Monitor Kubernetes?
- Detect performance bottlenecks before they impact users
- Visualize metrics such as CPU, memory, and request rates
- Set up alerts for failures or resource exhaustion
- Improve system reliability and proactively respond to issues
Today, we’ll install, configure, and visualize Kubernetes metrics using Prometheus & Grafana on Minikube.
Step 1: Enable Monitoring Addons in Minikube
Minikube provides a built-in Prometheus-Grafana stack. To enable it:
minikube addons enable metrics-server
minikube addons enable prometheus
minikube addons enable grafana
Check running pods:
kubectl get pods -n monitoring
Step 2: Access Prometheus & Grafana
Access Prometheus Dashboard
Forward the Prometheus service to your local machine:
kubectl port-forward -n monitoring svc/prometheus-k8s 9090:9090
Now, open http://localhost:9090 in your browser.
Access Grafana Dashboard
Forward Grafana service:
kubectl port-forward -n monitoring svc/grafana 3000:3000
Now, open http://localhost:3000
(Default login: admin / prom-operator)
Step 3: Configure Prometheus to Collect Metrics
Prometheus scrapes metrics from Kubernetes using ServiceMonitors. Let’s define one.
prometheus-servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: myapp-monitor
namespace: myapp
spec:
selector:
matchLabels:
app: backend
endpoints:
- port: http
interval: 15s
Apply it:
kubectl apply -f prometheus-servicemonitor.yaml
Now, Prometheus will scrape metrics every 15 seconds from the backend service.
Step 4: Expose Application Metrics
By default, Prometheus expects an endpoint like /metrics to collect data. Let’s modify our backend to expose metrics.
backend/app.py
from flask import Flask
from prometheus_client import start_http_server, Counter
app = Flask(__name__)
REQUESTS = Counter('http_requests_total', 'Total number of HTTP requests')
@app.route("/")
def home():
REQUESTS.inc()
return "Hello, Kubernetes Monitoring!"
if __name__ == "__main__":
start_http_server(8000) # Expose metrics on /metrics
app.run(host="0.0.0.0", port=5000)
Rebuild and restart the backend:
docker build -t my-backend .
kubectl rollout restart deployment backend
Verify metrics:
curl http://backend:8000/metrics
Step 5: Visualize Metrics in Grafana
Import Prometheus as a Data Source
- Open Grafana (
http://localhost:3000
) - Go to Configuration > Data Sources > Add Data Source
- Select Prometheus
- Enter http://prometheus-k8s.monitoring.svc:9090 as the URL
- Click Save & Test
Import a Prebuilt Dashboard
- Go to Dashboards > Import
- Use ID: 3119 (Kubernetes Cluster Monitoring)
- Click Load and Import
Now we can see real-time Kubernetes metrics!
Please, note that Grafana provides a variety of prebuilt dashboard templates that you can import and use directly. These templates are often available in the Grafana Dashboards library or shared by the community.
Some other example dashboards for Kubernetes are:
-
Kubernetes Cluster Monitoring - ID:
315
-
Kubernetes Node Monitoring - ID:
6417
Step 6: Set Up Alerts with Prometheus Alertmanager
Well, now, let’s define an Alert Rule:
# alert-rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: high-cpu-alert
namespace: monitoring
spec:
groups:
- name: instance-rules
rules:
- alert: HighCPUUsage
expr: rate(container_cpu_usage_seconds_total[2m]) > 0.5
for: 2m
labels:
severity: critical
annotations:
summary: "High CPU Usage Detected"
description: "CPU usage is above 50% for more than 2 minutes."
Apply it:
kubectl apply -f alert-rules.yaml
Prometheus will now trigger an alert if CPU usage exceeds 50% for 2 minutes.
Remarks
Key Takeaways
✅ Prometheus collects metrics from Kubernetes workloads
✅ Grafana visualizes metrics in real-time dashboards
✅ ServiceMonitors configure scraping rules for Prometheus
✅ Alerts notify teams when performance degrades
What’s Next?
🔹 Day 34: Automating Kubernetes Deployments with ArgoCD & GitOps
🔹 Day 35: Building a Kubernetes CI/CD Pipeline
Leave a comment