Kubernetes Metrics Server in Depth: Resource Metrics, HPA, and kubectl top
Run kubectl top nodes or scale a Deployment on CPU and Kubernetes needs a live answer: how much CPU and memory are Pods using right now? That answer does not come from Prometheus by default—it comes from metrics-server, a small cluster add-on that scrapes the kubelet, aggregates usage, and exposes the Resource Metrics API (metrics.k8s.io). Without it, kubectl top fails, built-in HPA cannot see resource utilization, and many platform runbooks hit a dead end.
In short
metrics-server is not your monitoring stack—it feeds autoscaling and CLI snapshots (15-second-ish resolution, no long-term storage). Install it in kube-system, register the APIService, ensure kubelets are reachable with valid serving certs, set Pod resources.requests so HPA percentages mean something, and use Prometheus/Grafana for dashboards and alerts. For queue lag and custom signals, add KEDA or a Prometheus adapter on top—not instead of metrics-server for CPU/memory HPA.
Why metrics-server exists
Kubernetes separates orchestration from observability products. The control plane must make scaling decisions without importing a particular vendor’s time-series database. The community standardized a narrow API: current CPU and memory usage per node, Pod, and container.
metrics-server (Kubernetes SIG Instrumentation) implements that API. It:
- Pulls resource usage from each node’s kubelet (which collects container stats via cAdvisor).
- Aggregates and caches values in memory.
- Serves them through the API aggregation layer as
metrics.k8s.io/v1beta1.
Consumers include kubectl top, the Horizontal Pod Autoscaler (HPA) when using CPU or memory metrics, and tools like the Vertical Pod Autoscaler (VPA) recommender that need recent utilization. It is listed as a cluster add-on alongside CoreDNS and kube-proxy on many distributions.
For cluster anatomy (kubelet, API server, aggregation), see Kubernetes architecture in simple terms. For event-driven scaling beyond CPU, see KEDA in depth.
metrics-server vs Prometheus (and “monitoring”)
Teams new to Kubernetes often install Prometheus and assume HPA and kubectl top will work. They are different pipelines:
| Aspect | metrics-server | Prometheus (typical) |
|---|---|---|
| Purpose | Resource Metrics API for control plane & CLI | Monitoring, alerting, SLO dashboards, ad-hoc queries |
| Retention | In-memory only; no history | Configurable TSDB retention (days–months) |
| Resolution | ~15s scrape interval (tunable) | Often 15s–1m scrape; recording rules |
| Metrics | CPU, memory (working set) per container/Pod/node | Rich: HTTP, queues, JVM, business KPIs, etc. |
| HPA CPU/memory | Native via metrics.k8s.io |
Requires Prometheus Adapter for custom/external metrics |
| Ops model | Small Deployment, cluster-critical add-on | Operator, storage, HA, cardinality discipline |
Rule of thumb: metrics-server answers “what is usage now for scaling and quick checks?” Prometheus answers “what happened over time, and should we page someone?” Run both in production platforms; neither replaces the other.
End-to-end data flow
Container(s) in Pod
│ cgroups / cAdvisor
▼
kubelet (Summary API / stats)
│ HTTPS scrape every ~15s
▼
metrics-server (aggregate + cache)
│ registers APIService
▼
kube-apiserver (aggregation layer)
│ metrics.k8s.io
├─► kubectl top nodes|pods
├─► HPA controller (CPU/memory % of request)
└─► VPA / other consumers
- Each kubelet exposes resource usage for Pods on that node.
- metrics-server discovers nodes, dials kubelets (preferring internal IPs), and stores the latest sample.
- The kube-apiserver proxies
/apis/metrics.k8s.ioto metrics-server via anAPIServiceobject. - HPA compares current usage to
resources.requests(for resource metrics) and adjusts replica count.
This is the Resource Metrics API. Custom Metrics API (custom.metrics.k8s.io) and External Metrics API (external.metrics.k8s.io) are separate extension points—KEDA and prometheus-adapter register there for queue depth, PromQL results, etc.
What you get: kubectl top and HPA
kubectl top
kubectl top nodes
kubectl top pods -n production
kubectl top pod my-app-7d4f8b -n production --containers
These commands query metrics.k8s.io. If metrics-server is missing or unhealthy, you see:
error: Metrics API not available
Values are current snapshots, not averages over the last hour. For trends, use your monitoring stack.
HPA and resource requests
HPA v2 can scale on CPU, memory, or external/custom metrics. For CPU utilization percentage, Kubernetes compares usage to the Pod’s CPU request:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
If containers have no CPU request, utilization math is undefined or misleading—HPA may never scale or may behave erratically. Day-one practice: set requests (and limits) before enabling CPU HPA. See Kubernetes hands-on: day-one practices.
Memory HPA is supported but less common: memory is not compressible like CPU; scaling out does not always fix OOM pressure on a single replica. Prefer memory requests/limits, VPA recommendations, or app-level tuning alongside cautious memory targets.
Installation and upgrades
Managed clusters (EKS, GKE, AKS) often ship metrics-server enabled or offer a one-click add-on. Self-managed and local labs install explicitly.
Official manifest (pin a release)
# Check compatibility with your Kubernetes minor version:
# https://github.com/kubernetes-sigs/metrics-server/releases
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Production: pin a version in Git (GitOps), don’t rely on latest redirect in CI. Namespace is typically kube-system; the Deployment runs two replicas on many manifests for availability.
Helm
helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/
helm upgrade --install metrics-server metrics-server/metrics-server \
-n kube-system \
--set args="{--kubelet-preferred-address-types=InternalIP}"
Local clusters (kind, minikube, k3s)
Kubelets often use self-signed serving certificates. metrics-server must trust them or skip verification (lab only):
# Example extra args for kind/minikube — NOT for production
--kubelet-insecure-tls
--kubelet-preferred-address-types=InternalIP,Hostname,ExternalIP
minikube: minikube addons enable metrics-server. kind: apply upstream manifest plus kubelet TLS flags documented in the metrics-server install guide.
Architecture inside the cluster
- Deployment — Usually 1–2 Pods; low CPU/memory footprint; cluster-critical.
- Service — ClusterIP fronting metrics-server Pods.
- APIService —
v1beta1.metrics.k8s.iowithinsecureSkipTLSVerify: trueon the APIService spec is common because the Service cert is often self-signed; the API server still authenticates the metrics-server Pod via RBAC. - RBAC — ClusterRole to read nodes/Pods and scrape kubelets;
system:auth-delegatorfor extension API authentication; aggregated ClusterRolesystem:aggregated-metrics-readerso users with appropriate RBAC can callmetrics.k8s.io. - PodSecurity / NetworkPolicy — Allow metrics-server → kubelet:10250 (or configured port) on every node; allow API server → metrics-server Service.
Verify registration:
kubectl get apiservice v1beta1.metrics.k8s.io -o wide
kubectl get deployment metrics-server -n kube-system
kubectl get pods -n kube-system -l k8s-app=metrics-server
Important configuration flags
| Flag | Effect |
|---|---|
--metric-resolution=15s |
How often metrics-server rescrapes kubelets; lower = fresher HPA signal, more kubelet load. |
--kubelet-preferred-address-types=InternalIP,... |
Which node address to use when dialing kubelets—fix for wrong IP / TLS SAN mismatches. |
--kubelet-insecure-tls |
Skip kubelet serving cert verification—lab only. |
--kubelet-use-node-status-port |
Use the port from Node status (useful when kubelet listens on non-default ports). |
--kubelet-certificate-authority |
Trust anchor for kubelet certs in hardened clusters. |
Resource metrics expose memory working set (what HPA uses for memory), not cache that the kernel could reclaim under pressure—another reason memory autoscaling needs careful testing.
Security and RBAC
metrics-server is a high-privilege component: it reads all Pod metrics cluster-wide. Treat it like control-plane software:
- Run in
kube-system(or a dedicated platform namespace) with restricted Pod security and no public Ingress. - Restrict who can create or patch
APIServiceobjects—compromised APIService registration is an attack path. - Audit kubelet read access; on hardened nodes, ensure only metrics-server’s ServiceAccount can reach the kubelet stats/summary endpoints.
- Rotate with cluster upgrades; watch release notes for breaking changes to API versions (
v1beta1vs future stable versions).
For broader authorization patterns, see Kubernetes cluster RBAC.
Limitations you should plan around
- No historical data — Cannot graph last week’s CPU in Grafana from metrics-server alone.
- Only CPU and memory — No HTTP RPS, queue depth, or GPU metrics (GPU may appear in future ecosystem paths; today use vendor exporters + custom metrics).
- Not for alerting — Nothing to alert on except “metrics API down.”
- Latency — Several scrape intervals between usage spike and HPA reaction; combine with sensible stabilization windows and cooldowns.
- Windows nodes — Support exists but validate versions and test HPA on your node OS mix.
- Very large clusters — Tune replicas, scrape interval, and kubelet load; follow SIG guidance for sharding/high availability options in release notes.
Troubleshooting playbook
| Symptom | Likely cause | What to check |
|---|---|---|
Metrics API not available |
Not installed, APIService not available, Pods crash | kubectl get apiservice, metrics-server Pod logs, events |
top works for nodes, not pods |
Kubelet scrape errors for some nodes | Logs: x509, connection refused, wrong address type |
HPA shows unknown for CPU |
Missing metrics-server or missing requests | kubectl describe hpa; set CPU requests on containers |
FailedDiscoveryCheck on APIService |
Service endpoints empty, network policy, TLS | kubectl describe apiservice v1beta1.metrics.k8s.io |
Intermittent gaps in top |
Kubelet overload, network, single replica | Increase replicas; check node health; scrape interval |
# Direct API check (needs RBAC to read metrics.k8s.io)
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes" | head -c 500
kubectl logs -n kube-system deploy/metrics-server --tail=50
kubectl describe apiservice v1beta1.metrics.k8s.io
More cluster-wide debug patterns: Kubernetes troubleshooting playbook.
Extending beyond resource metrics
When CPU/memory are not the right scaling signals:
- KEDA — Event-driven scalers (Kafka lag, SQS depth, Prometheus query, cron) via external metrics + managed HPAs.
- prometheus-adapter — Expose PromQL results on custom.metrics.k8s.io for HPA.
- Cluster Autoscaler / Karpenter — Scale nodes when Pending Pods need capacity; orthogonal to metrics-server but part of the same autoscaling story.
metrics-server remains required for standard resource-based HPA and kubectl top even when KEDA handles queue-shaped workloads.
Production checklist
- metrics-server installed, version pinned, tracked in GitOps.
APIServiceAvailable; at least one healthy metrics-server Pod.- Kubelet addresses and TLS configured for your cloud/network (no insecure TLS in prod).
- All HPA targets have meaningful
resources.requests(CPU at minimum). - HPA stabilization windows and min/max replicas reviewed under load tests.
- Separate Prometheus (or vendor APM) for dashboards, SLOs, and alerts.
- Runbooks document “Metrics API down” vs “app high CPU” (different fixes).
- Platform monitors metrics-server availability (synthetic
topor APIService condition).
Hands-on lab (local cluster)
- Before install:
kubectl top nodes→ expect Metrics API error. - Install metrics-server (addon or manifest); wait for Pods Ready.
kubectl top nodesandkubectl top pods -A→ see CPU(millicores) and memory.- Deploy a CPU-hog demo with requests; attach HPA at 50% CPU; watch
kubectl get hpa -wand replica count under load. - Remove CPU requests from the Deployment → observe HPA status
unknownor flat scaling; restore requests and confirm recovery.
Lab prerequisites: Part 1 — local lab, Part 3 — first workloads, Part 5 — debug and next steps.
Further reading
- metrics-server repository — install, flags, compatibility matrix
- Resource metrics pipeline — official architecture doc
- Horizontal Pod Autoscaler — metrics types and behavior
- NodeMetrics / PodMetrics API — schema reference
Further reading on this site
- Kubernetes architecture — kubelet and control plane
- KEDA in depth — external metrics and event-driven HPA
- Karpenter in depth — node scaling when HPA adds Pods
- Day-one best practices — requests, limits, probes
- Troubleshooting playbook — Pending, HPA, and resource issues
- GitOps principles — version metrics-server manifests like any add-on
Blog index · KEDA · Troubleshooting playbook · Hands-on Part 5