One of the most powerful features of the kube-prometheus-stack is its collection of pre-built Grafana dashboards. The moment you install the stack, over 30 production-grade dashboards are automatically loaded into Grafana — giving you instant visibility into cluster health, node performance, pod behavior, and Kubernetes control plane metrics without writing a single PromQL query.

This guide walks through every major dashboard category, explains what each one monitors, and shows you how to add custom dashboards for your own applications. Whether you are new to Kubernetes monitoring or an experienced SRE looking to optimize your observability setup, this is the definitive reference for Grafana dashboards within the kube-prometheus-stack.

Grafana Dashboards Overview in Kube-Prometheus-Stack

The dashboards shipped with kube-prometheus-stack come from the kubernetes-mixin project — a community-maintained collection of Grafana dashboards and Prometheus alerting rules specifically designed for Kubernetes environments. These dashboards are provisioned automatically via Kubernetes ConfigMaps, which means they are loaded into Grafana without any manual import or configuration.

The Grafana deployment in the stack includes a sidecar container that watches for ConfigMaps with the label grafana_dashboard: "1". When it finds a matching ConfigMap, it automatically loads the dashboard JSON into Grafana. This mechanism is what makes the dashboards "self-provisioning" — they appear the moment the stack is installed and update automatically when the chart is upgraded.

All dashboards use Prometheus as the data source, querying the same metrics that the stack collects automatically from your cluster. No additional data source configuration is needed — the connection between Grafana and Prometheus is established during installation.

Complete List of Default Dashboards

The kube-prometheus-stack ships with dashboards organized into several categories. Here is the complete breakdown of what you get out of the box:

Kubernetes Cluster Dashboards

  • Kubernetes / Compute Resources / Cluster — Overall cluster CPU, memory, and network usage across all nodes and namespaces.
  • Kubernetes / Compute Resources / Namespace (Pods) — Resource consumption broken down by namespace, showing each pod's CPU and memory requests, limits, and actual usage.
  • Kubernetes / Compute Resources / Namespace (Workloads) — Same data grouped by workload type (Deployment, StatefulSet, DaemonSet).
  • Kubernetes / Compute Resources / Node (Pods) — Per-node resource utilization showing which pods consume the most resources on each node.
  • Kubernetes / Compute Resources / Pod — Deep dive into a single pod's CPU, memory, network, and filesystem usage over time.
  • Kubernetes / Compute Resources / Workload — Resource metrics for a specific workload across all its replicas.

Kubernetes Networking Dashboards

  • Kubernetes / Networking / Cluster — Cluster-wide network bandwidth, packet rates, and error counts.
  • Kubernetes / Networking / Namespace (Pods) — Network traffic per namespace with pod-level breakdowns.
  • Kubernetes / Networking / Namespace (Workload) — Network metrics organized by workload type within each namespace.
  • Kubernetes / Networking / Pod — Per-pod network receive/transmit rates, packets, and drops.

Control Plane Dashboards

  • Kubernetes / API Server — Request rates, latency percentiles, error rates, and in-flight request counts for the Kubernetes API server.
  • Kubernetes / Controller Manager — Work queue depths, processing rates, and reconciliation latencies.
  • Kubernetes / Scheduler — Scheduling attempt rates, latencies, pending pods, and scheduler health.
  • Kubernetes / Kubelet — Pod lifecycle operations, container runtime metrics, and volume manager statistics per node.
  • Kubernetes / Proxy — kube-proxy sync rates and rule counts.

Node and Infrastructure Dashboards

  • Node Exporter / Nodes — CPU, memory, disk, and network metrics for all nodes in a single view.
  • Node Exporter / USE Method / Node — Utilization, saturation, and errors for each node — following the USE methodology by Brendan Gregg.
  • Node Exporter / MacOS — Specialized view for macOS development nodes (if applicable).

Prometheus Internal Dashboards

  • Prometheus / Overview — Prometheus server health: ingestion rate, active time series, WAL size, and query performance.
  • Prometheus / Remote Write — Metrics for remote_write pipelines when using Thanos, Cortex, or Grafana Cloud.
  • Alertmanager / Overview — Active alerts, notification rates, and Alertmanager cluster health.

Using the Cluster Health Dashboard

The Kubernetes / Compute Resources / Cluster dashboard is typically the first dashboard teams use daily. It provides a bird's-eye view of your entire cluster's resource utilization. The key panels to focus on are:

CPU Usage vs. Requests vs. Limits — This panel reveals whether your cluster is over-provisioned or under-provisioned. If actual CPU usage is consistently far below requests, you are wasting money on idle compute. If usage regularly exceeds requests, pods may experience throttling.

Memory Usage vs. Requests vs. Limits — Memory overcommitment is more dangerous than CPU overcommitment because exceeding memory limits triggers OOM kills. This panel helps you find the right balance between memory efficiency and stability.

Namespace Breakdown — The cluster dashboard breaks down resource consumption by namespace, making it easy to identify which teams or applications consume the most cluster resources. This data is invaluable for cost allocation and capacity planning.

Node Monitoring Dashboards

The Node Exporter dashboards provide infrastructure-level visibility that Kubernetes-native metrics alone cannot provide. The Node Exporter / USE Method / Node dashboard is particularly valuable because it follows a proven methodology:

Utilization — What percentage of each resource (CPU, memory, disk, network) is being used? High utilization is not always a problem, but sustained 90%+ utilization indicates a node that needs scaling.

Saturation — Is work being queued because the resource is fully utilized? CPU saturation shows up as high load averages relative to CPU count. Memory saturation appears as swap usage or OOM pressure. Disk saturation manifests as I/O wait times.

Errors — Are there hardware or software errors on the node? Network interface errors, disk read/write errors, and NMI interrupts all surface here and can indicate failing hardware before it causes outages.

Pod and Workload Dashboards

The pod-level dashboards allow you to drill down into individual application behavior. The Kubernetes / Compute Resources / Pod dashboard shows:

  • CPU Usage — Actual CPU consumption compared to requests and limits. Helps identify pods that need resource tuning.
  • Memory Usage (RSS) — Resident set size shows actual physical memory used by the pod, which is the metric that triggers OOM kills.
  • Network I/O — Receive and transmit bandwidth per pod, useful for identifying chatty services or unexpected network spikes.
  • Filesystem Reads/Writes — Disk I/O patterns that help identify storage-intensive workloads or pods with excessive logging.

Adding Custom Dashboards to Kube-Prometheus-Stack

While the default dashboards cover Kubernetes infrastructure comprehensively, most teams need custom dashboards for their own applications. Once you have set up ServiceMonitors to scrape your application metrics, you will want dashboards to visualize them. There are three methods to add custom dashboards:

Method 1: ConfigMap Provisioning (Recommended)

Create a ConfigMap containing your dashboard JSON with the grafana_dashboard: "1" label. The Grafana sidecar detects it automatically:

custom-dashboard-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-app-dashboard
  namespace: monitoring
  labels:
    grafana_dashboard: "1"
data:
  my-app-dashboard.json: |
    { "dashboard": { ... your dashboard JSON ... } }

Method 2: Helm Values

You can embed dashboards directly in your Helm values.yaml under the grafana.dashboards section. This approach keeps everything in one place but can make values files very large.

Method 3: Grafana UI

Create dashboards manually in the Grafana UI. However, dashboards created this way are not persistent by default — they are lost if the Grafana pod restarts unless you enable Grafana persistence in your values.yaml. ConfigMap provisioning is always more reliable for production.

Dashboard Best Practices for Production

  1. Use dashboard variables — All default dashboards use Grafana template variables (cluster, namespace, pod) that let users filter views without editing the dashboard. Follow this pattern in your custom dashboards.
  2. Set appropriate time ranges — Default to 1-hour or 6-hour views for operational dashboards. Use 7-day or 30-day views for capacity planning dashboards.
  3. Create tiered dashboards — Build a hierarchy: cluster overview → namespace → workload → pod. Each level links to the next for drill-down navigation.
  4. Export dashboards as code — Always export working dashboards as JSON and store them in Git. This ensures they survive cluster migrations and chart upgrades.
  5. Monitor Grafana itself — The Prometheus / Overview dashboard shows Prometheus health, but also monitor Grafana's datasource latency and rendering times.
  6. Use recording rules — For dashboards that query expensive PromQL expressions, create Prometheus recording rules to pre-compute results and improve dashboard load times.

Conclusion

The Grafana dashboards in kube-prometheus-stack are one of the stack's greatest strengths. They give teams immediate production-grade visibility into every layer of their Kubernetes infrastructure — from hardware metrics on individual nodes to API server performance and pod-level resource consumption.

By understanding what each dashboard monitors, customizing views for your specific applications, and following dashboard best practices, you can build an observability setup that genuinely serves your team when it matters most — during incidents, capacity planning, and day-to-day operations.

Ready to Deploy?

Get your full Kubernetes observability stack running in minutes with the official Helm chart.

Quick Install Guide Helm Chart Docs