Kube-Prometheus-Stack Helm Chart Documentation: The Complete 2026 Guide
If you have searched for the kube-prometheus-stack Helm chart documentation, you have probably already seen the official GitHub README, the Artifact Hub page, and a handful of blog posts that cover only the basics. What most of those resources miss is the depth that platform engineers, site reliability engineers, and DevOps teams actually need — the full picture from first install all the way through production-grade configuration, CRD management, upgrading, and real-world troubleshooting.
This guide fills that gap. It is written for engineers who want to understand not just how to run the commands, but why the chart works the way it does, what every critical configuration option actually controls, and how to build a monitoring stack that survives the real pressures of production Kubernetes clusters.
What Is the Kube-Prometheus-Stack Helm Chart?
The kube-prometheus-stack is an official Helm chart maintained by the prometheus-community organization on GitHub. It packages the entire kube-prometheus monitoring stack — Prometheus, Grafana, Alertmanager, Prometheus Operator, Node Exporter, and kube-state-metrics — into a single installable unit.
The chart was originally called the prometheus-operator chart. It was renamed to kube-prometheus-stack to better reflect what it actually deploys: not just the Prometheus Operator, but the entire upstream kube-prometheus project stack. This distinction matters because the Prometheus Operator is just one component among many.
The chart is distributed in two ways. The first is via the traditional Helm repository at https://prometheus-community.github.io/helm-charts. The second, and increasingly preferred method, is via the OCI registry at oci://ghcr.io/prometheus-community/charts/kube-prometheus-stack. The OCI distribution method aligns with modern Helm 3 standards and offers better versioning guarantees.
As of 2026, the latest stable release is version 84.5.0, reflecting years of continuous development and close alignment with upstream Prometheus Operator releases.
Prerequisites Before Installing the Helm Chart
Before running a single Helm command, your environment needs to meet several requirements. Skipping this verification step is the most common cause of failed installations and confusing errors.
- Kubernetes v1.20+ — Earlier versions are unsupported and may produce unexpected behavior around CRD API versions.
- Helm v3 — Helm 2 is not supported. Ensure
helm versionreturns a v3.x release. - kubectl configured with cluster access and cluster-admin RBAC — required for creating ClusterRoles, ClusterRoleBindings, and cluster-scoped CRDs.
- A default StorageClass — Without persistent storage, Prometheus and Alertmanager lose all data on pod restart. AWS, GCP, Azure, and DigitalOcean provide default storage classes automatically. Bare-metal clusters need Rook-Ceph, Longhorn, or OpenEBS.
Adding the Prometheus Community Helm Repository
Before installing, register the prometheus-community repository in your local Helm configuration:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update
The helm repo update command fetches the latest chart index. Run it every time before installing or upgrading — Helm caches the repository index locally and that cache can become stale. To see all available chart versions:
helm search repo kube-prometheus-stack --versions
Installing the Kube-Prometheus-Stack Helm Chart
The simplest installation command creates a dedicated monitoring namespace and installs the chart with all default values:
# Install via Helm repository helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack \ --namespace monitoring \ --create-namespace # Or install via OCI registry (modern approach) helm install kube-prometheus-stack \ oci://ghcr.io/prometheus-community/charts/kube-prometheus-stack \ --namespace monitoring \ --create-namespace
After installation, verify all pods are running:
# Check all pods are Running kubectl get pods -n monitoring # Check services created kubectl get svc -n monitoring
A healthy installation shows pods for the Prometheus Operator, Prometheus server, Alertmanager, Grafana, Node Exporter (DaemonSet on every node), and kube-state-metrics — all in Running state within two to three minutes.
Understanding the Chart's Default Values
Every configuration option in the kube-prometheus-stack Helm chart is controlled by the values.yaml file. The default values.yaml is one of the most comprehensive in the entire Helm ecosystem — it runs to thousands of lines and controls every aspect of every component.
# View full default values helm show values prometheus-community/kube-prometheus-stack # Save to file for inspection and customization helm show values prometheus-community/kube-prometheus-stack > default-values.yaml
The best practice is to create your own custom values file that overrides only the settings you want to change, and pass it to Helm with the -f flag. This keeps customizations clean, version-controllable, and easy to review.
Key Configuration Sections in values.yaml
Prometheus Configuration
The prometheus.prometheusSpec section controls the Prometheus server. The most critical settings for production are storage, retention, resources, and scrape intervals.
prometheus: prometheusSpec: retention: 30d retentionSize: 45GB scrapeInterval: 30s evaluationInterval: 30s resources: requests: memory: 2Gi cpu: 500m limits: memory: 4Gi cpu: 2000m storageSpec: volumeClaimTemplate: spec: storageClassName: standard accessModes: ["ReadWriteOnce"] resources: requests: storage: 50Gi
Memory sizing is one of the most commonly misconfigured settings. As a practical rule, every 100,000 active time series requires approximately 2–4 GB of RAM. A medium-sized cluster with 50 nodes and several hundred pods might generate 200,000–500,000 active series. Plan your resource requests accordingly.
Alertmanager Configuration
The alertmanager.alertmanagerSpec section controls the Alertmanager deployment. The Alertmanager routing configuration — defining which alerts go to which channels — is managed through a Kubernetes Secret or an AlertmanagerConfig CRD.
alertmanager: alertmanagerSpec: storage: volumeClaimTemplate: spec: storageClassName: standard accessModes: ["ReadWriteOnce"] resources: requests: storage: 10Gi resources: requests: memory: 256Mi cpu: 100m limits: memory: 512Mi cpu: 500m
Grafana Configuration
Key Grafana settings include enabling persistence, changing default credentials, and configuring Ingress. The default admin password is prom-operator — change this immediately in production.
grafana: enabled: true adminPassword: "your-secure-password" persistence: enabled: true storageClassName: standard size: 10Gi ingress: enabled: true hosts: - grafana.your-domain.com
Node Exporter Configuration
Node Exporter runs as a DaemonSet. Without the appropriate tolerations, it will not run on tainted control plane nodes, leaving gaps in your infrastructure monitoring coverage.
nodeExporter: enabled: true tolerations: - operator: "Exists"
Understanding Custom Resource Definitions (CRDs)
CRD management is the most important aspect of the kube-prometheus-stack that competitors consistently underexplain. Getting CRDs wrong is the most common cause of painful upgrade failures. The chart installs these CRDs:
alertmanagerconfigs.monitoring.coreos.comalertmanagers.monitoring.coreos.compodmonitors.monitoring.coreos.comprobes.monitoring.coreos.comprometheusagents.monitoring.coreos.comprometheuses.monitoring.coreos.comprometheusrules.monitoring.coreos.comscrapeconfigs.monitoring.coreos.comservicemonitors.monitoring.coreos.comthanosrulers.monitoring.coreos.com
helm upgrade. This is deliberate — CRD updates can be destructive. You must apply CRD updates manually before upgrading when a new chart version changes CRD schemas.
# Apply updated CRDs manually before helm upgrade kubectl apply --server-side \ -f https://raw.githubusercontent.com/prometheus-community/helm-charts/main/charts/kube-prometheus-stack/charts/crds/crds/
ServiceMonitor and PodMonitor: Monitoring Your Own Applications
ServiceMonitor and PodMonitor CRDs allow you to add monitoring for your own applications beyond Kubernetes system components. A ServiceMonitor tells Prometheus to scrape metrics from a Kubernetes Service:
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: my-webapp-monitor namespace: monitoring labels: release: kube-prometheus-stack # Required label spec: selector: matchLabels: app: my-webapp endpoints: - port: http path: /metrics interval: 30s namespaceSelector: matchNames: - production
release: kube-prometheus-stack label is required. By default, Prometheus only picks up ServiceMonitors carrying this label. Without it, Prometheus will silently ignore your ServiceMonitor. This behavior is configurable via prometheus.prometheusSpec.serviceMonitorSelector.
PrometheusRule: Defining Alerting Rules
PrometheusRule CRDs allow you to define alerting and recording rules as native Kubernetes objects. The Prometheus Operator picks them up automatically — no restart required.
apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: name: my-app-alerts namespace: monitoring labels: release: kube-prometheus-stack # Required label spec: groups: - name: my-app.rules rules: - alert: MyAppCrashLooping expr: rate(kube_pod_container_status_restarts_total{namespace="production"}[5m]) > 0 for: 5m labels: severity: critical annotations: summary: "Pod is crash looping" runbook_url: "https://your-runbooks.com/crash-loop"
Upgrading the Kube-Prometheus-Stack Helm Chart
Upgrading must be done carefully, especially for major version bumps that include CRD changes. The basic upgrade command is:
# Preview changes with helm-diff plugin first helm diff upgrade kube-prometheus-stack prometheus-community/kube-prometheus-stack \ --namespace monitoring \ -f your-custom-values.yaml # Run the upgrade helm upgrade kube-prometheus-stack prometheus-community/kube-prometheus-stack \ --namespace monitoring \ -f your-custom-values.yaml # Rollback if something goes wrong helm rollback kube-prometheus-stack --namespace monitoring
Upgrade checklist: Check release notes for breaking changes → apply CRD updates manually if required → test in staging first → use helm diff to preview changes → upgrade production.
Accessing Prometheus, Grafana, and Alertmanager
By default, all three UIs are only accessible within the cluster via ClusterIP services. Use kubectl port-forward for local access:
# Prometheus — http://localhost:9090 kubectl port-forward -n monitoring svc/kube-prometheus-stack-prometheus 9090:9090 # Grafana — http://localhost:3000 (admin / prom-operator) kubectl port-forward -n monitoring svc/kube-prometheus-stack-grafana 3000:80 # Alertmanager — http://localhost:9093 kubectl port-forward -n monitoring svc/kube-prometheus-stack-alertmanager 9093:9093
For persistent production access, configure Ingress in values.yaml for each component, or change the service type to LoadBalancer for a cloud provider IP.
Production Best Practices
- Enable persistent storage for all stateful components. Configure
storageSpecfor Prometheus and Alertmanager, andpersistencefor Grafana. - Set resource requests and limits for every component. Prometheus can become very memory-hungry as your cluster grows. Starting without limits is a common path to destabilizing your cluster during metrics spikes.
- Configure Pod Disruption Budgets (PDBs) to ensure at least one replica remains running during node maintenance, preventing monitoring blackouts.
- Use a dedicated monitoring namespace. Isolate all monitoring workloads from application workloads for cleaner RBAC and to prevent resource contention.
- Pin chart versions in production. Never deploy
latest. Specify an exact chart version and upgrade deliberately after testing in staging. - Integrate with GitOps tooling (ArgoCD or Flux). Store your custom values.yaml in Git alongside other infrastructure configuration for a full audit trail.
- Plan for high-cardinality metrics. Labels with very high unique values (user_id, request_id) cause Prometheus memory to grow dramatically. Use recording rules to pre-aggregate high-cardinality queries.
- Consider Thanos for scale. For large clusters or multi-cluster scenarios, the chart integrates cleanly with Thanos for long-term metric storage in S3/GCS and global querying.
Uninstalling the Helm Chart
# Uninstall the chart (CRDs are NOT deleted automatically) helm uninstall kube-prometheus-stack --namespace monitoring # Only if you want to fully remove ALL CRDs (destructive — removes all custom resources) kubectl delete crd alertmanagerconfigs.monitoring.coreos.com kubectl delete crd alertmanagers.monitoring.coreos.com kubectl delete crd podmonitors.monitoring.coreos.com kubectl delete crd prometheuses.monitoring.coreos.com kubectl delete crd prometheusrules.monitoring.coreos.com kubectl delete crd servicemonitors.monitoring.coreos.com kubectl delete crd thanosrulers.monitoring.coreos.com
Common Troubleshooting Scenarios
Pods stuck in Pending state
Almost always means missing PVCs (no storage class available) or insufficient cluster resources. Run kubectl describe pod <pod-name> -n monitoring for the specific reason.
Grafana shows no data
Confirm Prometheus is listed as a data source in Grafana (Configuration → Data Sources). If the source exists but shows errors, verify the Prometheus service URL matches the actual service name and port in your monitoring namespace.
ServiceMonitor not being picked up
The most common cause is a missing or incorrect release: kube-prometheus-stack label on the ServiceMonitor object. Check that the label matches prometheus.prometheusSpec.serviceMonitorSelector.matchLabels.
Helm upgrade failing with CRD errors
Apply CRD updates manually before upgrading when the new chart version changes CRD schemas. Follow the CRD management section above.
High Prometheus memory usage
Investigate high-cardinality metrics via the Prometheus UI: navigate to Status → TSDB Status to see which labels consume the most memory. Address by relabeling, dropping, or aggregating high-cardinality time series.
Conclusion
The kube-prometheus-stack Helm chart is the most comprehensive and production-proven monitoring solution available for Kubernetes in 2026. Understanding the chart deeply — beyond the basic install command — is what separates a monitoring setup that just runs from one that genuinely serves your team when things go wrong.
Proper CRD management, persistent storage configuration, meaningful resource sizing, ServiceMonitor and PrometheusRule best practices, and integration with GitOps workflows are all the difference between a fragile monitoring installation and a truly production-grade observability platform.
Ready to Deploy?
Get your full Kubernetes observability stack running in minutes with the official Helm chart.