Prometheus Operator ServiceMonitor Guide: How to Monitor Custom Applications in Kubernetes (2026)
When you deploy kube-prometheus-stack, Prometheus automatically scrapes metrics from Kubernetes internals — the API server, kubelet, node-exporter, and the control plane. But the moment you deploy your own application, Prometheus has no idea it exists. That is where ServiceMonitor and PodMonitor come in — the Prometheus Operator's custom resource definitions (CRDs) that let you declaratively tell Prometheus exactly how to discover and scrape metrics from your services.
This guide covers everything you need to know about ServiceMonitor and PodMonitor resources: how they work, how to write them correctly, how to debug them when they fail, and how to combine them with PrometheusRule to build a complete custom monitoring pipeline for any application running in Kubernetes.
What is the Prometheus Operator?
The Prometheus Operator is a Kubernetes operator that manages Prometheus instances, Alertmanager clusters, and related monitoring resources using Kubernetes-native custom resource definitions. It was originally created by CoreOS (now part of Red Hat) and is maintained under the prometheus-operator GitHub organization. When you install the kube-prometheus-stack Helm chart, the Prometheus Operator is the core component that orchestrates everything.
Without the Prometheus Operator, configuring Prometheus to scrape a new target requires editing the prometheus.yml configuration file, adding a new scrape_config block, and then reloading or restarting Prometheus. In a Kubernetes environment where services scale dynamically and pods come and go, this static approach becomes unmanageable.
The Prometheus Operator solves this by introducing several CRDs that translate Kubernetes-native resource definitions into Prometheus configuration:
- Prometheus — Defines a Prometheus server instance, including its retention, storage, replicas, and which ServiceMonitors and PodMonitors it should watch.
- ServiceMonitor — Declares how Prometheus should discover and scrape metrics from Kubernetes Services.
- PodMonitor — Declares how Prometheus should discover and scrape metrics directly from pods, without requiring a Service.
- PrometheusRule — Defines recording rules and alerting rules that Prometheus should evaluate.
- Alertmanager — Defines an Alertmanager instance for routing and deduplicating alert notifications.
- AlertmanagerConfig — Allows namespaced, fine-grained configuration of alert routing and receivers.
The operator watches for changes to these CRDs and automatically regenerates the Prometheus configuration. This means when you create a ServiceMonitor, the operator detects it, generates the corresponding scrape_config, updates the Prometheus ConfigMap (or Secret), and triggers a configuration reload — all without downtime or manual intervention.
Understanding ServiceMonitor CRDs
A ServiceMonitor is the most common way to tell Prometheus how to scrape your application. It works by targeting Kubernetes Services — specifically, by matching Service labels with a selector and defining which ports and paths to scrape. The Prometheus Operator then uses the Kubernetes API to discover all Endpoints behind those Services and configures Prometheus to scrape each pod individually.
Here is the anatomy of a ServiceMonitor resource:
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: my-app-monitor namespace: monitoring labels: release: kube-prometheus-stack # Must match Prometheus serviceMonitorSelector spec: selector: # Which Services to target matchLabels: app: my-app namespaceSelector: # Which namespaces to search matchNames: - default endpoints: # How to scrape each Service port - port: metrics # Must match a named port in the Service path: /metrics interval: 30s scrapeTimeout: 10s
The flow is as follows: the Prometheus Operator watches for ServiceMonitor resources. When it finds one, it reads the selector to determine which Services match. It then queries the Kubernetes API for the Endpoints of those Services to discover the individual pod IPs. Finally, it generates a Prometheus scrape_config that targets each pod on the specified port and path, and injects this configuration into the running Prometheus instance.
This is fundamentally different from the older kubernetes_sd_config approach where Prometheus itself does the service discovery. With the Prometheus Operator, the operator handles discovery through the Kubernetes API and pushes the finalized target list to Prometheus. This separation of concerns makes the system more maintainable and allows for fine-grained RBAC control over who can create monitoring configurations.
Key Fields Explained
- spec.selector.matchLabels — Label selector that must match the labels on the target Kubernetes Service (not the pods, not the Deployment — the Service itself).
- spec.endpoints[].port — The name of the port in the Service spec to scrape. This is the port name, not the port number. The name must exactly match a named port in the Service definition.
- spec.endpoints[].path — The HTTP path where metrics are exposed. Defaults to
/metricsif omitted. - spec.endpoints[].interval — How often Prometheus scrapes this target. Overrides the global scrape interval.
- spec.namespaceSelector — Restricts which namespaces the selector searches. If omitted, only the ServiceMonitor's own namespace is searched.
- metadata.labels — Critical for discoverability. The Prometheus resource has a
serviceMonitorSelectorthat filters which ServiceMonitors it watches. Your ServiceMonitor must have labels that match this selector.
Creating Your First ServiceMonitor
Let us walk through a complete, working example. Suppose you have a Go application that exposes Prometheus metrics on port 8080 at /metrics. Here is the full setup from Deployment to ServiceMonitor.
Step 1: Application Deployment
apiVersion: apps/v1 kind: Deployment metadata: name: my-web-app namespace: default spec: replicas: 3 selector: matchLabels: app: my-web-app template: metadata: labels: app: my-web-app version: v1.2.0 spec: containers: - name: app image: my-registry/my-web-app:v1.2.0 ports: - name: http containerPort: 8080 - name: metrics containerPort: 9090
Step 2: Service Definition
apiVersion: v1 kind: Service metadata: name: my-web-app namespace: default labels: app: my-web-app # ServiceMonitor selector matches THIS label spec: selector: app: my-web-app ports: - name: http port: 80 targetPort: 8080 - name: metrics # ServiceMonitor endpoints[].port matches THIS name port: 9090 targetPort: 9090
Step 3: ServiceMonitor
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: my-web-app-monitor namespace: monitoring labels: release: kube-prometheus-stack spec: selector: matchLabels: app: my-web-app namespaceSelector: matchNames: - default endpoints: - port: metrics path: /metrics interval: 30s scrapeTimeout: 10s honorLabels: false
Apply all three resources, then verify the ServiceMonitor was created:
kubectl apply -f deployment.yaml -f service.yaml -f servicemonitor.yaml kubectl get servicemonitor -n monitoring kubectl get endpoints my-web-app -n default
After roughly 30-60 seconds, the Prometheus Operator will detect the new ServiceMonitor, generate the scrape configuration, and reload Prometheus. You can verify the targets appear in the Prometheus UI by port-forwarding to the Prometheus service and navigating to Status > Targets.
The critical label to understand here is release: kube-prometheus-stack on the ServiceMonitor. By default, the kube-prometheus-stack Helm chart configures the Prometheus resource with serviceMonitorSelector.matchLabels.release: kube-prometheus-stack. If your ServiceMonitor does not have this label, Prometheus will never pick it up. You can change this behavior by setting prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues: false in your Helm values to make Prometheus watch all ServiceMonitors regardless of labels.
PodMonitor: When to Use It Instead
A PodMonitor works the same way as a ServiceMonitor, except it targets pods directly rather than going through a Service. The Prometheus Operator watches for PodMonitor resources, discovers matching pods via the Kubernetes API, and generates scrape configurations for each pod.
Use a PodMonitor when:
- The application has no Service — DaemonSets that expose node-level metrics, batch Jobs that run periodically, or standalone pods without a Service definition.
- Metrics are on a different port than the Service exposes — If your application's main Service exposes port 8080 for HTTP but metrics are on a sidecar container's port 9090 that is not included in the Service spec.
- You need pod-level label selection — When you want to scrape pods based on pod labels rather than Service labels, PodMonitor gives you direct access to pod metadata.
apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: my-daemonset-monitor namespace: monitoring labels: release: kube-prometheus-stack spec: selector: matchLabels: app: node-metrics-agent namespaceSelector: matchNames: - kube-system podMetricsEndpoints: - port: metrics path: /metrics interval: 60s scrapeTimeout: 15s
Notice the key difference: PodMonitor uses podMetricsEndpoints instead of endpoints, and the selector matches pod labels directly instead of Service labels. The port name must match a named containerPort on the pod spec. Everything else — namespace selection, label matching, scrape intervals — works identically to ServiceMonitor.
In practice, ServiceMonitor is the right choice about 90% of the time. Most applications in Kubernetes have a Service, and using ServiceMonitor aligns with how the Prometheus Operator is designed to work. Reserve PodMonitor for the edge cases where a Service either does not exist or does not expose the metrics port.
Label Matching and Selector Configuration
Label selectors are where most ServiceMonitor configuration errors occur. There are three distinct layers of label matching that must all align for monitoring to work:
Layer 1: Prometheus to ServiceMonitor
The Prometheus custom resource has a serviceMonitorSelector field that determines which ServiceMonitors the operator picks up. In kube-prometheus-stack, this is typically configured as:
apiVersion: monitoring.coreos.com/v1 kind: Prometheus metadata: name: kube-prometheus-stack-prometheus spec: serviceMonitorSelector: matchLabels: release: kube-prometheus-stack # ServiceMonitor must have this label serviceMonitorNamespaceSelector: {} # Empty = all namespaces
Your ServiceMonitor's metadata.labels must include release: kube-prometheus-stack (or whatever your Prometheus resource expects). To check what your Prometheus expects:
kubectl get prometheus -n monitoring -o jsonpath='{.items[*].spec.serviceMonitorSelector}' Layer 2: ServiceMonitor to Service
The spec.selector inside the ServiceMonitor must match labels on the target Service. This is the label on the Service's metadata.labels, not the Service's spec.selector (which selects pods).
Layer 3: Port Name Matching
The endpoints[].port value must exactly match a named port in the Service's spec.ports[]. If the Service defines the port as http-metrics but the ServiceMonitor references metrics, it will silently fail with no targets discovered.
For more complex selection logic, you can use matchExpressions instead of matchLabels:
spec: selector: matchExpressions: - key: app.kubernetes.io/name operator: In values: - my-app - my-app-canary - key: monitoring operator: NotIn values: - disabled
This selector targets Services that have app.kubernetes.io/name set to either my-app or my-app-canary, and do not have the label monitoring: disabled. The matchExpressions syntax supports operators: In, NotIn, Exists, and DoesNotExist.
Namespace Selectors and Cross-Namespace Monitoring
By default, a ServiceMonitor only looks for Services in its own namespace. In production environments, you typically place all ServiceMonitors in the monitoring namespace while your applications run in different namespaces. The namespaceSelector field controls this behavior.
Monitor Specific Namespaces
spec: namespaceSelector: matchNames: - production - staging - backend-services
Monitor All Namespaces
spec: namespaceSelector: any: true # Discovers Services across ALL namespaces
Using any: true is convenient but has implications. First, the Prometheus Operator needs RBAC permissions to list Services and Endpoints in all namespaces. The kube-prometheus-stack chart grants these permissions by default. Second, in large clusters with many namespaces, a broadly scoped ServiceMonitor can increase Prometheus load as it discovers more targets. Be intentional about which namespaces you monitor and use specific namespace selectors when possible.
There is also the serviceMonitorNamespaceSelector on the Prometheus resource itself, which controls which namespaces the operator looks for ServiceMonitor resources in (as opposed to which namespaces the ServiceMonitor searches for Services). Setting it to {} (empty) means the operator watches all namespaces for ServiceMonitors.
Configuring Scrape Intervals and Timeouts
Every endpoint in a ServiceMonitor can specify its own scrape interval and timeout, overriding the global Prometheus defaults. Choosing the right values requires balancing data granularity against Prometheus resource consumption.
spec: endpoints: - port: metrics path: /metrics interval: 15s # Scrape every 15 seconds scrapeTimeout: 10s # Timeout must be less than interval honorLabels: false # Prevent target from overriding job/instance labels honorTimestamps: true # Use timestamps from the target if present scheme: http # Use https for TLS-enabled endpoints metricRelabelings: # Drop or rename metrics after scraping - sourceLabels: [__name__] regex: 'go_.*' action: drop # Drop all Go runtime metrics to save storage
Choosing the Right Interval
- 15s — Suitable for critical production services where you need near-real-time alerting. Increases Prometheus CPU, memory, and storage consumption proportionally.
- 30s — The default for most applications. Provides good granularity for dashboards and alerting without excessive overhead. This is the recommended starting point.
- 60s — Appropriate for infrastructure metrics, batch jobs, or lower-priority services. Reduces Prometheus load by 50% compared to 30s intervals.
- 300s (5m) — Use for metrics that change slowly, such as disk capacity, certificate expiry, or configuration drift metrics. Not suitable for latency or error rate alerting.
The scrapeTimeout must always be less than the interval. If a scrape takes longer than the timeout, Prometheus marks the target as unhealthy. For most applications, a timeout of 10s with a 30s interval works well. If your application exposes thousands of metrics and scrapes are slow, increase both values proportionally.
The metricRelabelings field is extremely useful for controlling storage costs. Applications instrumented with Prometheus client libraries often expose hundreds of Go runtime metrics (go_gc_*, go_memstats_*) that you may not need. Dropping them at scrape time prevents them from ever entering the time-series database, saving both storage and query performance.
Debugging ServiceMonitor Issues
When a ServiceMonitor is not working, the symptoms are always the same: the target does not appear in Prometheus, and no metrics are collected. Debugging requires checking each layer systematically.
Step 1: Verify the ServiceMonitor exists
kubectl get servicemonitor -n monitoring kubectl describe servicemonitor my-web-app-monitor -n monitoring
Step 2: Check the Prometheus operator logs
kubectl logs -n monitoring deployment/kube-prometheus-stack-operator --tail=50 The operator logs will tell you if it detected the ServiceMonitor and whether it encountered errors generating the scrape config. Common errors include RBAC permission issues or invalid ServiceMonitor specs.
Step 3: Verify label selectors match
# Check what labels the Prometheus resource expects kubectl get prometheus -n monitoring -o yaml | grep -A5 serviceMonitorSelector # Check what labels your ServiceMonitor has kubectl get servicemonitor my-web-app-monitor -n monitoring --show-labels # Verify the target Service exists and has matching labels kubectl get svc -n default --show-labels | grep my-web-app # Confirm Endpoints exist (pods are running and selected by the Service) kubectl get endpoints my-web-app -n default
Step 4: Check the Prometheus targets directly
kubectl port-forward -n monitoring svc/kube-prometheus-stack-prometheus 9090:9090 Open http://localhost:9090/targets in your browser. If the target is listed but marked as DOWN, the issue is with the scrape itself (wrong port, wrong path, network policy blocking access, or the application is not exposing metrics correctly). If the target is not listed at all, the issue is with the label selectors or namespace selectors.
Step 5: Test the metrics endpoint directly
# Run a curl pod to test the metrics endpoint from within the cluster kubectl run curl-test --rm -it --image=curlimages/curl -- \ curl -s http://my-web-app.default.svc:9090/metrics | head -20
This confirms whether the application is actually exposing valid Prometheus metrics. The output should be plain text in the Prometheus exposition format, with lines like http_requests_total{method="GET",status="200"} 1234.
Common Issues Checklist
- Missing release label — The ServiceMonitor does not have
release: kube-prometheus-stack(or the label your Prometheus expects). - Port name mismatch — The
endpoints[].portin the ServiceMonitor does not match any named port in the Service spec. - Wrong namespace selector — The ServiceMonitor's
namespaceSelectordoes not include the namespace where the Service lives. - Service selector mismatch — The ServiceMonitor's
selector.matchLabelsdoes not match the Service'smetadata.labels. - No Endpoints — The Service has no Endpoints because pods are not running, are not ready, or the Service's
spec.selectordoes not match the pod labels. - Network policies — A NetworkPolicy is blocking Prometheus from reaching the target pods on the metrics port.
- RBAC permissions — The Prometheus Operator service account lacks permissions to list Services or Endpoints in the target namespace.
PrometheusRule for Custom Alerts
Once your ServiceMonitor is collecting metrics, the next step is to create alerting rules using the PrometheusRule CRD. This gives you a complete monitoring pipeline: your application exposes metrics, ServiceMonitor tells Prometheus how to scrape them, and PrometheusRule defines what conditions should trigger alerts that are sent to Alertmanager for notification routing.
apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: name: my-web-app-alerts namespace: monitoring labels: release: kube-prometheus-stack spec: groups: - name: my-web-app.rules rules: # Alert: High error rate - alert: MyWebAppHighErrorRate expr: | sum(rate(http_requests_total{job="my-web-app",status=~"5.."}[5m])) / sum(rate(http_requests_total{job="my-web-app"}[5m])) > 0.05 for: 5m labels: severity: critical team: backend annotations: summary: "High error rate on my-web-app" description: "Error rate is {{ $value | humanizePercentage }} (threshold: 5%)" runbook_url: "https://wiki.internal/runbooks/my-web-app-errors" # Alert: High latency on P99 - alert: MyWebAppHighLatency expr: | histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket{job="my-web-app"}[5m])) by (le)) > 2.0 for: 10m labels: severity: warning team: backend annotations: summary: "P99 latency above 2s on my-web-app" description: "P99 latency is {{ $value | humanizeDuration }}" # Recording rule: Pre-compute request rate for dashboards - record: my_web_app:http_requests:rate5m expr: sum(rate(http_requests_total{job="my-web-app"}[5m])) by (status, method)
The PrometheusRule resource follows the same label convention as ServiceMonitor — it must have the release: kube-prometheus-stack label (or whatever your Prometheus resource's ruleSelector expects). The Prometheus Operator detects the PrometheusRule, injects the rules into the Prometheus configuration, and reloads Prometheus automatically.
A few best practices for PrometheusRule definitions:
- Always include a
forduration — This prevents transient spikes from triggering alerts. Afor: 5mclause means the condition must be true for 5 consecutive minutes before the alert fires. - Use severity labels — Standardize on
severity: critical,severity: warning, andseverity: info. Alertmanager can route alerts differently based on severity. - Add runbook URLs — Include an
annotations.runbook_urlin every alert. When an engineer gets paged at 3am, a runbook link is the most valuable thing you can provide. - Use recording rules for dashboards — Pre-compute expensive PromQL expressions as recording rules. Dashboards that query recording rules load significantly faster than those computing aggregations on the fly.
- Test rules before deploying — Use
promtool check rulesto validate rule syntax locally before applying to the cluster.
Conclusion
The Prometheus Operator's ServiceMonitor and PodMonitor CRDs are the standard way to extend Prometheus monitoring to your own applications in Kubernetes. They replace manual scrape configuration with a declarative, Kubernetes-native approach that integrates seamlessly with the kube-prometheus-stack.
The key principles to remember are: always check the three layers of label matching (Prometheus to ServiceMonitor, ServiceMonitor to Service, port name matching); use namespaceSelector to enable cross-namespace monitoring; choose scrape intervals that balance data granularity against resource consumption; and always pair your ServiceMonitors with PrometheusRule definitions to create a complete monitoring pipeline from metrics collection through alerting.
Once metrics are flowing, build custom Grafana dashboards to visualize your application-specific data. For fine-grained control over scrape behavior, resource limits, and storage, review the values.yaml configuration guide. And for clusters that need long-term metric retention beyond what local Prometheus storage provides, integrate Thanos for unlimited historical data.
When something is not working, debug systematically from the operator logs through the label selectors to the metrics endpoint itself. The most common issues are label mismatches that silently prevent the operator from picking up your ServiceMonitor — a problem that is easy to fix once you know where to look.
Ready to Deploy?
Get your full Kubernetes observability stack running in minutes with the official Helm chart.