What Prometheus Does
Prometheus is a time-series database built specifically for monitoring. It works on a pull model: rather than waiting for applications to send metrics, Prometheus scrapes HTTP endpoints exposed by your services at regular intervals.
How Prometheus Collects Metrics
Applications expose metrics via an HTTP endpoint (typically /metrics) in Prometheus text format. Prometheus scrapes this endpoint every 15-30 seconds:
# Example /metrics endpoint output (Prometheus format)
# HELP http_requests_total Total HTTP requests
# TYPE http_requests_total counter
http_requests_total{method="GET",status="200"} 15243
http_requests_total{method="POST",status="200"} 3421
http_requests_total{method="POST",status="500"} 47
# HELP http_request_duration_seconds Request latency histogram
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.05"} 12841
http_request_duration_seconds_bucket{le="0.1"} 14987
http_request_duration_seconds_bucket{le="0.5"} 15201
http_request_duration_seconds_sum 892.3
http_request_duration_seconds_count 15243Prometheus scrape configuration
# prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'my-api'
static_configs:
- targets: ['localhost:8080']
- job_name: 'node-exporter'
static_configs:
- targets: ['localhost:9100']
- job_name: 'postgres-exporter'
static_configs:
- targets: ['localhost:9187']Prometheus Alerting Rules
Prometheus evaluates PromQL expressions against your stored metrics and fires alerts when conditions are met:
# Alert rule: high error rate
groups:
- name: api-alerts
rules:
- alert: HighErrorRate
expr: |
rate(http_requests_total{status=~"5.."}[5m])
/ rate(http_requests_total[5m]) > 0.01
for: 2m
labels:
severity: warning
annotations:
summary: "High error rate on {{ $labels.instance }}"
description: "Error rate is {{ humanizePercentage $value }}"
- alert: HighLatency
expr: |
histogram_quantile(0.99,
rate(http_request_duration_seconds_bucket[5m])
) > 0.5
for: 5m
labels:
severity: criticalWhat Grafana Does
Grafana is a data visualization and dashboard platform. By itself, it stores no data — it connects to data sources like Prometheus, CloudWatch, Elasticsearch, InfluxDB, and 50+ others to display that data as charts, heatmaps, tables, and dashboards.
Grafana's Key Capabilities
- Multi-datasource dashboards. A single Grafana panel can overlay Prometheus metrics, CloudWatch metrics, and database query results on the same chart.
- Pre-built dashboard library. Grafana.com hosts thousands of community dashboards for common stacks (Kubernetes, PostgreSQL, nginx, Redis) that you can import with one click.
- Grafana Alerting. As of Grafana 8+, Grafana has its own alerting engine that can query multiple data sources (not just Prometheus) and route notifications via Alertmanager, PagerDuty, Slack, email, and 30+ others.
- Team access control. RBAC, org-level permissions, folder-based access, and dashboard provisioning via GitOps.
Skip the self-hosting — try Better Stack
Better Stack gives you monitoring, alerting, and on-call management in one SaaS product. No Prometheus maintenance, no Grafana patching — just monitoring that works.
Try Better Stack Free →Grafana + Prometheus: How They Work Together
The typical open-source monitoring stack looks like this:
| Layer | Tool | Role |
|---|---|---|
| Collection | Prometheus + exporters | Scrape and store metrics |
| Visualization | Grafana | Dashboards and charts |
| Alerting | Prometheus Alertmanager or Grafana Alerting | Route notifications on-call |
| Logs | Loki (Grafana Labs) | Log aggregation, queryable via Grafana |
| Traces | Tempo (Grafana Labs) or Jaeger | Distributed tracing, viewable via Grafana |
Quick Start: Docker Compose Stack
version: '3.8'
services:
prometheus:
image: prom/prometheus:latest
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.retention.time=30d'
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=secret
volumes:
- grafana-storage:/var/lib/grafana
depends_on:
- prometheus
node-exporter:
image: prom/node-exporter:latest
ports:
- "9100:9100"
volumes:
grafana-storage:Grafana vs Prometheus: Side-by-Side Comparison
| Feature | Prometheus | Grafana |
|---|---|---|
| Primary function | Metrics storage + collection | Visualization + dashboards |
| Data model | Time-series with labels | No storage — queries data sources |
| Query language | PromQL | Datasource-native (PromQL, SQL, etc.) |
| Data collection | Pull (scrapes /metrics) | Reads from connected data sources |
| Alerting | Yes (Alertmanager) | Yes (Grafana Alerting, multi-datasource) |
| Dashboards | Basic expression browser | Rich, customizable dashboards |
| Data sources | Exporters + pushgateway | 50+ including Prometheus, CloudWatch, SQL |
| License | Apache 2.0 | AGPL-3.0 (OSS) / Enterprise |
When to Use Grafana Without Prometheus
Grafana is useful standalone when you have existing data sources you want to visualize — without wanting to replicate all that data into Prometheus:
- AWS CloudWatch as the data source. Skip the Prometheus + CloudWatch exporter complexity — just connect Grafana directly to CloudWatch.
- Mixed database + metrics dashboards. Grafana can query PostgreSQL or MySQL directly to display business metrics alongside technical ones.
- Elasticsearch / Loki logs. Grafana Explore mode provides excellent log analysis UX without needing Prometheus.
Limitations of the Prometheus + Grafana Stack
The open-source stack is powerful but not without tradeoffs:
- Operational burden. You're responsible for running, scaling, and backing up Prometheus. At scale (millions of time-series), Prometheus memory footprint grows significantly.
- No built-in long-term storage. Prometheus default retention is 15 days. Long-term storage requires Thanos, Cortex, or VictoriaMetrics.
- No built-in on-call management. Alertmanager handles routing but doesn't provide on-call schedules, escalation policies, or incident management. You need PagerDuty, OpsGenie, or Better Stack on top.
- Pull model limitations. Prometheus can't scrape targets behind NAT or short-lived batch jobs without Pushgateway or agent-based collection.
Alert Pro
14-day free trialStop checking — get alerted instantly
Next time your services goes down, you'll know in under 60 seconds — not when your users start complaining.
- Email alerts for your services + 9 more APIs
- $0 due today for trial
- Cancel anytime — $9/mo after trial
Grafana Cloud vs Self-Hosted
| Option | Cost | Best For |
|---|---|---|
| Grafana OSS (self-hosted) | Free | Teams with ops capacity to self-host |
| Grafana Cloud Free | Free (10K metrics, 50GB logs, 50GB traces) | Small teams, getting started |
| Grafana Cloud Pro | $0.01/series/mo + usage | Teams wanting managed infrastructure |
| Grafana Enterprise | Custom pricing | Large orgs with compliance/SSO needs |
The Verdict
Don't choose between Grafana and Prometheus — use both. Prometheus collects and stores your metrics; Grafana makes them beautiful and actionable. They're the foundation of the most widely-used open-source monitoring stack in the world for a reason.
If you want to skip the operational overhead of self-hosting both, consider Grafana Cloud (managed SaaS with built-in Prometheus-compatible storage) or a SaaS tool like Better Stack for simpler uptime + alerting needs.