How do you measure Mean Time to Restore (MTTR) for DORA?

MTTR measures the time from incident detection to service restoration. Track: incident start time (when monitoring alerts fire, not when humans notice), resolution time (when service returns to normal), and calculate the average across all incidents in a period. Use incident management tools or monitoring platforms that automatically log start/end times.

Staff Pick

📡 Monitor your APIs — know when they go down before your users do

Better Stack checks uptime every 30 seconds with instant Slack, email & SMS alerts. Free tier available.

Start Free →

Affiliate link — we may earn a commission at no extra cost to you

Blog/DORA Metrics Guide

DORA Metrics: Complete Guide for Engineering Teams (2026)

Q: What are DORA metrics?

DORA metrics are four key performance indicators developed by the DevOps Research and Assessment (DORA) team at Google to measure software delivery performance: Deployment Frequency, Lead Time for Changes, Change Failure Rate, and Mean Time to Restore (MTTR). Research shows these metrics strongly predict organizational performance and business outcomes.

Q: What is a good DORA metric score for elite teams?

Elite team benchmarks: Deployment Frequency — on-demand (multiple times per day), Lead Time for Changes — under 1 hour, Change Failure Rate — 0-5%, Mean Time to Restore — under 1 hour. These are the thresholds from the DORA State of DevOps Report that distinguish elite performers from high, medium, and low performers.

Q: Is MTTR the same as MTBF?

No. MTTR (Mean Time to Restore) measures recovery speed — how quickly you fix incidents. MTBF (Mean Time Between Failures) measures reliability — how frequently incidents occur. DORA uses MTTR and Change Failure Rate together: MTTR tells you recovery speed, CFR tells you how often deployments cause incidents.

DORA metrics are the gold standard for measuring software delivery performance. Used by Google, Netflix, and thousands of engineering teams worldwide, these four metrics separate elite teams from average ones — and they're measurable, actionable, and directly tied to business outcomes.

The 4 DORA Metrics at a Glance

Deployment Frequency

How often you ship

Lead Time for Changes

How fast code reaches prod

Change Failure Rate

How often deploys break things

Mean Time to Restore

How fast you recover

What Are DORA Metrics?

DORA (DevOps Research and Assessment) is a Google-backed research program that has studied high-performing engineering teams since 2014. Their annual State of DevOps Report analyzes tens of thousands of professionals to identify what separates elite teams from low performers.

The research identifies four key metrics that consistently predict software delivery performance and organizational outcomes. Elite performers on all four metrics show 127x faster MTTR, 3x lower change failure rates, and 208x more frequent deployments than low performers.

Two metrics measure throughput (speed): Deployment Frequency and Lead Time for Changes. Two metrics measure stability (quality): Change Failure Rate and MTTR. Elite teams excel at both — they ship fast and safely.

📡

Recommended

MTTR starts with knowing about incidents the moment they happen

Better Stack provides 30-second interval monitoring, multi-channel alerting, and incident management — the foundation for achieving elite MTTR scores.

Try Better Stack Free →

The 4 DORA Metrics: Deep Dive

Deployment Frequency

How often your team successfully releases to production.

Why it matters

High frequency forces smaller changes, reducing risk per deployment and enabling faster feedback loops.

How to measure it

Count successful production deployments per day/week/month. Track in your CI/CD pipeline (GitHub Actions, Jenkins, CircleCI).

Performance benchmarks

Elite

On-demand (multiple times/day)

High

1-7 times/week

Medium

1-6 times/month

Low

Less than once/month

How to improve

Implement trunk-based development, feature flags to decouple deploy from release, automated testing to build deploy confidence.

Lead Time for Changes

Time from code committed to that code running in production.

Why it matters

Short lead time enables rapid iteration and quick bug fixes. Long lead times are often caused by manual approvals, slow CI, or large batch sizes.

How to measure it

Measure from first commit of a change to deployment to production. Most Git platforms (GitHub, GitLab) can track this via deployment tracking.

Performance benchmarks

Elite

Less than 1 hour

High

1 day to 1 week

Medium

1 week to 1 month

Low

More than 1 month

How to improve

Reduce PR size, automate approvals for low-risk changes, speed up CI pipelines, eliminate manual handoff steps.

Change Failure Rate

Percentage of deployments that cause a degraded service requiring hotfix or rollback.

Why it matters

High CFR signals insufficient testing, poor deployment practices, or missing staging environments. Lower CFR means deployments are safer and faster.

How to measure it

Divide failed deployments (rollbacks + hotfixes) by total deployments in a period. Tag incidents with "deployment-caused" to make this trackable.

Performance benchmarks

Elite

0–5%

High

5–10%

Medium

10–15%

Low

More than 15%

How to improve

Improve test coverage, add canary deployments, implement blue-green deployments, require pre-deployment smoke tests.

Mean Time to Restore

Average time to recover from a production failure or incident.

Why it matters

Fast MTTR minimizes customer impact and revenue loss. Low MTTR is enabled by good observability, runbooks, and on-call practices.

How to measure it

Track incident start time (when monitoring first detects it) to resolution time (when service is restored). Average across all incidents.

Performance benchmarks

Elite

Less than 1 hour

High

Less than 1 day

Medium

1 day to 1 week

Low

More than 1 week

How to improve

Implement real-time alerting, write runbooks for common failures, practice incident response, improve observability.

DORA Elite vs. Low Performer Benchmarks

Metric	Elite	Low
Deployment Frequency	Multiple times/day	Less than once/month
Lead Time for Changes	Under 1 hour	Over 1 month
Change Failure Rate	0–5%	Over 15%
Mean Time to Restore	Under 1 hour	Over 1 week

Source: DORA State of DevOps Report. Benchmarks updated annually.

How to Start Measuring DORA Metrics

Step 1: Define what counts as a "deployment"

Teams differ on this. Standardize on: successful deploys to production (not staging). Exclude hotfixes that were immediately rolled back. If you have multiple services, decide whether to measure per-service or aggregate.

Step 2: Instrument your CI/CD pipeline

Most modern CI/CD tools (GitHub Actions, GitLab CI, CircleCI) can track deployment events. Use deployment markers to record: deploy timestamp, commit SHA, deploy outcome (success/failure/rollback). This gives you Deployment Frequency and enables Lead Time calculation.

Step 3: Tag incidents as deployment-caused

In your incident management tool (PagerDuty, OpsGenie, Better Stack), add a "deployment-caused" label to incidents triggered by a deployment within the preceding 60 minutes. This lets you automatically calculate Change Failure Rate.

Step 4: Track incident start and resolution times

MTTR requires accurate incident timestamps. Ensure your monitoring alerts fire automatically (don't rely on manual incident creation). The clock starts when monitoring detects the issue, not when an engineer notices it. Use automated alerting with API Status Check or Better Stack.

Step 5: Build a DORA dashboard

Consolidate your four metrics in a weekly engineering dashboard. Review in sprint retrospectives. Focus on trends over time, not point-in-time scores. Improving one metric often requires accepting tradeoffs in another — monitor all four together.

📡

Recommended

Build your MTTR foundation with real-time monitoring

Better Stack provides the incident detection, on-call alerting, and incident timeline data you need to track and improve MTTR — a key DORA metric for elite engineering teams.

Try Better Stack Free →

Common DORA Metrics Mistakes

✗ Measuring deployments to staging, not production

→ Only count successful production deployments. Staging deploys inflate Deployment Frequency without reflecting real delivery.

✗ Using human-reported incident start times

→ Your MTTR clock should start when your monitoring system detects the issue — not when an engineer creates an incident. This makes MTTR accurate and incentivizes fast detection.

✗ Gaming metrics with meaningless micro-deployments

→ Deploy small batches of real value, not empty commits. DORA metrics measure delivery performance — manipulating them misleads leadership and hides real problems.

✗ Measuring team-level without organizational context

→ DORA research shows that high-performing teams also need supportive culture and leadership. If your team scores well but shipping is blocked by organizational approvals, the org process is the bottleneck.

✗ Ignoring the relationship between metrics

→ Deployment Frequency and Lead Time often improve together. Change Failure Rate and MTTR often move inversely. Watch all four — obsessing on one metric alone creates blind spots.

Frequently Asked Questions

What are DORA metrics?

Four KPIs from Google's DevOps Research and Assessment program: Deployment Frequency, Lead Time for Changes, Change Failure Rate, and Mean Time to Restore. They predict software delivery performance and business outcomes.

What is a good DORA metric score for elite teams?

Elite benchmarks: Deployment Frequency — multiple times per day, Lead Time — under 1 hour, Change Failure Rate — 0-5%, MTTR — under 1 hour. These come from the annual DORA State of DevOps Report.

How do you measure MTTR for DORA?

Track incident start time (when monitoring alerts fire) to resolution time (when service is restored). Average across all incidents in a period. Use automated alerting — manual incident creation inflates MTTR.

Is MTTR the same as MTBF?

No. MTTR measures recovery speed; MTBF measures reliability (frequency of incidents). DORA uses MTTR and Change Failure Rate together to assess stability.

Alert Pro

14-day free trial

Stop checking — get alerted instantly

Next time API Monitoring goes down, you'll know in under 60 seconds — not when your users start complaining.

Email alerts for API Monitoring + 9 more APIs
$0 due today for trial
Cancel anytime — $9/mo after trial

Start Free Trial →Compare all plans →

Also recommended:

Better Stack — all-in-one monitoring 1Password — secure your API keys

🛠 Tools We Use & Recommend

Tested across our own infrastructure monitoring 200+ APIs daily

See all →

SEMrushBest for SEO

SEO & Site Performance Monitoring

Used by 10M+ marketers

Track your site health, uptime, search rankings, and competitor movements from one dashboard.

“We use SEMrush to track how our API status pages rank and catch site health issues early.”

From $129.95/moTry SEMrush Free

View full comparison & more tools →Affiliate links — we earn a commission at no extra cost to you

DORA Metrics: Complete Guide for Engineering Teams (2026)

What Are DORA Metrics?

The 4 DORA Metrics: Deep Dive

Deployment Frequency

Lead Time for Changes

Change Failure Rate

Mean Time to Restore

DORA Elite vs. Low Performer Benchmarks

How to Start Measuring DORA Metrics

Step 1: Define what counts as a "deployment"

Step 2: Instrument your CI/CD pipeline

Step 3: Tag incidents as deployment-caused

Step 4: Track incident start and resolution times

Step 5: Build a DORA dashboard

Common DORA Metrics Mistakes

Frequently Asked Questions

Related Guides

Stop checking — get alerted instantly