Is New Relic Down? How to Check New Relic Status in Real-Time
Is New Relic Down? How to Check New Relic Status in Real-Time
Quick Answer: To check if New Relic is down, visit apistatuscheck.com/api/new-relic for real-time monitoring, or check the official status.newrelic.com page. Common signs include agent connectivity failures, data ingestion lag, NRQL query timeouts, missing metrics, alert condition failures, and Synthetics monitor issues.
When your observability platform goes dark, you're flying blind. New Relic monitors your entire infrastructure, applications, and business metrics—making any downtime a critical incident. Whether you're seeing agents disconnected, queries timing out, or alerts failing to fire, knowing how to quickly verify New Relic's status can mean the difference between rapid incident resolution and hours of misdirected troubleshooting.
How to Check New Relic Status in Real-Time
1. API Status Check (Fastest Method)
The fastest way to verify New Relic's operational status is through apistatuscheck.com/api/new-relic. This real-time monitoring service:
- Tests actual API endpoints every 60 seconds
- Monitors data ingestion and query performance
- Tracks historical uptime over 30/60/90 days
- Provides instant alerts when issues are detected
- Monitors multiple regions (US, EU)
Unlike status pages that rely on manual updates, API Status Check performs active health checks against New Relic's production endpoints, including GraphQL API, REST API, and data ingestion pipelines, giving you the most accurate real-time picture of service availability.
2. Official New Relic Status Page
New Relic maintains status.newrelic.com as their official communication channel for service incidents. The page displays:
- Current operational status for all products
- Active incidents and investigations
- Scheduled maintenance windows
- Historical incident reports
- Component-specific status (APM, Infrastructure, Browser, Synthetics, Alerts, NRDB, UI)
Pro tip: Subscribe to status updates via email, webhook, or RSS feed on the status page. You can filter by specific products and regions to receive only relevant notifications.
3. Check Your New Relic One UI
If the New Relic One platform at one.newrelic.com is experiencing issues, you'll notice:
- Login failures or authentication timeouts
- Dashboard widgets stuck loading
- NRQL queries timing out or returning errors
- Entity explorer not populating
- Alert policy pages failing to load
- APM transaction traces unavailable
UI responsiveness is often the first indicator of backend database or API gateway issues.
4. Test API Endpoints Directly
For developers, making test API calls can quickly confirm connectivity and performance:
GraphQL API (NerdGraph) health check:
curl https://api.newrelic.com/graphql \
-H 'Content-Type: application/json' \
-H 'API-Key: YOUR_USER_KEY' \
-d '{"query": "{ actor { user { name email } } }"}'
REST API health check:
curl -X GET 'https://api.newrelic.com/v2/applications.json' \
-H 'Api-Key: YOUR_REST_API_KEY'
NRQL query via API:
curl -X GET "https://insights-api.newrelic.com/v1/accounts/YOUR_ACCOUNT_ID/query?nrql=SELECT%20count(*)%20FROM%20Transaction%20SINCE%201%20hour%20ago" \
-H "Accept: application/json" \
-H "X-Query-Key: YOUR_QUERY_KEY"
Look for HTTP response codes outside the 2xx range, timeout errors (>30s), or error responses indicating service degradation.
5. Use New Relic Diagnostics CLI
New Relic provides a diagnostic tool that can help identify connectivity and configuration issues:
# Download and run New Relic Diagnostics
curl -O https://download.newrelic.com/nrdiag/nrdiag_latest.zip
unzip nrdiag_latest.zip
./nrdiag -t GUID -attach
This tool performs comprehensive checks including:
- Network connectivity to New Relic collectors
- Agent configuration validation
- Proxy and firewall checks
- SSL/TLS certificate validation
- Local log file analysis
Common New Relic Issues and How to Identify Them
Agent Connectivity Failures
Symptoms:
- Agents showing "disconnected" in Entity Explorer
- No new data appearing in APM, Infrastructure, or Browser
- Agent logs showing connection timeouts or 503 errors
- Multiple applications across different hosts failing simultaneously
What it means: When agent connectivity fails across multiple hosts or regions, it typically indicates issues with New Relic's collector endpoints rather than your infrastructure. Single-agent failures are usually configuration or network issues on your end.
Diagnostic check:
import requests
import time
def check_agent_connectivity():
"""Test connectivity to New Relic collector endpoints"""
collectors = [
"https://collector.newrelic.com/status/mongrel",
"https://rpm.newrelic.com/status/mongrel",
"https://gov-collector.newrelic.com/status/mongrel" # FedRAMP
]
for collector in collectors:
try:
start = time.time()
response = requests.get(collector, timeout=10)
latency = (time.time() - start) * 1000
if response.status_code == 200:
print(f"✓ {collector}: OK ({latency:.0f}ms)")
else:
print(f"✗ {collector}: HTTP {response.status_code}")
except requests.exceptions.Timeout:
print(f"✗ {collector}: Timeout (>10s)")
except requests.exceptions.RequestException as e:
print(f"✗ {collector}: {str(e)}")
check_agent_connectivity()
Data Ingestion Lag
Indicators:
- Metrics delayed by 5+ minutes (normal is 10-60 seconds)
- "Data may be incomplete" warnings in dashboards
- Recent time windows showing significantly fewer data points
- Alert conditions not triggering despite threshold breaches
- APM transaction traces missing for recent requests
Impact: Data ingestion lag means you're making decisions based on stale information. A 10-minute lag during an incident can cost precious troubleshooting time.
Detection query:
-- Check data freshness across different data types
SELECT
latest(timestamp) as lastSeen,
(now() - latest(timestamp)) / 1000 as lagSeconds
FROM Transaction
SINCE 5 minutes ago
-- Compare with other event types
SELECT
latest(timestamp) as lastMetric
FROM Metric
WHERE metricName = 'apm.service.transaction.duration'
SINCE 5 minutes ago
If lagSeconds exceeds 300 (5 minutes) consistently, ingestion is degraded.
NRQL Query Timeouts
Common error patterns:
- Queries that normally return in <2s timing out after 30-60s
- "Query timeout" errors in dashboards
- GraphQL queries returning 500 errors
- NRDB (New Relic Database) performance degradation
When NRDB is impacted:
- All queries slow down, not just complex ones
- Simple
SELECT count(*) FROM Transactionqueries fail - Historical data queries (SINCE 30 days ago) especially affected
Diagnostic NRQL:
-- Test query performance with progressively larger time windows
SELECT count(*)
FROM Transaction
SINCE 1 hour ago
-- If this works but "SINCE 1 day ago" times out, NRDB is struggling
-- Check for query performance issues
SELECT percentile(duration, 50, 95, 99)
FROM NrdbQuery
WHERE query LIKE '%Transaction%'
SINCE 1 hour ago
FACET query
Alert Condition Failures
Critical symptoms:
- Alerts not firing despite threshold breaches visible in charts
- Alert violations showing in UI but no notifications sent
- Webhook and email integrations failing
- Incident timelines missing expected violations
- PagerDuty/Slack notifications not arriving
When this happens during a real incident: You lose your primary detection mechanism. Your production issues go undetected until customers report them.
Validation approach:
import requests
def test_alert_evaluation():
"""Trigger a test alert to verify alert pipeline"""
API_KEY = "YOUR_USER_KEY"
ACCOUNT_ID = "YOUR_ACCOUNT_ID"
# Send a custom metric that should trigger test alert
payload = [{
"eventType": "TestAlertMetric",
"value": 1000, # Above threshold
"timestamp": int(time.time())
}]
response = requests.post(
f"https://insights-collector.newrelic.com/v1/accounts/{ACCOUNT_ID}/events",
headers={
"Content-Type": "application/json",
"Api-Key": API_KEY
},
json=payload
)
if response.status_code == 200:
print("✓ Event ingestion working")
print("Check if alert fires within 3-5 minutes")
print("If data arrives but alert doesn't fire, alert pipeline is down")
else:
print(f"✗ Event ingestion failed: {response.status_code}")
test_alert_evaluation()
Synthetics Monitor Problems
Failure patterns:
- All monitors showing failures simultaneously across locations
- Monitors stuck in "pending" state
- Monitor results not appearing in UI
- Scripted browser monitors timing out
- API monitors returning connection errors
Distinguishing between target and New Relic issues:
- If monitors for DIFFERENT targets all fail → New Relic Synthetics issue
- If monitors for SAME target from multiple locations fail → Your target is down
- If public monitors (google.com, etc.) succeed but yours fail → Your target issue
Diagnostic script:
// Synthetics scripted browser test to verify Synthetics runtime
$browser.get("https://one.newrelic.com/");
$browser.wait($driver.until.elementLocated($driver.By.css("body")), 5000)
.then(function() {
console.log("✓ Synthetics can reach external sites");
return $browser.findElement($driver.By.css("body")).getAttribute("innerHTML");
})
.then(function(html) {
if (html.length > 100) {
console.log("✓ Synthetics browser runtime healthy");
} else {
console.log("✗ Response too short, possible issue");
}
});
Browser Agent Issues
Symptoms:
- JavaScript errors spike across all applications
- Browser agent script (js-agent.newrelic.com) failing to load
- PageView events not appearing in Browser data
- Session traces unavailable
- Core Web Vitals metrics missing
CDN vs data collection distinction:
- Agent script fails to load → CDN issue
- Agent loads but no data in UI → Data collection pipeline issue
Detection snippet:
// Add to your application to detect Browser agent health
window.addEventListener('load', function() {
setTimeout(function() {
if (typeof newrelic === 'undefined') {
console.error('New Relic Browser agent failed to load');
// Report to backup monitoring
fetch('/api/monitoring/alert', {
method: 'POST',
body: JSON.stringify({
severity: 'high',
message: 'New Relic Browser agent unavailable'
})
});
} else {
console.log('✓ New Relic Browser agent loaded');
}
}, 3000);
});
The Real Impact When New Relic Goes Down
Observability Blind Spots
When New Relic is unavailable, you lose visibility into:
- Application performance: No APM data means you can't see transaction response times, error rates, or throughput
- Infrastructure health: Missing CPU, memory, disk, and network metrics
- Business metrics: Custom events and metrics stop flowing
- User experience: Browser monitoring and real user data unavailable
- Synthetic monitoring: Proactive checks stop running
The compounding effect: If a production incident occurs WHILE New Relic is down, you're troubleshooting blind. You can't:
- Identify which service is causing the issue
- See error traces and stack traces
- Analyze database query performance
- Understand user impact geography
- Correlate infrastructure metrics with application behavior
Increased Mean Time to Resolution (MTTR)
Without observability, MTTR skyrockets:
- Normal MTTR with full observability: 15-45 minutes
- MTTR without observability tools: 2-6 hours or more
Why the dramatic increase:
- You must manually SSH into servers to check logs
- No centralized error aggregation or filtering
- No transaction traces to pinpoint slow components
- Can't correlate issues across services
- Must rely on customer reports instead of proactive detection
Cost calculation: If your average incident costs $10,000/hour in lost revenue and team time, losing New Relic during a critical incident can add $20,000-$50,000 in additional costs from extended downtime.
Capacity Planning Gaps
New Relic outages create blind spots in capacity planning:
- Missing trend data: Can't analyze growth patterns during outage window
- Incomplete historical analysis: Gaps in 30/60/90-day reports
- Auto-scaling failures: If infrastructure decisions rely on New Relic metrics
- Inaccurate forecasting: Models trained on incomplete data
For businesses planning Black Friday, product launches, or other high-traffic events, even a 2-hour gap in historical data can impact capacity decisions worth millions.
SLA Reporting Failures
Many businesses rely on New Relic data for SLA reporting:
- Customer-facing SLAs: Can't prove 99.9% uptime if monitoring was down
- Internal SLIs/SLOs: Service Level Indicators incomplete
- Compliance requirements: Audit trails with gaps
- Financial implications: SLA breach penalties if you can't prove uptime
The double bind: If your service AND monitoring are both down, you can't definitively prove duration of the outage, potentially triggering maximum SLA credits.
Alert Fatigue and Missed Incidents
When New Relic comes back online after an outage:
- Alert storm: Backlog of alerts fire simultaneously
- False positives: Transient issues during recovery trigger alerts
- Missed incidents: Real issues buried in noise
- Team burnout: Engineers overwhelmed by notification flood
This can lead to future alerts being ignored or deprioritized, reducing the effectiveness of your monitoring strategy for weeks after the incident.
What to Do When New Relic Goes Down
1. Implement Multi-Provider Observability
Never rely on a single observability platform. Implement defense in depth:
# Multi-provider metrics router with automatic failover
import time
import requests
from enum import Enum
class MetricsProvider(Enum):
NEW_RELIC = "newrelic"
DATADOG = "datadog"
GRAFANA_CLOUD = "grafana"
class ObservabilityRouter:
def __init__(self):
self.providers = {
MetricsProvider.NEW_RELIC: {
"url": "https://insights-collector.newrelic.com/v1/accounts/{account}/events",
"api_key": "YOUR_NR_KEY",
"healthy": True,
"last_check": 0
},
MetricsProvider.DATADOG: {
"url": "https://api.datadoghq.com/api/v1/series",
"api_key": "YOUR_DD_KEY",
"healthy": True,
"last_check": 0
}
}
self.health_check_interval = 60 # seconds
def send_metric(self, metric_name, value, tags=None):
"""Send metric to all healthy providers"""
results = []
for provider, config in self.providers.items():
if self._is_healthy(provider):
try:
self._send_to_provider(provider, metric_name, value, tags)
results.append((provider, True))
except Exception as e:
print(f"Failed to send to {provider.value}: {e}")
self._mark_unhealthy(provider)
results.append((provider, False))
# At least one provider must succeed
if not any(success for _, success in results):
raise Exception("All observability providers failed")
return results
def _is_healthy(self, provider):
"""Check if provider is healthy (with caching)"""
config = self.providers[provider]
now = time.time()
# Re-check health every 60 seconds
if now - config["last_check"] > self.health_check_interval:
config["healthy"] = self._health_check(provider)
config["last_check"] = now
return config["healthy"]
def _health_check(self, provider):
"""Perform actual health check against provider"""
config = self.providers[provider]
try:
if provider == MetricsProvider.NEW_RELIC:
# Test New Relic ingestion endpoint
response = requests.post(
config["url"],
headers={"Api-Key": config["api_key"]},
json=[{"eventType": "HealthCheck", "value": 1}],
timeout=5
)
return response.status_code == 200
elif provider == MetricsProvider.DATADOG:
# Test Datadog API
response = requests.get(
"https://api.datadoghq.com/api/v1/validate",
headers={"DD-API-KEY": config["api_key"]},
timeout=5
)
return response.status_code == 200
except Exception as e:
print(f"Health check failed for {provider.value}: {e}")
return False
def _mark_unhealthy(self, provider):
"""Mark provider as unhealthy"""
self.providers[provider]["healthy"] = False
self.providers[provider]["last_check"] = time.time()
def _send_to_provider(self, provider, metric_name, value, tags):
"""Provider-specific metric sending logic"""
config = self.providers[provider]
if provider == MetricsProvider.NEW_RELIC:
payload = [{
"eventType": "CustomMetric",
"metricName": metric_name,
"value": value,
"timestamp": int(time.time()),
**(tags or {})
}]
response = requests.post(
config["url"],
headers={"Api-Key": config["api_key"]},
json=payload,
timeout=10
)
response.raise_for_status()
elif provider == MetricsProvider.DATADOG:
payload = {
"series": [{
"metric": metric_name,
"points": [[int(time.time()), value]],
"type": "gauge",
"tags": [f"{k}:{v}" for k, v in (tags or {}).items()]
}]
}
response = requests.post(
config["url"],
headers={"DD-API-KEY": config["api_key"]},
json=payload,
timeout=10
)
response.raise_for_status()
# Usage
router = ObservabilityRouter()
# Automatically routes to all healthy providers
router.send_metric("api.response_time", 145, tags={
"endpoint": "/api/users",
"status": "200"
})
Recommended backup observability stack:
- Metrics: Datadog, Grafana Cloud, or Prometheus
- Logs: Splunk, Elasticsearch, or Loki
- Errors: Sentry for application errors
- Uptime: PagerDuty for synthetic monitoring and alerting
2. Implement Local Metrics Collection
Don't send ALL metrics to the cloud. Maintain local collection for critical data:
# Local metrics collector with time-series database
import sqlite3
import json
from datetime import datetime
class LocalMetricsStore:
"""SQLite-based local metrics storage for New Relic outages"""
def __init__(self, db_path="metrics.db"):
self.conn = sqlite3.connect(db_path, check_same_thread=False)
self._create_tables()
def _create_tables(self):
self.conn.execute("""
CREATE TABLE IF NOT EXISTS metrics (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp INTEGER NOT NULL,
metric_name TEXT NOT NULL,
value REAL NOT NULL,
tags TEXT,
synced_to_newrelic INTEGER DEFAULT 0,
INDEX idx_timestamp (timestamp),
INDEX idx_metric (metric_name),
INDEX idx_synced (synced_to_newrelic)
)
""")
self.conn.commit()
def record(self, metric_name, value, tags=None):
"""Record metric locally"""
self.conn.execute(
"INSERT INTO metrics (timestamp, metric_name, value, tags) VALUES (?, ?, ?, ?)",
(int(time.time()), metric_name, value, json.dumps(tags or {}))
)
self.conn.commit()
def sync_to_newrelic(self, api_key, account_id):
"""Backfill unsynced metrics to New Relic once it's back online"""
cursor = self.conn.execute("""
SELECT id, timestamp, metric_name, value, tags
FROM metrics
WHERE synced_to_newrelic = 0
ORDER BY timestamp ASC
LIMIT 1000
""")
unsynced = cursor.fetchall()
if not unsynced:
print("All metrics synced!")
return 0
# Batch send to New Relic
events = []
for row_id, ts, name, value, tags_json in unsynced:
events.append({
"eventType": "BackfilledMetric",
"metricName": name,
"value": value,
"timestamp": ts,
**json.loads(tags_json)
})
try:
response = requests.post(
f"https://insights-collector.newrelic.com/v1/accounts/{account_id}/events",
headers={"Api-Key": api_key},
json=events,
timeout=30
)
if response.status_code == 200:
# Mark as synced
ids = [row[0] for row in unsynced]
placeholders = ','.join('?' * len(ids))
self.conn.execute(
f"UPDATE metrics SET synced_to_newrelic = 1 WHERE id IN ({placeholders})",
ids
)
self.conn.commit()
print(f"✓ Synced {len(unsynced)} metrics to New Relic")
return len(unsynced)
else:
print(f"✗ Sync failed: HTTP {response.status_code}")
return 0
except Exception as e:
print(f"✗ Sync error: {e}")
return 0
def query(self, metric_name, start_time, end_time):
"""Query local metrics (for emergency dashboards)"""
cursor = self.conn.execute("""
SELECT timestamp, value, tags
FROM metrics
WHERE metric_name = ?
AND timestamp BETWEEN ? AND ?
ORDER BY timestamp ASC
""", (metric_name, start_time, end_time))
return cursor.fetchall()
# Usage during New Relic outage
local_store = LocalMetricsStore()
# Record metrics locally
local_store.record("api.requests", 150, {"endpoint": "/api/users", "status": "200"})
# Once New Relic is back online, backfill
local_store.sync_to_newrelic("YOUR_NR_KEY", "YOUR_ACCOUNT_ID")
3. Fallback Alert Mechanisms
Don't rely solely on New Relic alerts. Implement multi-layer alerting:
#!/bin/bash
# emergency-monitor.sh - Runs when New Relic is down
while true; do
# Check critical endpoint
response_time=$(curl -o /dev/null -s -w '%{time_total}\n' https://api.yourapp.com/health)
response_code=$(curl -o /dev/null -s -w '%{http_code}\n' https://api.yourapp.com/health)
# Check server resources
cpu_usage=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)
mem_usage=$(free | grep Mem | awk '{print ($3/$2) * 100.0}')
# Alert if thresholds breached
if (( $(echo "$response_time > 2.0" | bc -l) )); then
curl -X POST "https://api.pagerduty.com/incidents" \
-H "Authorization: Token token=YOUR_PD_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"incident": {
"type": "incident",
"title": "API response time > 2s (New Relic down, using fallback)",
"service": {"id": "YOUR_SERVICE_ID", "type": "service_reference"},
"urgency": "high",
"body": {"type": "incident_body", "details": "Response time: '"$response_time"'s"}
}
}'
fi
if [ "$response_code" != "200" ]; then
# Send to backup alerting
echo "CRITICAL: API health check returned $response_code" | \
mail -s "Emergency Alert" oncall@yourcompany.com
fi
sleep 60
done
4. Diagnostic NRQL Queries for New Relic Health
When you suspect New Relic issues, run these diagnostic queries:
-- 1. Check data freshness across event types
SELECT
count(*) as events,
latest(timestamp) as mostRecent,
(now() - latest(timestamp)) / 1000 as lagSeconds
FROM Transaction
SINCE 10 minutes ago
-- If lagSeconds > 300, data ingestion is lagging
-- 2. Identify gaps in metric reporting
SELECT
histogram(timestamp, 60000, 10)
FROM Metric
WHERE metricName = 'apm.service.transaction.duration'
SINCE 30 minutes ago
-- Look for missing buckets indicating ingestion gaps
-- 3. Check agent connectivity over time
SELECT
uniqueCount(entityGuid) as connectedAgents
FROM SystemSample
SINCE 1 hour ago
TIMESERIES 1 minute
-- Sudden drops indicate agent connectivity issues
-- 4. Verify alert condition evaluation
SELECT
count(*) as evaluations,
filter(count(*), WHERE result = 'violation') as violations
FROM NrAiIncident
WHERE conditionName = 'YOUR_CRITICAL_ALERT'
SINCE 1 hour ago
TIMESERIES 5 minutes
-- If evaluations = 0, alert pipeline is not running
-- 5. Synthetics monitor health check
SELECT
percentage(count(*), WHERE result = 'SUCCESS') as successRate,
average(duration) as avgDuration
FROM SyntheticCheck
SINCE 30 minutes ago
FACET monitorName
-- If ALL monitors show low success rate, Synthetics platform issue
5. Create an Emergency Runbook
Document your New Relic outage response procedure:
Immediate actions (0-5 minutes):
- Verify outage via status.newrelic.com and API Status Check
- Enable fallback monitoring scripts
- Switch to backup observability platform dashboards
- Notify engineering team via Slack/PagerDuty
- Start incident timeline documentation
Short-term mitigation (5-30 minutes):
- Increase log verbosity on critical services
- Enable local metrics collection
- Set up emergency health check endpoints
- Brief on-call engineers about reduced observability
- Defer non-critical deployments until New Relic returns
Recovery actions (after restoration):
- Backfill local metrics to New Relic
- Review alert conditions for missed violations
- Check for data gaps in dashboards
- Validate agent connectivity across all hosts
- Document incident and improve runbook
Frequently Asked Questions
How often does New Relic go down?
New Relic maintains strong uptime, typically 99.9%+ availability. Major platform-wide outages are rare (2-4 times per year), though regional or component-specific issues (affecting only APM, or only EU region, etc.) may occur more frequently. Most businesses experience zero downtime from New Relic in a typical quarter.
What's the difference between New Relic status page and API Status Check?
The official New Relic status page (status.newrelic.com) is manually updated by New Relic's team during incidents, which can lag behind actual issues by 5-15 minutes. API Status Check performs automated health checks every 60 seconds against live endpoints (GraphQL API, REST API, data ingestion), often detecting issues before they're officially reported. Use both for comprehensive awareness.
Can I get SLA credits for New Relic outages?
New Relic offers SLA credits for eligible customers (typically Pro and Enterprise tiers) when uptime falls below 99.95% in a calendar month. Credits are calculated as a percentage of monthly fees based on achieved uptime. Review your specific contract or contact New Relic account team for your tier's SLA terms. Standard tier typically does not include SLA guarantees.
Should I rely on New Relic for critical production alerts?
While New Relic Alerts is highly reliable, best practice for mission-critical alerts is defense in depth: use New Relic as primary alerting but implement backup alerting via PagerDuty, Datadog, or custom scripts that directly monitor your services. This ensures alert redundancy if any single platform experiences issues.
How do I prevent duplicate metrics during New Relic outages?
When using local metrics collection with later backfill, mark backfilled events with a distinct eventType (e.g., "BackfilledMetric" instead of "Metric") and include a backfilled: true attribute. This allows you to filter them in queries and avoid double-counting in dashboards that aggregate both real-time and historical data.
What causes New Relic agent disconnections?
Agent disconnections can result from: (1) Network issues between your infrastructure and New Relic collectors, (2) Proxy or firewall blocking collector endpoints, (3) New Relic collector outages, (4) Agent configuration errors, (5) Certificate validation failures. Use New Relic Diagnostics CLI to distinguish between local vs platform issues.
How long does New Relic retain data during outages?
New Relic has built-in buffering and retry logic. Agents typically buffer data locally for 1-2 hours during collector outages. If the outage exceeds this window, data may be lost. Event data has different retention (Real-time: seconds, Standard: 1 minute, Extended: up to 1 hour). Plan for local persistence if data is business-critical.
Can I test if New Relic is working without affecting production data?
Yes. Create a test application in New Relic and send synthetic events:
curl -X POST "https://insights-collector.newrelic.com/v1/accounts/YOUR_ACCOUNT/events" \
-H "Api-Key: YOUR_INSERT_KEY" \
-H "Content-Type: application/json" \
-d '[{"eventType":"HealthCheckTest","value":1}]'
Then query for these events in NRDB. This tests the full pipeline (ingestion, storage, query) without impacting production data.
Why do my NRQL queries work in EU but not US (or vice versa)?
New Relic operates separate data centers for US and EU regions. If queries work in one region but not another, it indicates regional infrastructure issues. Check status.newrelic.com for region-specific incident reports. Your account data resides in ONE region based on account creation; cross-region queries aren't supported.
Should I increase agent data collection during New Relic outages?
No—counter-intuitively, reduce data collection during outages. Agents buffer data locally, and excessive buffering can cause memory issues. Instead, maintain minimal critical metrics locally and reduce transaction trace collection, browser agent sampling, and custom event volume until New Relic service is restored.
Stay Ahead of New Relic Outages
Don't let observability blind spots derail your incident response. Subscribe to real-time New Relic alerts and get notified instantly when issues are detected—before your monitoring goes dark.
API Status Check monitors New Relic 24/7 with:
- 60-second health checks across API, ingestion, and query endpoints
- Instant alerts via email, Slack, Discord, or webhook
- Historical uptime tracking and incident reports
- Multi-API monitoring for your entire observability stack
Start monitoring New Relic now →
Last updated: February 4, 2026. New Relic status information is provided in real-time based on active monitoring. For official incident reports, always refer to status.newrelic.com.
Related guides:
Monitor Your APIs
Check the real-time status of 100+ popular APIs used by developers.
View API Status →