PythonDjango / FastAPI2026 Guide

Python Monitoring Guide: Django, FastAPI & Application Observability (2026)

Python's GIL, worker-per-process model, and dynamic typing create unique monitoring challenges. This guide covers how to instrument Django and FastAPI apps with Prometheus and OpenTelemetry, detect memory leaks and GIL contention, and choose the right APM tool.

Updated April 202612 min readPython / Django / FastAPI
Staff Pick

📡 Monitor your APIs — know when they go down before your users do

Better Stack checks uptime every 30 seconds with instant Slack, email & SMS alerts. Free tier available.

Start Free →

Affiliate link — we may earn a commission at no extra cost to you

TL;DR — Python Monitoring Checklist

  • ✅ Track per-worker RSS memory — growing workers = memory leak
  • ✅ Monitor Gunicorn worker saturation — all workers busy = add capacity
  • ✅ Add Prometheus with prometheus_client or prometheus-fastapi-instrumentator
  • ✅ Track database query time — usually the #1 Django performance bottleneck
  • ✅ Use Sentry for exception tracking — captures full stack traces with context
  • ✅ Configure Celery monitoring if using async tasks (queue depth, failure rate)

Python-Specific Monitoring Considerations

Python's architecture differs from Java or Node.js in ways that directly affect how you monitor it:

The GIL (Global Interpreter Lock)

CPython's GIL allows only one thread to execute Python bytecode at a time. This means: CPU-bound threads block each other, threads in C extensions (numpy, database drivers) release the GIL and can run in parallel, and multi-processing is more effective than multi-threading for CPU-heavy work. Monitor single-core CPU saturation — a Python process maxing one core while others sit idle is a GIL bottleneck.

Worker process model

Django (Gunicorn) typically runs as multiple separate processes (workers). Each worker has its own memory — monitor RSS per worker, not just total process memory. A memory leak compounds: 8 workers × 200MB leak = 1.6GB over time. Gunicorn's max_requests setting can mitigate slow leaks by recycling workers periodically.

Dynamic typing makes errors runtime surprises

Python's lack of compile-time type checking means type errors surface at runtime in production. Exception tracking (Sentry) is not optional for Python — it's the primary way to catch type errors, AttributeErrors, and KeyErrors that static languages would catch at build time.

FastAPI Metrics with prometheus-fastapi-instrumentator

The fastest way to add production-grade metrics to a FastAPI app:

# Install
pip install prometheus-fastapi-instrumentator prometheus-client

# main.py
from fastapi import FastAPI
from prometheus_fastapi_instrumentator import Instrumentator
from prometheus_client import Counter, Histogram, Gauge
import time

app = FastAPI()

# Auto-instrument all routes (adds http_request_duration_seconds,
# http_requests_total, http_request_size_bytes)
Instrumentator().instrument(app).expose(app)

# Custom business metrics
orders_total = Counter(
    "orders_total",
    "Total orders placed",
    ["payment_method", "status"]
)

payment_processing_seconds = Histogram(
    "payment_processing_seconds",
    "Time spent processing payments",
    buckets=[0.1, 0.25, 0.5, 1.0, 2.5, 5.0]
)

active_users = Gauge(
    "active_users",
    "Currently active users in the system"
)

@app.post("/orders")
async def create_order(order: OrderRequest):
    start = time.time()
    try:
        result = await process_order(order)
        orders_total.labels(
            payment_method=order.payment_method,
            status="success"
        ).inc()
        return result
    except PaymentError:
        orders_total.labels(
            payment_method=order.payment_method,
            status="failed"
        ).inc()
        raise
    finally:
        payment_processing_seconds.observe(time.time() - start)

# Metrics available at GET /metrics
📡
Recommended

Monitor your Django and FastAPI endpoints with Better Stack

Better Stack runs synthetic checks on your Python APIs from 30+ global locations — with on-call alerting when your Gunicorn workers crash.

Try Better Stack Free →

Django Monitoring Setup

For Django, use django-prometheus for automatic ORM and view metrics, plus middleware timing:

# Install
pip install django-prometheus

# settings.py
INSTALLED_APPS = [
    'django_prometheus',
    # ... other apps
]

MIDDLEWARE = [
    'django_prometheus.middleware.PrometheusBeforeMiddleware',
    # ... your middleware
    'django_prometheus.middleware.PrometheusAfterMiddleware',
]

DATABASES = {
    'default': {
        'ENGINE': 'django_prometheus.db.backends.postgresql',
        # ... connection params
    }
}

# urls.py
from django.urls import include, path

urlpatterns = [
    path('metrics/', include('django_prometheus.urls')),
    # ... other URLs
]

# This automatically provides:
# django_http_requests_total (by method, view, code)
# django_http_request_duration_seconds (by view)
# django_db_query_duration_seconds (by alias, type)
# django_db_query_total
# django_model_inserts_total, django_model_updates_total

Django Slow Query Monitoring

Django ORM slow queries are the #1 performance issue in Django apps. Log them at the database level:

# settings.py — log slow queries in development
import logging

LOGGING = {
    'version': 1,
    'handlers': {
        'console': {'class': 'logging.StreamHandler'},
    },
    'loggers': {
        'django.db.backends': {
            'level': 'DEBUG',
            'handlers': ['console'],
        },
    },
}

# In production: use django-silk or query-inspector
# django-silk captures request + query timeline in middleware
pip install django-silk

# settings.py (production: only enable for sampling)
MIDDLEWARE = [
    'silk.middleware.SilkyMiddleware',  # Sample mode: SILKY_INTERCEPT_PERCENT
    # ...
]
SILKY_PYTHON_PROFILER = True
SILKY_INTERCEPT_PERCENT = 5  # Profile 5% of requests
SILKY_MAX_REQUEST_BODY_SIZE = -1

OpenTelemetry for Python

# Install core packages
pip install opentelemetry-sdk opentelemetry-exporter-otlp
pip install opentelemetry-instrumentation-django  # or fastapi
pip install opentelemetry-instrumentation-psycopg2  # PostgreSQL
pip install opentelemetry-instrumentation-redis
pip install opentelemetry-instrumentation-celery

# Option 1: Auto-instrumentation (zero code changes)
# Wrap your Gunicorn/uvicorn command:
opentelemetry-instrument   --traces-exporter otlp   --service-name my-django-app   --exporter-otlp-endpoint http://localhost:4317   gunicorn myapp.wsgi:application -w 4

# Option 2: Manual setup in settings.py (more control)
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

provider = TracerProvider()
provider.add_span_processor(
    BatchSpanProcessor(OTLPSpanExporter(endpoint="http://localhost:4317"))
)
trace.set_tracer_provider(provider)

# Manual span creation in views
from opentelemetry import trace

tracer = trace.get_tracer(__name__)

def process_order(request):
    with tracer.start_as_current_span("process-order") as span:
        span.set_attribute("order.id", order_id)
        span.set_attribute("customer.tier", customer.tier)
        # ... your business logic

Alert Pro

14-day free trial

Stop checking — get alerted instantly

Next time your Python web applications goes down, you'll know in under 60 seconds — not when your users start complaining.

  • Email alerts for your Python web applications + 9 more APIs
  • $0 due today for trial
  • Cancel anytime — $9/mo after trial

Celery Task Monitoring

Celery async tasks are a black box without monitoring. flower provides a real-time dashboard; celery-prometheus-exporter sends metrics to Prometheus.

# Celery configuration for monitoring
# celery.py
from celery import Celery
from celery.signals import task_prerun, task_postrun, task_failure
from prometheus_client import Counter, Histogram

app = Celery('myapp')

celery_tasks_total = Counter(
    "celery_tasks_total",
    "Total Celery tasks",
    ["task_name", "status"]
)

celery_task_duration = Histogram(
    "celery_task_duration_seconds",
    "Celery task execution time",
    ["task_name"]
)

@task_prerun.connect
def task_prerun_handler(task_id, task, *args, **kwargs):
    task.start_time = time.time()

@task_postrun.connect
def task_postrun_handler(task_id, task, state, *args, **kwargs):
    duration = time.time() - getattr(task, 'start_time', time.time())
    celery_tasks_total.labels(
        task_name=task.name,
        status=state
    ).inc()
    celery_task_duration.labels(task_name=task.name).observe(duration)

@task_failure.connect
def task_failure_handler(task_id, exception, *args, **kwargs):
    celery_tasks_total.labels(
        task_name=kwargs['sender'].name,
        status="FAILURE"
    ).inc()

# Key metrics to alert on:
# - Queue depth > N (backlog building up)
# - Task failure rate > 5%
# - Task duration p95 > expected SLA

Python APM Tools Comparison

ToolPython SupportStandout FeaturePricing
SentryExcellentError tracking with full stack + variable context, PerformanceFree 5K errors/mo + $26/mo
New Relic Python AgentExcellentDjango ORM tracing, free 100GB/mo, Celery supportFree + $0.35/GB
Datadog APMExcellentContinuous profiler, Django/FastAPI auto-instrument$31/host/month
Better StackGoodLog ingestion + uptime checks; simple Django log shippingFree + $20/mo
Grafana CloudGoodprometheus_client metrics, Loki logs, free tierFree tier + usage-based
Elastic APMGoodDjango + Celery agent, integrates with ElasticsearchFree tier + $16/mo

FAQ

What metrics should I monitor for a Python web application?

For Django/FastAPI: request rate and p95/p99 latency, error rate by exception type, Gunicorn worker saturation, database query count and duration per request, RSS memory per worker, GC collections, and Celery task queue depth/failure rate. Database query time is typically the #1 bottleneck in Django apps — track slow queries first.

How do I add Prometheus metrics to a FastAPI application?

Install prometheus-fastapi-instrumentator and prometheus_client. Add Instrumentator().instrument(app).expose(app) after creating your FastAPI app instance. This auto-creates http_request_duration_seconds, http_requests_total, and http_request_size_bytes. For Django, use django-prometheus with its database backends and middleware.

How does the Python GIL affect monitoring?

The GIL means only one thread runs Python bytecode at a time. Watch for single-core CPU saturation even when multiple threads are running — that's a GIL bottleneck. CPU-bound work needs multi-processing (separate workers), not multi-threading. I/O-bound work (database, HTTP) releases the GIL so async frameworks (FastAPI + uvicorn) handle it well.

How do I detect memory leaks in a Python application?

Monitor RSS memory per Gunicorn/uWSGI worker over time — growing RSS is the primary leak signal. To investigate: use tracemalloc to snapshot and diff memory at intervals. Common sources: module-level dicts accumulating references, Django ORM queryset caching, global lists without clearing, unclosed database connections. Gunicorn's max_requests setting recycles workers to mitigate slow leaks.

How do I add OpenTelemetry to a Django application?

Install opentelemetry-instrumentation-django and opentelemetry-exporter-otlp. Use the opentelemetry-instrument CLI wrapper around your gunicorn command — zero code changes needed. This auto-instruments Django middleware, ORM queries, Redis, Celery, and HTTP clients. For manual spans: use trace.get_tracer(__name__) and start_as_current_span context manager.

🛠 Tools We Use & Recommend

Tested across our own infrastructure monitoring 200+ APIs daily

Better StackBest for API Teams

Uptime Monitoring & Incident Management

Used by 100,000+ websites

Monitors your APIs every 30 seconds. Instant alerts via Slack, email, SMS, and phone calls when something goes down.

We use Better Stack to monitor every API on this site. It caught 23 outages last month before users reported them.

Free tier · Paid from $24/moStart Free Monitoring
1PasswordBest for Credential Security

Secrets Management & Developer Security

Trusted by 150,000+ businesses

Manage API keys, database passwords, and service tokens with CLI integration and automatic rotation.

After covering dozens of outages caused by leaked credentials, we recommend every team use a secrets manager.

OpteryBest for Privacy

Automated Personal Data Removal

Removes data from 350+ brokers

Removes your personal data from 350+ data broker sites. Protects against phishing and social engineering attacks.

Service outages sometimes involve data breaches. Optery keeps your personal info off the sites attackers use first.

From $9.99/moFree Privacy Scan
ElevenLabsBest for AI Voice

AI Voice & Audio Generation

Used by 1M+ developers

Text-to-speech, voice cloning, and audio AI for developers. Build voice features into your apps with a simple API.

The best AI voice API we've tested — natural-sounding speech with low latency. Essential for any app adding voice features.

Free tier · Paid from $5/moTry ElevenLabs Free
SEMrushBest for SEO

SEO & Site Performance Monitoring

Used by 10M+ marketers

Track your site health, uptime, search rankings, and competitor movements from one dashboard.

We use SEMrush to track how our API status pages rank and catch site health issues early.

From $129.95/moTry SEMrush Free
View full comparison & more tools →Affiliate links — we earn a commission at no extra cost to you

Related Guides

Alert Pro

14-day free trial

Stop checking — get alerted instantly

Next time your Python web applications goes down, you'll know in under 60 seconds — not when your users start complaining.

  • Email alerts for your Python web applications + 9 more APIs
  • $0 due today for trial
  • Cancel anytime — $9/mo after trial