Skip to main content

Health Checks

Allegro exposes two read-only HTTP endpoints designed to be polled by an external monitor (an uptime checker, a metrics scraper, or an on-call alerting system). One reports the health of a single tenant's queues; the other reports the health of the platform-wide job queues. Both return an HTTP 503 when their backing service can't be reached, so even a naive "is the status code 200?" check trips on a degraded state.

These endpoints are part of the REST API — see the REST API reference for the authoritative request and response schemas. This guide explains what they report and how to wire up monitoring against them.

Tenant health

GET /api/v1/health

Authenticated with a Sanctum bearer token; the token's user must be able to view the tenant. The endpoint reports the depth and oldest-item age of that tenant's Redis-backed queues — the Event Queue and the Heartbeat Queue.

{
"status": "ok",
"tenant": "tenant_123",
"redis": { "reachable": true },
"queues": {
"events": { "depth": 0, "oldest_age_seconds": null },
"heartbeats": { "depth": 12, "oldest_age_seconds": 34 }
}
}
FieldMeaning
statusok when Redis is reachable, degraded otherwise.
redis.reachableWhether the queue's Redis backend responded.
queues.*.depthNumber of items currently waiting in the queue.
queues.*.oldest_age_secondsAge in seconds of the oldest item still waiting, or null when the queue is empty.

When Redis cannot be reached, status is degraded, every queue value is null, and the response status code is 503.

A steadily rising depth or oldest_age_seconds means a consumer has fallen behind and is no longer draining the queue fast enough — see the Event Queue and Heartbeat Queue guides for how consumers read and delete items.

Platform health

GET /api/v1/landlord/health

Authenticated with a Sanctum bearer token belonging to a super admin, and served only in the landlord context. It reports the backlog of every Horizon-managed background queue plus the recent failed-job count.

{
"status": "ok",
"redis": { "reachable": true },
"queues": {
"high": { "pending": 0, "wait_seconds": 0 },
"default": { "pending": 2, "wait_seconds": 1 },
"audience-sync": { "pending": 0, "wait_seconds": 0 }
},
"failed_jobs": 0
}
FieldMeaning
statusok when the queue backend is reachable, degraded otherwise.
redis.reachableWhether the queue backend responded.
queues.*.pendingNumber of jobs waiting on that queue.
queues.*.wait_secondsHorizon's estimated wait, in seconds, before a new job on that queue starts.
failed_jobsCount of jobs that have recently failed across all queues.

Each queue is reported separately, keyed by its name. For the meaning of the high, default, and audience-sync queues and how they're prioritized, see Queue Management. When the backend is unreachable, status is degraded, queues is empty, failed_jobs is null, and the response status code is 503.

To enumerate the tenants a monitor should poll for tenant-level health, a super admin can list them with GET /api/v1/landlord/tenants.

Setting up monitoring

  • Alert on the status code. Treat any non-200 (especially 503) as a page-worthy event — the endpoints are built so a plain HTTP check is enough.
  • Trend the depths. A single high reading can be a momentary spike; a depth or backlog age that climbs across consecutive polls is a stuck consumer.
  • Watch failed_jobs. A non-zero, growing count signals jobs erroring out rather than simply queuing.
note

Both endpoints require authentication, so configure your monitor with a Sanctum token. See REST API Authentication for how to issue one.