# Health Checks

Allegro exposes two read-only HTTP endpoints designed to be polled by an external monitor (an uptime checker, a metrics scraper, or an on-call alerting system). One reports the health of a single tenant's queues; the other reports the health of the platform-wide job queues. Both return an HTTP `503` when their backing service can't be reached, so even a naive "is the status code 200?" check trips on a degraded state.

These endpoints are part of the REST API — see the [REST API reference](/rest-api) for the authoritative request and response schemas. This guide explains what they report and how to wire up monitoring against them.

## Tenant health[​](#tenant-health "Direct link to Tenant health")

```text
GET /api/v1/health

```

Authenticated with a Sanctum bearer token; the token's user must be able to view the tenant. The endpoint reports the depth and oldest-item age of that tenant's Redis-backed queues — the [Event Queue](/developer/platform/event-queue.md) and the [Heartbeat Queue](/developer/platform/heartbeat-queue.md).

```json
{
    "status": "ok",
    "tenant": "tenant_123",
    "redis": { "reachable": true },
    "queues": {
        "events": { "depth": 0, "oldest_age_seconds": null },
        "heartbeats": { "depth": 12, "oldest_age_seconds": 34 }
    }
}

```

| Field                         | Meaning                                                                             |
| ----------------------------- | ----------------------------------------------------------------------------------- |
| `status`                      | `ok` when Redis is reachable, `degraded` otherwise.                                 |
| `redis.reachable`             | Whether the queue's Redis backend responded.                                        |
| `queues.*.depth`              | Number of items currently waiting in the queue.                                     |
| `queues.*.oldest_age_seconds` | Age in seconds of the oldest item still waiting, or `null` when the queue is empty. |

When Redis cannot be reached, `status` is `degraded`, every queue value is `null`, and the response status code is `503`.

A steadily rising `depth` or `oldest_age_seconds` means a consumer has fallen behind and is no longer draining the queue fast enough — see the [Event Queue](/developer/platform/event-queue.md) and [Heartbeat Queue](/developer/platform/heartbeat-queue.md) guides for how consumers read and delete items.

## Platform health[​](#platform-health "Direct link to Platform health")

```text
GET /api/v1/landlord/health

```

Authenticated with a Sanctum bearer token belonging to a **super admin**, and served only in the landlord context. It reports the backlog of every Horizon-managed background queue plus the recent failed-job count.

```json
{
    "status": "ok",
    "redis": { "reachable": true },
    "queues": {
        "high": { "pending": 0, "wait_seconds": 0 },
        "default": { "pending": 2, "wait_seconds": 1 },
        "audience-sync": { "pending": 0, "wait_seconds": 0 }
    },
    "failed_jobs": 0
}

```

| Field                   | Meaning                                                                      |
| ----------------------- | ---------------------------------------------------------------------------- |
| `status`                | `ok` when the queue backend is reachable, `degraded` otherwise.              |
| `redis.reachable`       | Whether the queue backend responded.                                         |
| `queues.*.pending`      | Number of jobs waiting on that queue.                                        |
| `queues.*.wait_seconds` | Horizon's estimated wait, in seconds, before a new job on that queue starts. |
| `failed_jobs`           | Count of jobs that have recently failed across all queues.                   |

Each queue is reported separately, keyed by its name. For the meaning of the `high`, `default`, and `audience-sync` queues and how they're prioritized, see [Queue Management](/developer/platform/queue-management.md). When the backend is unreachable, `status` is `degraded`, `queues` is empty, `failed_jobs` is `null`, and the response status code is `503`.

To enumerate the tenants a monitor should poll for tenant-level health, a super admin can list them with `GET /api/v1/landlord/tenants`.

## Setting up monitoring[​](#setting-up-monitoring "Direct link to Setting up monitoring")

* **Alert on the status code.** Treat any non-`200` (especially `503`) as a page-worthy event — the endpoints are built so a plain HTTP check is enough.
* **Trend the depths.** A single high reading can be a momentary spike; a depth or backlog age that climbs across consecutive polls is a stuck consumer.
* **Watch `failed_jobs`.** A non-zero, growing count signals jobs erroring out rather than simply queuing.

note

Both endpoints require authentication, so configure your monitor with a Sanctum token. See [REST API Authentication](/developer/api/authentication.md) for how to issue one.