API Reference
Usage
Aggregate per-collection usage statistics.
GET /v1/usageAccepts a session cookie or an API key with audit:read. Returns the per-collection resource footprint over a trailing window — useful for quota enforcement, billing, and capacity planning.
GET /v1/status/usage returns the same response shape for admin UI polling. Collection-pinned API keys cannot use the status variant because it aggregates across collections.
Query parameters
| Parameter | Type | Default | Notes |
|---|---|---|---|
window_days | integer | 30 | 1–365. Sliding window applied to query_log. Document totals are lifetime, not windowed. |
Response
{
"window_days": 30,
"queries_total": 9122,
"queries_per_day_avg": 304.07,
"documents_total": 482,
"chunks_total": 18420,
"storage_bytes_total": 52428800,
"embedding_tokens_total": 1820432,
"embedding_cost_usd_estimate": 0.36,
"avg_latency_ms": 142.31,
"timeline": [
{
"date": "2026-05-16T00:00:00Z",
"queries": 230,
"avg_latency_ms": 138.44
}
],
"by_collection": [
{
"collection": "knowledge_base",
"documents": 15,
"chunks": 482,
"storage_bytes": 52428800,
"embedding_tokens": 125000,
"embedding_cost_usd_estimate": 0.0250,
"queries": 1203,
"avg_latency_ms": 142.31
}
]
}Cost estimates use a fixed rate card (text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002, Cohere embed-*-v3.0 variants). Self-hosted openai_compatible endpoints and any model not in the rate card are reported as 0.
Common patterns
- Billing join: map
by_collection[].collectionto your owntenant_id ↔ collection_nametable and invoice. See Multi-tenant SaaS. - Quota alert: poll with
window_days=1and alert when a tenant'sembedding_tokensexceeds their plan. - Capacity planning: pair with
GET /v1/statsfor global health alongside this per-collection roll-up.