Rate limiting¶
Every public API request is rate-limited per API key (when authenticated) or per source IP (when anonymous). Limits are hourly buckets — the bucket resets one hour after the first request in it.
Two buckets per key¶
Every authenticated key has two independent rate-limit buckets:
- General bucket — for navigation endpoints (
/coverage,/politicians/:id,/bills,/votes,/committees/meetings, the free search auxiliaries, exports, …). Higher per-tier limits. - Semantic-search bucket — for the TEI-embedded endpoints
(
/search/speeches,/search/speeches/count,/search/facets). Lower per-tier limits because each call hits the GPU.
The two buckets are independent: blowing through your hourly semantic
budget doesn't stop you from calling /bills or /coverage, and
vice-versa.
Limits by tier¶
| Tier | General bucket | Semantic-search bucket | How to get it |
|---|---|---|---|
| Anonymous | 30 / hr (per source IP) | not available (sign in for free key) | No setup — call without an Authorization header |
| Free | 60 / hr | 5 / hr | Create a key |
| Developer ($20/mo) | 1,000 / hr | 100 / hr | Subscribe |
| Pro ($200/mo) | 10,000 / hr | 10,000 / hr | Subscribe |
All buckets are per-API-key, sliding-hourly. When you subscribe to dev or pro, all of your existing keys auto-promote in both buckets — you don't need to mint new keys.
Anonymous semantic search is not available — the three
/search/speeches*and/search/facetsendpoints require an API key. The navigation auxiliaries (/search/sessions,/search/chunks/:id,/search/meta) accept anonymous callers against the general IP bucket.
Response headers¶
Every response (success and 429) includes the standard
X-RateLimit-* headers so you can pace your client without
hammering and backing off:
HTTP/1.1 200 OK
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 47
X-RateLimit-Reset: 2387
Cache-Control: public, max-age=300
X-RateLimit-Limit— your current bucket's max (depends on tier).X-RateLimit-Remaining— calls left in this bucket.X-RateLimit-Reset— seconds until the bucket resets.
When the bucket is empty, you get a 429 Too Many Requests:
HTTP/1.1 429 Too Many Requests
Retry-After: 2387
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 2387
Content-Type: application/json
{
"statusCode": 429,
"error": "Too Many Requests",
"message": "Rate limit exceeded, retry in 2387 seconds"
}
The Retry-After header is in seconds — your client should respect it
rather than retrying immediately.
Retry strategy¶
For 429 responses:
import time
import requests
def get_with_retry(url, headers, max_attempts=3):
for attempt in range(max_attempts):
r = requests.get(url, headers=headers, timeout=10)
if r.status_code == 429:
retry_after = int(r.headers.get("Retry-After", "60"))
# Cap the wait to something reasonable for interactive use
time.sleep(min(retry_after, 300))
continue
r.raise_for_status()
return r.json()
raise RuntimeError(f"Exhausted retries for {url}")
For 503 responses (rare — occurs only on the search surface when the embedding service is degraded), retry with exponential backoff. See the errors guide.
Caching¶
Most public endpoints set Cache-Control: public, max-age=N so
intermediate caches (your CDN, your local HTTP cache) can serve
repeats without consuming your rate-limit bucket:
| Endpoint | Cache TTL |
|---|---|
/coverage |
5 minutes |
/jurisdiction-sources |
5 minutes |
/politicians/:id |
1 minute |
If you're hitting the rate limit, enable caching in your client before upgrading tiers. A simple in-process LRU + TTL keyed on URL will dramatically cut your API call volume for any read-heavy workload — these endpoints don't change second-to-second.
Upgrading¶
If you've optimized your client and still need more headroom:
- Visit
/account/billing. - Pick Developer ($20/mo, 1,000/hr) or Pro ($200/mo, 10,000/hr).
- Pay with any major card via Stripe-hosted Checkout.
- All of your existing API keys auto-promote to the new tier.
The next call you make will see
X-RateLimit-Limit: 1000(or10000for pro). No code change required.
Rate-limited authenticated calls are also recorded in an audit log
(private.api_key_events.event_type = 'rate_limited') so you can
look back at usage patterns when deciding whether to upgrade.
TEI semaphore on semantic search¶
The semantic-search endpoints (/search/speeches,
/search/speeches/count, /search/facets) share a GPU concurrency
budget independent of your tier's rate limit. The embedding step
runs on a single GPU and serves both interactive search and
background ingest, so we cap simultaneous embed requests across all
public-API callers regardless of tier.
- Max concurrent: 2 embed requests in flight at a time.
- Max queued: 6 requests waiting for a slot.
- Total slots: 8.
When all 8 slots are full, additional requests get refused immediately
with 503 Service Unavailable + Retry-After: 5 so your
client can back off cleanly rather than waiting minutes:
HTTP/1.1 503 Service Unavailable
Retry-After: 5
Content-Type: application/json
{
"code": "search_overloaded",
"error": "Service Unavailable",
"message": "public search service is at capacity, retry shortly"
}
This is independent of both the general and semantic per-tier hourly rate limits — it's a third, orthogonal protection layer:
- Blow through your semantic-search hourly bucket: 429.
- Hit the GPU concurrency ceiling because eight callers (regardless of
tier) are all firing at once: 503 +
Retry-After: 5. - Hit your general bucket on
/bills//coverageetc.: 429 — separate from the semantic bucket.
Quotas vs. rate limits¶
The current limits are rate limits (sliding hourly buckets), not
quota limits (monthly caps). There's no "max requests per month"
ceiling — if your bucket has room, you can call. The
api_usage_daily counter table exists in the schema but is reserved
for future operator-side analytics; it's not enforced as a quota
today.