Rate Limits
How Zilfu throttles API and publishing requests, and how to handle 429 responses.
Why Limits Exist
Zilfu caps how often each token and user can call the API so a runaway script can't overload the platform or get our social media integrations flagged for abusive traffic. Normal usage — composing, scheduling, listing your queue — will never come close to these ceilings. They only kick in when a client misbehaves.
Two limits apply to API and MCP traffic: a global api ceiling on every authenticated request, and a tighter publish ceiling on endpoints that create or reshape scheduled posts.
Limits at a Glance
| Limit | Scope | Cap | Keyed by |
|---|---|---|---|
api | Every /api/* request | 120 / min | API token (falls back to user, then IP) |
publish | POST /api/spaces/{space}/posts, PUT /api/spaces/{space}/clusters/{cluster_id} | 30 / min | User |
The two limits stack — a request to a publishing endpoint counts
against both publish and api. In practice publish is always the
ceiling that bites first.
API Requests
Every authenticated request to /api/* is counted against the api
limiter, capped at 120 requests per minute.
The bucket is keyed per personal access token, so each token you generate from Settings → API Tokens earns its own independent quota. If a noisy automation is eating its budget, splitting it onto a second token will give it a fresh 120/min without affecting your other integrations.
The MCP server authenticates with the same personal access token as the REST API, so MCP traffic and direct API calls made with the same token share one bucket.
Publishing
The publish limiter caps 30 requests per minute per user across:
POST /api/spaces/{space}/posts— creating posts (immediate, scheduled, or queued).PUT /api/spaces/{space}/clusters/{cluster_id}— updating a multi-account cluster.
This bucket is keyed by user, not by token, so splitting a publishing workload across multiple tokens for the same account does not raise the ceiling. If you need to backfill a large volume of scheduled posts, spread them across more than one minute.
Hitting a Limit
When you exceed a limit the API returns HTTP 429 Too Many Requests with a JSON body:
{
"message": "Too Many Requests"
}Every response — successful or throttled — carries headers describing the bucket state:
| Header | Meaning |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the current window |
X-RateLimit-Remaining | Requests still available in the current window |
Retry-After | Seconds until the bucket frees up (only on 429 responses) |
X-RateLimit-Reset | Unix timestamp of when the bucket frees up (only on 429 responses) |
When two limiters apply (publishing endpoints), the headers reflect the
limit that was hit — usually publish.
Handling 429s
Retrying immediately after a 429 will keep failing and may extend how
long you stay throttled. Always wait for Retry-After before the
next attempt.
A few practical rules:
- Respect
Retry-After. Sleep for at least the number of seconds it returns before issuing the same request again. - Back off exponentially on repeated failures — double the wait on each retry, with a sensible ceiling (for example 60 seconds), and give up after a handful of attempts.
- Watch
X-RateLimit-Remainingon every response and slow your client down before you hit zero, rather than waiting for the 429. - Spread bursty work over time. If you have a large backlog of posts to schedule, pace them out instead of firing all 100 at once.
- Use separate tokens for separate workloads. The
apibucket is per-token, so a one-off backfill script on its own token won't starve your production integration.