Rate Limits

How Zilfu throttles API and publishing requests, and how to handle 429 responses.

Why Limits Exist

Zilfu caps how often each token and user can call the API so a runaway script can't overload the platform or get our social media integrations flagged for abusive traffic. Normal usage — composing, scheduling, listing your queue — will never come close to these ceilings. They only kick in when a client misbehaves.

Two limits apply to API and MCP traffic: a global api ceiling on every authenticated request, and a tighter publish ceiling on endpoints that create or reshape scheduled posts.

Limits at a Glance

Limit	Scope	Cap	Keyed by
`api`	Every `/api/*` request	120 / min	API token (falls back to user, then IP)
`publish`	`POST /api/spaces/{space}/posts`, `PUT /api/spaces/{space}/clusters/{cluster_id}`	30 / min	User

The two limits stack — a request to a publishing endpoint counts against both publish and api. In practice publish is always the ceiling that bites first.

API Requests

Every authenticated request to /api/* is counted against the api limiter, capped at 120 requests per minute.

The bucket is keyed per personal access token, so each token you generate from Settings → API Tokens earns its own independent quota. If a noisy automation is eating its budget, splitting it onto a second token will give it a fresh 120/min without affecting your other integrations.

The MCP server authenticates with the same personal access token as the REST API, so MCP traffic and direct API calls made with the same token share one bucket.

Publishing

The publish limiter caps 30 requests per minute per user across:

POST /api/spaces/{space}/posts — creating posts (immediate, scheduled, or queued).
PUT /api/spaces/{space}/clusters/{cluster_id} — updating a multi-account cluster.

This bucket is keyed by user, not by token, so splitting a publishing workload across multiple tokens for the same account does not raise the ceiling. If you need to backfill a large volume of scheduled posts, spread them across more than one minute.

Hitting a Limit

When you exceed a limit the API returns HTTP 429 Too Many Requests with a JSON body:

{
  "message": "Too Many Requests"
}

Every response — successful or throttled — carries headers describing the bucket state:

Header	Meaning
`X-RateLimit-Limit`	Maximum requests allowed in the current window
`X-RateLimit-Remaining`	Requests still available in the current window
`Retry-After`	Seconds until the bucket frees up (only on 429 responses)
`X-RateLimit-Reset`	Unix timestamp of when the bucket frees up (only on 429 responses)

When two limiters apply (publishing endpoints), the headers reflect the limit that was hit — usually publish.

Handling 429s

Retrying immediately after a 429 will keep failing and may extend how long you stay throttled. Always wait for Retry-After before the next attempt.

A few practical rules:

Respect Retry-After. Sleep for at least the number of seconds it returns before issuing the same request again.
Back off exponentially on repeated failures — double the wait on each retry, with a sensible ceiling (for example 60 seconds), and give up after a handful of attempts.
Watch X-RateLimit-Remaining on every response and slow your client down before you hit zero, rather than waiting for the 429.
Spread bursty work over time. If you have a large backlog of posts to schedule, pace them out instead of firing all 100 at once.
Use separate tokens for separate workloads. The api bucket is per-token, so a one-off backfill script on its own token won't starve your production integration.