How It Works
Rate limiting is applied per-workspace, not per API key. This means all API keys belonging to the same workspace share the same rate limit pool.
Default Limits
All new workspaces start with a default limit of 1,000 requests per minute. Contact us to adjust limits for your workspace.
Rate Limit Headers
Rate-limited API responses may include standard rate limit headers:
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 995
X-RateLimit-Reset: 1705312260Handling 429 Responses
When you exceed the rate limit, the API returns 429 Too Many Requests.
Use exponential backoff to retry:
async function callWithRetry(fn, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await fn();
} catch (err) {
if (err.status === 429 && i < maxRetries - 1) {
const delay = Math.pow(2, i) * 1000;
await new Promise(r => setTimeout(r, delay));
continue;
}
throw err;
}
}
}Best Practices
- Monitor
X-RateLimit-Remainingto proactively throttle requests. - If rate limit headers are absent, fall back to your own per-workspace queue.
- Use exponential backoff with jitter when retrying.
- There is no batch evaluation endpoint today. Queue individual evaluation requests and keep concurrency below the workspace limit.
- Contact us to increase limits for Enterprise plans.