Rate Limiting

How It Works

Rate limiting is applied per-workspace, not per API key. This means all API keys belonging to the same workspace share the same rate limit pool.

Default Limits

All new workspaces start with a default limit of 1,000 requests per minute. Contact us to adjust limits for your workspace.

Rate Limit Headers

Rate-limited API responses may include standard rate limit headers:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 995
X-RateLimit-Reset: 1705312260

Handling 429 Responses

When you exceed the rate limit, the API returns 429 Too Many Requests. Use exponential backoff to retry:

async function callWithRetry(fn, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (err) {
      if (err.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000;
        await new Promise(r => setTimeout(r, delay));
        continue;
      }
      throw err;
    }
  }
}

Best Practices

Monitor X-RateLimit-Remaining to proactively throttle requests.
If rate limit headers are absent, fall back to your own per-workspace queue.
Use exponential backoff with jitter when retrying.
There is no batch evaluation endpoint today. Queue individual evaluation requests and keep concurrency below the workspace limit.
Contact us to increase limits for Enterprise plans.