Rate Limiting

Understand rate limits and how to handle them gracefully.

How It Works

Rate limiting is applied per-workspace, not per API key. This means all API keys belonging to the same workspace share the same rate limit pool.

Default Limits

Rate limits are configured per workspace. Typical defaults:

Workspace TypeRequests per Minute
Standard60
Production1,000
EnterpriseCustom

Contact us to adjust limits for your workspace.

Rate Limit Headers

Every response includes rate limit information:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 995
X-RateLimit-Reset: 1705312260

Handling 429 Responses

When you exceed the rate limit, the API returns 429 Too Many Requests. Use exponential backoff to retry:

async function callWithRetry(fn, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (err) {
      if (err.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000;
        await new Promise(r => setTimeout(r, delay));
        continue;
      }
      throw err;
    }
  }
}

Best Practices

  • Monitor X-RateLimit-Remaining to proactively throttle requests.
  • Use exponential backoff with jitter when retrying.
  • Batch evaluation requests where possible instead of sending them one by one.
  • Contact us to increase limits for Enterprise plans.