API Rate Limits

Understanding and managing API rate limits for optimal performance.

Overview

Rate limits protect our API infrastructure and ensure fair usage across all users. Limits are applied per API key and vary based on your subscription plan.

Rate Limit Tiers

Plan-Based Limits

Plan	Requests/Minute	Requests/Hour	Requests/Day	Concurrent Requests
Free	10	100	1,000	1
Pro	60	1,000	10,000	5
Ultra	120	5,000	50,000	10
Enterprise	Custom	Custom	Custom	Custom

Service-Specific Limits

Some services have additional rate limits:

Service	Additional Limits	Notes
Image Generation	20/minute	Resource-intensive operation
Video Generation	5/minute	Highest resource usage
Chat API	60/minute	Token-based pricing
Detection Tools	100/minute	Fast processing
Blockchain Audit	10/minute	Complex analysis

Rate Limit Headers

Every API response includes rate limit information:

http

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 57
X-RateLimit-Reset: 1703001600
X-RateLimit-Policy: 60;w=60

Header Descriptions

Header	Description	Example
X-RateLimit-Limit	Maximum requests allowed	60
X-RateLimit-Remaining	Requests remaining in window	57
X-RateLimit-Reset	Unix timestamp when limit resets	1703001600
X-RateLimit-Policy	Rate limit policy details	60;w=60

Rate Limit Responses

Approaching Limit Warning

When you have less than 20% of your rate limit remaining:

http

X-RateLimit-Warning: Approaching rate limit
X-RateLimit-Remaining: 10

Rate Limit Exceeded

http

HTTP/1.1 429 Too Many Requests
Retry-After: 58
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1703001600

json

{
  "error": "Rate limit exceeded",
  "code": "RATE_LIMIT_EXCEEDED",
  "message": "Too many requests. Please retry after 58 seconds.",
  "retryAfter": 58,
  "limit": 60,
  "window": "1 minute",
  "resetAt": "2025-01-06T12:00:00.000Z"
}

Handling Rate Limits

Exponential Backoff

Implement exponential backoff for retries:

javascript

async function makeRequestWithRetry(url, options, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      const response = await fetch(url, options);
      
      if (response.status === 429) {
        const retryAfter = response.headers.get('Retry-After');
        const delay = retryAfter ? parseInt(retryAfter) * 1000 : Math.pow(2, i) * 1000;
        
        console.log(`Rate limited. Retrying after ${delay}ms`);
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      
      return response;
    } catch (error) {
      if (i === maxRetries - 1) throw error;
    }
  }
}

Request Queuing

Queue requests to stay within limits:

javascript

class RateLimitedQueue {
  constructor(maxRequestsPerMinute) {
    this.maxRequests = maxRequestsPerMinute;
    this.queue = [];
    this.processing = false;
    this.requestTimes = [];
  }
  
  async add(requestFn) {
    return new Promise((resolve, reject) => {
      this.queue.push({ requestFn, resolve, reject });
      this.process();
    });
  }
  
  async process() {
    if (this.processing || this.queue.length === 0) return;
    
    this.processing = true;
    const now = Date.now();
    
    // Remove requests older than 1 minute
    this.requestTimes = this.requestTimes.filter(time => now - time < 60000);
    
    if (this.requestTimes.length < this.maxRequests) {
      const { requestFn, resolve, reject } = this.queue.shift();
      this.requestTimes.push(now);
      
      try {
        const result = await requestFn();
        resolve(result);
      } catch (error) {
        reject(error);
      }
    }
    
    this.processing = false;
    
    // Schedule next processing
    if (this.queue.length > 0) {
      const oldestRequest = this.requestTimes[0];
      const delay = Math.max(0, 60000 - (now - oldestRequest));
      setTimeout(() => this.process(), delay);
    }
  }
}

// Usage
const queue = new RateLimitedQueue(60);
const result = await queue.add(() => fetch('/api/endpoint'));

Monitoring Rate Limits

Track your rate limit usage:

javascript

function trackRateLimit(response) {
  const limit = response.headers.get('X-RateLimit-Limit');
  const remaining = response.headers.get('X-RateLimit-Remaining');
  const reset = response.headers.get('X-RateLimit-Reset');
  
  const usage = ((limit - remaining) / limit) * 100;
  
  console.log(`Rate limit usage: ${usage.toFixed(1)}%`);
  console.log(`Remaining requests: ${remaining}/${limit}`);
  console.log(`Reset time: ${new Date(reset * 1000).toLocaleTimeString()}`);
  
  if (remaining < limit * 0.2) {
    console.warn('Approaching rate limit!');
  }
}

Best Practices

1. Batch Operations

Combine multiple operations into single requests:

javascript

// Instead of multiple individual requests
const results = await Promise.all([
  generateImage(prompt1),
  generateImage(prompt2),
  generateImage(prompt3)
]);

// Use batch endpoint if available
const results = await generateImageBatch([prompt1, prompt2, prompt3]);

2. Implement Caching

Cache responses to reduce API calls:

javascript

const cache = new Map();
const CACHE_DURATION = 5 * 60 * 1000; // 5 minutes

async function getCached(key, fetchFn) {
  const cached = cache.get(key);
  
  if (cached && Date.now() - cached.timestamp < CACHE_DURATION) {
    return cached.data;
  }
  
  const data = await fetchFn();
  cache.set(key, { data, timestamp: Date.now() });
  return data;
}

3. Use Webhooks

For long-running operations, use webhooks instead of polling:

javascript

// Instead of polling
async function pollForResult(jobId) {
  while (true) {
    const result = await checkStatus(jobId);
    if (result.complete) return result;
    await sleep(5000); // Uses rate limit
  }
}

// Use webhooks
async function submitWithWebhook(data) {
  return await submitJob({
    ...data,
    webhookUrl: 'https://your-app.com/webhook/job-complete'
  });
}

Rate Limit Strategies

Time-Based Distribution

Spread requests evenly across time windows:

javascript

class RateLimiter {
  constructor(requestsPerMinute) {
    this.interval = 60000 / requestsPerMinute;
    this.lastRequest = 0;
  }
  
  async throttle() {
    const now = Date.now();
    const timeSinceLastRequest = now - this.lastRequest;
    
    if (timeSinceLastRequest < this.interval) {
      const delay = this.interval - timeSinceLastRequest;
      await new Promise(resolve => setTimeout(resolve, delay));
    }
    
    this.lastRequest = Date.now();
  }
}

const limiter = new RateLimiter(60);

async function makeRequest() {
  await limiter.throttle();
  return fetch('/api/endpoint');
}

Priority Queuing

Prioritize important requests:

javascript

class PriorityQueue {
  constructor() {
    this.queues = {
      high: [],
      normal: [],
      low: []
    };
  }
  
  add(request, priority = 'normal') {
    this.queues[priority].push(request);
  }
  
  next() {
    if (this.queues.high.length > 0) return this.queues.high.shift();
    if (this.queues.normal.length > 0) return this.queues.normal.shift();
    if (this.queues.low.length > 0) return this.queues.low.shift();
    return null;
  }
}

Monitoring and Alerts

Usage Tracking

javascript

class RateLimitMonitor {
  constructor() {
    this.requests = [];
    this.alerts = [];
  }
  
  track(endpoint, remaining, limit) {
    this.requests.push({
      endpoint,
      timestamp: Date.now(),
      remaining,
      limit,
      usage: (limit - remaining) / limit
    });
    
    // Alert if usage > 80%
    if (remaining < limit * 0.2) {
      this.alerts.push({
        type: 'high_usage',
        endpoint,
        usage: ((limit - remaining) / limit * 100).toFixed(1)
      });
    }
  }
  
  getStats() {
    const now = Date.now();
    const recentRequests = this.requests.filter(r => now - r.timestamp < 300000);
    
    return {
      totalRequests: recentRequests.length,
      avgUsage: recentRequests.reduce((sum, r) => sum + r.usage, 0) / recentRequests.length,
      alerts: this.alerts
    };
  }
}

Upgrading Limits

When to Upgrade

Consider upgrading when:

Consistently hitting rate limits
Business growth requires more capacity
Need higher concurrent request limits
Require dedicated infrastructure

Enterprise Options

Enterprise plans offer:

Custom rate limits
Dedicated endpoints
Priority support
SLA guarantees
Advanced monitoring

Contact sales@inspirahub.net for enterprise pricing.

Troubleshooting

Common Issues

Sudden Rate Limit Errors
- Check for code loops
- Verify retry logic
- Monitor for traffic spikes
Inconsistent Limits
- Confirm API key is correct
- Check subscription status
- Verify endpoint-specific limits
Reset Time Issues
- Use server time, not local
- Account for timezone differences
- Check system clock accuracy

API Rate Limits ​

Overview ​

Rate Limit Tiers ​

Plan-Based Limits ​

Service-Specific Limits ​

Rate Limit Headers ​

Header Descriptions ​

Rate Limit Responses ​

Approaching Limit Warning ​

Rate Limit Exceeded ​

Handling Rate Limits ​

Exponential Backoff ​

Request Queuing ​

Monitoring Rate Limits ​

Best Practices ​

1. Batch Operations ​

2. Implement Caching ​

3. Use Webhooks ​

Rate Limit Strategies ​

Time-Based Distribution ​

Priority Queuing ​

Monitoring and Alerts ​

Usage Tracking ​

Upgrading Limits ​

When to Upgrade ​

Enterprise Options ​

Troubleshooting ​

Common Issues ​

Next Steps ​

API Rate Limits

Overview

Rate Limit Tiers

Plan-Based Limits

Service-Specific Limits

Rate Limit Headers

Header Descriptions

Rate Limit Responses

Approaching Limit Warning

Rate Limit Exceeded

Handling Rate Limits

Exponential Backoff

Request Queuing

Monitoring Rate Limits

Best Practices

1. Batch Operations

2. Implement Caching

3. Use Webhooks

Rate Limit Strategies

Time-Based Distribution

Priority Queuing

Monitoring and Alerts

Usage Tracking

Upgrading Limits

When to Upgrade

Enterprise Options

Troubleshooting

Common Issues

Next Steps