API Rate Limits
Understanding and managing API rate limits for optimal performance.
Overview
Rate limits protect our API infrastructure and ensure fair usage across all users. Limits are applied per API key and vary based on your subscription plan.
Rate Limit Tiers
Plan-Based Limits
Plan | Requests/Minute | Requests/Hour | Requests/Day | Concurrent Requests |
---|---|---|---|---|
Free | 10 | 100 | 1,000 | 1 |
Pro | 60 | 1,000 | 10,000 | 5 |
Ultra | 120 | 5,000 | 50,000 | 10 |
Enterprise | Custom | Custom | Custom | Custom |
Service-Specific Limits
Some services have additional rate limits:
Service | Additional Limits | Notes |
---|---|---|
Image Generation | 20/minute | Resource-intensive operation |
Video Generation | 5/minute | Highest resource usage |
Chat API | 60/minute | Token-based pricing |
Detection Tools | 100/minute | Fast processing |
Blockchain Audit | 10/minute | Complex analysis |
Rate Limit Headers
Every API response includes rate limit information:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 57
X-RateLimit-Reset: 1703001600
X-RateLimit-Policy: 60;w=60
Header Descriptions
Header | Description | Example |
---|---|---|
X-RateLimit-Limit | Maximum requests allowed | 60 |
X-RateLimit-Remaining | Requests remaining in window | 57 |
X-RateLimit-Reset | Unix timestamp when limit resets | 1703001600 |
X-RateLimit-Policy | Rate limit policy details | 60;w=60 |
Rate Limit Responses
Approaching Limit Warning
When you have less than 20% of your rate limit remaining:
X-RateLimit-Warning: Approaching rate limit
X-RateLimit-Remaining: 10
Rate Limit Exceeded
HTTP/1.1 429 Too Many Requests
Retry-After: 58
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1703001600
{
"error": "Rate limit exceeded",
"code": "RATE_LIMIT_EXCEEDED",
"message": "Too many requests. Please retry after 58 seconds.",
"retryAfter": 58,
"limit": 60,
"window": "1 minute",
"resetAt": "2025-01-06T12:00:00.000Z"
}
Handling Rate Limits
Exponential Backoff
Implement exponential backoff for retries:
async function makeRequestWithRetry(url, options, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
const response = await fetch(url, options);
if (response.status === 429) {
const retryAfter = response.headers.get('Retry-After');
const delay = retryAfter ? parseInt(retryAfter) * 1000 : Math.pow(2, i) * 1000;
console.log(`Rate limited. Retrying after ${delay}ms`);
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
return response;
} catch (error) {
if (i === maxRetries - 1) throw error;
}
}
}
Request Queuing
Queue requests to stay within limits:
class RateLimitedQueue {
constructor(maxRequestsPerMinute) {
this.maxRequests = maxRequestsPerMinute;
this.queue = [];
this.processing = false;
this.requestTimes = [];
}
async add(requestFn) {
return new Promise((resolve, reject) => {
this.queue.push({ requestFn, resolve, reject });
this.process();
});
}
async process() {
if (this.processing || this.queue.length === 0) return;
this.processing = true;
const now = Date.now();
// Remove requests older than 1 minute
this.requestTimes = this.requestTimes.filter(time => now - time < 60000);
if (this.requestTimes.length < this.maxRequests) {
const { requestFn, resolve, reject } = this.queue.shift();
this.requestTimes.push(now);
try {
const result = await requestFn();
resolve(result);
} catch (error) {
reject(error);
}
}
this.processing = false;
// Schedule next processing
if (this.queue.length > 0) {
const oldestRequest = this.requestTimes[0];
const delay = Math.max(0, 60000 - (now - oldestRequest));
setTimeout(() => this.process(), delay);
}
}
}
// Usage
const queue = new RateLimitedQueue(60);
const result = await queue.add(() => fetch('/api/endpoint'));
Monitoring Rate Limits
Track your rate limit usage:
function trackRateLimit(response) {
const limit = response.headers.get('X-RateLimit-Limit');
const remaining = response.headers.get('X-RateLimit-Remaining');
const reset = response.headers.get('X-RateLimit-Reset');
const usage = ((limit - remaining) / limit) * 100;
console.log(`Rate limit usage: ${usage.toFixed(1)}%`);
console.log(`Remaining requests: ${remaining}/${limit}`);
console.log(`Reset time: ${new Date(reset * 1000).toLocaleTimeString()}`);
if (remaining < limit * 0.2) {
console.warn('Approaching rate limit!');
}
}
Best Practices
1. Batch Operations
Combine multiple operations into single requests:
// Instead of multiple individual requests
const results = await Promise.all([
generateImage(prompt1),
generateImage(prompt2),
generateImage(prompt3)
]);
// Use batch endpoint if available
const results = await generateImageBatch([prompt1, prompt2, prompt3]);
2. Implement Caching
Cache responses to reduce API calls:
const cache = new Map();
const CACHE_DURATION = 5 * 60 * 1000; // 5 minutes
async function getCached(key, fetchFn) {
const cached = cache.get(key);
if (cached && Date.now() - cached.timestamp < CACHE_DURATION) {
return cached.data;
}
const data = await fetchFn();
cache.set(key, { data, timestamp: Date.now() });
return data;
}
3. Use Webhooks
For long-running operations, use webhooks instead of polling:
// Instead of polling
async function pollForResult(jobId) {
while (true) {
const result = await checkStatus(jobId);
if (result.complete) return result;
await sleep(5000); // Uses rate limit
}
}
// Use webhooks
async function submitWithWebhook(data) {
return await submitJob({
...data,
webhookUrl: 'https://your-app.com/webhook/job-complete'
});
}
Rate Limit Strategies
Time-Based Distribution
Spread requests evenly across time windows:
class RateLimiter {
constructor(requestsPerMinute) {
this.interval = 60000 / requestsPerMinute;
this.lastRequest = 0;
}
async throttle() {
const now = Date.now();
const timeSinceLastRequest = now - this.lastRequest;
if (timeSinceLastRequest < this.interval) {
const delay = this.interval - timeSinceLastRequest;
await new Promise(resolve => setTimeout(resolve, delay));
}
this.lastRequest = Date.now();
}
}
const limiter = new RateLimiter(60);
async function makeRequest() {
await limiter.throttle();
return fetch('/api/endpoint');
}
Priority Queuing
Prioritize important requests:
class PriorityQueue {
constructor() {
this.queues = {
high: [],
normal: [],
low: []
};
}
add(request, priority = 'normal') {
this.queues[priority].push(request);
}
next() {
if (this.queues.high.length > 0) return this.queues.high.shift();
if (this.queues.normal.length > 0) return this.queues.normal.shift();
if (this.queues.low.length > 0) return this.queues.low.shift();
return null;
}
}
Monitoring and Alerts
Usage Tracking
class RateLimitMonitor {
constructor() {
this.requests = [];
this.alerts = [];
}
track(endpoint, remaining, limit) {
this.requests.push({
endpoint,
timestamp: Date.now(),
remaining,
limit,
usage: (limit - remaining) / limit
});
// Alert if usage > 80%
if (remaining < limit * 0.2) {
this.alerts.push({
type: 'high_usage',
endpoint,
usage: ((limit - remaining) / limit * 100).toFixed(1)
});
}
}
getStats() {
const now = Date.now();
const recentRequests = this.requests.filter(r => now - r.timestamp < 300000);
return {
totalRequests: recentRequests.length,
avgUsage: recentRequests.reduce((sum, r) => sum + r.usage, 0) / recentRequests.length,
alerts: this.alerts
};
}
}
Upgrading Limits
When to Upgrade
Consider upgrading when:
- Consistently hitting rate limits
- Business growth requires more capacity
- Need higher concurrent request limits
- Require dedicated infrastructure
Enterprise Options
Enterprise plans offer:
- Custom rate limits
- Dedicated endpoints
- Priority support
- SLA guarantees
- Advanced monitoring
Contact sales@inspirahub.net for enterprise pricing.
Troubleshooting
Common Issues
Sudden Rate Limit Errors
- Check for code loops
- Verify retry logic
- Monitor for traffic spikes
Inconsistent Limits
- Confirm API key is correct
- Check subscription status
- Verify endpoint-specific limits
Reset Time Issues
- Use server time, not local
- Account for timezone differences
- Check system clock accuracy