Welcome to API Academy

Module 09 · Real-Time · ~20 min

Rate limits.

By the end of this module, your bot can hammer the exchange exactly as fast as it’s allowed, and not a tick faster, so it never gets cut off mid-trade for being rude to the API.

To get there, you’ll shape your outbound request rate with a client-side token bucket, page through bulk endpoints through that bucket, and layer the SDK retry helpers on top for when a 429 slips through anyway.

Real-Time tier · Reference card
Quick answer

What are the rate limits on the Limitless API?

Limitless does not publish numeric per-second, per-minute, or per-day quotas; the one guaranteed signal is a 429 Too Many Requests when you’ve been throttled. Treat the limit as opaque and defend in two layers. Proactive: a client-side token bucket that every outbound call awaits, with the module’s conservative defaults of capacity 20 and a refill of 10 per second, tuned only once you see production traffic. Reactive: the SDK retry helpers (withRetry in TypeScript, retry_on_errors in Python, WithRetry in Go) back off on a [429, 500, 502, 503, 504] allow-list. Respect a Retry-After header verbatim when present; otherwise use jittered exponential backoff. Never retry 400, 401, or 403, the request is wrong, and never retry a 409, which means a duplicate clientOrderId whose original order was already accepted.

Endpoints verified 2026-06-09 against the OpenAPI spec.

Section 01

Understanding quotas.

Limitless does not publish dedicated rate-limit quotas. What the docs do confirm: a 429 Too Many Requests is the signal that you’ve been throttled, and the fix is the same across TypeScript, Python, and Go, exponential backoff on 429 and 5xx, never retry 400/401.

Status

429 Too Many Requests

The only rate-limit signal the API guarantees. Trip it and you retry with backoff, you don’t resubmit instantly.

Header

Retry-After

If present on a 429 or 503, respect it verbatim. If absent, fall back to your own exponential schedule.

Helpers

withRetry · retry_on_errors · WithRetry

All three SDKs ship a retry helper that implements this pattern, status-code allow-list, max retries, exponential base. Use them instead of rolling your own.

What we actually know

Numeric per-second, per-minute, and per-day quotas are not publicly documented. Treat the rate limit as opaque: ship a client-side limiter to keep your app inside a self-imposed budget, handle 429s when they happen, and tune your numbers once you see production traffic.

Retry on

429 · 5xx

Rate limit + transient server errors. Safe to retry with backoff.

Never retry

400 · 401 · 403

Your request is wrong. Retrying won’t make it right, fix the code.

Order conflict

409

Duplicate clientOrderId on POST /orders, do not retry, the original is already accepted.

Section 02

Backoff strategies.

When a request fails because of throttling or a transient server hiccup, you retry. The question is how you wait between tries. Get this wrong and every client in your fleet retries in lockstep, hammering the service the instant it recovers and tripping the rate limit again. The three cards below walk from the worst option to the one you should actually use, fixed, exponential, and jittered. Only the last one breaks the stampede by decorrelating clients, which is why the SDK retry helpers default to it.

Fixed

Sleep a constant interval between retries. Easy, predictable, and catastrophic under load: every client wakes up at the same moment and stampedes the service.

wait = 1000ms

Exponential

Double the delay on each failure. Gives the upstream room to recover but still correlates across clients, a thundering herd on the rebound.

wait = base * 2^n

Jittered

Exponential backoff with a random offset added to each wait. Decorrelates clients. This is the strategy you actually want in production, the only one without a stampede failure mode.

wait = base * 2^n + rand(0, base)

Section 03

Implementing a client-side limiter.

The right place to rate-limit yourself is in your client, not at the exchange. A token bucket lets bursts happen up to a cap but enforces your long-term rate. Acquire a token before every request; refill at a steady pace.

// Module 09, Client-side token bucket (defense in depth).
//
// Limitless does not publish per-second / per-minute quotas, so
// the SDK-provided withRetry() + @retryOnErrors decorator cover
// the reactive case. A client-side limiter is your PROACTIVE
// layer, it stops your app from bursting in the first place.

import { withRetry } from '@limitless-exchange/sdk';

class TokenBucket {
  private tokens: number;
  private lastRefill: number;

  constructor(private capacity: number, private refillPerSec: number) {
    this.tokens = capacity;
    this.lastRefill = Date.now();
  }

  private refill() {
    const now = Date.now();
    const add = ((now - this.lastRefill) / 1000) * this.refillPerSec;
    this.tokens = Math.min(this.capacity, this.tokens + add);
    this.lastRefill = now;
  }

  async acquire(): Promise<void> {
    while (true) {
      this.refill();
      if (this.tokens >= 1) { this.tokens -= 1; return; }
      const waitMs = ((1 - this.tokens) / this.refillPerSec) * 1000;
      await new Promise(r => setTimeout(r, waitMs));
    }
  }
}

// Pick conservative numbers until you see a 429 in the wild.
const limiter = new TokenBucket(20, 10);

// Limiter (proactive) + withRetry (reactive) = belt and braces.
export async function safeCall<T>(fn: () => Promise<T>): Promise<T> {
  await limiter.acquire();
  return withRetry(fn, {
    statusCodes: [429, 500, 502, 503, 504],
    maxRetries:  3,
    delays:      [1000, 2000, 4000],
    onRetry:     (err, attempt) => console.warn(`retry ${attempt}: ${err.message}`),
  });
}

How to run this

  1. Set LIMITLESS_API_KEY. The Go snippet also imports golang.org/x/time/rate, run go get golang.org/x/time/rate once inside the module. Keep capacity=20 / refill=10 until you’ve seen production traffic.
  2. Save the snippet above as token-bucket.ts and import safeCall from your other scripts, or call it directly with npx tsx token-bucket.ts after adding a top-level invocation.
  3. Fire the call in a tight loop (e.g. 50 iterations) and confirm it paces itself at ~10/s instead of bursting. If the SDK onRetry hook logs a 429, tighten the bucket numbers; if it never fires, you’re safely inside the quota.

Section 04

Pagination & bulk fetching.

When you genuinely need thousands of rows, page sequentially through the limiter. Parallelising pagination without a limiter is the fastest way to get your API key banned. The limiter guarantees that even maxed-out concurrency still respects the quota.

// Module 09, Page-based pagination through the limiter.
//
// ListMarketsResponseDto supports both offset pagination (via `page` +
// `limit`) and cursor pagination (via `cursor`). This example uses page
// mode for simplicity; for very large scans, prefer cursor mode to avoid
// the drift that can happen when markets are added/removed between pages.

import { HttpClient, MarketFetcher } from '@limitless-exchange/sdk';

const httpClient    = new HttpClient({
  baseURL: 'https://api.limitless.exchange',
  apiKey:  process.env.LIMITLESS_API_KEY,
});
const marketFetcher = new MarketFetcher(httpClient);

async function fetchAllMarkets() {
  const all: any[] = [];
  let page = 1;
  const limit = 100;

  while (true) {
    await limiter.acquire(); // token-bucket from section 03
    const res = await marketFetcher.getActiveMarkets({ page, limit });
    const batch = res.data ?? [];
    if (batch.length === 0) break;

    all.push(...batch);
    if (batch.length < limit) break; // last page
    page += 1;
  }

  return all;
}

How to run this

  1. Set LIMITLESS_API_KEY. This snippet reuses the limiter from Section 03, so paste it in the same file or import it. Add a top-level call (fetchAllMarkets().then(rows => console.log(rows.length))) to actually kick it off.
  2. Save the combined file as fetch-all-markets.ts, then run npx tsx fetch-all-markets.ts.
  3. You see the total row count at the end, and the loop takes roughly total / refillPerSec seconds to finish. A noisy onRetry hook means the bucket is too loose; a silent one means you’re paced correctly.

Never parallelise pagination without a limiter.

Fan-out Promise.all / asyncio.gather / goroutines without token acquisition will nuke your per-second budget and get you 429’d within a second. Always gate concurrency at the limiter, then you can fan out as aggressively as you like.

Common questions

Limitless rate limits: what people ask

Each answer also ships invisibly as schema.org FAQ data for search engines and AI assistants. Tap a question to expand.

  1. How should a bot handle a 429 from the Limitless API?
    Retry with backoff, never an instant resubmit. If the 429 (or a 503) carries a Retry-After header, respect it verbatim; if it’s absent, fall back to your own jittered exponential schedule. The SDK helpers, withRetry, retry_on_errors, and WithRetry, implement exactly this pattern with a status-code allow-list, max retries, and an exponential base, so use them instead of rolling your own.
  2. Which HTTP status codes are safe to retry?
    429 and the 5xx family (500, 502, 503, 504): rate limiting and transient server errors, safe with backoff. Never retry 400, 401, or 403, your request is wrong and retrying won’t make it right. And never retry a 409 on POST /orders: it means a duplicate clientOrderId, and the original order is already accepted, so there’s nothing to resend.
  3. What is a token bucket rate limiter?
    A client-side limiter that allows bursts up to a capacity while enforcing your long-term rate: every request acquires a token first, and the bucket refills at a steady pace. It’s the proactive layer that stops your app from bursting in the first place, with the SDK retry helpers as the reactive layer. The module’s defaults are capacity 20 and 10 tokens per second, conservative until a real 429 tells you otherwise.
  4. Why do retries need jitter?
    Because synchronized backoff recreates the storm you were retrying to avoid: after an outage, every client on the same wall-clock cadence hammers the server in lockstep at each step of the schedule. Add randomness to every delay: delay = min(cap, base * 2^attempt) + random(0, 1s), with defaults of base = 250ms, cap = 30s, and maxAttempts = 8, and cap concurrent reconnects per host at 1.
  5. How do you bulk-fetch thousands of rows without tripping the rate limit?
    Page sequentially through the limiter: acquire a token, fetch a page of 100 with page + limit, and repeat until a short page comes back. Never parallelise pagination without a limiter; fan-out via Promise.all, asyncio.gather, or goroutines without token acquisition will nuke your per-second budget and get you 429’d within a second. For very large scans, prefer cursor pagination, it stays stable when markets are added or removed between pages.

Section 05

Module checklist.

Tick each item once you’ve actually done it. The Continue button unlocks at 5/5.

Module 09 complete

Throttle tamed.

Your bot stops getting itself banned. It paces its own requests, knows the difference between a real failure and a “try again in a second,” and keeps trading through the kind of network noise that takes naive bots offline.

Concretely, you can max out your quota without getting blocked. Here’s what you walk away with:

01

A client-side TokenBucket (or rate.Limiter in Go) that every outbound call awaits, the proactive layer that stops bursts before they leave your process.

02

A bulk-fetch routine that pages sequentially through the limiter, so even a job that pulls every market still respects the rate budget.

03

The SDK-provided reactive layer wired up, withRetry / retry_on_errors / WithRetry with a [429, 500, 502, 503, 504] allow-list, so the rare 429 that slips through handles itself.

Next up: the failure modes the limiter can’t prevent, idempotency keys on order placement, typed error hierarchies, and the retry patterns that keep Tier 2 bots alive through real production hiccups.

Complete the checklist above to unlock