Welcome to API Academy
Module 09 · Real-Time · ~20 min
Rate limits.
By the end of this module, your bot can hammer the exchange exactly as fast as it’s allowed, and not a tick faster, so it never gets cut off mid-trade for being rude to the API.
To get there, you’ll shape your outbound request rate with a client-side token bucket, page through bulk endpoints through that bucket, and layer the SDK retry helpers on top for when a 429 slips through anyway.
Real-Time tier · Reference cardWhat are the rate limits on the Limitless API?
Limitless does not publish numeric per-second, per-minute, or per-day quotas; the one guaranteed signal is a 429 Too Many Requests when you’ve been throttled. Treat the limit as opaque and defend in two layers. Proactive: a client-side token bucket that every outbound call awaits, with the module’s conservative defaults of capacity 20 and a refill of 10 per second, tuned only once you see production traffic. Reactive: the SDK retry helpers (withRetry in TypeScript, retry_on_errors in Python, WithRetry in Go) back off on a [429, 500, 502, 503, 504] allow-list. Respect a Retry-After header verbatim when present; otherwise use jittered exponential backoff. Never retry 400, 401, or 403, the request is wrong, and never retry a 409, which means a duplicate clientOrderId whose original order was already accepted.
Endpoints verified 2026-06-09 against the OpenAPI spec.
Section 01
Understanding quotas.
Limitless does not publish dedicated rate-limit quotas. What the docs do confirm: a 429 Too Many Requests is the signal that you’ve been throttled, and the fix is the same across TypeScript, Python, and Go, exponential backoff on 429 and 5xx, never retry 400/401.
Status
429 Too Many Requests
The only rate-limit signal the API guarantees. Trip it and you retry with backoff, you don’t resubmit instantly.
Header
Retry-After
If present on a 429 or 503, respect it verbatim. If absent, fall back to your own exponential schedule.
Helpers
withRetry · retry_on_errors · WithRetry
All three SDKs ship a retry helper that implements this pattern, status-code allow-list, max retries, exponential base. Use them instead of rolling your own.
What we actually know
Numeric per-second, per-minute, and per-day quotas are not publicly documented. Treat the rate limit as opaque: ship a client-side limiter to keep your app inside a self-imposed budget, handle 429s when they happen, and tune your numbers once you see production traffic.
Retry on
429 · 5xx
Rate limit + transient server errors. Safe to retry with backoff.
Never retry
400 · 401 · 403
Your request is wrong. Retrying won’t make it right, fix the code.
Order conflict
409
Duplicate clientOrderId on POST /orders, do not retry, the original is already accepted.
Section 02
Backoff strategies.
When a request fails because of throttling or a transient server hiccup, you retry. The question is how you wait between tries. Get this wrong and every client in your fleet retries in lockstep, hammering the service the instant it recovers and tripping the rate limit again. The three cards below walk from the worst option to the one you should actually use, fixed, exponential, and jittered. Only the last one breaks the stampede by decorrelating clients, which is why the SDK retry helpers default to it.
Fixed
Sleep a constant interval between retries. Easy, predictable, and catastrophic under load: every client wakes up at the same moment and stampedes the service.
wait = 1000ms
Exponential
Double the delay on each failure. Gives the upstream room to recover but still correlates across clients, a thundering herd on the rebound.
wait = base * 2^n
Jittered
Exponential backoff with a random offset added to each wait. Decorrelates clients. This is the strategy you actually want in production, the only one without a stampede failure mode.
wait = base * 2^n + rand(0, base)
Section 03
Implementing a client-side limiter.
The right place to rate-limit yourself is in your client, not at the exchange. A token bucket lets bursts happen up to a cap but enforces your long-term rate. Acquire a token before every request; refill at a steady pace.
// Module 09, Client-side token bucket (defense in depth).
//
// Limitless does not publish per-second / per-minute quotas, so
// the SDK-provided withRetry() + @retryOnErrors decorator cover
// the reactive case. A client-side limiter is your PROACTIVE
// layer, it stops your app from bursting in the first place.
import { withRetry } from '@limitless-exchange/sdk';
class TokenBucket {
private tokens: number;
private lastRefill: number;
constructor(private capacity: number, private refillPerSec: number) {
this.tokens = capacity;
this.lastRefill = Date.now();
}
private refill() {
const now = Date.now();
const add = ((now - this.lastRefill) / 1000) * this.refillPerSec;
this.tokens = Math.min(this.capacity, this.tokens + add);
this.lastRefill = now;
}
async acquire(): Promise<void> {
while (true) {
this.refill();
if (this.tokens >= 1) { this.tokens -= 1; return; }
const waitMs = ((1 - this.tokens) / this.refillPerSec) * 1000;
await new Promise(r => setTimeout(r, waitMs));
}
}
}
// Pick conservative numbers until you see a 429 in the wild.
const limiter = new TokenBucket(20, 10);
// Limiter (proactive) + withRetry (reactive) = belt and braces.
export async function safeCall<T>(fn: () => Promise<T>): Promise<T> {
await limiter.acquire();
return withRetry(fn, {
statusCodes: [429, 500, 502, 503, 504],
maxRetries: 3,
delays: [1000, 2000, 4000],
onRetry: (err, attempt) => console.warn(`retry ${attempt}: ${err.message}`),
});
}
# Module 09, Client-side token bucket (defense in depth).
#
# Limitless does not publish per-second / per-minute quotas. The
# SDK-provided retry_on_errors decorator covers the reactive case
# (back off when you actually see a 429). A client-side limiter
# is your PROACTIVE layer, it stops the burst from happening
# in the first place.
import asyncio
import time
from limitless_sdk.api import retry_on_errors
class TokenBucket:
def __init__(self, capacity: int, refill_per_sec: float) -> None:
self.capacity = capacity
self.refill_per_sec = refill_per_sec
self.tokens = float(capacity)
self.last_refill = time.monotonic()
self._lock = asyncio.Lock()
def _refill(self) -> None:
now = time.monotonic()
self.tokens = min(
self.capacity,
self.tokens + (now - self.last_refill) * self.refill_per_sec,
)
self.last_refill = now
async def acquire(self) -> None:
while True:
async with self._lock:
self._refill()
if self.tokens >= 1:
self.tokens -= 1
return
wait = (1 - self.tokens) / self.refill_per_sec
await asyncio.sleep(wait)
# Conservative defaults, tune once you see production traffic.
limiter = TokenBucket(capacity=20, refill_per_sec=10)
# Reactive layer: the SDK's retry_on_errors decorator backs off
# automatically on 429 + 5xx with exponential delays.
@retry_on_errors(status_codes={429, 500, 502, 503, 504}, max_retries=3, delays=[1, 2, 4])
async def limited_fetch(http_client, path: str):
await limiter.acquire()
return await http_client.get(path)
// Module 09, Client-side token bucket (defense in depth) +
// the SDK's WithRetry/RetryableClient for reactive backoff.
//
// Limitless does not publish per-second quotas, so we combine
// two layers: golang.org/x/time/rate keeps bursts in check,
// and limitless.WithRetry handles the occasional 429 that
// sneaks through.
package main
import (
"context"
"log"
"time"
limitless "github.com/limitless-labs-group/limitless-exchange-go-sdk/limitless"
"golang.org/x/time/rate"
)
// 10 req/s sustained, burst up to 20.
var limiter = rate.NewLimiter(rate.Limit(10), 20)
func main() {
ctx := context.Background()
client := limitless.NewHttpClient()
mf := limitless.NewMarketFetcher(client)
// Proactive limiter + reactive retry helper.
if err := limiter.Wait(ctx); err != nil {
log.Fatal(err)
}
result, err := limitless.WithRetry(
ctx,
func() (*limitless.ActiveMarketsResponse, error) {
return mf.GetActiveMarkets(ctx, &limitless.ActiveMarketsParams{Limit: 10})
},
limitless.RetryConfig{
StatusCodes: []int{429, 500, 502, 503, 504},
MaxRetries: 3,
ExponentialBase: 2.0,
MaxDelay: 60 * time.Second,
OnRetry: func(attempt int, err error, delay time.Duration) {
log.Printf("retry %d after %v: %v", attempt, delay, err)
},
},
)
if err != nil {
log.Fatal(err)
}
log.Printf("fetched %d markets", len(result.Data))
}
How to run this
- Set LIMITLESS_API_KEY. The Go snippet also imports golang.org/x/time/rate, run go get golang.org/x/time/rate once inside the module. Keep capacity=20 / refill=10 until you’ve seen production traffic.
- Save the snippet above as token-bucket.ts and import safeCall from your other scripts, or call it directly with npx tsx token-bucket.ts after adding a top-level invocation.
- Save the snippet above as token_bucket.py, import limited_fetch from your caller, or run it directly with python token_bucket.py after adding an asyncio.run(…) block.
- Save the snippet above as main.go inside a Go module, then run go run main.go.
- Fire the call in a tight loop (e.g. 50 iterations) and confirm it paces itself at ~10/s instead of bursting. If the SDK onRetry hook logs a 429, tighten the bucket numbers; if it never fires, you’re safely inside the quota.
Section 04
Pagination & bulk fetching.
When you genuinely need thousands of rows, page sequentially through the limiter. Parallelising pagination without a limiter is the fastest way to get your API key banned. The limiter guarantees that even maxed-out concurrency still respects the quota.
// Module 09, Page-based pagination through the limiter.
//
// ListMarketsResponseDto supports both offset pagination (via `page` +
// `limit`) and cursor pagination (via `cursor`). This example uses page
// mode for simplicity; for very large scans, prefer cursor mode to avoid
// the drift that can happen when markets are added/removed between pages.
import { HttpClient, MarketFetcher } from '@limitless-exchange/sdk';
const httpClient = new HttpClient({
baseURL: 'https://api.limitless.exchange',
apiKey: process.env.LIMITLESS_API_KEY,
});
const marketFetcher = new MarketFetcher(httpClient);
async function fetchAllMarkets() {
const all: any[] = [];
let page = 1;
const limit = 100;
while (true) {
await limiter.acquire(); // token-bucket from section 03
const res = await marketFetcher.getActiveMarkets({ page, limit });
const batch = res.data ?? [];
if (batch.length === 0) break;
all.push(...batch);
if (batch.length < limit) break; // last page
page += 1;
}
return all;
}
# Module 09, Page-based pagination through the limiter.
#
# ListMarketsResponseDto supports both offset (`page` + `limit`) and
# cursor pagination (`cursor`). This example uses page mode for
# simplicity; prefer cursor mode for very large scans since it's
# stable when markets are added/removed between pages.
from limitless_sdk.api import HttpClient
from limitless_sdk.markets import MarketFetcher
http_client = HttpClient()
market_fetcher = MarketFetcher(http_client)
async def fetch_all_markets() -> list[dict]:
all_rows: list[dict] = []
page, limit = 1, 100
while True:
await limiter.acquire() # from section 03
response = await market_fetcher.get_active_markets(page=page, limit=limit)
batch = response["data"]
if not batch:
break
all_rows.extend(batch)
if len(batch) < limit:
break
page += 1
return all_rows
// Module 09, Page-based pagination through the limiter.
//
// GetActiveMarkets uses Page + Limit on ActiveMarketsParams.
// ListMarketsResponseDto also supports cursor pagination; use that
// for stable large scans. This example walks pages for simplicity.
func fetchAllMarkets(ctx context.Context, mf *limitless.MarketFetcher) ([]limitless.Market, error) {
var all []limitless.Market
page := 1
limit := 100
for {
if err := limiter.Wait(ctx); err != nil {
return nil, err
}
result, err := mf.GetActiveMarkets(ctx, &limitless.ActiveMarketsParams{
Page: page,
Limit: limit,
})
if err != nil {
return nil, err
}
if len(result.Data) == 0 {
break
}
all = append(all, result.Data...)
if len(result.Data) < limit {
break
}
page++
}
return all, nil
}
How to run this
- Set LIMITLESS_API_KEY. This snippet reuses the limiter from Section 03, so paste it in the same file or import it. Add a top-level call (fetchAllMarkets().then(rows => console.log(rows.length))) to actually kick it off.
- Save the combined file as fetch-all-markets.ts, then run npx tsx fetch-all-markets.ts.
- Save the combined file as fetch_all_markets.py, then run python fetch_all_markets.py.
- Save the snippet above as main.go inside a Go module alongside Section 03’s limiter, then run go run main.go.
- You see the total row count at the end, and the loop takes roughly total / refillPerSec seconds to finish. A noisy onRetry hook means the bucket is too loose; a silent one means you’re paced correctly.
Never parallelise pagination without a limiter.
Fan-out Promise.all / asyncio.gather / goroutines without token acquisition will nuke your per-second budget and get you 429’d within a second. Always gate concurrency at the limiter, then you can fan out as aggressively as you like.
Limitless rate limits: what people ask
Each answer also ships invisibly as schema.org FAQ data for search engines and AI assistants. Tap a question to expand.
-
How should a bot handle a 429 from the Limitless API?
Retry with backoff, never an instant resubmit. If the 429 (or a 503) carries aRetry-Afterheader, respect it verbatim; if it’s absent, fall back to your own jittered exponential schedule. The SDK helpers,withRetry,retry_on_errors, andWithRetry, implement exactly this pattern with a status-code allow-list, max retries, and an exponential base, so use them instead of rolling your own. -
Which HTTP status codes are safe to retry?
429 and the 5xx family (500, 502, 503, 504): rate limiting and transient server errors, safe with backoff. Never retry 400, 401, or 403, your request is wrong and retrying won’t make it right. And never retry a 409 onPOST /orders: it means a duplicateclientOrderId, and the original order is already accepted, so there’s nothing to resend. -
What is a token bucket rate limiter?
A client-side limiter that allows bursts up to a capacity while enforcing your long-term rate: every request acquires a token first, and the bucket refills at a steady pace. It’s the proactive layer that stops your app from bursting in the first place, with the SDK retry helpers as the reactive layer. The module’s defaults are capacity 20 and 10 tokens per second, conservative until a real 429 tells you otherwise. -
Why do retries need jitter?
Because synchronized backoff recreates the storm you were retrying to avoid: after an outage, every client on the same wall-clock cadence hammers the server in lockstep at each step of the schedule. Add randomness to every delay:delay = min(cap, base * 2^attempt) + random(0, 1s), with defaults ofbase = 250ms,cap = 30s, andmaxAttempts = 8, and cap concurrent reconnects per host at 1. -
How do you bulk-fetch thousands of rows without tripping the rate limit?
Page sequentially through the limiter: acquire a token, fetch a page of 100 withpage+limit, and repeat until a short page comes back. Never parallelise pagination without a limiter; fan-out viaPromise.all,asyncio.gather, or goroutines without token acquisition will nuke your per-second budget and get you 429’d within a second. For very large scans, prefer cursor pagination, it stays stable when markets are added or removed between pages.
Section 05
Module checklist.
Tick each item once you’ve actually done it. The Continue button unlocks at 5/5.
I count my own outbound request rate (client-side limiter) since Limitless does not publish numeric quotas
I can explain why jittered backoff beats fixed and plain exponential
I wrapped my HTTP client in a token-bucket limiter with a sensible capacity and refill rate
My bulk-fetch code pages sequentially through the limiter
On a 429 I respect the Retry-After header (or jittered exponential backoff if absent) via the SDK retry helpers
Module 09 complete
Throttle tamed.
Your bot stops getting itself banned. It paces its own requests, knows the difference between a real failure and a “try again in a second,” and keeps trading through the kind of network noise that takes naive bots offline.
Concretely, you can max out your quota without getting blocked. Here’s what you walk away with:
A client-side TokenBucket (or rate.Limiter in Go) that every outbound call awaits, the proactive layer that stops bursts before they leave your process.
A bulk-fetch routine that pages sequentially through the limiter, so even a job that pulls every market still respects the rate budget.
The SDK-provided reactive layer wired up, withRetry / retry_on_errors / WithRetry with a [429, 500, 502, 503, 504] allow-list, so the rare 429 that slips through handles itself.
Next up: the failure modes the limiter can’t prevent, idempotency keys on order placement, typed error hierarchies, and the retry patterns that keep Tier 2 bots alive through real production hiccups.
Complete the checklist above to unlock