Skip to main content
GUIDE · RATE LIMITING11 min read

Rate limits, retries, and backpressure: how TrackLayer handles BFCM scale

A technical guide to what happens when Black Friday Cyber Monday traffic hits the event pipeline: no edge drops, durable queueing, adaptive platform pacing, bounded retries, and clear backpressure signals for operators who need to know whether the system is healthy or merely surviving.

Context

Why rate limiting matters

BFCM does not break pipelines because a store did more traffic than normal. It breaks pipelines because the shape of the traffic changes. A merchant can see 10x normal session volume in a few hours, but the real stress arrives when page views, carts, purchases, lead submits, and post-purchase events all compress into the same short interval. If collection, queuing, and delivery happen in the same synchronous request path, the slowest downstream partner becomes the system ceiling and the whole event path inherits that fragility.

Each ad platform also has its own idea of what a healthy request rate looks like. Some publish fixed request ceilings, some expose runtime limits through response headers, and some throttle based on token, account, pixel, or customer scope. That means one universal retry rule is not enough. TrackLayer treats rate limiting as a destination-specific scheduling problem: accept traffic once, preserve event identity, and pace each destination according to the contract it actually enforces.

Ingress

TrackLayer ingress limits

Ingress limits exist to protect shared infrastructure and billing boundaries, not to model downstream API constraints. They control how much volume a workspace can push into the system in steady state, while the queue and consumer layers handle the burst behavior that appears during campaign launches and checkout peaks.

TierEvents/secondBurstMonthly cap
Starter25/sec2502M events
Growth100/sec1,00010M events
Scale500/sec5,000100M events
Enterprise2,000+/sec20,000+Contracted
Destinations

Downstream platform limits

The table below is a practical operator view, not a promise that every account on every platform gets the same quota forever. It reflects public platform guidance and effective ceilings commonly enforced as of April 23, 2026. Some APIs publish exact request numbers, others expose the active ceiling through headers or account-specific quotas. TrackLayer stores these as pacing inputs and prefers platform-returned retry windows when available.

PlatformDocumented ceilingOperational note
Meta CAPI1,000 events/min per PixelReference ceiling TrackLayer budgets for when a Pixel is hot during BFCM flash-sale windows.
Google Ads offline conversions100 conversions/sec per customer_idHigh throughput, but still isolated per conversion customer, so large merchants need customer-aware worker keys.
TikTok Events API600 requests/min sliding windowTikTok publishes one-minute sliding-window enforcement, so short spikes can trip 429s even when hourly volume looks normal.
LinkedIn Conversions API600 requests/min and 300,000/day per member tokenLinkedIn also supports batches up to 5,000 conversion events, which TrackLayer uses to lower request pressure.
Microsoft Ads offline conversions1,000 conversions/requestThe practical limit is often request shaping plus account-level concurrency rather than pure daily volume.
X Conversion API100,000 requests/15 min/account and 500 events/requestLarge envelope, but still worth pacing because sudden retries can burn the whole window quickly.
Pinterest Conversions APIToken-scoped API quota with 429 throttlingPinterest exposes effective ceilings through runtime throttling, so TrackLayer treats response headers as the contract of record.
Snapchat CAPIBusiness token quota with 429 throttlingSnap traffic is paced independently from other destinations so one campaign launch cannot starve other queues.
Reddit CAPIApp-level request quota with retry-after windowsTrackLayer obeys returned retry windows instead of guessing fixed sleeps for Reddit delivery workers.
Taboola S2S conversionsAccount-managed throughput windowOften generous in steady state, but still sensitive to parallel replay storms after an outage.
Outbrain conversion importsAdvertiser-token throttle windowOutbrain limits matter most when replaying offline conversions in large batches from CRM sources.
Criteo server-side conversionsAdvertiser-scoped request quotaCriteo is usually stable under batching, but burst retries should still preserve event IDs and original timestamps.
Burst Handling

How we handle bursts

The core design choice is simple: collection and delivery are not the same concern. Accepting an event at the edge should stay fast even when one partner API starts throttling. Delivery can then be scheduled with far more context about which destination, account, and event class is under pressure.

Step 01

Edge worker ingestion

Inbound requests terminate at the edge, are authenticated, normalized, and written to durable storage without applying a destination throttle. The goal at this stage is to avoid losing customer truth during the first spike.

Step 02

Queue buffering

Cloudflare Queues absorb burst volume and decouple collection from delivery. During peak sale drops, a 1M+ message buffer is far safer than trying to push every event straight into twelve different partner APIs.

Step 03

Adaptive consumer pacing

Consumer workers pull from the queue and maintain separate token buckets, concurrency caps, and batch sizes per platform, account, pixel, or customer_id depending on the downstream API contract.

Step 04

Backoff + DLQ

Transient failures get retried with exponential backoff. Persistent failures move to a dead-letter queue with payload, response body, destination, retry count, and first-seen timestamp preserved for operator review.

Ingress accept
  → Cloudflare edge normalization
  → Cloudflare Queues buffer
  → destination-aware consumer workers
  → retry with exponential backoff
  → dead-letter queue after persistent failure
Retries

Retry strategy

Retries are useful only if they are bounded, idempotent, and separated from validation failures. TrackLayer keeps the original event identifier on every retry, applies exponential spacing, and stops after eight failed delivery attempts. After that point the problem is no longer transport noise. It is an incident that deserves operator attention.

AttemptWaitTotal elapsed
11s1s
22s3s
34s7s
48s15s
516s31s
632s63s
764s127s
8128s255s
After 8Move to DLQInvestigate payload, platform state, or destination auth
Observability

Backpressure signals

queue_depth

How many events are waiting to be processed per destination shard. Depth rising while ingress is flat usually means a downstream ceiling changed or a worker pool is undersized.

consumer_lag

The time delta between event ingestion and consumer pickup. Lag is the most direct merchant-facing signal because it maps to how fresh conversions will appear in ad platforms.

platform_latency_p95

The 95th percentile response time by destination. Rising latency often appears before 429 rates increase, especially when an API is degrading but not yet rejecting requests.

platform_error_rate

The rolling share of 4xx and 5xx responses by destination and by account shard. This separates bad payloads from true platform throttling and lets operators route incidents correctly.

dlq_growth_rate

How fast the dead-letter queue is growing. A DLQ that grows faster than operators can drain it is a stronger alert than a single 429 spike.

Preparation

BFCM readiness checklist

01

Verify every destination uses a stable dedup key such as event_id, conversion_id, order_id, or transaction identifier before the BFCM window starts.

02

Warm up traffic gradually where platforms are sensitive to sudden jumps, instead of switching from low-volume weekdays to full promotional throughput in one deployment.

03

Separate high-value destinations from low-priority destinations so a slow social platform cannot delay warehouse, webhook, or email automation traffic.

04

Store original event timestamps at collection time and never overwrite them with queue processing time during retries or replays.

05

Define per-platform concurrency, batch size, and retry policy in configuration, not code, so on-call teams can tune throughput without a redeploy.

06

Alert on queue_depth, consumer_lag, platform_latency_p95, platform_error_rate, and dlq_growth_rate at destination granularity, not only system-wide averages.

07

Run a controlled replay drill before BFCM using production-like payload sizes to confirm workers, queue retention, and observability dashboards behave as expected.

08

Document which failures are safe to retry automatically and which require operator review, especially for 400-class validation errors that should not loop forever.

Diagnostics

Troubleshooting

Meta 613 or destination 429

Reduce worker concurrency for the hot pixel, keep the original event_id, honor Retry-After if returned, and let the queue absorb the backlog instead of issuing fresh parallel retries.

Batch accepted with partial failures

Split successful and failed records immediately. Only failed records should re-enter the retry pipeline, otherwise accepted conversions get duplicated during replay.

Queue depth grows but error rate stays low

This usually means the consumer side is underscaled or batch size is too small. Increase workers or per-destination batch size before touching the global ingress limit.

DLQ growth after a schema change

Treat it as a payload regression, not a throughput issue. Pause retries for the bad destination, patch the serializer, and replay only the invalid window once validation passes.

Conversions arrive hours late after a flash sale

Inspect consumer_lag and platform_latency_p95 together. If queue lag is high, add consumer capacity. If platform latency is high with low queue depth, the bottleneck is downstream and pacing must be reduced.

FAQ

Common questions

Why not rate limit at the very edge and reject surplus traffic?

Because that throws away the only copy of customer truth during the most valuable moments. At BFCM scale, it is safer to accept events, persist them, and meter delivery later where platform-specific constraints actually apply.

Does every destination share the same retry policy?

No. The exponential schedule is the default transport policy, but TrackLayer can override pacing, max attempts, and batch size per destination when a platform exposes stricter semantics or a Retry-After header.

What makes backpressure different from rate limiting?

Rate limiting is a rule. Backpressure is the system response to saturation. A platform can be within its published quota and still create backpressure if latency rises, worker capacity drops, or validation failures start looping.

Can merchants safely replay a BFCM backlog the next morning?

Yes, if the replay preserves original timestamps and dedup keys and is paced per destination. A replay storm with new IDs is the fastest way to create duplicate conversions and another round of throttling.

What should an operator look at first during an incident?

Start with queue_depth and consumer_lag by destination shard. If both are flat, the problem is usually payload quality. If they are rising, inspect platform latency and throttling responses before increasing capacity.

Next reads

Related implementation guides

We use essential cookies to keep the site secure and functional. Analytics and third-party tags run only with your consent. See our Cookie Policy.

We use essential cookies to keep the site secure and functional. Analytics and third-party tags run only with your consent. See our Cookie Policy.