Why rate limiting matters
BFCM does not break pipelines because a store did more traffic than normal. It breaks pipelines because the shape of the traffic changes. A merchant can see 10x normal session volume in a few hours, but the real stress arrives when page views, carts, purchases, lead submits, and post-purchase events all compress into the same short interval. If collection, queuing, and delivery happen in the same synchronous request path, the slowest downstream partner becomes the system ceiling and the whole event path inherits that fragility.
Each ad platform also has its own idea of what a healthy request rate looks like. Some publish fixed request ceilings, some expose runtime limits through response headers, and some throttle based on token, account, pixel, or customer scope. That means one universal retry rule is not enough. TrackLayer treats rate limiting as a destination-specific scheduling problem: accept traffic once, preserve event identity, and pace each destination according to the contract it actually enforces.
TrackLayer ingress limits
Ingress limits exist to protect shared infrastructure and billing boundaries, not to model downstream API constraints. They control how much volume a workspace can push into the system in steady state, while the queue and consumer layers handle the burst behavior that appears during campaign launches and checkout peaks.
| Tier | Events/second | Burst | Monthly cap |
|---|---|---|---|
| Starter | 25/sec | 250 | 2M events |
| Growth | 100/sec | 1,000 | 10M events |
| Scale | 500/sec | 5,000 | 100M events |
| Enterprise | 2,000+/sec | 20,000+ | Contracted |
Downstream platform limits
The table below is a practical operator view, not a promise that every account on every platform gets the same quota forever. It reflects public platform guidance and effective ceilings commonly enforced as of April 23, 2026. Some APIs publish exact request numbers, others expose the active ceiling through headers or account-specific quotas. TrackLayer stores these as pacing inputs and prefers platform-returned retry windows when available.
| Platform | Documented ceiling | Operational note |
|---|---|---|
| Meta CAPI | 1,000 events/min per Pixel | Reference ceiling TrackLayer budgets for when a Pixel is hot during BFCM flash-sale windows. |
| Google Ads offline conversions | 100 conversions/sec per customer_id | High throughput, but still isolated per conversion customer, so large merchants need customer-aware worker keys. |
| TikTok Events API | 600 requests/min sliding window | TikTok publishes one-minute sliding-window enforcement, so short spikes can trip 429s even when hourly volume looks normal. |
| LinkedIn Conversions API | 600 requests/min and 300,000/day per member token | LinkedIn also supports batches up to 5,000 conversion events, which TrackLayer uses to lower request pressure. |
| Microsoft Ads offline conversions | 1,000 conversions/request | The practical limit is often request shaping plus account-level concurrency rather than pure daily volume. |
| X Conversion API | 100,000 requests/15 min/account and 500 events/request | Large envelope, but still worth pacing because sudden retries can burn the whole window quickly. |
| Pinterest Conversions API | Token-scoped API quota with 429 throttling | Pinterest exposes effective ceilings through runtime throttling, so TrackLayer treats response headers as the contract of record. |
| Snapchat CAPI | Business token quota with 429 throttling | Snap traffic is paced independently from other destinations so one campaign launch cannot starve other queues. |
| Reddit CAPI | App-level request quota with retry-after windows | TrackLayer obeys returned retry windows instead of guessing fixed sleeps for Reddit delivery workers. |
| Taboola S2S conversions | Account-managed throughput window | Often generous in steady state, but still sensitive to parallel replay storms after an outage. |
| Outbrain conversion imports | Advertiser-token throttle window | Outbrain limits matter most when replaying offline conversions in large batches from CRM sources. |
| Criteo server-side conversions | Advertiser-scoped request quota | Criteo is usually stable under batching, but burst retries should still preserve event IDs and original timestamps. |
How we handle bursts
The core design choice is simple: collection and delivery are not the same concern. Accepting an event at the edge should stay fast even when one partner API starts throttling. Delivery can then be scheduled with far more context about which destination, account, and event class is under pressure.
Edge worker ingestion
Inbound requests terminate at the edge, are authenticated, normalized, and written to durable storage without applying a destination throttle. The goal at this stage is to avoid losing customer truth during the first spike.
Queue buffering
Cloudflare Queues absorb burst volume and decouple collection from delivery. During peak sale drops, a 1M+ message buffer is far safer than trying to push every event straight into twelve different partner APIs.
Adaptive consumer pacing
Consumer workers pull from the queue and maintain separate token buckets, concurrency caps, and batch sizes per platform, account, pixel, or customer_id depending on the downstream API contract.
Backoff + DLQ
Transient failures get retried with exponential backoff. Persistent failures move to a dead-letter queue with payload, response body, destination, retry count, and first-seen timestamp preserved for operator review.
Ingress accept
→ Cloudflare edge normalization
→ Cloudflare Queues buffer
→ destination-aware consumer workers
→ retry with exponential backoff
→ dead-letter queue after persistent failureRetry strategy
Retries are useful only if they are bounded, idempotent, and separated from validation failures. TrackLayer keeps the original event identifier on every retry, applies exponential spacing, and stops after eight failed delivery attempts. After that point the problem is no longer transport noise. It is an incident that deserves operator attention.
| Attempt | Wait | Total elapsed |
|---|---|---|
| 1 | 1s | 1s |
| 2 | 2s | 3s |
| 3 | 4s | 7s |
| 4 | 8s | 15s |
| 5 | 16s | 31s |
| 6 | 32s | 63s |
| 7 | 64s | 127s |
| 8 | 128s | 255s |
| After 8 | Move to DLQ | Investigate payload, platform state, or destination auth |
Backpressure signals
queue_depth
How many events are waiting to be processed per destination shard. Depth rising while ingress is flat usually means a downstream ceiling changed or a worker pool is undersized.
consumer_lag
The time delta between event ingestion and consumer pickup. Lag is the most direct merchant-facing signal because it maps to how fresh conversions will appear in ad platforms.
platform_latency_p95
The 95th percentile response time by destination. Rising latency often appears before 429 rates increase, especially when an API is degrading but not yet rejecting requests.
platform_error_rate
The rolling share of 4xx and 5xx responses by destination and by account shard. This separates bad payloads from true platform throttling and lets operators route incidents correctly.
dlq_growth_rate
How fast the dead-letter queue is growing. A DLQ that grows faster than operators can drain it is a stronger alert than a single 429 spike.
BFCM readiness checklist
Verify every destination uses a stable dedup key such as event_id, conversion_id, order_id, or transaction identifier before the BFCM window starts.
Warm up traffic gradually where platforms are sensitive to sudden jumps, instead of switching from low-volume weekdays to full promotional throughput in one deployment.
Separate high-value destinations from low-priority destinations so a slow social platform cannot delay warehouse, webhook, or email automation traffic.
Store original event timestamps at collection time and never overwrite them with queue processing time during retries or replays.
Define per-platform concurrency, batch size, and retry policy in configuration, not code, so on-call teams can tune throughput without a redeploy.
Alert on queue_depth, consumer_lag, platform_latency_p95, platform_error_rate, and dlq_growth_rate at destination granularity, not only system-wide averages.
Run a controlled replay drill before BFCM using production-like payload sizes to confirm workers, queue retention, and observability dashboards behave as expected.
Document which failures are safe to retry automatically and which require operator review, especially for 400-class validation errors that should not loop forever.
Troubleshooting
Meta 613 or destination 429
Reduce worker concurrency for the hot pixel, keep the original event_id, honor Retry-After if returned, and let the queue absorb the backlog instead of issuing fresh parallel retries.
Batch accepted with partial failures
Split successful and failed records immediately. Only failed records should re-enter the retry pipeline, otherwise accepted conversions get duplicated during replay.
Queue depth grows but error rate stays low
This usually means the consumer side is underscaled or batch size is too small. Increase workers or per-destination batch size before touching the global ingress limit.
DLQ growth after a schema change
Treat it as a payload regression, not a throughput issue. Pause retries for the bad destination, patch the serializer, and replay only the invalid window once validation passes.
Conversions arrive hours late after a flash sale
Inspect consumer_lag and platform_latency_p95 together. If queue lag is high, add consumer capacity. If platform latency is high with low queue depth, the bottleneck is downstream and pacing must be reduced.
Common questions
Why not rate limit at the very edge and reject surplus traffic?
Because that throws away the only copy of customer truth during the most valuable moments. At BFCM scale, it is safer to accept events, persist them, and meter delivery later where platform-specific constraints actually apply.
Does every destination share the same retry policy?
No. The exponential schedule is the default transport policy, but TrackLayer can override pacing, max attempts, and batch size per destination when a platform exposes stricter semantics or a Retry-After header.
What makes backpressure different from rate limiting?
Rate limiting is a rule. Backpressure is the system response to saturation. A platform can be within its published quota and still create backpressure if latency rises, worker capacity drops, or validation failures start looping.
Can merchants safely replay a BFCM backlog the next morning?
Yes, if the replay preserves original timestamps and dedup keys and is paced per destination. A replay storm with new IDs is the fastest way to create duplicate conversions and another round of throttling.
What should an operator look at first during an incident?
Start with queue_depth and consumer_lag by destination shard. If both are flat, the problem is usually payload quality. If they are rising, inspect platform latency and throttling responses before increasing capacity.
Related implementation guides
BFCM playbook
Operational planning for peak-week traffic, deployment freezes, failover drills, and conversion recovery.
Read guide →Deduplication explained
How to keep retries, browser events, and server events from multiplying one customer action into several conversions.
Read guide →Meta CAPI setup guide
Destination-specific implementation details for Meta event IDs, user_data, batching, and diagnostics.
Read guide →