WebSocket tick feeds show trades that REST polling silently drops
Contents
I read “Is Your Real-Time Feed Lying to You? Streaming US Stock Tick Data with WebSockets” on DEV Community.
It is not about trading decisions. It is about what disappears from your saved data when you collect prices through REST polling instead of a persistent stream.
I compared the pipeline design with the WebSocket implementation in my own chat project and thought through connection stability and data integrity.
What REST polling loses
When you poll a REST API every few hundred milliseconds to one second, you get aggregated snapshots, not individual trades.
The typical format is the OHLCV bar.
Over a fixed interval (1 second, 1 minute, 5 minutes), the first trade price becomes Open, the highest becomes High, the lowest becomes Low, the last becomes Close, and total volume is Volume.
Five numbers, one bar.
An OHLCV bar summarizes “what happened in this interval,” but individual executions are gone.
If High and Low are the same, you cannot tell whether a large sell hit the book first followed by small buybacks, or many small trades preceded a single large order at the end.
Per-trade size, the exchange’s millisecond timestamp, and aggressor side are discarded at aggregation time.
OHLCV is fine for price display.
Order-flow analysis, volume profiling, anomalous trade detection, and accurate backtest logs all need data that is already gone.
Receiving each trade event over WebSocket
WebSocket holds a persistent connection and the server pushes a message every time an event occurs.
Unlike REST polling, there is no request-response cycle per trade, so data arrives the instant an execution happens.
This is the classic “bidirectional, low-latency” use case I organized in the low-latency real-time sync article.
The question is what to do after receiving.
If the receive callback writes to a database or updates the UI directly, any stall there blocks WebSocket message processing itself.
The pipeline splits into receive, queue, and downstream processing.
graph LR
A["Market Data API<br/>WebSocket"] --> B["Receiver"]
B --> C["Queue"]
C --> D["Persistence<br/>store every tick"]
C --> E["Aggregation & Alerts"]
E --> F["UI Update<br/>throttled to 4-5/sec"]
The receive callback parses JSON, does minimal validation, pushes to the queue, and returns.
Persistence, aggregation, alerts, and screen updates all run asynchronously downstream.
Throttle rendering and persistence separately
Active symbols can produce tens to hundreds of executions per second.
Rewriting the DOM on every tick would drop frames in the browser, so screen updates should be throttled to around 4-5 times per second.
The values shown to the frontend are the latest price, recent volume, and perhaps a short moving average.
The critical point is that throttling screen updates must not throttle persistence along with it.
The persistence layer writes each received trade as-is: receive timestamp, source timestamp, symbol, price, size, message ID.
In the Cloudflare Workers real-time analytics article, live display was rounded to per-second counters in KV, while the persistent PostgreSQL store received events individually.
Mixing up the UI throttle with the storage throttle means your backtest later finds “only 5 trades in this second when there were actually 40.”
Never degrade storage granularity for UI convenience.
WebSocket disconnects and QUIC (HTTP/3)
WebSocket runs over TCP as a long-lived connection, so any change in the network path kills it.
Wi-Fi switching, mobile handover, ISP route changes.
Client-side disconnect detection, reconnection, subscription restoration, and REST backfill for the gap are always required.
High-frequency trading minimizes this latency by colocating servers next to the exchange, but for dashboards and daily backtests that level of effort is unnecessary.
What matters is how fast you detect the disconnect and how fast you recover.
QUIC (the transport layer of HTTP/3) may help here.
QUIC runs over UDP and manages sessions by connection ID, so the session survives an IP address change.
TCP connections die the moment the IP changes; QUIC can continue the same session.
0-RTT reconnection also speeds up recovery even if the connection does drop.
Most market data providers currently deliver over WebSocket (TCP), and few officially support streaming over QUIC.
That said, SSE (Server-Sent Events) over HTTP/3 and gRPC streaming over QUIC are starting to appear.
Being ready to switch when providers add support is enough for now.
Where to look when prices don’t match
Switching to WebSocket alone does not make “the trading screen and the saved log disagree” go away.
Discrepancies have multiple layers.
Start with the data type from the source.
Even if labeled “real-time,” it might be best bid/offer, last trade, an aggregated bar, or 15-minute delayed.
US equities involve exchanges, ATSs (alternative trading systems), the SIP (consolidated tape), and vendor-specific normalization. The visible price changes depending on the source.
Timestamps need careful separation too.
The exchange execution time, the vendor’s distribution time, and your application’s receive time are three different things.
Sorting late-arriving messages by receive time alone reorders them relative to actual market sequence.
Your own pipeline introduces causes as well.
Gaps during reconnection, queue overflow, DB unique-constraint violations, batch write failures.
Keeping raw logs from immediately after receipt separate from normalized stored logs lets you trace at which stage data was lost.
AllTick’s minimal example code
The original article’s code connects to AllTick’s WebSocket endpoint, subscribes to AAPL, MSFT, and GOOGL, and extracts symbol, price, volume, and time in a few dozen lines.
Enough for a proof of concept, but far from production-ready.
Subscription restoration on reconnect, ping/pong heartbeat monitoring, sequence number gap detection, a retry queue for failed writes, and API key rotation are all missing.
On the data contract side, which exchanges are covered, what trade types are included (pre-market, after-hours or not), and the latency and loss SLA all need verification before you can trace why a particular execution is absent.
Structural comparison with Kana Chat
Kana Chat v1’s FastAPI backend also streams LLM response tokens to the browser over WebSocket.
It sends thinking, stream, and done events, and the UI renders tokens incrementally.
v2 added subscription restoration after reconnect and cancel notifications.
Structurally, market ticks and chat streaming are the same.
The server produces events, the client receives them and passes them downstream. Keeping the receive callback lightweight is common to both.
The difference is in the severity of dropped messages.
If a chat token is lost, surrounding context usually fills the gap, and at worst you regenerate.
A single missing tick changes the OHLCV calculation and breaks backtest reproducibility. There is no one to request a retransmission from, so the gap stays forever.
Building the WebSocket connection management and event separation in Kana Chat’s streaming implementation meant the receive-queue-downstream skeleton for a tick feed pipeline came together without hesitation.
Going the other direction, thinking through the “never drop a single tick” design made me notice that the duplicate-check on message IDs during reconnection on the chat side was sloppier than it should be.
Can screen updates keep up with the data feed?
Earlier I wrote “throttle to 4-5 per second.”
If data arrives in near real-time, does screen rendering become the bottleneck?
The browser’s rendering loop synchronizes with the display refresh rate via requestAnimationFrame; at 60 Hz that is 16.6 ms per frame.
JavaScript execution, DOM updates, layout calculation, and paint all need to finish within that budget or frames get dropped.
But 16.6 ms is the ceiling for smooth animation, and rewriting a numeric ticker or table 60 times per second serves no purpose.
Humans can read changing numbers at about 3-4 updates per second at best; beyond that it is just flicker.
Passing every tick through to the screen is unnecessary; a few snapshots per second are enough.
The bottleneck is not data arrival speed but the cost of DOM manipulation.
Updating many table rows triggers the browser’s layout recalculation (reflow).
Even a single innerText change can cause the layout tree to be rebuilt up through ancestor elements.
With an active symbol pushing 100 executions per second, reflecting all of them in the DOM blows through the 16.6 ms budget instantly.
The practical approach is to render only the latest state inside requestAnimationFrame.
Overwrite a state object on each receive, and inside the rAF callback touch only the DOM elements that differ from the previous frame.
React’s virtual DOM works on the same principle, but for high-frequency tickers the virtual DOM diff computation itself becomes overhead, and raw DOM operations are often faster.
For real-time charts or heatmaps, Canvas (including WebGL) bypasses the DOM layout engine entirely.
TradingView’s charts are Canvas-based for exactly this reason.
Drawing line charts with SVG scales layout computation linearly with node count, while Canvas draws directly to a pixel buffer and keeps load stable.
If you want a continuously scrolling tick log on screen, virtual scrolling is also needed.
Limit the DOM rows to the visible viewport plus a buffer, and swap displayed rows as the scroll position changes.
Dumping 10,000 rows straight into the DOM makes even scrolling sluggish, but virtual scrolling keeps the DOM node count at a few dozen at all times.