š“The O(n²) Trap Nobody Talks About
Open any LLM-powered chat. Stream a long response. Watch your DevTools. Every single token triggers a full re-parse of the entire accumulated markdown string.
The cascade on every token:
After 500 tokens, you're parsing a 2,000-character markdown string on every frame. After 2,000 tokens ā 8,000 characters. The work grows quadratically. At 100 tok/s the average app performs 100 full re-parses per second, each one touching content that hasn't changed.
š¢The Fix: Block-Level Incremental Parsing
I asked: What if the parser only processed new characters?
StreamMD's StreamParser accepts the full accumulated text on each call, but internally tracks prevLength and only processes the delta. It classifies each line into block types (heading, code fence, table, list, paragraph) and maintains a running array of structured Block objects.
The magic: completed blocks are frozen. When a block is closed (the parser encounters a blank line, a new heading, or a closing code fence), it's marked closed: true. The React layer wraps each block in React.memo ā closed blocks never re-render. Only the active (last, unclosed) block updates on each token.
āļøArchitecture
The key insight: most blocks are frozen. In a typical streaming response, 95% of the rendered content is in completed blocks. The parser identifies block boundaries and the React layer leverages this to skip re-rendering everything that hasn't changed.
The incomplete-line tracking is critical ā when tokens arrive mid-line (e.g., "## He" before the "ading\n"), the partial text is held in a separate buffer and virtually appended at render time. This avoids the classic streaming bug where partial tokens get permanently committed and then duplicated when the rest of the line arrives.
šThe Numbers
| Metric | react-markdown | StreamMD |
|---|---|---|
| Re-renders (500 tokens) | 500 | ~20 |
| Per-token complexity | O(n) ā full re-parse | O(1) ā delta only |
| Bundle size | 45kB + remark + rehype | 30kB total (incl. highlighter) |
| Runtime dependencies | unified + remark + rehype + ⦠| 0 (React peer only) |
| Syntax highlighting | BYO (Prism/Shiki) | Built-in (15 languages) |
š»Usage
$ npm install stream-md
That's it. One import. One component. Your streaming goes from janky to buttery.
šÆWhat Makes This Different
Incremental Diffing
The parser tracks prevLength and only processes the new characters ā never re-scans completed content.
Incomplete Line Buffer
Tokens that arrive mid-line are held in a separate buffer and virtually appended at render time ā preventing the classic duplication bug.
Block-Level Memoization
Each block (heading, code, paragraph, table) is a React.memo component. Completed blocks never re-render.
Built-in Highlighter
Token-by-token syntax highlighting for 15 languages. Returns structured spans ā no dangerouslySetInnerHTML needed.
Component Overrides
Full control: swap any element (pre, a, table, code) with your own component via the components prop.
Zero Dependencies
No unified, no remark, no rehype. Just React as a peer dependency. 30kB total ESM bundle.
The ZeroJitter + StreamMD Stack
ZeroJitter eliminates layout thrashing by rendering text to canvas. StreamMD eliminates redundant markdown parsing by incrementally tracking blocks.
Together, they own the "streaming LLM display" category. Use ZeroJitter for raw text streams. Use StreamMD when you need full markdown rendering with headings, code blocks, tables, and inline formatting.
The fastest LLM UI is the one that does the least work.
$ npm install stream-mdStar on GitHub ā