StreamMD

Stop re-parsing everything. Render only what changes.

0 DependenciesIncremental ParserReact.memo BlocksBuilt-in Syntax Highlighting

Naive (innerHTML)

re-parse every token
Chars parsed: 0
Re-renders: 0

StreamMD (incremental)

only active block re-renders
Chars parsed: 0
Block renders: 0
Blocks: 0
25x
Fewer Renders
O(1)
Per Token
30kB
Bundle Size

Technical Deep-Dive

I Made Streaming Markdown 25x Faster — Here's the Architecture

Every AI chat app has the same performance bug: they re-parse the entire conversation on every token. I built a library that makes this structurally impossible.

šŸ”“The O(n²) Trap Nobody Talks About

Open any LLM-powered chat. Stream a long response. Watch your DevTools. Every single token triggers a full re-parse of the entire accumulated markdown string.

The cascade on every token:

Token arrives→Full string → parser→Re-parse ALL blocks→Diff entire VDOM→Reconcile

After 500 tokens, you're parsing a 2,000-character markdown string on every frame. After 2,000 tokens — 8,000 characters. The work grows quadratically. At 100 tok/s the average app performs 100 full re-parses per second, each one touching content that hasn't changed.

🟢The Fix: Block-Level Incremental Parsing

I asked: What if the parser only processed new characters?

StreamMD's StreamParser accepts the full accumulated text on each call, but internally tracks prevLength and only processes the delta. It classifies each line into block types (heading, code fence, table, list, paragraph) and maintains a running array of structured Block objects.

The magic: completed blocks are frozen. When a block is closed (the parser encounters a blank line, a new heading, or a closing code fence), it's marked closed: true. The React layer wraps each block in React.memo — closed blocks never re-render. Only the active (last, unclosed) block updates on each token.

āš™ļøArchitecture

Data Flow
LLM tokens
→
StreamParser.push(fullText)
diff(prevLength)
ā–¼
Incremental Parser
ā–ø Process only new lines
ā–ø Classify: heading | code | table | list | paragraph
ā–ø Track incomplete lines separately (no duplication)
ā–øReturns: Block[], activeIndex
blocks[]
ā–¼
React.memo(BlockComponent)
Only activeBlock re-renders

The key insight: most blocks are frozen. In a typical streaming response, 95% of the rendered content is in completed blocks. The parser identifies block boundaries and the React layer leverages this to skip re-rendering everything that hasn't changed.

The incomplete-line tracking is critical — when tokens arrive mid-line (e.g., "## He" before the "ading\n"), the partial text is held in a separate buffer and virtually appended at render time. This avoids the classic streaming bug where partial tokens get permanently committed and then duplicated when the rest of the line arrives.

šŸ“ŠThe Numbers

Metricreact-markdownStreamMD
Re-renders (500 tokens)500~20
Per-token complexityO(n) — full re-parseO(1) — delta only
Bundle size45kB + remark + rehype30kB total (incl. highlighter)
Runtime dependenciesunified + remark + rehype + …0 (React peer only)
Syntax highlightingBYO (Prism/Shiki)Built-in (15 languages)

šŸ’»Usage

$ npm install stream-md

// Drop-in replacement for react-markdown
import { StreamMD } from 'stream-md';
import 'stream-md/css';

function Chat() {
const { messages } = useChat(); // Vercel AI SDK
const last = messages[messages.length - 1];

return (
<StreamMD
text={last?.content || ''}
theme="dark"
/>
);
}

That's it. One import. One component. Your streaming goes from janky to buttery.

šŸŽÆWhat Makes This Different

Incremental Diffing

The parser tracks prevLength and only processes the new characters — never re-scans completed content.

Incomplete Line Buffer

Tokens that arrive mid-line are held in a separate buffer and virtually appended at render time — preventing the classic duplication bug.

Block-Level Memoization

Each block (heading, code, paragraph, table) is a React.memo component. Completed blocks never re-render.

Built-in Highlighter

Token-by-token syntax highlighting for 15 languages. Returns structured spans — no dangerouslySetInnerHTML needed.

Component Overrides

Full control: swap any element (pre, a, table, code) with your own component via the components prop.

Zero Dependencies

No unified, no remark, no rehype. Just React as a peer dependency. 30kB total ESM bundle.

The ZeroJitter + StreamMD Stack

ZeroJitter eliminates layout thrashing by rendering text to canvas. StreamMD eliminates redundant markdown parsing by incrementally tracking blocks.

Together, they own the "streaming LLM display" category. Use ZeroJitter for raw text streams. Use StreamMD when you need full markdown rendering with headings, code blocks, tables, and inline formatting.

The fastest LLM UI is the one that does the least work.

$ npm install stream-mdStar on GitHub →