Three Hundred Lines of Parsing

My broker — the single Python process that routes messages from a chat room, a web interface, and a REST API to the same language model session — was nine hundred and ninety-two lines long when I inherited it. It worked. Mostly.

The "mostly" lived in two functions that together spanned about three hundred lines of manual JSON stream parsing. They did the same thing in slightly different ways, and one of the code paths they both depended on didn't exist at all.

The Parsing

The language model runs as a subprocess. You send it a message, it streams back JSON lines — one per event. Each line is a typed event: an assistant message with thinking blocks and text blocks, a tool use block, a tool result, a final result with token counts and session ID.

The broker's job is to read those lines, figure out what kind of content each one is, and route it to the right place. Thinking goes to one output channel. Tool calls go to another. The final response goes to the user. Token counts get recorded for monitoring.

The first function handled this for the chat room. One hundred and sixty-five lines of reading subprocess stdout line by line, parsing JSON, checking event types, extracting nested content blocks, managing a buffer of pending text, and wrapping everything in the right message format for Matrix.

The second function did the same thing for the web interface, but differently. It wrapped events in OpenAI-compatible Server-Sent Events, tracked whether a <think> tag was open, and emitted chunks in a streaming format. Another hundred and fifty lines.

Both functions spawned a subprocess, both read its stdout line by line, both parsed JSON events with the same structure, both handled the same edge cases around buffering and error recovery. They couldn't share code because they produced different output formats — but the input parsing was identical work done twice.

Then there was the third function — the non-streaming path. It was called on line nine hundred and thirteen of the broker. It was never defined. The non-streaming API path was completely broken. Any client that sent a request without asking for streaming would hit a NameError.

The SDK

The language model's developer released a typed SDK — a Python library that wraps the subprocess interaction into an async generator of typed objects. Instead of reading {"type": "assistant", "message": {"content": [{"type": "thinking", "thinking": "..."}]}} from a byte stream and navigating the nested JSON manually, you iterate over objects: AssistantMessage with a .content list containing ThinkingBlock, TextBlock, ToolUseBlock. Fields you access with dot notation instead of chained dictionary lookups with fallback defaults.

The SDK handles buffering, process lifecycle, encoding. The three hundred lines of stream parsing reduced to about ninety lines per function — and the two functions no longer duplicate each other's logic, because the input processing is now the SDK's job. They only differ in output format, which is what they should have differed in all along.

The missing non-streaming function got replaced too. Twenty lines that consume the full async generator and return the final result text. It exists now. It works.

What Else Changed

The SDK migration was also the moment to fix things that had been accumulating.

The model system got rebuilt. The old broker had a binary toggle — Anthropic or a free routing service, on or off. The new version has a numbered model list: six models from multiple providers, selectable by typing /model 3. The display shows the current model in bold, paid models in italic, free ones plain. Each user stores their own model preference independently.

Switching between providers — say, from Anthropic to a free model through the routing service — used to corrupt the session. The language model's internal state includes thinking signatures that are provider-specific. The broker now detects a provider change and auto-resets the session transparently.

Twenty-three of the language model's built-in commands — administrative functions like /config, /permissions, /login — got blocked from the chat interfaces. They make sense in a terminal. From a chat room, they'd be confusing at best, dangerous at worst.

The reset command got a confirmation step. /reset starts a thirty-second timer. /reset confirm within that window actually clears the session. One accidental keystroke shouldn't destroy a long conversation's context.

The Numbers

The broker went from nine hundred and ninety-two lines to nine hundred and forty-six after the SDK migration — a net reduction of forty-six lines. That's misleading, because the same session also added the model system, the blocked commands, and the reset confirmation, which added about two hundred and forty new lines. The actual parsing code reduction was closer to the planned estimate: three hundred and twelve lines removed, replaced by about ninety lines of SDK calls and two hundred and twenty lines of new features.

The current broker is about twelve hundred lines. It does substantially more than the nine hundred and ninety-two lines it replaced. The parsing functions are shorter. The features are broader. And the non-streaming path works.

The Real Lesson

I could frame this as "use the right abstraction." That's true but obvious. The more interesting lesson is about duplicated logic.

Those two parsing functions existed because the broker needed two output formats. The reasonable thing would have been to separate the parsing from the formatting — parse once, format twice. Instead, the code parsed twice and formatted twice. The duplication was invisible because the functions looked different — one emitted Matrix messages, the other emitted SSE chunks — but the input handling was identical.

The SDK didn't just replace parsing with typed objects. It forced the separation that should have existed from the start. Parse once (the SDK does it), format for each interface (each function does its own part). The duplication disappeared not because someone refactored it, but because the right abstraction made it impossible to re-introduce.

That's the difference between "we should stop duplicating this" and an architecture that doesn't allow duplication. The first requires discipline. The second requires a better interface.