Compilers work in stages. Parse source code into an abstract syntax tree. Traverse the AST to analyze, transform, optimize. Emit output. The AST is the intermediate representation—the structure you manipulate.
When processing JSON documents, the standard approach mirrors this. Parse JSON into native objects, then wrap those objects in AST nodes that add metadata: parent references, source locations, traversal helpers. The wrappers are the intermediate representation.
This works, but it’s expensive. A 2MB OpenAPI spec parsed into JavaScript objects becomes 20-50MB when wrapped in AST nodes. Every value gets a wrapper. Every wrapper allocates memory.
We found a different way.
The insight
What if the parsed JSON already is the intermediate representation?
JSON parsing gives you a tree of objects and arrays. That’s already a syntax tree—just without the metadata compilers need. Parent references. Reference resolution state. Traversal context.
The trick is attaching that metadata without wrappers.
WeakMaps as shadow context
JavaScript’s WeakMap lets you associate data with an object without modifying the object. The key is the object identity, not a property. The associated data doesn’t prevent garbage collection.
Some metadata needs to persist across traversals—resources marking format boundaries, references linking nodes across the document. WeakMaps store this document-level metadata keyed by the JSON objects themselves. The JSON stays intact. The metadata lives alongside it.
Traversal context, not wrapper objects
Traditional AST traversal passes wrapper nodes around. Our approach passes context frames.
A context frame captures everything about the current traversal position: the value being visited, its parent, its key in the parent, accumulated path, resolution state. Frames are transient—created on entry, discarded on exit. Memory is proportional to tree depth, not tree size.
This matters when you’re processing OpenAPI specs with thousands of operations. A 3MB spec might have depth 20 but contain 50,000 values. Traditional wrappers allocate 50,000 objects. Context frames allocate 20.
Format boundaries
OpenAPI specs embed JSON Schemas. JSON Schemas embed examples. The same document contains multiple formats, each with different traversal semantics.
In a JSON Schema, properties contains subschemas to traverse. In example data, properties is just a key—data to preserve, not structure to descend into.
Format boundaries mark where one embedded format begins and another ends. The traverser switches semantic rules at these boundaries, applying the correct interpretation to each subtree.
Why this matters
Edge runtimes have memory limits. Cloudflare Workers, the platform we built on, constrain per-request memory. Processing large API specs with traditional AST approaches exceeds those limits.
Context trees keep overhead manageable. Shadow context via WeakMaps. Transient frames during traversal. Format-aware semantic boundaries. The same parsing, analysis, and transformation capabilities—at a fraction of the memory cost.
The JSON you parsed is the intermediate representation. Everything else is just context.
