Most OpenAPI tooling takes a service-centric view. You generate a client library. You import the SDK. You call methods. The entire API surface ships as a unit because that’s how SDKs work.
But we’re not building an SDK. We’re building infrastructure for AI agents to discover and call API operations dynamically. And LLM tools are singular—one operation at a time, loaded on demand, discarded after use.
The unit isn’t the service. It’s the operation.
This changes everything about how you process specs. You don’t want to parse an 8MB Stripe spec every time an agent needs to create a customer. You want to extract just createCustomer with exactly the schemas, parameters, and security definitions it requires. Nothing else.
Why extraction is hard
Last month we built tree-shaking for JSON Schema. Given a schema, extract only what it references—no unused definitions, no bloat. One structural challenge: following $ref pointers and collecting schemas into $defs. One component type, one destination.
OpenAPI has ten component types:
- schemas
- parameters
- responses
- examples
- requestBodies
- headers
- securitySchemes
- links
- callbacks
- pathItems
Each extracted component must land in the correct bucket. A parameter reference goes to components/parameters. A response reference goes to components/responses. Extraction needs to understand OpenAPI’s component taxonomy and organize output accordingly.
But it gets worse.
Two layers
OpenAPI embeds JSON Schema but isn’t JSON Schema. An operation contains parameters. Parameters contain schemas. When extraction hits a schema, you’re in a different semantic domain with different reference patterns.
Extraction must cross this boundary cleanly. Follow OpenAPI references (parameters, responses, security schemes) at the OpenAPI layer. When you reach a schema, delegate to schema extraction. Follow schema references ($ref to $defs or other schemas). Return to OpenAPI when done.
Two reference systems, two extraction logics, one unified output. The layers must compose without coupling.
Implicit references
Here’s what makes extraction genuinely tricky: not all references are $ref.
Consider an operation with tags:
paths: /users: post: tags: - Users operationId: createUserThose tag names reference tag definitions elsewhere in the document:
tags: - name: Users description: User management operationsNo $ref. Just a string that implicitly references another part of the spec. If you extract the operation, you need to extract the tag definition too.
Security requirements are the same:
paths: /users: post: security: - bearerAuth: []That bearerAuth key references a security scheme definition:
components: securitySchemes: bearerAuth: type: http scheme: bearerAgain, no $ref. An implicit reference through a name that must be tracked and followed during extraction.
We use the same reference-tracking infrastructure for both explicit and implicit references. When we parse a security requirement, we register it as a reference to the corresponding security scheme. When we parse tag names, we register references to the tag definitions. The extraction machinery doesn’t care whether the reference was explicit or implicit—it follows them all.
Inheritance
OpenAPI has inheritance. Parameters declared on a path item apply to all operations within it. Servers declared at the root apply to all paths unless overridden. Security requirements cascade similarly.
When you extract an operation, you need its inherited components too—not just what’s explicitly declared on the operation itself. A path-level parameter is just as much a dependency as an operation-level one.
This is where the compiler architecture pays off. Our traversers already handle inheritance. When you traverse an operation, you traverse its inherited parameters, inherited servers, inherited security requirements. The inheritance resolution is baked into traversal.
The tree shaker builds on the traverser. It doesn’t know about inheritance. It just follows what the traverser visits. And the traverser visits everything reachable—explicit or inherited.
Layered architecture means inheritance complexity lives in one place. Extraction builds on top without reimplementing that logic.
Reference relocation
Extraction isn’t just filtering—it’s reorganization. References in the original spec point to their original locations:
{ "$ref": "#/components/schemas/User" }In the extracted output, that schema might be the only one. But we preserve the reference structure so the output remains a valid OpenAPI fragment. References get rewritten to point to their new locations in the minimal output.
This is subtle. You can’t just copy schemas verbatim—you need to track where each component lands and update all references accordingly. The same schema might be referenced from multiple places within the extracted operation. All those references must point to the single copy in the output.
The result
Given an operation, extraction produces a minimal, self-contained spec fragment:
{ "components": { "schemas": { "User": {...}, "CreateUserRequest": {...} }, "securitySchemes": { "bearerAuth": {...} } }, "tags": [{ "name": "Users", "description": "..." }]}Only the schemas that operation uses. Only the security schemes it requires. Only the tags it references. An 8MB specification becomes roughly 5KB per extracted operation.
Memory constraints
We’re targeting edge deployment. Cloudflare Workers give us 128MB of memory. The largest API specs we process are around 25MB of raw JSON.
This rules out building a full AST. We process JSON as a context tree—traversing the original structure, tracking references, transforming in place. The spec is never fully materialized as an object graph. We work with the JSON directly, following pointers, collecting what we need.
Extracting a single operation from a 25MB spec completes in milliseconds with memory headroom to spare. That’s the difference between viable edge infrastructure and “works on a server with 8GB RAM.”
Why this matters
The operation becomes a self-contained unit. Load it on demand. Discard it when done. Scale to thousands of APIs without loading thousands of specs.
This is the foundation for dynamic API discovery. You can’t give AI agents access to every API if accessing an API means loading its entire specification. But if each operation is independently extractable, suddenly the scale problem disappears.
Operations, not services. That’s the unit that makes AI tool use practical at scale.
