After discovering an operation, AI needs to understand how to call it. learn_api synthesizes a complete interface—types, parameter structures, body schemas—in a format optimized for LLM comprehension. This isn’t just type conversion. It’s a multi-stage pipeline that tree-shakes massive specs down to minimal dependencies, transforms schemas into shapes LLMs can produce reliably, and renders optimized TypeScript that looks like what a human would write.
Low-level LLM tool calling uses JSON Schema directly. We could do the same—we have tree-shaken JSON Schemas for every operation. But we synthesize TypeScript types instead.
The reason: LLMs are trained on vastly more TypeScript than JSON Schema. TypeScript types are more native to how LLMs think.
Compare:
JSON Schema:
{ "type": "object", "properties": { "email": { "type": "string", "description": "Customer email" }, "name": { "type": "string", "description": "Customer name" }, "metadata": { "type": "object", "additionalProperties": { "type": "string" } } }, "required": ["email"]}TypeScript:
interface CreateCustomerRequest { /** Customer email */ email: string; /** Customer name */ name?: string; /** Arbitrary key-value pairs */ metadata?: Record<string, string>;}The TypeScript version uses roughly half as many tokens, is more reliably interpreted by LLMs, expresses the same constraints more clearly, and matches how LLMs naturally think about structure.
But the advantage goes deeper than format. The synthesis pipeline includes an optimizer that transforms complex schemas into cleaner types—flattening nested unions, eliminating redundant constraints, normalizing type ordering. The result is clean, minimal types that LLMs interpret correctly.
Type synthesis happens in stages:
Each stage is driven by the spec as data. The spec is the source of truth.
OpenAPI specs can be massive. Stripe’s defines hundreds of schemas. GitHub’s defines thousands. The full spec might be 8MB.
When an API is indexed, each operation’s spec is tree-shaken:
An 8MB spec becomes 5KB. The tree-shaken operation spec contains exactly what’s needed—nothing more.
The type generator walks the schema and produces a type AST—an intermediate representation that captures the structure without committing to a specific output format. The AST represents primitives, objects, arrays, unions, intersections, literals, and records as typed nodes.
This intermediate form enables optimization. You can’t easily optimize raw TypeScript source, and you can’t optimize JSON Schema without understanding its full semantics across multiple dialects. The type AST provides a uniform representation that optimization passes can analyze and transform.
The optimizer runs three passes in sequence until the type reaches fixedpoint—until no pass changes anything:
Expand — Inline type references to a controlled depth. A deeply nested reference like $ref: "#/components/schemas/Address" gets inlined so the LLM sees the actual structure. But we don’t inline everything—that would explode type size. The inlining depth balances comprehension against context efficiency.
Reduce — Simplify type expressions. Flatten nested unions ((A | B) | C → A | B | C). Eliminate redundant constraints. Remove never branches from unions. Collapse single-element unions. Deduplicate union members. The goal is algebraic simplification—same semantics, simpler structure.
Normalize — Establish consistent ordering. Union members sort by structural complexity. Object properties maintain declaration order. The same type always produces the same output, regardless of how the schema expressed it.
The passes repeat until nothing changes. Complex schemas might take several iterations to fully simplify.
The renderer takes the optimized AST and outputs TypeScript. It handles formatting—JSDoc comments for descriptions, proper indentation, type aliases for named schemas. The output looks like hand-written TypeScript, not mechanical transliteration.
The API schema describes what goes over the wire. But that’s not always what an LLM should produce. File uploads expect binary streams. Multipart encoding has structural requirements invisible in the schema. Some content types have conventions the schema doesn’t capture.
The system maintains two views of every interface: the API schema that defines the wire format, and the semantic schema that the LLM actually sees. When these differ, transformation bridges the gap—in both directions.
Consider file uploads with multipart encoding:
{ "type": "object", "properties": { "file": { "type": "string", "format": "binary" } }}An LLM can’t produce binary data. The API schema describes what the HTTP request will contain, but that’s not what the LLM needs to produce.
The semantic schema transforms this into something the model can work with:
interface UploadRequest { file: { $content: string; // base64-encoded content $filename?: string; // original filename $contentType?: string; // MIME type };}The LLM provides data it can actually generate. When the LLM invokes the operation, the value transformation runs in reverse—base64 decoding, MIME type inference, multipart boundary construction—producing what the API expects from what the LLM provided.
This bidirectionality is essential. Schema transformation defines what the LLM sees; value transformation defines what gets sent. They’re two halves of the same contract. Without both, the system breaks—either the LLM can’t produce valid input, or the execution engine can’t interpret what the LLM produces.
Content negotiation determines which transformation applies. The content type declared in the operation—application/json, multipart/form-data, application/x-www-form-urlencoded—selects the encoding, and the encoding defines both how to present the schema and how to encode values. JSON needs no transformation; the semantic schema matches the API schema. Multipart needs the domain property transformation shown above. Other content types have their own requirements.
This keeps the execution engine generic. It doesn’t need special cases for each encoding pattern—the transformation logic lives with the encoding that needs it, selected by content type at runtime.
When AI calls learn_api, it gets TypeScript declarations:
/** * Create a new customer */interface CreateCustomerRequest { /** * Customer's email address */ email: string;
/** * Customer's full name */ name?: string;
/** * Arbitrary key-value pairs for storing additional information */ metadata?: Record<string, string>;
/** * Customer's payment source (card token or source ID) */ source?: string;}The interface includes:
Toolcog synthesizes more than types. For each operation, the system produces:
| Component | Purpose |
|---|---|
| Request schema | Complete input structure for LLM tool use |
| Request template | Parameter expansion rules (path, query, header, cookie) |
| Content encoders | Body serialization (JSON, form, multipart, binary) |
| Content decoders | Response parsing |
| Response schema | Output structure |
OpenAPI has a notoriously complex encoding model—multiple parameter styles, explode modes, deep object serialization, multipart with per-field encoding. Toolcog implements the full model. AI provides values; the system handles the HTTP details correctly.
This is all data, not generated code. The templates and encoders are data structures interpreted at runtime—Tools as Data.
By default, learn_api returns request types. Response types are optional:
// Request types only (default)learn_api({ operation: "stripe/customers.create" });
// Include response typeslearn_api({ operation: "stripe/customers.create", response: true });AI typically skips response types unless it needs to understand the output for downstream processing. This keeps context usage minimal—types are generated on demand, only what’s needed.
When AI uses Toolcog to call an API:
find_api locates the operation by intentlearn_api returns the interface in LLM-native formatcall_api receives AI’s parametersBehind the scenes, call_api:
AI sees clean TypeScript interfaces. The machinery underneath handles the HTTP details.