When AI needs to understand an API operation, what format should the interface be in?
The obvious answer is JSON Schema. It’s what OpenAPI uses. It’s the standard for describing JSON structures. Most LLM tool-calling systems use it directly.
We generate TypeScript types instead. Here’s why.
LLMs are trained on TypeScript
Language models learn from training data. The training data contains vastly more TypeScript than JSON Schema.
GitHub alone has millions of TypeScript files. Every npm package with types. Every TypeScript tutorial. Every Stack Overflow answer. LLMs have seen TypeScript interfaces thousands of times more often than JSON Schema definitions.
This matters because LLMs interpret based on pattern recognition. Formats they’ve seen more often are formats they understand better. TypeScript is more native to how LLMs think.
Token efficiency
Consider the same structure in both formats:
JSON Schema (67 tokens):
{ "type": "object", "properties": { "email": { "type": "string" }, "name": { "type": "string" }, "metadata": { "type": "object", "additionalProperties": { "type": "string" } } }, "required": ["email"]}TypeScript (28 tokens):
interface CreateCustomerRequest { email: string; name?: string; metadata?: Record<string, string>;}TypeScript uses roughly half as many tokens. This matters when context windows are limited and every token counts.
But it’s not just quantity. TypeScript expresses optional properties with ?, required with their absence. JSON Schema requires a separate required array. TypeScript uses Record<K, V> for maps. JSON Schema needs nested additionalProperties. The TypeScript patterns are more compact and more familiar.
Interpretation accuracy
In our testing, LLMs more reliably produce correct arguments when given TypeScript interfaces than JSON Schema.
This isn’t surprising. TypeScript interfaces look like the code LLMs write. When an LLM sees:
interface CreateCustomerRequest { email: string; name?: string;}It recognizes the pattern immediately. It knows email is required and name is optional. It knows what types to use. The format matches its training.
JSON Schema works, but requires more interpretation. The LLM has to parse the schema structure, understand the required array, navigate nested properties objects. It’s an extra cognitive step.
The synthesis pipeline
Generating good TypeScript from JSON Schema isn’t trivial. We built a complete synthesis pipeline:
Schema extraction: Pull the relevant schemas from the operation spec, following $ref chains to include dependencies.
AST construction: Build a TypeScript AST representation of the types. This isn’t string templating—it’s proper AST manipulation.
Algebraic optimization: Simplify complex type expressions. Flatten nested unions. Eliminate impossible types. Normalize ordering for consistency.
Code emission: Render clean, readable TypeScript with proper formatting and JSDoc comments.
The result is types a human would write, not mechanical transliteration. Clean interfaces, sensible names, consistent style.
What AI receives
When AI calls learn_api, it gets something like:
/** * Create a customer * * Creates a new customer object. */interface CreateCustomerRequest { body: { /** Customer's email address */ email: string;
/** Customer's full name */ name?: string;
/** Set of key-value pairs for additional information */ metadata?: Record<string, string>;
/** ID of the payment method to attach */ payment_method?: string; };}The interface includes:
- Operation description
- Parameter names and types
- Required vs. optional distinction
- JSDoc comments from the API spec
- Nested structures fully expanded
AI reads this and knows exactly how to construct valid arguments.
Response types
By default, we only send request types. Response types are optional because AI often doesn’t need them—it just needs to make the call.
But when AI is processing results for downstream use, it can request response types:
interface Customer { id: string; email: string | null; name: string | null; created: number; metadata: Record<string, string>;}This enables AI to understand what it will receive and plan accordingly.
The lesson
The formats we choose for AI communication matter. JSON Schema is standard but not optimal. TypeScript is more familiar, more compact, and more reliably interpreted.
This is a small example of a broader principle: design for how LLMs actually work, not for what seems theoretically correct. Training data matters. Token efficiency matters. Pattern recognition matters.
When you’re building infrastructure for AI, these details compound.
