Changelog

Page 2 of 4

Type Generation and Intent Indexing

We’re building the new approach. Fixed meta-tools, unlimited operations, interfaces presented as TypeScript. This month: the type generator and the semantic search infrastructure.

TypeScript Type Generation

The schema compiler generates TypeScript declarations from JSON Schema—not mechanical transliteration, but the types a human would write. Discriminated unions from oneOf with discriminator properties. Eliminated impossible intersections from allOf combinations. Proper handling of optional properties vs nullable types. Clean inline types for simple schemas, named references for complex ones.

Type AST Optimization

The generator operates on an intermediate AST before emitting TypeScript. Three passes run until fixedpoint: Expand inlines type references so LLMs see actual structure. Reduce simplifies expressions—flattens nested unions, eliminates redundant constraints, removes never branches, deduplicates members. Normalize establishes consistent ordering for deterministic output.

Naive schema-to-type mapping produces technically correct but practically useless types. The optimization pipeline produces what a human would write: {name: string} | null instead of ({name?: string} & {name: string}) & ({} | null).

Keyword-Driven Type Generation

The keyword-centric architecture from December pays off. Adding type generation required no changes to core schema processing. Each keyword learned to generate its own TypeScript representation—type emits primitives, properties emits object members, allOf emits intersections. The generator traverses schemas exactly like validation does, but each keyword contributes types instead of errors.

Intent Phrase Generation

Operations indexed by intent—what they accomplish, not just their names. “Send an email” finds the right operation across different APIs even if they call it different things. A frontier model sees each operation’s synthesized TypeScript interface, documentation, and related operations, then generates natural language phrases capturing distinct reasons an agent might search for that operation. Semantic distillation: expensive comprehension amortized into phrases optimized for fast similarity matching.

Vector Indexing

Intent phrases embed into Cloudflare Vectorize for semantic similarity search. Query deduplication retrieves highest-scoring intent per operation. Metadata indexing on service name, operation name, intent ID, and phrase.

Batch Processing

OpenAI Batch API for cost-efficient intent generation. Two-stage pipeline: HTTP endpoint validates and enqueues, Durable Object processes in chunks. Lifecycle tracking with scheduled polling and exponential backoff.

Operation Metadata

Per-operation metadata indexed for efficient retrieval: method, path, servers, authentication schemes, tags. TypeScript signature generation per operation. Full-text search via FTS5.

Execution Pipeline

Different approach. Instead of one tool per operation with vendor-specific schema normalization, use fixed meta-tools that can call any operation. No schema normalization per vendor. No dynamic tool registration. LLMs learn interfaces on demand through the tools themselves.

Pipeline Integration

The pieces from previous months—request templates, parameter encoding, credential application, response decoding—compose into a single execution flow. Validate parameters against their schemas. Expand URI templates. Encode the request body. Apply credentials per security scheme. Execute. Decode the response. Return structured results.

Each stage is spec-driven. The OpenAPI spec controls encoding. The security scheme controls authentication. User input flows through designated channels—path parameters into path templates, query parameters into query strings, credentials into their declared locations. Nothing leaks across boundaries.

Credential Application

Credentials bind to security schemes by name and only apply to operations that declare that scheme. API keys go where the scheme specifies—header, query, or cookie. OAuth tokens become Bearer headers. HTTP Basic credentials get base64-encoded. The LLM provides the operation name and parameters; credential resolution happens entirely outside its view.

Corpus Validation

Testing against APIs Guru’s corpus—4,100+ specs, 116,000+ operations—surfaces edge cases no synthetic test suite would find. Malformed specs that violate their declared OpenAPI version. Security schemes declared but never referenced. Parameters with content instead of schema. The corpus is the test suite.

Content Pipeline

Two efforts in parallel: schema normalization for native LLM tools, and completing the content pipeline with response decoding.

Schema Normalization

Native LLM tool generation requires transforming JSON Schemas into whatever subset each vendor accepts. No vendor documents which keywords they support. No two vendors support the same subset.

We’re building normalizers for each vendor’s quirks. Every hundred operations surfaces another valid construct that some vendor rejects. additionalProperties works in some contexts but not others. oneOf supported by one vendor, ignored by another. The transformation pipeline keeps growing.

MCP dynamic tool updates would help—register tools on demand as agents discover operations. But no clients support it yet. We’re evaluating whether to wait for ecosystem support or find a different approach.

Response Decoding

The content pipeline needs its other half. Request encoding handles values going out; response decoding handles what comes back.

OpenAPI defines multiple possible responses per operation—success codes, error codes, ranges like 2XX, default fallback. Each response can declare multiple media types. The decoder matches status codes using OpenAPI precedence, selects media types with wildcard support, applies the appropriate decoder, and preserves the matched definitions for downstream use.

Same pluggable architecture as request encoding, inverted. The content pipeline is now symmetric.

MCP Dynamic Tools

MCP recently announced dynamic tool updates—adding and removing tools mid-session. We prototyped dynamic tool registration using the new primitives.

Dynamic Tool Prototype

Tools register on demand when agents discover operations via semantic search. The flow works: agent searches, system registers the tool, agent calls it immediately.

Client Support

No MCP clients support dynamic tool updates yet. Claude Desktop, Cursor, and other major clients use the tool list from session initialization and don’t respond to update notifications. Without client support, we’d need to pre-register tools—but a universal bridge can’t know in advance which APIs an agent will need.

Prompt DSL

Type-safe, composable prompt builder generating clean Markdown. Node types: section, list, code, yaml, json, link, image, strong, emphasis, strikethrough, thunk, block, inline. Thunks enable lazy evaluation with context awareness.

Document Fetching

Retrieves web pages with realistic browser headers and converts HTML to Markdown. URL fragment support for section-specific extraction. Useful for fetching OpenAPI additional documentation.

Credential Application and LLM Tool Generation

With request encoding complete, we added the authentication layer and started generating native LLM tools from OpenAPI operations.

Security Scheme Binding

Credentials apply to requests based on declared security requirements. An operation requiring OAuth gets a Bearer token; one requiring API key gets the key in its declared location (header, query, or cookie). The security scheme type determines how credentials transform into request headers or parameters.

Scheme-Bound Isolation

A credential is bound to a security scheme by name. The execution engine only applies it to operations that explicitly declare that scheme. Even with valid credentials, they cannot leak to operations that don’t request them. This is a structural guarantee, not a policy.

Server URL Templating

Server URL templates support variables with enum constraints. Variable schema generation enables validation. Server override at template creation or request encoding time supports multi-environment specifications.

LLM Tool Generation

Tool generator transforms OpenAPI operations into native LLM tool definitions. Takes an operation’s parameters and request body, generates a JSON Schema for function calling, registers tools dynamically.

Schema Normalization

LLM vendors each implement an unspecified subset of JSON Schema for function calling. No vendor documents which keywords they support. No two vendors support the same subset. additionalProperties works in some contexts but not others. oneOf supported by one vendor, silently ignored by another. Nested $ref resolution varies.

We’re building normalizers to transform schemas into each vendor’s accepted subset. Every hundred operations in our corpus surfaces another valid construct that some vendor rejects.

Authentication Header Normalization

Analysis of 4,138 OpenAPI specifications from the APIs Guru corpus revealed that only 39% properly declare security schemes. The remaining 61% use direct header parameters, creating challenges for generic clients and AI agents that need clean security boundaries.

Among APIs using direct auth headers, 62% use the generic Authorization header. Pattern analysis shows predictable distribution: 66% are Bearer tokens (detectable via “bearer”, “jwt”, or “token” in descriptions), 32% are OAuth 1.0 signatures, and only 3% are ambiguous. Description-based detection is sufficient.

The normalizer detects common auth headers—Authorization, X-Api-Key, Auth-Token, Access-Token, and variants—and converts them to proper security schemes. Type inference uses parameter descriptions. Merge strategies control how normalized schemes inject into operations: always, when-missing, or when-undeclared.