openai

OpenAI's platform surfaces a broad set of resources grouped around projects, files/uploads, assistants/threads/runs, vector stores, fine-tuning, batches, and organization-level admin objects (API keys, certificates, users). The following narrative explains how those pieces fit together, what to call first to get the IDs you will need, common multi-step workflows users request, and the non-obvious behaviors that cause mistakes.

Core entities and how they relate

Projects vs Organization
- Projects are the primary scoping unit for most operational resources: project-scoped API keys, service accounts, rate limits, and project membership live under a project_id. Many management operations require a project_id. Organization-level resources include organization admin API keys and organization certificates (there are also project-level certificates).
Files, Uploads, and Parts
- A large-file upload is a two-step flow: create an Upload object, add Parts (each part ≤ 64 MB) with addUploadPart, then call completeUpload specifying the ordered list of Part IDs to produce a File object. Uploads expire after an hour and can be canceled with cancelUpload. For single-file uploads within the size limit, createFile produces a File directly.
- Files are referenced by file_id across other features: fine-tuning, batches, vector stores, and assistants' file-search tool.
Vector stores and files
- Create a vector store (createVectorStore) to get a vector_store_id. Add data by attaching a File via createVectorStoreFile or batch with createVectorStoreFileBatch. Use searchVectorStore to query. You can fetch parsed content with retrieveVectorStoreFileContent and update file attributes with updateVectorStoreFileAttributes.
- Deleting a vector store file (via deleteVectorStoreFile) removes it from the store but does not delete the underlying File object; to remove the file from the system, call deleteFile separately.
Assistants, Threads, Runs, Messages
- Assistants are configuration objects (createAssistant, getAssistant, modifyAssistant, deleteAssistant), used by higher-level assistant tooling.
- Conversations are modeled as thread objects. Create a thread (createThread) and then create runs (createRun) to execute assistant logic. You can create a thread and run it in one call (createThreadAndRun).
- Run objects contain steps (sequence of operations the assistant performed) and may generate required_action states. Use listRunSteps/getRunStep to inspect steps and listRuns/getRun to inspect runs.
- Messages belong to threads (createMessage, listMessages, getMessage, modifyMessage, deleteMessage). A run may create messages; messages can be filtered by run_id when listing.
- When a run enters requires_action with required_action.type == submit_tool_outputs, you must submit all tool outputs in a single submitToolOuputsToRun call for that run to continue.
Responses and Chat Completions
- Newer flows use createResponse (Responses) which supports richer behavior (tooling, structured outputs, multimodal inputs). Legacy chat-style generation still exists as createChatCompletion. Both can return streaming or non-streaming types — check the returned object type to decide whether you received a stream event versus a final object.
- Stored chat completions are only accessible for retrieval/modification/deletion if they were created with store=true. Only stored completions appear in listChatCompletions, getChatCompletion, updateChatCompletion, and deleteChatCompletion.
Fine-tuning and checkpoints
- Fine-tuning jobs are created via createFineTuningJob. Jobs have checkpoints and events you can list (listFineTuningJobCheckpoints, listFineTuningEvents). Sharing a checkpoint across projects and deleting/inspecting checkpoint permissions requires an organization admin API key (admin-only endpoints are explicitly noted).
Batches
- Batches accept a .jsonl file of requests and produce outputs. You create a batch (createBatch) from an uploaded file and can cancel it (cancelBatch); cancellation is asynchronous and a batch may be cancelling for minutes before becoming cancelled with partial results available.
API keys, service accounts, users, invites
- Organization admin API keys are separate from project-level API keys. Creating organization admin keys and some checkpoint-permission operations require the admin-level key. Creating a project service account returns an unredacted API key — treat it as a secret immediately after creation. Invites can be deleted only while pending; once accepted they cannot be deleted.

Recommended entry points (what to call first)

To operate on a particular project: call list-projects (or retrieve-project if you already have a project id) to obtain project_id. Many project-scoped endpoints require this (service accounts, project API keys, rate limits, users, certificates).
To work with files and data:
- If the user will upload >512 MB or needs parallel part uploads, start with createUpload then addUploadPart and finish with completeUpload to get a File object.
- For smaller uploads or when the platform-specific file creation is acceptable, use createFile.
- Use listFiles or retrieveFile to locate existing file_id values.
To run assistant behavior or reproduce a conversation:
- Create (or retrieve) the thread with createThread/getThread and then run it with createRun (or createThreadAndRun for one-shot). Inspect progress via listRuns, getRun, listRunSteps, and getRunStep.
To build a vector-based retrieval flow:
- Call createVectorStore to get vector_store_id, attach data via createVectorStoreFile / createVectorStoreFileBatch, then searchVectorStore to query. Use listVectorStores and listVectorStoreFiles to monitor ingestion status.
To generate text/embeddings/images/audio:
- For new development prefer createResponse for richest features, or createChatCompletion/createCompletion for chat/completion flows. Use createEmbedding for vector embeddings. Check listModels if you need model capabilities or availability.

Common multi-step workflows

Large-file upload then use elsewhere:
1. createUpload to get upload_id.
2. Parallel addUploadPart calls with each part's bytes and part index.
3. completeUpload with the ordered list of Part IDs; the response includes a nested File object with file_id.
4. Use that file_id with createFineTuningJob, createBatch, createVectorStoreFile, or assistant file-search tools.
Create a vector retrieval pipeline:
1. createVectorStore → obtain vector_store_id.
2. Attach a file with createVectorStoreFile or create a batch with createVectorStoreFileBatch for larger/parallel ingestion.
3. Monitor ingestion with listVectorStoreFiles / getVectorStoreFile / getVectorStoreFileBatch.
4. Query with searchVectorStore and optionally fetch parsed content with retrieveVectorStoreFileContent.
Run a thread that requires tool outputs:
1. Start or identify a thread_id and run it (createRun).
2. If the run returns status: "requires_action" with required_action.type == "submit_tool_outputs", gather all tool outputs and submit them in a single submitToolOuputsToRun call. Partial submissions will not resume the run — all outputs must be included together.
Fine-tune a model:
1. Prepare a .jsonl file in the fine-tuning format and upload it (use createFile for files under the allowed limit or the Upload flow for larger ones).
2. createFineTuningJob referencing the file_id.
3. Monitor job events and checkpoints with listFineTuningEvents and listFineTuningJobCheckpoints.
4. Sharing or deleting checkpoint permissions across projects requires an organization admin API key.

Non-obvious behaviors and gotchas

Upload expiry and sizes
- Upload objects expire after one hour. Parts can be uploaded in parallel, but you must call completeUpload before expiry and the total bytes on completion must match the original upload object.
- Individual Part ≤ 64 MB, upload total ≤ 8 GB. Separate createFile accepts files up to 512 MB (and the overall org storage limit applies).
- Assistants' file tools and some flows accept much larger tokenized sizes (Assistants support files up to ~2M tokens); check the target feature before choosing the upload method.
Streaming vs final responses
- Several generation endpoints can produce either a streaming response type or a final object (createChatCompletion, createResponse, createTranscription etc.). Do not assume the shape; inspect the returned result to decide whether it's a stream event or the final resource.
Stored chat completions
- Only completions created with store=true are returned by list/retrieve/update/delete chat-completion endpoints. If a user asks to modify or delete a stored chat, first ensure that the completion was stored.
Admin-only operations
- Some permission and checkpoint operations explicitly require an organization admin API key (organization-level admin privileges). Attempts to call those with a project-level key will fail authorization.
Deletion constraints and semantics
- delete-invite only works for pending invites. Once accepted, an invite cannot be deleted.
- deleteCertificate requires that the certificate be inactive at both the organization and project levels before deletion.
- Deleting a vector store file removes it from the vector store but does not delete the underlying File object from the organization's file list.
- Deleting a fine-tuned model requires Owner role in the organization.
Cancellation is often asynchronous
- Cancelling a batch (cancelBatch) or some other long-running work sets an in-flight status such as cancelling for a period; expect partial results to be available after the cancel completes.
Service accounts and API keys
- Creating a project service account returns the unredacted API key once — store or surface it securely to the user immediately. Project API keys vs organization admin API keys have different scopes and capabilities.
Usage endpoints
- Usage endpoints require a start_time and optional end_time and support grouping and paging. The page or next_page cursor is used to retrieve additional buckets.
Response includes additional fields via include parameters
- Several endpoints accept include query parameters (for example include[] when creating a run or fetching steps) to fetch nested or large fields like file-search result content. Only request these when you need them, as they can enlarge the response.

Quick checklist for common user requests

Generate text or conversation: createResponse (preferred; richer features) or createChatCompletion / createCompletion.
Get embeddings: createEmbedding.
Upload a large file and get a file_id: createUpload → addUploadPart → completeUpload (or createFile for small files).
Fine-tune a model: upload .jsonl file → createFineTuningJob → monitor with retrieveFineTuningJob and listFineTuningEvents.
Run an assistant flow for a saved thread: createRun (or createThreadAndRun) → monitor listRuns/getRun → inspect steps with listRunSteps/getRunStep.
Build a vector search: createVectorStore → createVectorStoreFile / createVectorStoreFileBatch → searchVectorStore.

Use these patterns as composable building blocks: first acquire the canonical ID the downstream call requires (project_id, file_id, upload_id, vector_store_id, thread_id, run_id, response_id), then perform the operation that consumes that ID. When an endpoint warns of admin-only access or special size/format constraints, prefer the specialized endpoint (e.g., Upload + Parts for large files, admin endpoints for checkpoint permissions).

When something looks incomplete

If a listing response is truncated, request the next page with the provided cursor parameter (after or page).
If a run stops with requires_action, inspect required_action to decide whether to call submitToolOuputsToRun (note: all tool outputs must be submitted in one request) or other follow-ups.
If a creation endpoint returns an ephemeral secret (service account or realtime session), return that secret to the caller immediately—these values are unredacted only on creation.

This guidance highlights the platform patterns that most frequently block multi-step flows: where to get the IDs you need, which objects are project-scoped vs organization-scoped, upload/file size and expiry constraints, and the special semantics around stored completions, admin-only permissions, and run-required-actions. Follow the entity-first pattern (acquire the resource ID, then call the consumer operation) and prefer the Upload flow for large files, createFile for smaller files, and createResponse for newer generation features.