Create an Azure OpenAI inference endpoint
PUT
/_inference/{task_type}/{azureopenai_inference_id} Create an inference endpoint to perform an inference task with the azureopenai service.
The list of chat completion models that you can choose from in your Azure OpenAI deployment include:
The list of embeddings models that you can choose from in your deployment can be found in the Azure models documentation.
Required authorization
- Cluster privileges:
manage_inference
Parameters
path Path Parameters
| Name | Type |
|---|---|
task_type
required
The type of the inference task that the model will perform.
NOTE: The | type InferenceTypesAzureOpenAITaskType = "completion" | "chat_completion" | "text_embedding" |
azureopenai_inference_id
required
The unique identifier of the inference endpoint. | type TypesId = string |
query Query Parameters
| Name | Type |
|---|---|
timeout Specifies the amount of time to wait for the inference endpoint to be created. | type TypesDuration = string | "-1" | "0" |
Request Body
application/json
required
{
chunking_settings?:InferenceTypesInferenceChunkingSettings ;
service:InferenceTypesAzureOpenAIServiceType ;
service_settings:InferenceTypesAzureOpenAIServiceSettings ;
task_settings?:InferenceTypesAzureOpenAITaskSettings ;
}
chunking_settings?:
service:
service_settings:
task_settings?:
}
Responses
200 application/json
type InferenceTypesInferenceEndpointInfoAzureOpenAI = interface InferenceTypesInferenceEndpoint {
chunking_settings?:InferenceTypesInferenceChunkingSettings ;
service: string;
service_settings:InferenceTypesServiceSettings ;
task_settings?:InferenceTypesTaskSettings ;
} & { inference_id: string;task_type:InferenceTypesTaskTypeAzureOpenAI ; }
chunking_settings?:
service: string;
service_settings:
task_settings?:
} & { inference_id: string;task_type: