Search + K

Command Palette

Search for a command to run...

Sign In

Create an Amazon Bedrock inference endpoint

PUT /_inference/{task_type}/{amazonbedrock_inference_id}
Copy endpoint

Create an inference endpoint to perform an inference task with the amazonbedrock service.

info You need to provide the access and secret keys only once, during the inference model creation. The get inference API does not retrieve your access or secret keys. After creating the inference model, you cannot change the associated key pairs. If you want to use a different access and secret key pair, delete the inference model and recreate it with the same name and the updated keys.

Required authorization

  • Cluster privileges: manage_inference

Parameters

path Path Parameters

Name Type
task_type required

The type of the inference task that the model will perform.

type InferenceTypesAmazonBedrockTaskType = "chat_completion" | "completion" | "text_embedding"
amazonbedrock_inference_id required

The unique identifier of the inference endpoint.

type TypesId = string

query Query Parameters

Name Type
timeout

Specifies the amount of time to wait for the inference endpoint to be created.

type TypesDuration = string | "-1" | "0"

Request Body

application/json required
{
chunking_settings?: InferenceTypesInferenceChunkingSettings

Chunking configuration object

interface InferenceTypesInferenceChunkingSettings {
max_chunk_size?: number;
overlap?: number;
sentence_overlap?: number;
separator_group?: string;
separators?: string[];
strategy?: string;
}
;
service: InferenceTypesAmazonBedrockServiceType
type InferenceTypesAmazonBedrockServiceType = "amazonbedrock"
;
service_settings: InferenceTypesAmazonBedrockServiceSettings
interface InferenceTypesAmazonBedrockServiceSettings {
access_key: string;
model: string;
provider?: string;
region: string;
rate_limit?: InferenceTypesRateLimitSetting;
secret_key: string;
}
;
task_settings?: InferenceTypesAmazonBedrockTaskSettings
interface InferenceTypesAmazonBedrockTaskSettings {
max_new_tokens?: number;
temperature?: number;
top_k?: number;
top_p?: number;
}
;
}

Responses

200 application/json
type InferenceTypesInferenceEndpointInfoAmazonBedrock = interface InferenceTypesInferenceEndpoint {
chunking_settings?: InferenceTypesInferenceChunkingSettings

Chunking configuration object

interface InferenceTypesInferenceChunkingSettings {
max_chunk_size?: number;
overlap?: number;
sentence_overlap?: number;
separator_group?: string;
separators?: string[];
strategy?: string;
}
;
service: string;
service_settings: InferenceTypesServiceSettings
interface InferenceTypesServiceSettings {}
;
task_settings?: InferenceTypesTaskSettings
interface InferenceTypesTaskSettings {}
;
}
& { inference_id: string;task_type: InferenceTypesTaskTypeAmazonBedrock
type InferenceTypesTaskTypeAmazonBedrock = "chat_completion" | "completion" | "text_embedding"
; }