Create an Amazon Bedrock inference endpoint

PUT /_inference/{task_type}/{amazonbedrock_inference_id}

Copy endpoint

Create an inference endpoint to perform an inference task with the amazonbedrock service.

info You need to provide the access and secret keys only once, during the inference model creation. The get inference API does not retrieve your access or secret keys. After creating the inference model, you cannot change the associated key pairs. If you want to use a different access and secret key pair, delete the inference model and recreate it with the same name and the updated keys.

Required authorization

Cluster privileges: manage_inference

Parameters

path Path Parameters

Name	Type
`task_type` required The type of the inference task that the model will perform.	type InferenceTypesAmazonBedrockTaskType = "chat_completion" \| "completion" \| "text_embedding"
`amazonbedrock_inference_id` required The unique identifier of the inference endpoint.	type TypesId = string

query Query Parameters

Name	Type
`timeout` Specifies the amount of time to wait for the inference endpoint to be created.	type TypesDuration = string \| "-1" \| "0"

Request Body

application/json required

{
chunking_settings?:

InferenceTypesInferenceChunkingSettings

Chunking configuration object

interface InferenceTypesInferenceChunkingSettings {
max_chunk_size?: number;
overlap?: number;
sentence_overlap?: number;
separator_group?: string;
separators?: string[];
strategy?: string;
}

;
service:

InferenceTypesAmazonBedrockServiceType

type InferenceTypesAmazonBedrockServiceType = "amazonbedrock"

;
service_settings:

InferenceTypesAmazonBedrockServiceSettings

interface InferenceTypesAmazonBedrockServiceSettings {
access_key: string;
model: string;
provider?: string;
region: string;
rate_limit?: InferenceTypesRateLimitSetting;
secret_key: string;
}

;
task_settings?:

InferenceTypesAmazonBedrockTaskSettings

interface InferenceTypesAmazonBedrockTaskSettings {
max_new_tokens?: number;
temperature?: number;
top_k?: number;
top_p?: number;
}

;
}

Responses

200 application/json

type InferenceTypesInferenceEndpointInfoAmazonBedrock = interface InferenceTypesInferenceEndpoint {
chunking_settings?:

InferenceTypesInferenceChunkingSettings

Chunking configuration object

interface InferenceTypesInferenceChunkingSettings {
max_chunk_size?: number;
overlap?: number;
sentence_overlap?: number;
separator_group?: string;
separators?: string[];
strategy?: string;
}

;
service: string;
service_settings:

InferenceTypesServiceSettings

interface InferenceTypesServiceSettings {}

;
task_settings?:

InferenceTypesTaskSettings

interface InferenceTypesTaskSettings {}

;
} & { inference_id: string;task_type:

InferenceTypesTaskTypeAmazonBedrock

type InferenceTypesTaskTypeAmazonBedrock = "chat_completion" | "completion" | "text_embedding"

; }