Generative AI Models and Regions for Agentic API

This page lists the subset of pretrained models and regions supported for agentic features in the OCI Generative AI.

Agentic features include:

  • Agentic inference (runtime chat calls) used by agents during runtime.
  • Project memory models used when you add long-term memory extraction and short-term memory compaction to an OCI Generative AI project.

1. Agentic Inference (Runtime) Models

Available Regions

You can access agentic inference models in one or more of the following OC1 regions:

North America
  • US East (Ashburn)
  • US Midwest (Chicago)
  • US West (Phoenix)
South America
  • Brazil East (Sao Paulo)
Europe (EU)
  • Germany Central (Frankfurt)
  • UK South (London)
Middle East (ME)
  • Saudi Arabia Central (Riyadh)
  • Note

    Agentic API isn't available in UAE East (Dubai).
Asia Pacific (AP)
  • India South (Hyderabad)
  • Japan Central (Osaka)
Important

Not every model is available in every region in the preceding listed. For per-model supported regions and deployment details, see the, see the Models by Region page.

Project Memory Models (Project Settings

When you create a Project and enable memory features, you select models for:

  • Short-term memory compaction (conversation history compaction)
  • Long-term memory extraction (aims to extract key information from conversations)
  • Long-term memory embeddings (stores extracted memories as searchable vectors)

2.1 Short-Term Memory Compaction (Conversation History Compaction)

Projects can use the following models for short-term memory compaction:

Region Region Code Embed Model
Brazil East (Sao Paulo) sa-saopaulo-1
Meta
OpenAI Open Source
Germany Central (Frankfurt) eu-frankfurt-1
Google Vertex AI Platform
Meta
OpenAI Open Source
UK South (London) uk-london-1
Google Vertex AI Platform
Meta
OpenAI Open Source
India South (Hyderabad) ap-hyderabad-1
Google Vertex AI Platform
Meta
OpenAI Open Source
US East (Ashburn) (cross region to US Midwest (Chicago) us-ahsburn-1 (cross region to us-chicago-1)
Google Vertex AI Platform
Meta
OpenAI Open Source
Japan Central (Osaka) ap-osaka-1
Google Vertex AI Platform
Meta
OpenAI Open Source
Saudi Arabia Central (Riyadh) me-riyadh-1
Meta
OpenAI Open Source
US Midwest (Chicago) us-chicago-1
Google Vertex AI Platform
Meta
OpenAI Open Source
US West (Phoenix) (cross region to US Midwest (Chicago) us-phoenix-1(cross region to us-chicago-1)
Google Vertex AI Platform
Meta
OpenAI Open Source

2.2 Long-Term Memory

Extraction Model (All Supported Regions)
OpenAI gpt-oss-120b
Embedding Model

The embedding model used to store extracted memories as searchable vectors depends on the Project region:

Region Region Code Embed Model
Brazil East (Sao Paulo) sa-saopaulo-1 Cohere Embed Multilingual 3
Germany Central (Frankfurt) eu-frankfurt-1 Cohere Embed Multilingual 3
UK South (London) uk-london-1 Cohere Embed Multilingual 3
India South (Hyderabad) ap-hyderabad-1 Cohere Embed Multilingual Image 3
US East (Ashburn) (cross region to US Midwest (Chicago)

See external calls

us-ahsburn-1 (cross region to us-chicago-1) Cohere Embed 4
Japan Central (Osaka) ap-osaka-1 Cohere Embed 4
Saudi Arabia Central (Riyadh) me-riyadh-1 Cohere Embed 4
US Midwest (Chicago) us-chicago-1 Cohere Embed 4
US West (Phoenix) (cross region to US Midwest (Chicago)

See external calls

us-phoenix-1 (cross region to us-chicago-1) Cohere Embed 4

Notes and Known Limitations

  • UAE East (Dubai): Agentic API isn’t available in this region.
  • Availability: Regions listed for agentic inference models on this page indicate where agentic features are supported. Individual model availability might still vary within those regions.

External Calls to Google Models

Important

External Calls to Google Gemini 2.5 Pro for US Regions

The Google Gemini 2.5 Pro model that can be accessed through the OCI Generative AI service in US regions, are hosted externally by Google. Therefore, a call to a Google Gemini 2.5 Pro model (through the OCI Generative AI service) results in a call to a Google location. For Google Gemini 2.5 Pro, a Google Americas regional location is used, which routes the request to only a Google Americas location. Machine Learning Processing takes place within a Google Americas location.

Important

External Calls to Gemini 2.5 Flash for US Regions

The Gemini 2.5 Flash model that can be accessed through the OCI Generative AI service in US regions, are hosted externally by Google. Therefore, a call to a Gemini 2.5 Flash model (through the OCI Generative AI service) results in a call to a Google location. For Gemini 2.5 Flash, a Google Americas regional location is used, which routes the request to only a Google Americas location. Machine Learning Processing takes place within a Google Americas location.

Important

External Calls to Gemini 2.5 Flash-Lite for US Regions

The Gemini 2.5 Flash-Lite model that can be accessed through the OCI Generative AI service in US regions, are hosted externally by Google. Therefore, a call to a Gemini 2.5 Flash-Lite model (through the OCI Generative AI service) results in a call to a Google location. For Gemini 2.5 Flash-Lite, a Google Americas regional location is used, which routes the request to only a Google Americas location. Machine Learning Processing takes place within a Google Americas location.