Generative AI Models and Regions for Agentic API
This page lists the subset of pretrained models and regions supported for agentic features in the OCI Generative AI.
Agentic features include:
- Agentic inference (runtime chat calls) used by agents during runtime.
- Project memory models used when you add long-term memory extraction and short-term memory compaction to an OCI Generative AI project.
1. Agentic Inference (Runtime) Models
Available Chat Models for Agents
Agents can call the following chat models for agentic inference use cases:
- Google Vertex AI Platform
- OpenAI Open Source
- xAI Platform
Available Regions
You can access agentic inference models in one or more of the following OC1 regions:
- North America
-
- US East (Ashburn)
- US Midwest (Chicago)
- US West (Phoenix)
- South America
-
- Brazil East (Sao Paulo)
- Europe (EU)
-
- Germany Central (Frankfurt)
- UK South (London)
- Middle East (ME)
-
- Saudi Arabia Central (Riyadh)
-
Note
Agentic API isn't available in UAE East (Dubai).
- Asia Pacific (AP)
-
- India South (Hyderabad)
- Japan Central (Osaka)
Not every model is available in every region in the preceding listed. For per-model supported regions and deployment details, see the, see the Models by Region page.
Project Memory Models (Project Settings
When you create a Project and enable memory features, you select models for:
- Short-term memory compaction (conversation history compaction)
- Long-term memory extraction (aims to extract key information from conversations)
- Long-term memory embeddings (stores extracted memories as searchable vectors)
2.1 Short-Term Memory Compaction (Conversation History Compaction)
Projects can use the following models for short-term memory compaction:
| Region | Region Code | Embed Model |
|---|---|---|
| Brazil East (Sao Paulo) | sa-saopaulo-1 |
|
| Germany Central (Frankfurt) | eu-frankfurt-1 |
|
| UK South (London) | uk-london-1 |
|
| India South (Hyderabad) | ap-hyderabad-1 |
|
| US East (Ashburn) (cross region to US Midwest (Chicago) | us-ahsburn-1 (cross region to us-chicago-1) |
|
| Japan Central (Osaka) | ap-osaka-1 |
|
| Saudi Arabia Central (Riyadh) | me-riyadh-1 |
|
| US Midwest (Chicago) | us-chicago-1 |
|
| US West (Phoenix) (cross region to US Midwest (Chicago) | us-phoenix-1(cross region to us-chicago-1) |
|
2.2 Long-Term Memory
- Extraction Model (All Supported Regions)
- OpenAI gpt-oss-120b
- Embedding Model
-
The embedding model used to store extracted memories as searchable vectors depends on the Project region:
Region Region Code Embed Model Brazil East (Sao Paulo) sa-saopaulo-1Cohere Embed Multilingual 3 Germany Central (Frankfurt) eu-frankfurt-1Cohere Embed Multilingual 3 UK South (London) uk-london-1Cohere Embed Multilingual 3 India South (Hyderabad) ap-hyderabad-1Cohere Embed Multilingual Image 3 US East (Ashburn) (cross region to US Midwest (Chicago) See external calls
us-ahsburn-1(cross region tous-chicago-1)Cohere Embed 4 Japan Central (Osaka) ap-osaka-1Cohere Embed 4 Saudi Arabia Central (Riyadh) me-riyadh-1Cohere Embed 4 US Midwest (Chicago) us-chicago-1Cohere Embed 4 US West (Phoenix) (cross region to US Midwest (Chicago) See external calls
us-phoenix-1(cross region tous-chicago-1)Cohere Embed 4
Notes and Known Limitations
- UAE East (Dubai): Agentic API isn’t available in this region.
- Availability: Regions listed for agentic inference models on this page indicate where agentic features are supported. Individual model availability might still vary within those regions.
External Calls to Google Models
External Calls to Google Gemini 2.5 Pro for US Regions
The Google Gemini 2.5 Pro model that can be accessed through the OCI Generative AI service in US regions, are hosted externally by Google. Therefore, a call to a Google Gemini 2.5 Pro model (through the OCI Generative AI service) results in a call to a Google location. For Google Gemini 2.5 Pro, a Google Americas regional location is used, which routes the request to only a Google Americas location. Machine Learning Processing takes place within a Google Americas location.
External Calls to Gemini 2.5 Flash for US Regions
The Gemini 2.5 Flash model that can be accessed through the OCI Generative AI service in US regions, are hosted externally by Google. Therefore, a call to a Gemini 2.5 Flash model (through the OCI Generative AI service) results in a call to a Google location. For Gemini 2.5 Flash, a Google Americas regional location is used, which routes the request to only a Google Americas location. Machine Learning Processing takes place within a Google Americas location.
External Calls to Gemini 2.5 Flash-Lite for US Regions
The Gemini 2.5 Flash-Lite model that can be accessed through the OCI Generative AI service in US regions, are hosted externally by Google. Therefore, a call to a Gemini 2.5 Flash-Lite model (through the OCI Generative AI service) results in a call to a Google location. For Gemini 2.5 Flash-Lite, a Google Americas regional location is used, which routes the request to only a Google Americas location. Machine Learning Processing takes place within a Google Americas location.