Concepts for Generative AI Agents
Here are some concepts and terms related to the OCI Generative AI Agents service.
Generative AI model
A large language model (LLM), trained on large amounts of data, that takes inputs that it hasn't been trained on before and generates new content. The Generative AI Agents service uses an LLM while processing requests and generating responses.
Agent
An LLM-based autonomous system that understands and generates human-like text, enabling natural-language processing interactions. The Generative AI Agents supports retrieval-augmented generation (RAG) agents. A RAG agent connects to a data source, retrieves data, and augments model responses with the information from the data sources to generate more relevant responses. Examples for other AI agents are agents that can dynamically invoke APIs, such as agents addressing customer support inquiries in a conversation interface or agents that can automatically buy items on behalf of the customer.
When using RAG agents, models need to perform with high answerability and groundedness:
- Answerability
-
The model can generate relevant responses to user queries.
- Groundedness
- The model's generated responses can be tracked to data sources.
Knowledge base, data source, and data store
How an agent gets access to data. An agent connects to a knowledge base, which is a vector-based storage that can connect to and ingest data from a data source. Data sources provide connection information to the data stores that an agent uses to generate responses. For example, for an Object Storage data store that contains a bucket with hundreds of data files, the data source can connect to the data store and ingest the files.
Depending on the data store, knowledge bases can be service-managed or customer-managed.
- Service-managed knowledge base
- The user specifies the data source and Generative AI Agents ingests data from that data source to a knowledge base to be used by agents.
- Customer-managed knowledge base
- The user manages the indexing of the data and then provides the indexes to Generative AI Agents to be used by agents.
Data Ingestion
A process that extracts data from data source documents, converts it into a structured format suitable for analysis, and then stores it in a knowledge base.
Chat
Having a conversation with a Large Language Model (LLM) by asking questions and having the model generate answers such as text or code, and continuing the conversation while the model keeps the context of the conversation. When the LLM is enabled with a Retrieval-Augmented Generation (RAG) agent, you can ask questions related to the data to which the agent has access to and the model can generate outputs with reference to the knowledge base.
Session
Represents an interactive conversation started by a user through an API to engage with the agent. It involves a series of exchanges where the user sends queries or prompts, and the agent responds with relevant information, actions, or help based on the user's input. The session persists during the interaction, maintaining context and continuity to provide coherent and meaningful responses throughout the conversation.
Agent endpoint
Specific points of access in a network or system that agents use to interact with other systems or services. Endpoints are used primarily to enable communication and data exchange between an agent and external systems, in order for agents to retrieve or send information as needed to perform their functions.
Trace
Tracking a chat conversation.
In OCI Generative AI Agents, the trace feature tracks and displays the conversation history, including both the original prompt and the generated response, during a chat conversation. You can enable this feature when you create an endpoint for an agent.
Citation
The source of information for the agent's response.
In OCI Generative AI Agents, the RAG agent outputs the citation of every response, which includes the title, external path, document id, and the page numbers for the source of information. You can enable citation for an agent, when you create an endpoint for that agent.
In OCI Generative AI Agents, in addition to the existing relevant information, the citation feature returns document title and pages, providing more context. For citation management, you can override Object Storage citation links to custom URLs, by adding metadata to objects. See Assigning a Custom URL to an Object Storage Object.
Content Moderation
A feature designed to help detect or filter out certain toxic, violent, abusive, hateful, threatening, insulting, and harassing phrases from generated responses or user prompts in large language models (LLMs). In OCI Generative AI Agents, this feature is associated with a categorization of the following four types of harm:
- Hate and Harassment, for example, identity attacks, insults, threats of violence, and sexual aggression.
- Self-Inflicted Harm, for example, self-harm and eating-disorder promotion.
- Ideological Harm, for example, extremism, terrorism, and organized crime.
- Exploitation, for example, scams and sexual abuse.
Enabling Content Moderation
To activate content moderation, you must:
- Enable it when creating an endpoint for an agent.
- Specify whether moderation is applied to:
- User prompt (input)
- Generated response (output)
- Both input and output
Learn about Creating an Endpoint in Generative AI Agents.
How Content Moderation Works
When applied to input, the agent doesn't look for answers if it detects harmful content in the input. When applied to output, the agent will search for answers but will not display them if it finds harmful content in the source.