OpenSearch Guidelines for Generative AI Agents

Important

Note

To make your data available to OCI Generative AI Agents, you have two options:
  1. Object Storage data: You can upload your data files to OCI Object Storage and let Generative AI Agents automatically ingest the data. Skip this topic if your data files are in Object Storage.
  2. OpenSearch data: You can bring your own (BYO) ingested and indexed OCI Search with OpenSearch data for the agents to use. This topic provides guidelines for this option, assuming you are already familiar with OCI Search with OpenSearch.
Creating a Cluster with a Management Instance

Contact the Beta program for a link and instructions to download an OCI Resource Manager Terraform stack that creates an OCI Search with OpenSearch cluster with a public management instance. Use the following guidelines:

  1. Select Terraform version 1.2.x.
  2. If you're using OCI Identity Domain for authentication, in the Consele, navigate to the domain section and copy the domain URL. For example, https://idcs-xxx.identity.oraclecloud.com:443

  3. If you're using a federation-based tenancy, in the Console, navigate to Federation and under Identity, select your identity provider. Get the OpenID URL by copying the IDCS URL. For example, https://idcs-xxxx.identity.oraclecloud.com
Creating Indexes for OCI OpenSearch

After running the stack in the previous section, create indexes and add your data to OpenSearch with the following guidelines:

Important

To ingest large documents into OpenSearch, you must first chunk those documents to files with less than 10,000 tokens each.
  1. In the OCI Console, navigate to the details page of your stack. Click Stack resources, and copy the values for the following resources:
    • public_ip: The public IP address for the VM that was created by the stack.
    • opendashboard_private_ip: The private IP to use for the dashboard endpoint
    • opensearch_private_ip: The private IP to use for the API endpoint
    • private_key_pem: The private IP to use for the API endpoint
  2. Save the private_key_pem value to the management-instance-pk.pem file.
  3. Format the management-instance-pk.pem file:
    sed -i "" 's/\"//g' management-instance-pk.pem
    sed -i "" 's/\\n/\n/g' management-instance-pk.pem
    chmod 600 management-instance-pk.pem
  4. Sign in to the VM that was created with the stack in the previous section. Use the private key that you created during the Terraform stack creation. Use SSH. For example,
    ssh -C -N -v -t -L 127.0.0.1:5601:<your_OpenSearch_dashboards_private_IP>:5601 -L 127.0.0.1:9200:<your_OpenSearch_private_IP>:9200 opc<your_VM_instance_public_IP> -i <path_to_your_private_key>
    

    Reference: Search and visualize data using OCI Search Service with OpenSearch

  5. Format the private key file:
    # Step 1: Create a private key file
    touch <<private-key-file-name>>.pem
     
    # Step 2: Edit File
    vim <<private-key-file-name>>.pem
    # Paste copied private key
    # Click Esc key and type :wq to save it
     
    # Step 3: Format Private Key File
    #To remove "
    sed -i "" 's/\"//g' <<private-key-file-name>>.pem
     
    #To replace \n with new line:
    sed -i "" 's/\\n/\n/g' <<private-key-file-name>>.pem
     
    # Step 4: check file content if formatted properly:
    cat <<private-key-file-name>>.pem
     
    # Step 5: Change Private key file access
    chmod 600 <<private-key-file-name>>.pem
  6. Ingest data into OpenSearch:
    # Step 1: ssh to management-instance which will be created as part of stack creation:
    ssh -i private_key.pem opc@<<management-instance-ip>>
     
    # Step 2: Create index
    curl -XPUT https://<<OpenSearch-cluster-ip>>:9200/<<index-name>> -u <<OpenSearch-cluster-username>>:<<password>> --insecure
    e.g.: curl -XPUT https://254.125.0.0:9200/iaas -u pocuser:Poc@1234 --insecure
     
     
    # Step 3: Dump Data
     
    #copy files to management instance
    scp -i <<private_key.pem>> <<local-files-path>> opc@<<management-instance-ip>>:<<path>>
    e.g: scp -i private_key.pem ~/Documents/Setup/ opc@207.211.175.225:data
     
    # Using Script, file format supported: pdf, html, docx
    ingestor -f <<path-to-files-directory>> -ip <<OpenSearch-cluster-ip>> -po 9200 -u <<OpenSearch-cluster-username>> -pw <<password>> -i <<index-name>>
     
    ingestor -p <<object-storage-par-URL>> -ip <<OpenSearch-cluster-ip>> -po 9200 -u <<OpenSearch-cluster-username>> -pw <<password>> -i <<index-name>>
     
    #Using curl, file format supported: x_and_json
    curl -H 'Content-Type: application/x-ndjson' -XPOST https://<<OpenSearch-cluster-ip>>:9200/<<index-name>>/_bulk?pretty --data-binary @<<file-name>>.json
    -u <<OpenSearch-cluster-username>>:<<password>> --insecure
     
     
    # Step 4: Print Count
    curl -XGET https://<<OpenSearch-cluster-ip>>:9200/<<index-name>>/_count?pretty -u <<OpenSearch-cluster-username>>:<<password>> --insecure
     
     
    # Step 5 : Search data (Optional)
    curl -XGET https://<<OpenSearch-cluster-ip>>:9200/<<index-name>>/_search -u <<OpenSearch-cluster-username>>:<<password>> --insecure
    
  7. Configure OpenID Connect for the OpenSearch cluster:
    curl -XPUT "https://<<OpenSearch-cluster-ip>>:9200/_plugins/_security/api/securityconfig/config" -u <<OpenSearch-cluster-username>>:<<password>> --insecure -H 'Content-Type: application/json' -d'
    {
      "dynamic": {
        "security_mode": "ENFORCING",
        "http": {
          "anonymous_auth_enabled": false,
          "xff": {
            "enabled": false
          }
        },
        "authc": {
          "OpenID_auth_domain": {
                        "http_enabled": true,
                        "transport_enabled": true,
                        "order": 1,
                        "http_authenticator": {
                            "challenge": true,
                            "type": "OpenID",
                            "config": {
                                "subject_key": "sub",
                                "roles_key": "scope",
                                "openid_connect_url": "<<IDCS-URL>>/.well-known/openid-configuration"
                            }
                        },
                        "authentication_backend": {
                            "type": "noop",
                            "config": {}
                        },
                        "description": "Authenticate using OpenID connect"
                    },
                    "basic_internal_auth_domain": {
                        "http_enabled": true,
                        "transport_enabled": true,
                        "order": 0,
                        "http_authenticator": {
                            "challenge": false,
                            "type": "basic",
                            "config": {}
                        },
                        "authentication_backend": {
                            "type": "intern",
                            "config": {}
                        },
                        "description": "Authenticate via HTTP Basic against internal users database"
                    }
        },
        "authz": null
      }
    }'
    Note

    Ensure that you add the correct OpenID_connect_URL value.
  8. Reset the OpenSearch cluster password:

    In the OCI Console, navigate to the OpenSearch listed clusters and select your cluster. In the Security Information tabl, click Update security information and update the password.

Creating a Secret

Before you add your Search with OpenSearch data to a knowledge base in Generative AI Agents, you must create a secret for OpenSearch in OCI Vault service.

To create a secret for basic authentication and then use that secret for the knowledge base perform the steps in the first dropdown section. To create a secret for an Identity Cloud Service (IDCS), use the guidelines in the second dropdown section. Follow the guidelines in only one of the following two sections.

Creating a Vault Secret for Basic Authentication Scenario
  1. In the OCI Console, Create a Vault.
  2. After the vault is active, create a key for the vault
  3. For the vault, create a secret with the following specifics:
    • Select the key that you created in the previous step.
    • Manually enter the username and password for the OpenSearch cluster with the following format:
      • Secret Type Template: Plain-Text
      • Secret Contents: <OpenSearch-username>:<OpenSearch-password>
Creating a Vault Secret for Identity Cloud Service (IDCS)

Creating a confidential application

Create a confidential application if you don't have one:

  1. In the IDCS Console, navigate to Applications, click Add application, and select Confidential Application.
  2. Create a resource server application with agent-endpoint as the primary audience by adding the following values in the Configure Oauth step:
    • Select Configure this application as a resource server now.
    • Access token expiration: 3600
    • Primary audience: https://agent-endpoint/
  3. Create a client application with agent-endpoint as the redirect URL by selecting the following options in the Authorization section:
    • Resource Owner
    • Client Credentials
    • JWT assertion
    • Refresh token
    • Authorization code
    • TLS client authentication

    For Redirect URL, enter https://agent-endpoint/.

  4. Click Finish to create the application.
  5. After the application is active, in the application's detail page, click Activate.
  6. For the resource server application, edit the OAuth configuration and select Add scope. Then add the scope, genaiagent.
  7. Edit the client application and select Add resources. Under Resources, click Add scope and select the resource server application that you created. The Scope field displays https://agent-endpoint/genaiagent

Setting Up the OpenSearch OpenID

Reference: Configuring OpenID Connect for an OpenSearch Cluster

 Step 1: Add OpenID Config
curl -XPUT "https://<opensearch-ip>:9200/_plugins/_security/api/securityconfig/config" -u username:password --insecure -H 'Content-Type: application/json' -d'
{
  "dynamic": {
    "security_mode": "ENFORCING",
    "http": {
      "anonymous_auth_enabled": false,
      "xff": {
        "enabled": false
      }
    },
    "authc": {
      "openid_auth_domain": {
                    "http_enabled": true,
                    "transport_enabled": true,
                    "order": 0,
                    "http_authenticator": {
                        "challenge": false,
                        "type": "openid",
                        "config": {
                            "subject_key": "sub",
                            "roles_key": "scope",
                            "openid_connect_url": "https://<openid-domain-host>/.well-known/openid-configuration"
                        }
                    },
                    "authentication_backend": {
                        "type": "noop",
                        "config": {}
                    },
                    "description": "Authenticate using OpenId connect"
                },
                "basic_internal_auth_domain": {
                    "http_enabled": true,
                    "transport_enabled": true,
                    "order": 1,
                    "http_authenticator": {
                        "challenge": true,
                        "type": "basic",
                        "config": {}
                    },
                    "authentication_backend": {
                        "type": "intern",
                        "config": {}
                    },
                    "description": "Authenticate via HTTP Basic against internal users database"
                }
    },
    "authz": null
  }
}'
  
  
# Step 5: Create Readonly Role
curl -XPUT "https://<opensearch-ip>:9200/_plugins/_security/api/roles/genaiagent_readall" -u username:password --insecure -H 'Content-Type: application/json' -d'{
  "description": "Role to be used by Generative AI Agent having read only permission to all Indexes",
  "cluster_permissions": [
    "cluster_composite_ops_ro"
  ],
  "index_permissions": [{
    "index_patterns": [
      "*"
    ],
    "fls": [],
    "masked_fields": [],
    "allowed_actions": [
      "read"
    ]
  }],
  "tenant_permissions": []
 }'
  
# Step 6: Add role mapping for genaiagent_readall
curl -XPUT "https://<opensearch-ip>:9200/_plugins/_security/api/rolesmapping/genaiagent_readall" -u username:password --insecure -H 'Content-Type: application/json' -d'{
    "backend_roles" : [ "genaiagent" ],
    "hosts" : [],
    "users" : []
  }'
  
# Step 7: Test Access
curl --location 'https://<domain-host>/oauth2/v1/token' \
--header 'authorization: Basic <Base64 clientId:clientSecret>' \
--header 'content-type: application/x-www-form-urlencoded; charset=utf-8' \
--data 'grant_type=client_credentials&scope=<scope>'

Creating a Vault secret for IDCS client credential

  1. In the OCI Console, Create a Vault.
  2. After the vault is active, create a key for the vault
  3. For the vault, create a secret with the following specifics:
    • Select the key that you created in the previous step.
    • Manually enter the IDCS client secret for the OpenSearch cluster with the following format:
      • Secret Type Template: Plain-Text
      • Secret Contents: clientSecret