Creating a Private Endpoint

Create a private endpoint for a custom or pretrained model on a hosting dedicated AI cluster in OCI Generative AI.

Note

Before you create a private endpoint, perform the Prerequisites for Private Endpoints and have the following details:

  • The name of the Virtual Cloud Network (VCN)
  • The name of the private subnet in the VCN
  • (Optional, for Zero Trust Packet Routing (ZPR)): The security attribute namespace/key/value you plan to assign to the endpoint, and a ZPR policy that allows traffic to the endpoint.

Caution

If you assign a ZPR security attribute to the private endpoint, access to the endpoint requires an explicit ZPR policy allow rule. Otherwise, traffic can be blocked even if your route rules, NSGs, and security lists allow it.

By default, a tenancy has a maximum limit of 5 private endpoints. For more Generative AI private endpoints, request an increase in the limit, private-endpoint-count for the Generative AI service.

  • On the Private Endpoints list page, select Create private endpoint. If you need help finding the list page, see Listing Private Endpoints.

    General Information

    1. Select a compartment to create the private endpoint in. The default compartment is the same as the list page, but you can select any compartment that you have permission to work in.
      We recommend that you create the private endpoint in the same compartment as the model that will use this endpoint.
    2. (Optional) Enter a name for the endpoint. Start the name with a letter or underscore, followed by letters, numbers, hyphens, or underscores. The length can be 1 to 255 characters. If you don't enter a name, the system generates a name that you can change later.
      The generated name has the format generativeaiprivateendpoint<timestamp>. Example: generativeaiprivateendpoint20250929212918
    3. (Optional) Enter a description for the private endpoint.

    VCN and Subnet

    Select the following information:

    • VCN compartment
    • VCN
    • Private subnet compartment
    • Private subnet

    DNS and Network Security Groups

    1. Enter a domain namespace (DNS) prefix for the FQDN.
      A preview displays the FQDN with this DNS prefix. A fully qualified domain name is a complete unique name for a network resource, that's resolved to a specific IP address. For example,
      <DNS-prefix>.pe.inference.generativeai.us-chicago-1.oci.oraclecloud.com
    2. (Optional) Select whether to add one or more network security groups by selecting Add network security group for each group that you want to add.
      Learn about Security Rules.
    3. (Optional) Select a network security group from then list.
    4. (Optional) Add more network security groups.
    5. In the create flow, expand Show security attributes, and then expand the Tags option that reveals for the security attributes.
    6. Select Add security attribute.
    7. Enter the following information:
      • Security attribute namespace
      • Security attribute key
      • Security attribute value
    8. Select Add security attribute to add more attributes (up to 3 total).

      If you have permissions to create a resource, then you might also have permissions to add security attributes to that resource. To add a security attribute, you must have permissions to use the security attribute namespace. For more information about security attributes and security attribute namespaces, see Zero Trust Packet Routing. If you're not sure whether to add security attributes, skip this option or ask an administrator. You can add security attributes later.

      Note

      To avoid unintentionally blocking access, ensure the ZPR policies are defined to allow the intended traffic flow to the endpoint before using the endpoint in production. See Prerequisites.

    9. (Optional) Select Add tag and assign tags to this private endpoint. See Resource Tags.
    10. Select Create.

    Use this Endpoint for On-Demand Models

    By default, this private endpoint is available for models hosted on dedicated AI clusters. If you want this endpoint to also be available for on-demand models that are offered in Generative AI service, then perform this step:
    1. Select Allow Usage In On-Demand Mode.
    2. See the Tip at the end of this section on how to reach the on-demand model.
    Important

    To access a Generative AI model through this private endpoint, see Adding a Model to a Private Endpoint.

    Create the Endpoint

    1. (Optional) Select Add tag and assign tags to this private endpoint. See Resource Tags.
    2. Select Create.
    Tip

    To use this private endpoint to reach an on-demand model, create a Compute instance in the private subnet allocated for the private endpoint, add your code to the Compute instance, use the FQDN for the private endpoint, and access the model from that Compute instance.
  • Use the generative-ai-private-endpoint create command and required parameters to create a private endpoint:

    oci generative-ai generative-ai-private-endpoint create [OPTIONS] 
    [OPTIONS]

    For a complete list of parameters and values for CLI commands, see the CLI Command Reference.

    Note

    For pretrained models, instead of an OCID, you can use the model name exactly as listed in the Console's playground. You can also find this OCI model name, in the model's detail page in Offered Pretrained Foundational Models in Generative AI.
  • Run the CreateGenerativeAiPrivateEndpoint operation to create a private endpoint.