Use Custom Networking

Create a model deployment with the custom networking option.

The workload is attached by using a secondary VNIC to a customer-managed VCN and subnet. The subnet can be configured for egress to the public internet through a NAT/Internet gateway.

To use custom egress, you need to add a policy to provide Data Science with access to the subnet:
allow service datascience to use virtual-network-family in compartment <subnet_compartment>

For custom egress, the subnet must have at least 127 IP addresses available.

You can create and run custom networking model deployments using the Console, the OCI Python SDK, the OCI CLI, or the Data Science API.

    1. Use the Console to sign in to a tenancy with the necessary policies.
    2. Open the navigation menu and click Analytics & AI. Under Machine Learning, click Data Science.
    3. Select the compartment that contains the project that you want to create the model deployment in.

      All projects in the compartment are listed.

    4. Click the name of the project.

      The project details page opens and lists the notebook sessions.

    5. Under Resources, click Model deployments.

      A tabular list of model deployments in the project is displayed.

    6. Click Create model deployment.
    7. (Optional) Enter a unique name for the model (limit of 255 characters). If you don't provide a name, a name is automatically generated.

      For example, modeldeployment20200108222435.

    8. (Optional) Enter a description (limit of 400 characters) for the model deployment.
    9. (Optional) Under Default configuration, enter a custom environment variable key and corresponding value. Click + Additional custom environment key to add more environment variables.
    10. In the Models section, click Select to select an active model to deploy from the model catalog.
      1. Find a model by using the default compartment and project, or by clicking Using OCID and searching for the model by entering its OCID.
      2. Select the model.
      3. Click Submit.
      Important

      Model artifacts that exceed 400 GB aren't supported for deployment. Select a smaller model artifact for deployment.
    11. (Optional) Change the Compute shape by clicking Change shape. Then, follow these steps in the Select compute panel.
      1. Select an instance type.
      2. Select an shape series.
      3. Select one of the supported Compute shapes in the series.
      4. Select the shape that best suits how you want to use the resource. For the AMD shape, you can use the default or set the number of OCPUs and memory.

        For each OCPU, select up to 64 GB of memory and a maximum total of 512 GB. The minimum amount of memory allowed is either 1 GB or a value matching the number of OCPUs, whichever is greater.

      5. Click Select shape.
    12. Enter the number of instances for the model deployment to replicate the model on.
    13. Select Custom networking to configure the network type.

      Select the VCN and subnet that you want to use for the resource (notebook session or job).

      If you don't see the VCN or subnet that you want to use, click Change Compartment, and then select the compartment that contains the VCN or subnet.
      Note

      Changing from default networking to custom networking is allowed. If custom networking is selected, it can't be changed to default networking.
    14. (Optional) If you configured access or predict logging, in the Logging section, click Select and then follow these steps:
      1. For access logs, select a compartment, log group, and log name.
      2. For predict logs, select a compartment, log group, and log name.
      3. Click Submit.
    15. (Optional) Click Show Advanced Options to add tags.
      1. (Optional) Select the serving mode for the model deployment, either as an HTTPS endpoint or using a Streaming service stream.
      2. (Optional) Select the load balancing bandwidth in Mbps or use the 10 Mbps default.

        Tips for load balancing:

        If you know the common payload size and the frequency of requests per second, you can use the following formula to estimate the bandwidth of the load balancer that you need. We recommend that you add an extra 20% to account for estimation errors and sporadic peak traffic.

        (Payload size in KB) * (Estimated requests per second) * 8 / 1024

        For example, if the payload is 1,024 KB and you estimate 120 requests per second, then the recommended load balancer bandwidth would be (1024 * 120 * 8 / 1024) * 1.2 = 1152 Mbps.

        Remember that the maximum supported payload size is 10 MB when dealing with image payloads.

        If the request payload size is more than the allocated bandwidth of the load balancer that was defined, then the request is rejected with a 429 status code.

      3. (Optional) Select Use a custom container image and enter the following:
        • Repository in <tenancy>: The repository that contains the custom image.

        • Image: The custom image to use in the model deployment at runtime.

        • CMD: More commands to run when the container starts. Add one instruction per text-box. For example if CMD is ["--host", "0.0.0.0"], then pass --host in one text-box and 0.0.0.0 in another one. Don't use quotation marks at the end.

        • Entrypoint: One or more entry point files to run when the container starts. For example /opt/script/entrypoint.sh. Don't use quotation marks at the end.

        • Server port: The port that the web server serving the inference is running on. The default is 8080. The port can be anything between 1024 and 65535. Don't use the 24224, 8446, 8447 ports.

        • Health check port: The port that the container HEALTHCHECK listens on. Defaults to the server port. The port can be anything between 1024 and 65535. Don't use the 24224, 8446, 8447 ports.

      4. (Optional) Click the Tags tab, and then enter the tag namespace (for a defined tag), key, and value to assign tags to the resource.

        To add more than one tag, click Add tag.

        Tagging describes the various tags that you can use organize and find resources including cost-tracking tags.

    16. Click Create.
  • You can use the OCI CLI to create a model deployment as in this example.

    1. Deploy the model with:
      oci data-science model-deployment create \
      --compartment-id <MODEL_DEPLOYMENT_COMPARTMENT_OCID> \
      --model-deployment-configuration-details file://<MODEL_DEPLOYMENT_CONFIGURATION_FILE> \
      --project-id <PROJECT_OCID> \
      --category-log-details file://<OPTIONAL_LOGGING_CONFIGURATION_FILE> \
      --display-name <MODEL_DEPLOYMENT_NAME>
    2. Use this model deployment JSON configuration file:
      {
            "deploymentType": "SINGLE_MODEL",
            "modelConfigurationDetails": {
              "bandwidthMbps": <YOUR_BANDWIDTH_SELECTION>,
              "instanceConfiguration": {
                "subnetId": <YOUR_SUBNET_ID>,
                "instanceShapeName": "<YOUR_VM_SHAPE>"
              },
              "modelId": "<YOUR_MODEL_OCID>",
              "scalingPolicy": {
                  "instanceCount": <YOUR_INSTANCE_COUNT>,
                  "policyType": "FIXED_SIZE"
               }
           }
       }

      If you're specifying an environment configuration, you must include the environmentConfigurationDetails object as in this example:

      
      {
        "modelDeploymentConfigurationDetails": {
          "deploymentType": "SINGLE_MODEL",
          "modelConfigurationDetails": {
            "modelId": "ocid1.datasciencemodel.oc1.iad........",
            "instanceConfiguration": {
              "subnetId": <YOUR_SUBNET_ID>,
              "instanceShapeName": "VM.Standard.E4.Flex",
              "modelDeploymentInstanceShapeConfigDetails": {
                "ocpus": 1,
                "memoryInGBs": 16
              }
            },
            "scalingPolicy": {
              "policyType": "FIXED_SIZE",
              "instanceCount": 1
            },
            "bandwidthMbps": 10
          },
          "environmentConfigurationDetails" : {
            "environmentConfigurationType": "OCIR_CONTAINER",
            "image": "iad.ocir.io/testtenancy/image_name:1.0.0",
            "entrypoint": [
              "python",
              "/opt/entrypoint.py"
            ],
            "serverPort": "5000",
            "healthCheckPort": "5000"
          },
          "streamConfigurationDetails": {
            "inputStreamIds": null,
            "outputStreamIds": null
          }
        }
      }
    3. (Optional) Use this logging JSON configuration file:
      {
          "access": {
            "logGroupId": "<YOUR_LOG_GROUP_OCID>",
            "logId": "<YOUR_LOG_OCID>"
          },
          "predict": {
            "logGroupId": "<YOUR_LOG_GROUP_OCID>",
            "logId": "<YOUR_LOG_OCID>"
          }
      }
    4. (Optional) Use this to use a custom container:
      oci data-science model-deployment create \
      --compartment-id <MODEL_DEPLOYMENT_COMPARTMENT_OCID> \
      --model-deployment-configuration-details file://<MODEL_DEPLOYMENT_CONFIGURATION_FILE> \
      --project-id <PROJECT_OCID> \
      --category-log-details file://<OPTIONAL_LOGGING_CONFIGURATION_FILE> \
      --display-name <MODEL_DEPLOYMENT_NAME>
  • Use the CreateModelDeployment operation to create a model deployment with custom networking. Set the subnet ID as described in the Instance Configuration API documentation.