Updating an Existing Model Deployment with Autoscaling

Learn how to enable autoscaling for an existing model deployment or update any existing autoscaling configuration.

For model deployments in an Active state, note that modifications to the Autoscaling Scaling Policy fields must occur independently, without simultaneous changes to other configurations. However, updates to fields such as Display name, Description, Tags, and other non-infrastructure related aspects are allowed.

In contrast, when the model deployment is Inactive, you have the flexibility to change all options simultaneously.

    1. On the Projects list page, select the project that contains the model deployments that you want to work with. If you need help finding the list page or the project, see Listing Projects.
    2. On the project details page, select Model deployments.
      All model deployments in the selected project are displayed in a table.
    3. Find the model deployment that you want to work with.
    4. From the Actions menu (three dots) for the model deployment, select Edit.
    5. On the Edit model deployment page, enter the following information.
      Important

      Each individual option change must be submitted before another option is changed using the same steps.

      For field descriptions, see Creating a Model Deployment.

      • Autoscaling configuration (Optional): Select Enable autoscaling and enter the following information.
        • Minimum number of instances
        • Maximum number of instances
        • Cooldown period in seconds
        • Scaling metric type

          To use the custom scaling metric option, select Custom, and then specify the scale-in and scale-out queries.

          Important

          Include the following text in each MQL query to reference the resource OCID: {resourceId = "MODEL_DEPLOYMENT_OCID"}
        • Scale-in threshold in percentage
        • Scale-out threshold in percentage
        • Advanced options (Optional): Autoscale the load balancer. Set the value of maximum bandwidth to be more than the minimum bandwidth value, and no more than twice the minimum bandwidth value.
          • Scale-in instance count step
          • Scale-out instance count step
    6. Select Submit.
  • Use the oci data-science model-deployment update command and required parameters to edit (update) a model deployment:

    oci data-science model-deployment update --model-deployment-id <model-deployment-id>... [OPTIONS]

    For example, update a deployment with:

    oci data-science model-deployment update \
    --model-deployment-id <MODEL_DEPLOYMENT_OCID>
    --model-deployment-configuration-details file://<MODEL_DEPLOYMENT_CONFIGURATION_FILE>

    Then use this model deployment JSON configuration file for update. Update the fields under the AUTOSCALING scaling policy as appropriate:

    {
      "deploymentType": "SINGLE_MODEL",
      "modelConfigurationDetails": {
        "modelId": "ocid1.datasciencemodel....",
        "scalingPolicy": {
          "policyType": "AUTOSCALING",
          "coolDownInSeconds": 650,
          "isEnabled": true,
          "autoScalingPolicies": [
            {
              "autoScalingPolicyType": "THRESHOLD",
              "initialInstanceCount": 1,
              "maximumInstanceCount": 2,
              "minimumInstanceCount": 1,
              "rules": [
                {
                  "metricExpressionRuleType": "PREDEFINED_EXPRESSION",
                  "metricType": "CPU_UTILIZATION",
                  "scaleInConfiguration": {
                    "scalingConfigurationType": "THRESHOLD",
                    "threshold": "10"
                  },
                  "scaleOutConfiguration": {
                    "scalingConfigurationType": "THRESHOLD",
                    "threshold": "65"
                  }
                }
              ]
            }
          ]
        },
        "bandwidthMbps": 10,
        "maximumBandwidthMbps": 20
      }
    }

    For a complete list of parameters and values for CLI commands, see the CLI Command Reference.

  • Use the UpdateModelDeployment operation to edit (update) a model deployment.