Creating a Model Deployment with Autoscaling Using a Custom Metric
Learn how to create a model deployment with an autoscaling policy using a custom metric.
Use the oci data-science model-deployment create command and required parameters to create a model deployment:
oci data-science model-deployment create --required-param-name variable-name ... [OPTIONS]For example, deploy a model:
oci data-science model-deployment create \ --compartment-id <MODEL_DEPLOYMENT_COMPARTMENT_OCID> \ --model-deployment-configuration-details file://<MODEL_DEPLOYMENT_CONFIGURATION_FILE> \ --project-id <PROJECT_OCID> \ --display-name <MODEL_DEPLOYMENT_NAME>Use this model deployment JSON configuration file:
{ "deploymentType": "SINGLE_MODEL", "modelConfigurationDetails": { "modelId": "ocid1.datasciencemodel.oc1.iad.amaaaaaav66vvnias2wuzfkwmkkmxficse3pty453vs3xtwlmwvsyrndlx2q", "instanceConfiguration": { "instanceShapeName": "VM.Standard.E4.Flex", "modelDeploymentInstanceShapeConfigDetails": { "ocpus": 1, "memoryInGBs": 16 } }, "scalingPolicy": { "policyType": "AUTOSCALING", "coolDownInSeconds": 650, "isEnabled": true, "autoScalingPolicies": [ { "autoScalingPolicyType": "THRESHOLD", "initialInstanceCount": 1, "maximumInstanceCount": 2, "minimumInstanceCount": 1, "rules": [ { "metricExpressionRuleType": "CUSTOM_EXPRESSION", "scaleInConfiguration": { "scalingConfigurationType": "QUERY", "pendingDuration": "PT5M", "instanceCountAdjustment": 1, "query": "MemoryUtilization[1m]{resourceId = 'MODEL_DEPLOYMENT_OCID'}.grouping().mean() < 10" }, "scaleOutConfiguration": { "scalingConfigurationType": "QUERY", "pendingDuration": "PT3M", "instanceCountAdjustment": 1, "query": "MemoryUtilization[1m]{resourceId = 'MODEL_DEPLOYMENT_OCID'}.grouping().mean() > 65" } } ] } ] }, "bandwidthMbps": 10, "maximumBandwidthMbps": 20 } }For a complete list of parameters and values for CLI commands, see the CLI Command Reference.
Use the CreateModelDeployment operation to create a model deployment using the custom scaling metric type.