Creating a Pipeline

Create a Data Science pipeline to run a task. You can create pipelines by using the ADS SDK, OCI Console, or the OCI SDK. Using ADS for creating pipelines can make developing the pipeline, the steps, and the dependencies easier. ADS supports reading and writing the pipeline to and from a YAML file. You can use ADS to view a visual representation of the pipeline. We recommend that you use ADS to create and manage pipeline using code.

Ensure that you have created the necessary policies, authentication, and authorization for pipelines.

Important

For proper operation of script steps, ensure that you have added the following rule to a dynamic group policy:
all {resource.type='datasciencepipelinerun', resource.compartment.id='<pipeline-run-compartment-ocid>'}
    1. On the Projects list page, select the project that contains the pipelines that you want to work with. If you need help finding the list page or the project, see Listing Projects.
    2. On the project details page, select Pipelines.
    3. Select Create pipeline.
    4. On the Create pipeline page, enter the following information.
      • Compartment: Select the compartment to store the pipeline in.
      • Name (Optional): Enter a name for the pipeline (limit of 255 characters). If you don't provide a name, a name is automatically generated. Example: pipeline2022808222435
      • Description (Optional): Enter a description for the pipeline.
      • Pipeline steps: For each pipeline step that you want to add to the pipeline, select Add pipeline steps to open the Add pipeline step panel and then follow the procedure for the type of pipeline step that you want.

        Job: To create a pipeline step from a job, select From jobs and enter the following information.

        Note

        Optionally create a default pipeline configuration that's used when the pipeline is run by entering environment variable, command line arguments, and maximum runtime options.
        • Step name: Enter a unique name for the step. You can't repeat a step name in a pipeline.
        • Step description (Optional): Enter a step description, which can help you find step dependencies.
        • Step run name
        • Depends on (Optional): If this step depends on another step, select one or more steps to run before this step.
        • Select a job compartment: Select the compartment containing the job that you want to use as a pipeline step.
        • Select a job: Select the job that you want to use as a pipeline step.
        • Parameters (Optional):
          Note

          The step needs to ensure that the specified file (for example, /home/datascience/output.json) is populated with a valid JSON defining the specified variables. For example:
          { "message":"Hello John!", "ocpu": 2, "memory": 10 }
          • Custom environment variable key (Optional): The environment variables for this pipeline step.
          • Value (Optional): The key's value.
        • Command line arguments (Optional): Enter the command line arguments that you want to use for running the pipeline step.
        • Maximum runtime (in minutes) (Optional): The maximum number of minutes that the pipeline step is allowed to run. The service cancels the pipeline run if its runtime exceeds the specified value. The maximum runtime is 30 days (43,200 minutes). We recommend that you configure a maximum runtime on all pipeline runs to prevent runaway pipeline runs.
        • Output parameters (Optional):
          • Output parameter type: Select JSON.
          • Parameter name: Enter a parameter name.
          • Output file name: Select the output file name in which the step stores the output parameters. For example: /home/datascience/output.json.
        • Save: Select to save the step.

          The Create pipeline page reopens with the step added.

        Script: To create a pipeline step from a script, select From script and enter the following information.

        • Step name: Enter a unique name for the step. You can't repeat a step name in a pipeline.
        • Step description (Optional): Enter a step description, which can help you find step dependencies.
        • Depends on (Optional): If this step depends on another step, select one or more steps to run before this step.
        • Upload job artifact: Drag a job step file into the box, or select the box to navigate to the file for selection.
        • Entrypoint (Optional): Select a file to be the entry run point of the step. This is useful when you have many files.
        • Parameters (Optional):
          Note

          The step needs to ensure that the specified file (for example, /home/datascience/output.json) is populated with a valid JSON defining the specified variables. For example:
          { "message":"Hello John!", "ocpu": 2, "memory": 10 }
          • Custom environment variable key (Optional): The environment variables for this pipeline step.
          • Value (Optional): The key's value.
        • Command line arguments (Optional): Enter the command line arguments that you want to use for running the pipeline step.
        • Maximum runtime (in minutes) (Optional): The maximum number of minutes that the pipeline step is allowed to run. The service cancels the pipeline run if its runtime exceeds the specified value. The maximum runtime is 30 days (43,200 minutes). We recommend that you configure a maximum runtime on all pipeline runs to prevent runaway pipeline runs.
        • Output parameters (Optional):
          • Output parameter type: Select JSON.
          • Parameter name: Enter a parameter name.
          • Output file name: Select the output file name in which the step stores the output parameters. For example: /home/datascience/output.json.
        • Change the Compute shape by selecting Change shape. Then, follow these steps in the Select compute shape panel.
          Note

          For the AMD shape, you can use the default or set the number of OCPUs and memory.
          • Select an instance type.
          • Select a shape series.
          • Select one of the supported Compute shapes in the series. Select the shape that best suits how you want to use the resource.
          • Expand the selected shape to configure OCPUs and memory.
            • Number of OCPUs
            • Amount of memory (GB): For each OCPU, select up to 64 GB of memory and a maximum total of 512 GB. The minimum amount of memory allowed is either 1 GB or a value matching the number of OCPUs, whichever is greater.
            • Enable Burstable Shape: Select if using burstable VMs, and then for Baseline utilization per OCPU, select the percentage of OCPUs that you usually want to use. The supported values are 12.5% and 50%. (For model deployments, only the value of 50% is supported.)
          • Select Select shape.
        • Compute shape parameterized
          • Shape parameterized
          • Ocpus parameterized
          • MemoryInGBs parameterized
        • Block Storage: Enter the amount of storage that you want to use between 50 GB and 10, 240 GB (10 TB). You can change the value by 1 GB increments. The default value is 100 GB.
        • Networking resources: Select the relevant option.
          • Default Networking: Restricts traffic to Oracle services only. The system uses the existing service-managed network. The workload is attached by using a secondary VNIC to a preconfigured, service-managed VCN, and subnet. This provided subnet lets egress to the public internet through a NAT gateway, and access to other Oracle Cloud services through a service gateway.

            If you need access only to the public internet and OCI services, we recommend using this option. It doesn't require you to create networking resources or write policies for networking permissions.

          • Default networking with internet: Allows outbound internet access through the Data Science NAT gateway.
            Note

            You can't use Default networking with internet in disconnected realms and Oracle development tenancies. If your tenancy or compartment has a Data Science security zone policy that denies public network access (for example, deny model_deploy_public_network—see Data Science security zone policy), the service-managed public internet access option is disabled. If you try to use this option, you receive a 404 NotAuthorizedOrNotFound error.
          • Custom Networking: Select the VCN and subnet (by compartment) that you want to use.

            For egress access to the public internet, use a private subnet with a route to a NAT gateway.

            Note

            • Custom networking must be used to use a file storage mount.
            • Switching from custom networking to managed networking isn't supported after creation.
            • If you see the banner The specified subnet is not accessible. Select a different subnet., then create a policy that allows Data Science to use custom networking. See Policies.
        • Storage mounts (Optional):
          • File storage mounts (Optional): Select Add file storage mount and enter the following information.
            • Compartment: Select the compartment that contains the target that you want to mount.
            • Mount target: The mount target that you want to use.
            • Export path: The export path that you want to use.
            • Destination path and directory: Enter the path to use for mounting the storage.

              The path must start with an alphanumeric character. The destination directory must be unique across the storage mounts provided. The allowed characters are alphanumerics, hyphen ( - ) and underscore ( _ ).

              You can specify the full path, such as /opc/storage-directory. If only a directory is specified, such as /storage-directory, then it's mounted under the default /mnt directory. You can't specify OS specific directories, such as /bin or /etc.

          • Object storage mounts (Optional): Select Add object storage mount and enter the following information.
            • Compartment: Select the compartment that contains the bucket that you want to mount.
            • Bucket: Select the bucket that you want to use.
            • Object name prefix (Optional): Object name prefix. The prefix must start with an alphanumeric character. The allowed characters are alphanumerics, slash ( / ), hyphen ( - ) and underscore ( _ ).
            • Destination path and directory: Enter the path to use for mounting the storage.

              The path must start with an alphanumeric character. The destination directory must be unique across the storage mounts provided. The allowed characters are alphanumerics, hyphen ( - ) and underscore ( _ ).

              You can specify the full path, such as /opc/storage-directory. If only a directory is specified, such as /storage-directory, then it's mounted under the default /mnt directory. You can't specify OS specific directories, such as /bin or /etc.

            Note

            If using custom networking:
            1. Create the service gateway in the VCN.
            2. For the route table configurations in the private subnet, add the service gateway.
            3. Change the egress rules of security list of the required subnet to let traffic to all services in the network.
        • Save: Select to save the step.

          The Create pipeline page reopens with the step added.

        Container: To create a pipeline step from a container, select From container and enter the following information.

        Optionally, when defining pipeline steps, you can select to use Bring Your Own Container. For more information, see Bring Your Own Container (BYOC) for Pipelines.

        • Step name: Enter a unique name for the step. You can't repeat a step name in a pipeline.
        • Step description (Optional): Enter a step description, which can help you find step dependencies.
        • Depends on (Optional): If this step depends on another step, select one or more steps to run before this step.
        • Configure container environment: Select Configure to open the Configure container environment panel and then enter the following information.
          • Repository compartment
          • Repository
          • Image
          • Entrypoint
          • CMD: Use CMD as arguments to the ENTRYPOINT or the only command to run in the absence of an ENTRYPOINT.
          • Image digest (Optional)
          • Signature ID (Optional): If using signature verification, enter the OCID of the image signature. Example: ocid1.containerimagesignature.oc1.iad.aaaaaaaaab....
        • Configure container environment: Select Configure to open the Configure container environment panel and then enter the following information.
          • Repository compartment
          • Repository
          • Image
          • Entrypoint (Optional)
          • CMD (Optional): Use CMD as arguments to the ENTRYPOINT or the only command to run in the absence of an ENTRYPOINT.
          • Image digest (Optional)
          • Signature ID (Optional): If using signature verification, enter the OCID of the image signature. Example: ocid1.containerimagesignature.oc1.iad.aaaaaaaaab....
        • Upload job artifact: Drag a step artifact into the box, or select the box to navigate to the file for selection.

          This step is optional only if BYOC is configured.

        • Parameters (Optional):
          Note

          The step needs to ensure that the specified file (for example, /home/datascience/output.json) is populated with a valid JSON defining the specified variables. For example:
          { "message":"Hello John!", "ocpu": 2, "memory": 10 }
          • Custom environment variable key (Optional): The environment variables for this pipeline step.
          • Value (Optional): The key's value.
        • Command line arguments (Optional): Enter the command line arguments that you want to use for running the pipeline step.
        • Maximum runtime (in minutes) (Optional): The maximum number of minutes that the pipeline step is allowed to run. The service cancels the pipeline run if its runtime exceeds the specified value. The maximum runtime is 30 days (43,200 minutes). We recommend that you configure a maximum runtime on all pipeline runs to prevent runaway pipeline runs.
        • Output parameters (Optional):
          • Output parameter type: Select JSON.
          • Parameter name: Enter a parameter name.
          • Output file name: Select the output file name in which the step stores the output parameters. For example: /home/datascience/output.json.
        • Change the Compute shape by selecting Change shape. Then, follow these steps in the Select compute shape panel.
          Note

          For the AMD shape, you can use the default or set the number of OCPUs and memory.
          • Select an instance type.
          • Select a shape series.
          • Select one of the supported Compute shapes in the series. Select the shape that best suits how you want to use the resource.
          • Expand the selected shape to configure OCPUs and memory.
            • Number of OCPUs
            • Amount of memory (GB): For each OCPU, select up to 64 GB of memory and a maximum total of 512 GB. The minimum amount of memory allowed is either 1 GB or a value matching the number of OCPUs, whichever is greater.
            • Enable Burstable Shape: Select if using burstable VMs, and then for Baseline utilization per OCPU, select the percentage of OCPUs that you usually want to use. The supported values are 12.5% and 50%. (For model deployments, only the value of 50% is supported.)
          • Select Select shape.
        • Compute shape parameterized
          • Shape parameterized
          • Ocpus parameterized
          • MemoryInGBs parameterized
        • Block Storage: Enter the amount of storage that you want to use between 50 GB and 10, 240 GB (10 TB). You can change the value by 1 GB increments. The default value is 100 GB.
        • Networking resources: Select the relevant option.
          • Default Networking: Restricts traffic to Oracle services only. The system uses the existing service-managed network. The workload is attached by using a secondary VNIC to a preconfigured, service-managed VCN, and subnet. This provided subnet lets egress to the public internet through a NAT gateway, and access to other Oracle Cloud services through a service gateway.

            If you need access only to the public internet and OCI services, we recommend using this option. It doesn't require you to create networking resources or write policies for networking permissions.

          • Default networking with internet: Allows outbound internet access through the Data Science NAT gateway.
            Note

            You can't use Default networking with internet in disconnected realms and Oracle development tenancies. If your tenancy or compartment has a Data Science security zone policy that denies public network access (for example, deny model_deploy_public_network—see Data Science security zone policy), the service-managed public internet access option is disabled. If you try to use this option, you receive a 404 NotAuthorizedOrNotFound error.
          • Custom Networking: Select the VCN and subnet (by compartment) that you want to use.

            For egress access to the public internet, use a private subnet with a route to a NAT gateway.

            Note

            • Custom networking must be used to use a file storage mount.
            • Switching from custom networking to managed networking isn't supported after creation.
            • If you see the banner The specified subnet is not accessible. Select a different subnet., then create a policy that allows Data Science to use custom networking. See Policies.
        • Storage mounts (Optional):
          • File storage mounts (Optional): Select Add file storage mount and enter the following information.
            • Compartment: Select the compartment that contains the target that you want to mount.
            • Mount target: The mount target that you want to use.
            • Export path: The export path that you want to use.
            • Destination path and directory: Enter the path to use for mounting the storage.

              The path must start with an alphanumeric character. The destination directory must be unique across the storage mounts provided. The allowed characters are alphanumerics, hyphen ( - ) and underscore ( _ ).

              You can specify the full path, such as /opc/storage-directory. If only a directory is specified, such as /storage-directory, then it's mounted under the default /mnt directory. You can't specify OS specific directories, such as /bin or /etc.

          • Object storage mounts (Optional): Select Add object storage mount and enter the following information.
            • Compartment: Select the compartment that contains the bucket that you want to mount.
            • Bucket: Select the bucket that you want to use.
            • Object name prefix (Optional): Object name prefix. The prefix must start with an alphanumeric character. The allowed characters are alphanumerics, slash ( / ), hyphen ( - ) and underscore ( _ ).
            • Destination path and directory: Enter the path to use for mounting the storage.

              The path must start with an alphanumeric character. The destination directory must be unique across the storage mounts provided. The allowed characters are alphanumerics, hyphen ( - ) and underscore ( _ ).

              You can specify the full path, such as /opc/storage-directory. If only a directory is specified, such as /storage-directory, then it's mounted under the default /mnt directory. You can't specify OS specific directories, such as /bin or /etc.

            Note

            If using custom networking:
            1. Create the service gateway in the VCN.
            2. For the route table configurations in the private subnet, add the service gateway.
            3. Change the egress rules of security list of the required subnet to let traffic to all services in the network.
        • Save: Select to save the step.

          The Create pipeline page reopens with the step added.

        Data Flow application: To create a pipeline step from a Data Flow application, select From Data Flow applications and enter the following information.

        • Step name: Enter a unique name for the step. You can't repeat a step name in a pipeline.
        • Step description (Optional): Enter a step description, which can help you find step dependencies.
        • Depends on (Optional): If this step depends on another step, select one or more steps to run before this step.
        • Select a dataflow application compartment
        • Select a dataflow application
        • Parameters (Optional):
          Note

          The step needs to ensure that the specified file (for example, /home/datascience/output.json) is populated with a valid JSON defining the specified variables. For example:
          { "message":"Hello John!", "ocpu": 2, "memory": 10 }
          • Custom environment variable key (Optional): The environment variables for this pipeline step.
          • Value (Optional): The key's value.
        • Command line arguments (Optional): Enter the command line arguments that you want to use for running the pipeline step.
        • Maximum runtime (in minutes) (Optional): The maximum number of minutes that the pipeline step is allowed to run. The service cancels the pipeline run if its runtime exceeds the specified value. The maximum runtime is 30 days (43,200 minutes). We recommend that you configure a maximum runtime on all pipeline runs to prevent runaway pipeline runs.
        • Data Flow configuration: Select Configure to open the Configure Data Flow configuration panel and then enter the following information.
          • Driver shape
          • Driver OCPUs
          • Driver Memory (GB)
          • Executor shape
          • Executor OCPUs
          • Executor Memory (GB)
          • Number of executors
          • Enter the bucket path manually
            • Logs bucket URI
          • Object storage bucket name compartment
          • Object storage bucket name
          • Key
          • Value
          • Warehouse bucket URI
          • Configure: Select to save entered information and go back to the Add pipeline step page.
        • Save: Select to save the step.

          The Create pipeline page reopens with the step added.

      • Parameters (Optional):
        Note

        The step needs to ensure that the specified file (for example, /home/datascience/output.json) is populated with a valid JSON defining the specified variables. For example:
        { "message":"Hello John!", "ocpu": 2, "memory": 10 }
        • Custom environment variable key (Optional): The environment variables for this pipeline step.
        • Value (Optional): The key's value.
      • Command line arguments (Optional): Enter the command line arguments that you want to use for running the pipeline step.
      • Maximum runtime (in minutes) (Optional): The maximum number of minutes that the pipeline step is allowed to run. The service cancels the pipeline run if its runtime exceeds the specified value. The maximum runtime is 30 days (43,200 minutes). We recommend that you configure a maximum runtime on all pipeline runs to prevent runaway pipeline runs.
      • Custom parameter key
      • Value
      • Change the Compute shape by selecting Change shape. Then, follow these steps in the Select compute shape panel.
        Note

        For the AMD shape, you can use the default or set the number of OCPUs and memory.
        • Select an instance type.
        • Select a shape series.
        • Select one of the supported Compute shapes in the series. Select the shape that best suits how you want to use the resource.
        • Expand the selected shape to configure OCPUs and memory.
          • Number of OCPUs
          • Amount of memory (GB): For each OCPU, select up to 64 GB of memory and a maximum total of 512 GB. The minimum amount of memory allowed is either 1 GB or a value matching the number of OCPUs, whichever is greater.
          • Enable Burstable Shape: Select if using burstable VMs, and then for Baseline utilization per OCPU, select the percentage of OCPUs that you usually want to use. The supported values are 12.5% and 50%. (For model deployments, only the value of 50% is supported.)
        • Select Select shape.
      • Compute shape parameterized
        • Shape parameterized
        • Ocpus parameterized
        • MemoryInGBs parameterized
      • Block Storage: Enter the amount of storage that you want to use between 50 GB and 10, 240 GB (10 TB). You can change the value by 1 GB increments. The default value is 100 GB.
      • Networking resources: Select the relevant option.
        • Default Networking: Restricts traffic to Oracle services only. The system uses the existing service-managed network. The workload is attached by using a secondary VNIC to a preconfigured, service-managed VCN, and subnet. This provided subnet lets egress to the public internet through a NAT gateway, and access to other Oracle Cloud services through a service gateway.

          If you need access only to the public internet and OCI services, we recommend using this option. It doesn't require you to create networking resources or write policies for networking permissions.

        • Default networking with internet: Allows outbound internet access through the Data Science NAT gateway.
          Note

          You can't use Default networking with internet in disconnected realms and Oracle development tenancies. If your tenancy or compartment has a Data Science security zone policy that denies public network access (for example, deny model_deploy_public_network—see Data Science security zone policy), the service-managed public internet access option is disabled. If you try to use this option, you receive a 404 NotAuthorizedOrNotFound error.
        • Custom Networking: Select the VCN and subnet (by compartment) that you want to use.

          For egress access to the public internet, use a private subnet with a route to a NAT gateway.

          Note

          • Custom networking must be used to use a file storage mount.
          • Switching from custom networking to managed networking isn't supported after creation.
          • If you see the banner The specified subnet is not accessible. Select a different subnet., then create a policy that allows Data Science to use custom networking. See Policies.
      • Enable logging (Optional): Log messages.
        • Log group compartment: Select the compartment that contains the log group.
        • Log group: Select the log group.
      • Storage mounts (Optional):
        • File storage mounts (Optional): Select Add file storage mount and enter the following information.
          • Compartment: Select the compartment that contains the target that you want to mount.
          • Mount target: The mount target that you want to use.
          • Export path: The export path that you want to use.
          • Destination path and directory: Enter the path to use for mounting the storage.

            The path must start with an alphanumeric character. The destination directory must be unique across the storage mounts provided. The allowed characters are alphanumerics, hyphen ( - ) and underscore ( _ ).

            You can specify the full path, such as /opc/storage-directory. If only a directory is specified, such as /storage-directory, then it's mounted under the default /mnt directory. You can't specify OS specific directories, such as /bin or /etc.

        • Object storage mounts (Optional): Select Add object storage mount and enter the following information.
          • Compartment: Select the compartment that contains the bucket that you want to mount.
          • Bucket: Select the bucket that you want to use.
          • Object name prefix (Optional): Object name prefix. The prefix must start with an alphanumeric character. The allowed characters are alphanumerics, slash ( / ), hyphen ( - ) and underscore ( _ ).
          • Destination path and directory: Enter the path to use for mounting the storage.

            The path must start with an alphanumeric character. The destination directory must be unique across the storage mounts provided. The allowed characters are alphanumerics, hyphen ( - ) and underscore ( _ ).

            You can specify the full path, such as /opc/storage-directory. If only a directory is specified, such as /storage-directory, then it's mounted under the default /mnt directory. You can't specify OS specific directories, such as /bin or /etc.

          Note

          If using custom networking:
          1. Create the service gateway in the VCN.
          2. For the route table configurations in the private subnet, add the service gateway.
          3. Change the egress rules of security list of the required subnet to let traffic to all services in the network.
      • Tags (under Advanced options): Add tags to the pipeline. If you have permissions to create a resource, then you also have permissions to apply free-form tags to that resource. To apply a defined tag, you must have permissions to use the tag namespace. For more information about tagging, see Resource Tags. If you're not sure whether to apply tags, skip this option or ask an administrator. You can apply tags later.
    5. Select Create.

      After the pipeline is in an active state, you can use pipeline runs to repeatedly run the pipeline.

  • These environment variables control the pipeline run.

    You can use the OCI CLI to create a pipeline as in this Python example:

    1. Create a pipeline:

      The following parameters are available to use in the payload:

      Parameter name Required Description
      Pipeline (top level)
      projectId Required The project OCID to create the pipeline in.
      compartmentId Required The compartment OCID to the create the pipeline in.
      displayName Optional The name of the pipeline.
      infrastructureConfigurationDetails Optional

      Default infrastructure (compute) configuration to use for all the pipeline steps, see infrastructureConfigurationDetails for details on the supported parameters.

      Can be overridden by the pipeline run configuration.

      logConfigurationDetails Optional

      Default log to use for the all the pipeline steps, see logConfigurationDetails for details on the supported parameters.

      Can be overridden by the pipeline run configuration.

      configurationDetails Optional

      Default configuration for the pipeline run, see configurationDetails for details on supported parameters.

      Can be overridden by the pipeline run configuration.

      freeformTags Optional Tags to add to the pipeline resource.
      stepDetails
      stepName Required Name of the step. Must be unique in the pipeline.
      description Optional Free text description for the step.
      stepType Required CUSTOM_SCRIPT or ML_JOB
      jobId Required* For ML_JOB steps, this is the job OCID to use for the step run.
      stepInfrastructureConfigurationDetails Optional*

      Default infrastructure (Compute) configuration to use for this step, see infrastructureConfigurationDetails for details on the supported parameters.

      Can be overridden by the pipeline run configuration.

      *Must be defined on at least one level (precedence based on priority, 1 being highest):

      1 pipeline run and/or

      2 step and/or

      3 pipeline

      stepConfigurationDetails Optional*

      Default configuration for the step run, see configurationDetails for details on supported parameters.

      Can be overridden by the pipeline run configuration.

      *Must be defined on at least one level (precedence based on priority, 1 being highest):

      1 pipeline run and/or

      2 step and/or

      3 pipeline

      dependsOn Optional List of steps that must be completed before this step begins. This creates the pipeline workflow dependencies graph.
      infrastructureConfigurationDetails
      shapeName Required Name of the Compute shape to use. For example, VM.Standard2.4.
      blockStorageSizeInGBs Required Number of GBs to use as the attached storage for the VM.
      logConfigurationDetails
      enableLogging Required Define to use logging.
      logGroupId Required Log group OCID to use for the logs. The log group must be created and available when the pipeline runs
      logId Optional* Log OCID to use for the logs when not using the enableAutoLogCreation parameter.
      enableAutoLogCreation Optional If set to True, a log for each pipeline run is created.
      configurationDetails
      type Required Only DEFAULT is supported.
      maximumRuntimeInMinutes Optional Time limit in minutes for the pipeline to run.
      environmentVariables Optional

      Environment variables to provide for the pipeline step runs.

      For example:

      "environmentVariables": {
      
       "CONDA_ENV_TYPE": "service"
      
      }

      Review the list of service supported environment variables.

      pipeline_payload = {
          "projectId": "<project_id>",
          "compartmentId": "<compartment_id>",
          "displayName": "<pipeline_name>",
          "pipelineInfrastructureConfigurationDetails": {
              "shapeName": "VM.Standard2.1",
              "blockStorageSizeInGBs": "50"
          },
          "pipelineLogConfigurationDetails": {
              "enableLogging": True,
              "logGroupId": "<log_group_id>",
              "logId": "<log_id>"
          },
          "pipelineDefaultConfigurationDetails": {
              "type": "DEFAULT",
              "maximumRuntimeInMinutes": 30,
              "environmentVariables": {
                  "CONDA_ENV_TYPE": "service",
                  "CONDA_ENV_SLUG": "classic_cpu"
              }
          },
          "stepDetails": [
              {
                  "stepName": "preprocess",
                  "description": "Preprocess step",
                  "stepType": "CUSTOM_SCRIPT",
                  "stepInfrastructureConfigurationDetails": {
                      "shapeName": "VM.Standard2.4",
                      "blockStorageSizeInGBs": "100"
                  },
                  "stepConfigurationDetails": {
                      "type": "DEFAULT",
                      "maximumRuntimeInMinutes": 90
                      "environmentVariables": {
                          "STEP_RUN_ENTRYPOINT": "preprocess.py",
                          "CONDA_ENV_TYPE": "service",
                          "CONDA_ENV_SLUG": "onnx110_p37_cpu_v1"
                  }
              },
              {
                  "stepName": "postprocess",
                  "description": "Postprocess step",
                  "stepType": "CUSTOM_SCRIPT",
                  "stepInfrastructureConfigurationDetails": {
                      "shapeName": "VM.Standard2.1",
                      "blockStorageSizeInGBs": "80"
                  },
                  "stepConfigurationDetails": {
                      "type": "DEFAULT",
                      "maximumRuntimeInMinutes": 60
                  },
                  "dependsOn": ["preprocess"]
              },
          ],
          "freeformTags": {
              "freeTags": "cost center"
          }
      }
      pipeline_res = dsc.create_pipeline(pipeline_payload)
      pipeline_id = pipeline_res.data.id

      Until all pipeline steps artifacts are uploaded, the pipeline is in the CREATING state.

    2. Upload a step artifact:

      After an artifact is uploaded, it can't be changed.

      fstream = open(<file_name>, "rb")
      dsc.create_step_artifact(pipeline_id, step_name, fstream, content_disposition=f"attachment; filename={<file_name>}")
    3. Update a pipeline:

      You can only update a pipeline when it's in an ACTIVE state.

      update_pipeline_details = {
      "displayName": "pipeline-updated"
      }
      self.dsc.update_pipeline(<pipeline_id>, <update_pipeline_details>)
    4. Start pipeline run:
      pipeline_run_payload = {
      "projectId": project_id,
      "displayName": "pipeline-run",
      "pipelineId": <pipeline_id>,
      "compartmentId": <compartment_id>,
      }
      dsc.create_pipeline_run(pipeline_run_payload)
  • The ADS SDK is also a publicly available Python library that you can install with this command:

    pip install oracle-ads

    You can use the ADS SDK to create and run pipelines.

Creating Pipelines with Custom Networking Using APIs

You can select custom networking when creating a pipeline. Use a custom network that you've already created in the pipeline to give you extra flexibility on the network.

Provide subnet-id in the infrastructure-configuration-details to use a custom subnet on the pipeline level. For example:

"infrastructure-configuration-details": {
      "block-storage-size-in-gbs": 50,
      "shape-config-details": {
        "memory-in-gbs": 16.0,
        "ocpus": 1.0
      },
      "shape-name": "VM.Standard.E4.Flex",
      "subnet-id": "ocid1.subnet.oc1.iad.aaaaaaaa5lzzq3fyypo6x5t5egplbfyxf2are6k6boop3vky5t4h7g35xkoa"
}

Or in the step-container-configuration-details to use a custom subnet for a particular step. For example:

"step-infrastructure-configuration-details": {
          "block-storage-size-in-gbs": 50,
          "shape-config-details": {
            "memory-in-gbs": 16.0,
            "ocpus": 1.0
          },
          "shape-name": "VM.Standard.E4.Flex",
          "subnet-id": "ocid1.subnet.oc1.iad.aaaaaaaa5lzzq3fyypo6x5t5egplbfyxf2are6k6boop3vky5t4h7g35xkoa"
},