Updating the Task Run Configuration for OCI Data Flow Service

Only integration tasks that you create and publish in Data Integration can be configured to run in the OCI Data Flow service.

Before you update a task's run configuration for running in the OCI Data Flow service, ensure that you have created the required resources and policies. See Required Setup and Policies for OCI Data Flow Service to Run Tasks.

In Data Integration, you update a task's OCI Data Flow service run configuration when you create the task or when you edit the task.

    1. Do any one of the following.
      • Open the project or folder in which you want to create a task.
      • Open the project or folder that contains the task that you want to edit.

      For the steps to open the details page of a project or folder, see Viewing the Details of a Project or Viewing the Details of a Folder.

    2. On the project or folder details page, click Tasks.
    3. In the Tasks section, do any one of the following.
      • To create a task, click Create task and select Integration.

      • To edit a task, select View details from the Actions menu (Actions menu) of the task.

      Currently, only integration tasks can be configured to run in the OCI Data Flow service.

    4. On the create or edit page that opens, go to the Run configuration section.

      By default, all tasks that you first create in Data Integration are configured to run in the OCI Data Integration service, as indicated by the label Task run service: OCI Data Integration service.

      If a task is configured to run in the OCI Data Flow service, the label is Task run service: OCI Data Flow service.

    5. In the Run configuration section, click Edit.
    6. On the Update task run configuration page, select one of the following options:
      • OCI Data Integration service: Run this task in Data Integration using fixed default resources. No other configuration is needed. Skip ahead to step 8.
      • OCI Data Flow service: Run this task in the Data Flow service using scalable resources and dynamic allocation. Proceed to step 7 to specify the run configuration.
    7. Complete the following selections to update or parameterize the run properties for using the OCI Data Flow service.
      1. Select the pool in OCI Data Flow to run this task.
      2. (Optional) Select the private endpoint in OCI Data Flow.
      3. For Log bucket path, select the Object Storage bucket to use for OCI Data Flow application run logs.

        If this is the first time you're editing the task's OCI Data Flow service run configuration, and the bucket dis-df-system-bucket already exists in Object Storage, Data Integration automatically selects that bucket, as indicated by oci://dis-df-system-bucket@<tenancy-name> in the selection field.

      4. For Artifact bucket path, select the Object Storage bucket to use for Data Integration run job artifacts such as jar and zip files.

        If this is the first time you're editing the task's OCI Data Flow service run configuration, and the bucket dis-df-system-bucket already exists in Object Storage, Data Integration automatically selects that bucket, as indicated by oci://dis-df-system-bucket@<tenancy-name> in the selection field.

      5. (Optional) For Application compartment, select the compartment for the OCI Data Flow application that's created when Data Integration service tasks are run in the Data Flow service.

        If an application compartment is not specified, the Data Integration application compartment is used.

      6. Select or enter the minimum number of workers (or executors) to use for OCI Data Flow jobs.

        The default is 1. If the value for Maximum number of workers is also 1, then dynamic allocation for OCI Data Flow jobs is not used.

      7. Select or enter the maximum number of workers (or executors) to use for OCI Data Flow jobs.

        The default is 1, which indicates that dynamic allocation is not used. If you want to use dynamic allocation for OCI Data Flow jobs, specify a larger value. This value should be greater than or equal to the value for Minimum number of workers.

      8. (Optional) For OCI Data Flow Spark configuration properties, enter one or more Spark properties to use for running this task.

        A Spark property is a key-value pair. Click Another property to add more key-value pairs, as needed.

        For the Spark configuration properties that you can add, see Supported Spark Properties.

      9. (Optional) After configuring any task run property (steps 6a to 6h), click Parameterize that's below the configured property value to assign a parameter to that property.

        Upon parameterizing, Data Integration adds a parameter of type String and sets the default parameter value to the value that's currently configured for that property. The label Parameterized followed by a parameter name is displayed. For example: Parameterized: OCI_DF_POOL

        The parameter names for the OCI Data Flow service run configuration properties are as follows:

        Task run property Parameter name
        Pool OCI_DF_POOL
        Private endpoint OCI_DF_PRIVATE_ENDPOINT
        Log bucket path OCI_DF_LOG_BUCKET
        Artifact bucket path OCI_DF_ARTIFACT_BUCKET
        Application compartment OCI_DF_APP_COMPARTMENT
        Minimum number of workers OCI_DF_MIN_WORKERS
        Maximum number of workers OCI_DF_MAX_WORKERS
        Custom OCI Data Flow configuration OCI_DF_CUSTOM_OCI_DF_SPARK_CONFIG

        The actions for a parameter are:

        • Click Edit to add or edit a parameter description. The parameter name and type cannot be edited. A parameter description, if added, is shown as a tip in the panel for changing parameter values at design time or runtime.
        • Click Remove if you no longer want a property to be parameterized.
      10. Click Save.
    8. On the create or edit page of the task, click the appropriate button to create and save the task.
  • Use the oci data-integration task update-integration-task command and required parameters to update an integration task:

    oci data-integration task update-integration-task [OPTIONS]

    Use the oci data-integration task update-data-loader-task command and required parameters to update a data loader task:

    oci data-integration task update-data-loader-task [OPTIONS]

    For a complete list of flags and variable options for CLI commands, see the Command Line Reference.

  • Run the UpdateTask operation with the appropriate resource subtype to update a task.