Required Setup and Policies for OCI Data Flow Service to Run Tasks

Only integration tasks that you create and publish in Data Integration can be configured to run in the OCI Data Flow service.

To run the tasks in the OCI Data Flow service, ensure that you have set up the following resources and policies.

Data assets used in the tasks
  • Must be configured to use OCI Vault secrets for the passwords to connect to the data sources. This is required for passing credentials securely across OCI services. See OCI Vault Secrets and Oracle Wallets.

  • Must be specified using the fully qualified domain name (FQDN) for the database hosts. The OCI Data Flow service does not allow connections through direct IP addresses.

OCI Object Storage

  • Object Storage buckets are required for:
    • The OCI Data Flow service to upload Data Flow application run logs.
    • The Data Integration service to upload the artifacts for run jobs such as jar and zip files.

    When editing for the first time a task's OCI Data Flow service run configuration, the Data Integration service automatically selects the bucket dis-df-system-bucket if it already exists. Otherwise, you're required to select a log and an artifact bucket when you update the task run configuration to use the Data Flow service.

  • The relevant permissions and IAM policies to access Object Storage, as described in Policy Examples to Enable Access to OCI Object Storage.

Note

Different types of policies (resource principal and on behalf of) are required for using Object Storage. The policies required also depend on whether the Object Storage instance and the Data Integration workspace are in the same tenancy or different tenancies, and whether you create the policies at the compartment level or tenancy level. More examples are available in the blog Policies in Oracle Cloud Infrastructure (OCI) Data Integration to help you identify the policies for specific needs.

OCI Data Flow service

  • A pool. See Creating a Pool in the OCI Data Flow documentation.

    For running Data Integration service tasks in the OCI Data Flow service, the required pool must have a single configuration with at least two compute shapes.

  • A private endpoint. See Creating a Private Endpoint in the OCI Data Flow documentation.

    If the Data Integration service tasks access data sources that are only available using private IPs, a private endpoint is required to give OCI Data Flow access to a private network in the tenancy for working with those data sources.

  • Relevant policies to publish from the Data Integration service the tasks that have the Data Flow service run configuration enabled, and run the tasks on the Data Flow service (with or without private endpoints).

    For Data Integration to run tasks on the Data Flow service:

    allow any-user to manage dataflow-family in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}

    For user to access the Data Flow service directly:

    allow group <group-name> to read dataflow-application in compartment <compartment-name>
    allow group <group-name> to manage dataflow-run in compartment <compartment-name>

    For user to manage Data Flow private endpoints and secret bundles:

    allow any-user to read dataflow-private-endpoint in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}
    allow any-user to read secret-bundles in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}

    For the Data Flow service to read the application run logs from the Object Storage bucket that's specified in the Data Integration task's run configuration for Data Flow:

    ALLOW SERVICE dataflow TO READ objects IN tenancy WHERE target.bucket.name = '<log-bucket-name>'

    For non-administrator users, these policies are required:

    allow group <group-name> to inspect dataflow-private-endpoint in compartment <compartment-name>
    allow group <group-name> to read secret-bundles in compartment <compartment-name>

After meeting the prerequisite resource and policy requirements, edit the run configuration of the task that you want to run in the OCI Data Flow service. See Updating the Task Run Configuration for OCI Data Flow Service.

Note

After you add IAM components (for example, dynamic groups and policy statements), don't try to perform the associated tasks immediately. New IAM policies require about five to 10 minutes to take effect.