Bulk Operations for Files in the Cloud

The PL/SQL package DBMS_CLOUD offers parallel execution support for bulk file upload, download, copy, and transfer activities, which streamlines the user experience and delivers optimal performance for bulk file operations.

The package DBMS_CLOUD supports loading and unloading files into the following cloud services:

  • Oracle Cloud Infrastructure Object Storage

  • Azure Blob Storage or Azure Data Lake Storage

  • Amazon S3

  • Amazon S3-Compatible, including Wasabi Hot Cloud Storage

  • Google Cloud Storage

See DBMS_CLOUD Package File URI Formats for more information.

About Bulk File Operations

DBMS_CLOUD bulk file operations support downloading and uploading files on Autonomous Database, including support for copying, moving, and deleting files across Cloud Object Stores, either within the same Object Store provider or between Object Store providers.

The bulk file operations support parallelism and provide optimal performance for uploading, downloading, copying, and moving files. Parallelism for bulk file operations is handled by specifying a priority for an operation. The supported priorities are:

  • HIGH: Use the Autonomous Database instance CPU compute count to determine the number of parallel files processed.
  • MEDIUM: Use the concurrency limit for Medium service to determine the parallelism.
  • LOW: Process files in serial order (no concurrency).

Running bulk operations with a higher priority uses more database resources and operations complete faster when parallelism can speed up the operation. A lower priority consumes less database resources and operations take longer to complete when parallelism can speed up the operation. When bulk operations involve small numbers of files that contain little data, specifying higher priority may not change the performance.

To increase parallelism for bulk file operations use HIGH priority and increase the number of CPUs assigned to your Autonomous Database instance. The maximum supported concurrent file operations is limited to 64 for bulk file operations.

The default priority is MEDIUM, which specifies that bulk file operations use the concurrency limit defined for the MEDIUM consumer group. See Manage Concurrency and Priorities on Autonomous Database for more information.

See DBMS_CLOUD for Bulk File Management for details on using the format parameter to specify the priority with bulk file operations.

Bulk Copy Files in Cloud Object Storage

Use the DBMS_CLOUD.BULK_COPY procedure to bulk copy files from one Object Store bucket or folder to another bucket or folder.

  1. Create a credential object to access the source location.

    The source credential name, as specified with the source_credential_name parameter is by default also used as the credential for the target location.

    See CREATE_CREDENTIAL Procedure for more information.

  2. When the source and target are in distinct Object Stores or have different accounts with the same cloud provider, create a credential to access the target location and include the target_credential_name parameter.
  3. Run DBMS_CLOUD.BULK_COPY procedure to copy files in parallel from one Object Store bucket or folder to another bucket or folder which can be across cloud provider, accounts, and buckets. To run the procedure, you must be logged in as the ADMIN user or have the EXECUTE privilege on DBMS_CLOUD.
    BEGIN 
    DBMS_CLOUD.BULK_COPY (
          source_credential_name => 'OCI_CRED',
          source_location_uri    => 'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname1/o',
          target_location_uri    => 'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname2/o',
          format                 => JSON_OBJECT ('logretention' value 7, 'logprefix' value 'BULKOP')
    );
    END;
    /

    This example bulk copies files from one Oracle Cloud Infrastructure Object Storage bucket to another.

    See BULK_COPY Procedure for more information.

    See DBMS_CLOUD URI Formats for more information.

For example, use DBMS_CLOUD.BULK_COPY to copy files from Amazon S3 to Oracle Cloud Infrastructure Object Storage.

BEGIN
DBMS_CLOUD.BULK_COPY(      
      source_credential_name  => 'AWS_CRED',
      source_location_uri     => 'https://bucketname.s3-us-west-2.amazonaws.com/',
      target_credential_name  => 'OCI_CRED',
      target_location_uri     => 'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname2/o',
      format                  =>  JSON_OBJECT ('logretention' value 7, 'logprefix' value 'BULKOP')
);
END;
/

Bulk Move Files Across Cloud Object Storage

Use the DBMS_CLOUD.BULK_MOVE procedure to bulk move files from one Cloud Object Storage location to another.

The first step in moving files is copying them to the target location. After the files are successfully copied, they are deleted from the source location.

The files are renamed rather than copied if Object Store allows renaming operations between source and target locations.

  1. Create a credential object to access the source location.

    The source credential name, as specified with the source_credential_name parameter is by default also used as the credential for the target location.

    See CREATE_CREDENTIAL Procedure for more information.

  2. When the source and target are in distinct Object Stores or have different accounts with the same cloud provider, create a credential to access the target location and include the target_credential_name parameter.
  3. Run DBMS_CLOUD.BULK_MOVE procedure to bulk move files from one Cloud Object Storage location to another. To run the procedure, you must be logged in as the ADMIN user or have the EXECUTE privilege on DBMS_CLOUD.
    BEGIN 
    DBMS_CLOUD.BULK_MOVE (    
         source_credential_name => 'OCI_CRED',
         source_location_uri    => 'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname1/o',
         target_location_uri    => 'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname2/o',
         format                 => JSON_OBJECT ('logretention' value 7, 'logprefix' value 'BULKMOVE')
    );
    END;
    /

    This example bulk moves files from one Oracle Cloud Infrastructure Object Storage location to another.

    See BULK_MOVE Procedure for more information.

    See DBMS_CLOUD URI Formats for more information.

For example, use DBMS_CLOUD.BULK_MOVE to move files from Amazon S3 to Oracle Cloud Infrastructure Object Storage.

BEGIN
DBMS_CLOUD.BULK_MOVE(      
      source_credential_name  => 'AWS_CRED',
      source_location_uri     => 'https://bucketname.s3-us-west-2.amazonaws.com/',
      target_credential_name  => 'OCI_CRED',
      target_location_uri     => 'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname2/o',
      format                  =>  JSON_OBJECT ('logretention' value 7, 'logprefix' value 'BULKOP')
);
END;
/

Bulk Download Files from Cloud Object Storage

Use the DBMS_CLOUD.BULK_DOWNLOAD procedure to bulk download files from the Cloud Object Store location to the directory object in an Autonomous Database.

  1. Create a credential to access your Cloud Object Storage.

    The credential name, is specified with the credential_name parameter.

    See CREATE_CREDENTIAL Procedure for more information.

  2. Run DBMS_CLOUD.BULK_DOWNLOAD procedure to download files into an Autonomous Database directory from Cloud Object Storage. To run the procedure, you must be logged in as the ADMIN user or have the EXECUTE privilege on DBMS_CLOUD.
    BEGIN
    DBMS_CLOUD.BULK_DOWNLOAD (
         credential_name => 'OCI_CRED',
         location_uri    => 'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o',
         directory_name  => 'BULK_TEST',
         format          => JSON_OBJECT ('logretention' value 7, 'logprefix' value 'BULKOP')
     );
    END;
    /

    This example bulk downloads files from the Oracle Cloud Infrastructure Object Store location URI to the directory object in an Autonomous Database.

    Note

    To write the files in the target directory object, you must have the WRITE privilege on the directory object.

    See BULK_DOWNLOAD Procedure for more information.

    See DBMS_CLOUD URI Formats for more information.

Bulk Upload Files to Cloud Object Storage

Use the DBMS_CLOUD.BULK_UPLOAD procedure to bulk upload files from a directory object in database to a Cloud Object Store location.

  1. Create a credential to access your Cloud Object Storage.

    The credential name, is specified with the credential_name parameter.

    See CREATE_CREDENTIAL Procedure for more information.

  2. Run DBMS_CLOUD.BULK_UPLOAD procedure to copy files into Cloud Object Storage from a database directory on your Autonomous Database instance. To run the procedure, you must be logged in as the ADMIN user or have the EXECUTE privilege on DBMS_CLOUD.
    BEGIN
    DBMS_CLOUD.BULK_UPLOAD (    
         credential_name => 'OCI_CRED',
         location_uri    => 'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o',
         directory_name  => 'BULK_TEST',
         format          => JSON_OBJECT ('logretention' value 5, 'logprefix' value 'BULKUPLOAD')
     );
    END;
    /

    This example bulk uploads files from a directory object, as specified with the directory_name parameter to the Oracle Cloud Infrastructure Object Store location URI.

    Note

    To read the source files in the directory object, you must have the READ privilege on the directory object.

    See BULK_UPLOAD Procedure for more information.

Bulk Delete Files from Cloud Object Storage

Use the DBMS_CLOUD.BULK_DELETE procedure to bulk delete files from Cloud Object Storage.

  1. Create a credential to access your Cloud Object Storage.

    The credential name, is specified with the credential_name parameter.

    The credential_name can be NULL for public or Pre-authenticated or Pre-signed bucket URI.

    See CREATE_CREDENTIAL Procedure for more information.

  2. Run DBMS_CLOUD.BULK_DELETE procedure to delete files from the Cloud Object Store. To run the procedure, you must be logged in as the ADMIN user or have the EXECUTE privilege on DBMS_CLOUD.
    BEGIN
    DBMS_CLOUD.BULK_DELETE (    
         credential_name => 'OCI_CRED',
         location_uri    => 'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o',
         format          => JSON_OBJECT ('logretention' value 5, 'logprefix' value 'BULKDEL')
    );
    END;
    /

    This example bulk deletes files from the Oracle Cloud Infrastructure Object Store.

    See BULK_DELETE Procedure for more information.

    See DBMS_CLOUD URI Formats for more information.

Monitor and Troubleshoot Bulk File Loads

All DBMS_CLOUD data load operations are logged in the dba_load_operations and user_load_operations views.

You can use the following views to monitor and troubleshoot bulk file loads:

  • dba_load_operations: shows all load operations.

  • user_load_operations: shows the load operations in your schema.

Query these views to see information about ongoing and completed bulk file operations. For example, the following SELECT statement with a WHERE clause predicate on the TYPE shows DOWNLOAD operations:


SELECT owner_name, type, status, start_time, update_time, status_table, rows_loaded 
   FROM user_load_operations WHERE type = 'DOWNLOAD';

OWNER_NAME TYPE     STATUS    START_TIME               UPDATE_TIME              STATUS_TABLE ROWS_LOADED 
---------- -------- --------- ------------------------ ------------------------ ------------ ----------- 
"ADMIN"    DOWNLOAD COMPLETED 2022-10-17T20:42:19.498Z 2022-10-17T20:42:21.054Z DWN$2_STATUS           4 
"ADMIN"    DOWNLOAD FAILED    2022-10-17T20:40:54.348Z 2022-10-17T20:40:55.679Z DWN$1_STATUS             

The STATUS_TABLE column shows the name of the table you can query to look at detailed logging information for the bulk download operation. For example:

DESCRIBE DWN$2_STATUS
Name          Null?    Type                        
------------- -------- --------------------------- 
ID            NOT NULL NUMBER                      
NAME                   VARCHAR2(4000)              
BYTES                  NUMBER                      
CHECKSUM               VARCHAR2(128)               
LAST_MODIFIED          TIMESTAMP(6) WITH TIME ZONE 
STATUS                 VARCHAR2(30)                
ERROR_CODE             NUMBER                      
ERROR_MESSAGE          VARCHAR2(4000)              
START_TIME             TIMESTAMP(6) WITH TIME ZONE 
END_TIME               TIMESTAMP(6) WITH TIME ZONE 
SID                    NUMBER                      
SERIAL#                NUMBER                      
ROWS_LOADED            NUMBER                      

SELECT id, name, bytes, status, error_message, start_time, end_time FROM DWN$2_STATUS;
ID NAME       BYTES STATUS    ERROR_MESSAGE START_TIME               END_TIME                 
-- ---------- ----- --------- ------------- ------------------------ ------------------------ 
 1 trees.txt     58 COMPLETED               2022-10-17T20:42:19.998Z 2022-10-17T20:42:20.421Z 
 2 trees1.txt    58 COMPLETED               2022-10-17T20:42:20.425Z 2022-10-17T20:42:20.533Z 
 3 trees2.txt    58 COMPLETED               2022-10-17T20:42:20.535Z 2022-10-17T20:42:20.894Z 
 4 trees3.txt    58 COMPLETED               2022-10-17T20:42:20.896Z 2022-10-17T20:42:21.043Z

The status table shows each file name and its status for the bulk operation.

The relevant error number and message are recorded in the status table if an operation on a specific file fails.

For completed operations, the time needed for each operation can be calculated using the reported START_TIME and END_TIME time.

The file operation STATUS column can have one of the following values:

File Status Description

COMPLETED

File operation completed successfully.

FAILED

File operation failed, a retry may be attempted for two times.

PENDING

The file operation has not yet started.

RUNNING

File operation is currently in progress.

SKIPPED

File operation skipped.

If any file operation fails after two retry attempts, then the bulk operation is marked as failed and an error is raised. For example:

ORA-20003: Operation failed, please query table DOWNLOAD$2_STATUS for error details

When you use a DBMS_CLOUD bulk file operation there are format parameter options that control status tables:

  • logretention: Specifies an integer value that determines the duration in days that the status table is retained. The default value is 2 days.

  • logprefix: Specifies a string value that determines the bulk operation status table's name prefix.

    Each bulk operation has its own default value for the logprefix option:

    Procedure Default Value for logprefix
    DBMS_CLOUD.BULK_COPY COPYOBJ
    DBMS_CLOUD.BULK_DELETE DELETE
    DBMS_CLOUD.BULK_DOWNLOAD DOWNLOAD
    DBMS_CLOUD.BULK_MOVE MOVE
    DBMS_CLOUD.BULK_UPLOAD UPLOAD

See DELETE_ALL_OPERATIONS Procedure for information on clearing the user_load_operations table.

Notes for Bulk File Operations

Provides notes for the credentials you use with bulk file operations.

  • You may specify principals as the credential_name, source_credential_name or target_credential_name parameter to access cloud resources securely without storing user credentials. The supported principals are:

    • Oracle Cloud Infrastructure Resource Principals
    • AWS Amazon Resource Names (ARN)s
    • Azure service principal
    • Google Service Account

    See Configure Policies and Roles to Access Resources for more information.

  • The credential_name, source_credential_name or the target_credential_name can be NULL for public, Pre-authenticated, or Pre-signed bucket URI.