DBMS_CLOUD_PIPELINE Package
The
DBMS_CLOUD_PIPELINE
package allows you to
create data pipelines for loading and exporting data in the cloud. This package supports
continuous incremental data load of files in object store into the database. DBMS_CLOUD_PIPELINE
also supports continuous
incremental export of table data or query results from the database to object store based on
a timestamp column.
- Prerequisites
- Summary of DBMS_CLOUD_PIPELINE Subprograms
This table summarizes the subprograms included in theDBMS_CLOUD_PIPELINE
package. - CREATE_PIPELINE Procedure
The procedure creates a new data pipeline. - DROP_PIPELINE Procedure
The procedure drops an existing data pipeline. If a pipeline has been started, then it must be stopped before it can be dropped. - RESET_PIPELINE Procedure
Resets the tracking state of a data pipeline. Use reset pipeline to restart the pipeline from the initial state of data load or export. Optionally reset pipeline can purge data in the database or in object store, depending on the type of pipeline. A data pipeline must be in stopped state to reset it. - RUN_PIPELINE_ONCE Procedure
This procedure performs an on-demand run of the pipeline in the current foreground session, instead of a running in a scheduled job. UseDBMS_CLOUD_PIPELINE.RUN_PIPELINE_ONCE
to test a pipeline before you start the pipeline as a continuous job. - SET_ATTRIBUTE Procedure
This procedure sets pipeline attributes. There are two overloaded procedures, one to set a single attribute and another to set multiple attributes using a JSON document of attribute name/value pairs. - START_PIPELINE Procedure
The procedure starts the data pipeline. When a pipeline is started, the pipeline operation runs continuously in a scheduled job according to theinterval
configured with the pipeline attributes. - STOP_PIPELINE Procedure
The procedure stops the data pipeline. When a pipeline is stopped, no future jobs are scheduled for the pipeline. - DBMS_CLOUD_PIPELINE Attributes
Attributes help to control and configure the behavior of a data pipeline.
Parent topic: Autonomous Database Supplied Package Reference
Prerequisites
As a developer, you can use DBMS_CLOUD procedures with Autonomous Databases deployed on Oracle Public Cloud or Exadata Cloud@Customer.
Depending on the deployment choice, the following prerequisites must be met to use the DBMS_CLOUD procedures with Amazon S3, Azure Blob Storage, and Google Cloud Storage service providers.
- Create a NAT gateway in the Virtual Cloud Network (VCN) where your Autonomous Database resources reside by following the instructions in Create a NAT Gateway in Oracle Cloud Infrastructure Documentation.
- After creating the NAT gateway, add a route rule and an egress security rule to each subnet (in the VCN) where Autonomous Database resources reside so that these resources can use the gateway to obtain a public key from your Azure AD instance:
- Go to the Subnet Details page for the subnet.
- In the Subnet Information tab, click the name of the subnet's Route Table to display its Route Table Details page.
- In the table of existing Route Rules, check whether there is already a rule with the following characteristics:
- Destination: 0.0.0.0/0
- Target Type: NAT Gateway
- Target: The name of the NAT gateway you just created in the VCN
If such a rule does not exist, click Add Route Rules and add a route rule with these characteristics.
- Return to the Subnet Details page for the subnet.
- In the subnet's Security Lists table, click the name of the subnet's security list to display its Security List Details page.
- In the side menu, under Resources, click Egress Rules.
- In the table of existing Egress Rules, check whether there is already a rule with the following characteristics:
- Destination Type: CIDR
- Destination: 0.0.0.0/0
- IP Protocol: TCP
- Source Port Range: 443
- Destination Port Range: All
If such a rule does not exist, click Add Egress Rules and add an egress rule with these characteristics.
The HTTP Proxy settings in your environment must allow the database to access the cloud service provider.
The network configuration including the HTTP Proxy can only be edited until the Exadata Infrastructure is in Requires Activation state. Once it is activated, you cannot edit those settings.
Setting up an HTTP Proxy for an already provisioned Exadata Infrastructure needs a Service Request (SR) in My Oracle Support. See Create a Service Request in My Oracle Support for details.
Parent topic: DBMS_CLOUD_PIPELINE Package
Summary of DBMS_CLOUD_PIPELINE Subprograms
This table summarizes the subprograms included in the DBMS_CLOUD_PIPELINE
package.
Subprogram | Description |
---|---|
Creates a new data pipeline. |
|
Drops an existing data pipeline. |
|
Resets the tracking state of a data pipeline. Use reset pipeline to restart the pipeline from the initial state of data load or export. Optionally reset pipeline can purge data in the database or in object store, depending on the type of pipeline. |
|
Performs an on-demand run of the pipeline in the current foreground session, instead of a scheduled job. |
|
Sets pipeline attributes. There are two overloaded procedures, one to set a single attribute and another to set multiple attributes using a JSON document of attribute name/value pairs |
|
Starts the data pipeline. When a pipeline is started, the pipeline operation will continuously run in a scheduled job according to the "interval" configured in pipeline attributes. |
|
Stops the data pipeline. When a pipeline is stopped, no future jobs are scheduled for the pipeline. |
Parent topic: DBMS_CLOUD_PIPELINE Package
CREATE_PIPELINE Procedure
The procedure creates a new data pipeline.
Syntax
DBMS_CLOUD_PIPELINE.CREATE_PIPELINE
(
pipeline_name IN VARCHAR2,
pipeline_type IN VARCHAR2,
attributes IN CLOB DEFAULT NULL,
description IN VARCHAR2 DEFAULT NULL
);
Parameters
Parameter | Description |
---|---|
|
Specifies a name for the pipeline. The pipeline name must follow the naming rules of Oracle SQL identifiers. See Identifiers for more information. This parameter is mandatory. |
|
Specifies the pipeline type. Valid values: This parameter is mandatory. |
|
Pipeline attributes in JSON format. Default value: See DBMS_CLOUD_PIPELINE Attributes for more information. |
|
Description for the pipeline. Default value: |
Parent topic: DBMS_CLOUD_PIPELINE Package
DROP_PIPELINE Procedure
The procedure drops an existing data pipeline. If a pipeline has been started, then it must be stopped before it can be dropped.
Syntax
DBMS_CLOUD_PIPELINE.DROP_PIPELINE
(
pipeline_name IN VARCHAR2,
force IN BOOLEAN DEFAULT FALSE
);
Parameters
Parameter | Description |
---|---|
|
Specifies a pipeline name. This parameter is mandatory. |
|
Forcibly drop a pipeline, even if it is in started state. Valid values: Default value: |
Usage Note
-
In order to drop a pipeline that is in started state, set the
force
parameter toTRUE
.
Parent topic: DBMS_CLOUD_PIPELINE Package
RESET_PIPELINE Procedure
Resets the tracking state of a data pipeline. Use reset pipeline to restart the pipeline from the initial state of data load or export. Optionally reset pipeline can purge data in the database or in object store, depending on the type of pipeline. A data pipeline must be in stopped state to reset it.
Syntax
DBMS_CLOUD_PIPELINE.RESET_PIPELINE
(
pipeline_name IN VARCHAR2,
purge_data IN BOOLEAN DEFAULT FALSE
);
Parameters
Parameter | Description |
---|---|
|
Specifies a name for the pipeline. This parameter is mandatory. |
|
Purge data applies for either a load pipeline or an export pipeline:
Valid values: Default value: |
Usage Notes
-
A data pipeline must be in stopped state to reset it. See STOP_PIPELINE Procedure for more information.
-
For a load pipeline, resetting the pipeline clears the record of the files being loaded by the pipeline. When you call either
START_PIPELINE
orRUN_PIPELINE_ONCE
after resetting a load pipeline, the pipeline repeats the data load and includes all the files present in the object store location.When
purge_data
is set toTRUE
,DBMS_CLOUD_PIPELINE.RESET_PIPELINE
does the following:-
Truncates the data in the pipeline's database table you specify with
table_name
attribute. -
Drops the pipeline's status table, and the pipeline's bad file table and error table ( if they exist).
-
-
For an export pipeline, resetting the pipeline clears the last tracked data in the database table. When you call either
START_PIPELINE
orRUN_PIPELINE_ONCE
after resetting an export pipeline, the pipeline repeats exporting data from the table or query.When
purge_data
set toTRUE
,DBMS_CLOUD_PIPELINE.RESET_PIPELINE
deletes existing files in the object store location specified with thelocation
attribute.
Parent topic: DBMS_CLOUD_PIPELINE Package
RUN_PIPELINE_ONCE Procedure
This procedure performs an on-demand run of the
pipeline in the current foreground session, instead of a running in a scheduled job. Use DBMS_CLOUD_PIPELINE.RUN_PIPELINE_ONCE
to
test a pipeline before you start the pipeline as a continuous job.
Syntax
DBMS_CLOUD_PIPELINE.RUN_PIPELINE_ONCE
(
pipeline_name IN VARCHAR2
);
Parameters
Parameter | Description |
---|---|
|
Specifies a name for the pipeline to run. This parameter is mandatory. |
Usage Notes
-
After you perform a test run of a pipeline you can reset the pipeline state using
DBMS_CLOUD_PIPELINE.RESET_PIPELINE
. This allows you to reset the pipeline state before you start the pipeline in a scheduled job. -
If a pipeline is in the started state, then it cannot be run in the foreground session.
Parent topic: DBMS_CLOUD_PIPELINE Package
SET_ATTRIBUTE Procedure
This procedure sets pipeline attributes. There are two overloaded procedures, one to set a single attribute and another to set multiple attributes using a JSON document of attribute name/value pairs.
Syntax
PROCEDURE DBMS_CLOUD_PIPELINE.SET_ATTRIBUTE
(
pipeline_name IN VARCHAR2,
attribute_name IN VARCHAR2,
attribute_value IN CLOB
);
PROCEDURE DBMS_CLOUD_PIPELINE.SET_ATTRIBUTE
(
pipeline_name IN VARCHAR2,
attributes IN CLOB
);
Parameters
Parameter | Description |
---|---|
|
Specifies a name for the pipeline to set attributes. This parameter is mandatory. |
|
Specifies the attribute name for the attribute to be set. See DBMS_CLOUD_PIPELINE Attributes for more information. |
attribute_value |
Specifies the value for the pipeline attribute to set. See DBMS_CLOUD_PIPELINE Attributes for more information. |
|
Specifies a JSON document containing attribute names and values. See DBMS_CLOUD_PIPELINE Attributes for more information. |
Usage Note
-
When you use
DBMS_CLOUD_PIPELINE.SET_ATTRIBUTE
to set multiple attributes with theattributes
parameter, all the existing attributes are deleted and overwritten with the specified attributes from the JSON document.
Parent topic: DBMS_CLOUD_PIPELINE Package
START_PIPELINE Procedure
interval
configured with the pipeline
attributes.
Syntax
DBMS_CLOUD_PIPELINE.START_PIPELINE
(
pipeline_name IN VARCHAR2,
start_date IN TIMESTAMP WITH TIME ZONE DEFAULT NULL
);
Parameters
Parameter | Description |
---|---|
|
Specifies a name for the pipeline. This parameter is mandatory. |
|
Specifies the starting date for the pipeline job. Default value: |
Usage Notes
-
By default, a pipeline job begins immediately, as soon as the pipeline is started. To start a pipeline job at a later time specify a valid date or timestamp using the
start_date
parameter. -
See DBMS_CLOUD_PIPELINE Attributes for information on the pipeline
interval
and other pipeline attributes.
Parent topic: DBMS_CLOUD_PIPELINE Package
STOP_PIPELINE Procedure
The procedure stops the data pipeline. When a pipeline is stopped, no future jobs are scheduled for the pipeline.
Syntax
DBMS_CLOUD_PIPELINE.STOP_PIPELINE
(
pipeline_name IN VARCHAR2,
force IN BOOLEAN DEFAULTFALSE
);
Parameters
Parameter | Description |
---|---|
|
Specifies a name for the pipeline. This parameter is mandatory. |
|
If force parameter is passed as
Valid values: Default value: |
Parent topic: DBMS_CLOUD_PIPELINE Package
DBMS_CLOUD_PIPELINE Attributes
Attributes help to control and configure the behavior of a data pipeline.
Attributes
As indicated in the Pipeline Type column, depending on the pipeline type
LOAD
or
EXPORT
, a pipeline supports a different set of
attributes.
Attribute Name | Description | Pipeline Type | Modifiable After Pipeline Starts |
---|---|---|---|
credential_name |
The name of the credential to access the source Cloud Object Storage. Default value: If you do not supply a
|
LOAD , EXPORT |
Yes |
field_list |
Identifies the fields in the source files and their data types.
This argument's syntax is the same as the
Default value: The default value specifies the fields and their data types are
determined by the columns in the table specified in
|
LOAD |
Yes |
format |
The options describing the format for the type of pipeline.
Datapump This attribute is mandatory for both |
LOAD , EXPORT |
Yes |
interval |
The time interval in minutes between consecutive executions of scheduled pipeline job. Default value: 15 minutes |
LOAD , EXPORT |
Yes |
key_column |
A timestamp or date column in the specified
Default value: NULL If |
EXPORT |
No |
location |
Specifies a URI that points to an Object Storage location. The format of the URI depends on the Cloud Object Storage service. See Cloud Object Storage URI Formats for more information. This attribute is mandatory for both |
LOAD , EXPORT |
No |
priority |
Specifies a string value that determines the number of parallel operations performed for the pipeline.
An operation with a higher priority consumes more database resources and is completed sooner. Valid values:
Default value: The maximum number of concurrent file operations is limited to 64. |
LOAD , EXPORT |
Yes |
query |
Specifies a For example:
Default value: For an export pipeline, either |
EXPORT |
No |
table_name |
Specifies the name of the target table for loading or exporting data. For a load pipeline For an export pipeline, either |
LOAD , EXPORT |
No |
table_owner |
The name of the schema where the target table resides for loading or exporting data. Default value: With a |
LOAD , EXPORT |
No |
Parent topic: DBMS_CLOUD_PIPELINE Package