Query External Data

To query data in files in the Cloud, you need to first store your object storage credentials in your Autonomous Database, and then create an external table using the PL/SQL procedure DBMS_CLOUD.CREATE_EXTERNAL_TABLE.

You can also use the procedure DBMS_CLOUD.CREATE_EXTERNAL_TABLE to query external data in attached file systems or in the local file system.

The procedure DBMS_CLOUD.CREATE_EXTERNAL_TABLE supports external files in the supported cloud object storage services, including:

  • Oracle Cloud Infrastructure Object Storage

  • Azure Blob Storage

  • Amazon S3

  • Amazon S3-Compatible, including: Oracle Cloud Infrastructure Object Storage, Google Cloud Storage, and Wasabi Hot Cloud Storage.

  • GitHub Repository

The source file in this example, channels.txt, has the following data:

S,Direct Sales,Direct
T,Tele Sales,Direct
C,Catalog,Indirect
I,Internet,Indirect
P,Partners,Others
  1. Store your object store credentials using the procedure DBMS_CLOUD.CREATE_CREDENTIAL.

    For example:

    BEGIN
      DBMS_CLOUD.CREATE_CREDENTIAL(
        credential_name => 'DEF_CRED_NAME',
        username => 'adb_user@example.com',
        password => 'password' );
    END;
    /
    

    Creating a credential to access Oracle Cloud Infrastructure Object Store is not required if you enable resource principal credentials. See Use Resource Principal to Access Oracle Cloud Infrastructure Resources for more information.

    This operation stores the credentials in the database in an encrypted format. You can use any name for the credential name. Note that this step is required only once unless your object store credentials change. Once you store the credentials you can then use the same credential name for creating external tables.

    See CREATE_CREDENTIAL Procedure for information about the username and password parameters for different object storage services.

  2. Create an external table on top of your source files using the procedure  DBMS_CLOUD.CREATE_EXTERNAL_TABLE.

    The procedure DBMS_CLOUD.CREATE_EXTERNAL_TABLE supports external files in the supported cloud object storage services. The credential is a table level property; therefore, the external files must be on the same object store.

    For example:

    BEGIN
       DBMS_CLOUD.CREATE_EXTERNAL_TABLE(
        table_name =>'CHANNELS_EXT',
        credential_name =>'DEF_CRED_NAME',
        file_uri_list =>'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/channels.txt',
        format => json_object('delimiter' value ','),
        column_list => 'CHANNEL_ID VARCHAR2(2), CHANNEL_DESC VARCHAR2(20), CHANNEL_CLASS VARCHAR2(20)' );
    END;
    /
    

    The parameters are:

    • table_name: is the external table name.

    • credential_name: is the name of the credential created in the previous step. The credential_name parameter must conform to Oracle object naming conventions, which do not allow spaces or hyphens.

      The credential_name parameter is not used when you specify a directory with file_uri_list. See Creating and Managing Directories on Autonomous Database for more information.

    • file_uri_list: is a comma delimited list of the source files you want to query.

    • format: defines the options you can specify to describe the format of the source file.

      If the data in your source files is encrypted, decrypt the data by specifying the format parameter with the encryption option. See Decrypt Data While Importing from Object Storage for more information on decrypting data.

    • column_list: is a comma delimited list of the column definitions in the source files.

    In this example, namespace-string is the Oracle Cloud Infrastructure object storage namespace and bucketname is the bucket name. See Understanding Object Storage Namespaces for more information.

    Note

    Autonomous Database supports a variety of source file formats, including compressed data formats. See DBMS_CLOUD Package Format Options and the DBMS_CLOUD compression format option to see the supported compression types.

    You can now run queries on the external table you created in the previous step. For example:

    SELECT count(*) FROM channels_ext;

    By default the database expects all rows in the external data file to be valid and match both the target data type definitions as well as the format definition of the files. If there are any rows in the source files that do not match the format options you specified, the query reports an error. You can use DBMS_CLOUD parameters, like rejectlimit, to suppress these errors. As an alternative, you can also validate the external table you created to see the error messages and the rejected rows so that you can change your format options accordingly. See Validate External Data for more information.

    For detailed information about the parameters, see CREATE_EXTERNAL_TABLE Procedure.

    See DBMS_CLOUD URI Formats for more information on the supported cloud object storage services.

External Table Metadata Columns

The external table metadata helps you determine where data is coming from when you perform a query.

The external tables you create with DBMS_CLOUD.CREATE_EXTERNAL_TABLE, DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE, or DBMS_CLOUD.CREATE_HYBRID_PART_TABLE include two invisible columns file$path and file$name. These columns help identify which file a record is coming from.

  • file$path: Specifies the file path text up to the beginning of the object name.

  • file$name: Specifies the object name, including all the text that follows the final "/".

For example:

SELECT genre_id, name, file$name, file$path FROM ext_genre
     WHERE rownum <= 2;

genre_id      name        file$name     file$path
--------      ---------   -----------   ----------------------
1             Action      genre.csv     https://objectstorage.us-ashburn-1.oraclecloud.com/n/namespace-string/b/moviestream_gold/o/genre
2             Adventure   genre.csv     https://objectstorage.us-ashburn-1.oraclecloud.com/n/namespace-string/b/moviestream_gold/o/genre

See Invisible Columns for more information on invisible columns.