How to use external table preprocessors to unzip files on-the-fly
We are using external tables for reading the text files. Now instead of ASCII files, if we may get compressed files as input.
Here we have two options to handle this situation
1. Uncompress files and use the same external table definition for reading them
2. Use 11gR2 preprocessor option to dynamically unzip the files and read them
So what preprocessor offers?
It offers the functionality, which is normally not offered by external tables and which is normally available as operating system utility.
As part of existing setup, we will have a directory object where the input text files will be stored. For preprocessor, we can add another directory object and store the program (normally a shell script), which handles the requirement like unzip. Although this is not mandatory, it is recommended to separate it from the input data file area.
SQL> create directory exttab_exec as '<some path>'; Directory created.
Then we need to grant the read & execute (new privilege) to the user, who will be using this directory.
SQL> grant execute, read on directory exttab_exec to scott; Grant succeeded
Create a shell file
$ cat uncomp.sh /usr/bin/uncompress -c $1 OR /bin/gunzip -c $1
where -c write output on standard output; keep original files unchanged. This will act as a pipe.
Add the execute permission to this script
$ ls -al uncomp.sh -rwxr--r-- 1 oracle oinstall 15 2011-03-09 20:53 uncomp.sh
Compress the input data file
$ gzip test.txt OR $ compress test.txt
Then re-create the external table with preprocessor clause.
SQL> create table test 2 ( 3 id number, 4 name varchar2(28) 5 ) 6 organization external 7 ( 8 type oracle_loader 9 default directory mydir 10 access parameters 11 ( 12 records delimited by newline 13 preprocessor exttab_exec:'uncomp.sh' 14 fields terminated by '|' 15 ) 16 location ('test.txt.Z') 17 );
Check if we can read from the table.
SQL> select * from test; ID NAME ---------- ---------------------------- 1001 name1 1002 name2 1003 name3 1004 name4 1005 name5
Works as expected. This is one simple example, but we can achieve many other things using preprocessor.