Sunday, May 4, 2008

Optimize Informatica ETL Incremental Mappings

Some of the ETL mapping for Incremental run with EBSO as Source database really runs very slow. The SELECT statement against the source database itself is very slow and with huge database, rows fetching started in 8-10 hours.

Based on our experience we have seen that Incremental SQL queries use the OR operator. Oracle Optimizer perform much better with UNION operator than OR.

In order to optimize Incremental ETL queries against the huge database, test the queries with UNION operator instead of OR.

If UNION also does not perform well, test UNION ALL. UNION ALL will pull the duplicates but you can always remove the duplicates from staging table after the ETL run is done. An example of the SQL to delete the duplicate rows are


DELETE FROM W_SALES_INVOICE_LINE_FS A WHERE ROWID > (SELECT MIN(ROWID) FROM W_SALES_INVOICE_LINE_FS B WHERE A.INTEGRATION_ID = B.INTEGRATION_ID AND A.DATASOURCE_NUM_ID = B.DATASOURCE_NUM_ID)

No comments: