Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
bjsellers57
Contributor II
Contributor II

QlikView Oracle Optimization

 

I have an Oracle database of a lot of data to pull data per month (approx. 2.5 million records per month) to pull from Oracle to individual monthly files (six years’ worth).  I have two ways I was testing the pull of the Oracle data to QVDs: 1) I pulled all the data with one SQL Select statement with 6 sub-queries (different selection criteria in each sub-query) 55 rows of data per month filter; 2) I pulled the main data (48 rows) separately with one SQL Select statement filter by a given month, then created separate single pull (7 additional rows) from the Oracle table and joined back to the resident table prior to saving the QVD.

 

The single pull (1) from the Oracle table took 4.5 hours, whereas the multiple pulls (2) with joins took between 20 min. and 25 min.  What is going on behind scenes/optimization that makes the single pull of data with the sub-queries take so long?  I noticed a comment in the community suggests pulling large about of data through SQL-Plus to text was very fast, then they pull the text file back in to QlikView.   I ran the SQL Select with the sub-queries in SQL Developer and it took 3 min. to run and pull the data.

 

7 Replies
petter
Partner - Champion III
Partner - Champion III

I have an Oracle database of a lot of data to pull data per month (approx. 2.5 million records per month) to pull from Oracle to individual monthly files (six years’ worth).

  • So is that 2.5 million source rows that result in 2.5 million result rows? x 12 x 6 ? = 360 million result rows?



I have two ways I was testing the pull of the Oracle data to QVDs: 1) I pulled all the data with one SQL Select statement with 6 sub-queries (different selection criteria in each sub-query) 55 rows of data per month filter;

  • Is this a single compound Oracle SQL statement with 6 sub-queries as a part of the SQL? What do you mean by 55 rows of data per month filter? Is that a joined table that filters the resulting rows?


2) I pulled the main data (48 rows) separately with one SQL Select statement filter by a given month, then created separate single pull (7 additional rows) from the Oracle table and joined back to the resident table prior to saving the QVD.

The single pull (1) from the Oracle table took 4.5 hours, whereas the multiple pulls (2) with joins took between 20 min. and 25 min.  What is going on behind scenes/optimization that makes the single pull of data with the sub-queries take so long?  I noticed a comment in the community suggests pulling large about of data through SQL-Plus to text was very fast, then they pull the text file back in to QlikView.   I ran the SQL Select with the sub-queries in SQL Developer and it took 3 min. to run and pull the data.


  • If you run a single SQL query with multiple subqueries QlikView does not interfer or interact with what is going on at the Oracle Server - the slow-down from 3 minutes down to 4.5 hours seems to be due to the inefficiency or configuration of the ODBC / OLEDB drivers and or your infrastructure between the QlikView Server and Oracle Server

bjsellers57
Contributor II
Contributor II
Author

Peter,

Yes there are 2.5 million records per month of data. I keep in separate QVD 72 in all allowing the reporting application to pick and choose which months are needing for reporting on. As or Rows, I should have said Columns of data thinking in terms of a database

What I wanted to do in lieu of hitting the database several times is make one pass on the database, thus in scenario 2 I took the sub queries from the separate data pulls used in scenario 1 and added them to the main pull making one pass to the Oracle Database.

I pass the parameters for the vRepCycle and vOutputFile to the function in a Do while building 72 separate QVDs, one for each month

Unfortunately scenario 2 takes way too long to execute

Scenario 1

LET vQVDPath = '..\Internal\DataSources\CDB\';

LET vRepCycle = 20150701;

Let vFileName = Peek('FileName',$(i),'LOAN_INFO_QVD_LIST');

DATA_DYNAMIC:

LOAD

LDI_KEY,

RepCycle,

REPORTED_ACTION_CODE,

Field1,

Field2,

Field3,

......,

Field48

;

SQL SELECT

LDI.LDI_KEY,

LDI.REP_CYCLE,

LDI_REPORTED_ACTION_CODE,

LDI.Field1,

LDI.Field2,

LDI.Field3,

LDI. ... ,

LDI.Field48,

FROM DATA_DYNAMIC LDI

JOIN DATA_STATIC on DATA_STATIC.INTERNAL_ID = LDI.INTERNAL_ID

JOIN POOL_DATA POI on POI.POOL_NBR = LDI.POOL_NBR

JOIN SERIES_DATA SRI on SRI.SERIES_NBR = POI.SERIES_NBR

WHERE LDI.REP_CYCLE = $(vRepCycle)

;

Join (DATA_DYNAMIC)

LOAD

LDI_KEY,

Field49,

Field50

;

SQL SELECT

LDI_KEY,

RepCycle,

(SELECT BK_CODE FROM (SELECT internal_id as ilid, BK_FILED_DATE, BK_EXIT_DATE, BK_CODE FROM BK order by BK_FILED_DATE DESC)

WHERE ilid=LDI.INTERNAL_ID AND TRUNC(BK_FILED_DATE,'MM') <= LDI.CYCLE_DATE AND (TRUNC(BK_EXIT_DATE,'MM') >= LDI.CYCLE_DATE or BK_EXIT_DATE IS NULL) and ROWNUM=1) as “Field49”,

(SELECT BK_FILED_DATE FROM (SELECT internal_id as ilid, BK_FILED_DATE, BK_EXIT_DATE, BK_CHAPTER_CODE FROM BK order by BK_FILED_DATE DESC)

WHERE ilid=LDI.INTERNAL_ID AND TRUNC(BK_FILED_DATE,'MM') <= LDI.CYCLE_DATE AND (TRUNC(BK_EXIT_DATE,'MM') >= LDI.CYCLE_DATE or BK_EXIT_DATE IS NULL) and ROWNUM=1) as “Field50”

FROM DATA_DYNAMIC LDI

WHERE LDI.REPORTED_ACTION_CODE='51' AND LDI.REP_CYCLE = $(vRepCycle)

;

JOIN (DATA_DYNAMIC)

LOAD

LDI_KEY,

Field51

;

SQL SELECT

LDI.LDI_KEY,

LDI.CYCLE_DATE,

(SELECT trunc(ATT_REFERRAL_DATE) FROM (SELECT internal_id as ilid, ATT_REFERRAL_DATE, FRCLSR_EXIT_DATE FROM FORE order by ATT_REFERRAL_DATE DESC)

WHERE ilid=ldi.INTERNAL_ID AND TRUNC(ATT_REFERRAL_DATE,'MM') <= ldi.CYCLE_DATE AND (TRUNC(FRC_EXIT_DATE,'MM') >= ldi.CYCLE_DATE or FRC_EXIT_DATE IS NULL) and ROWNUM=1) as "Field51"

FROM DATA_DYNAMIC LDI

WHERE LDI.REPORTED_ACTION_CODE in ('3','17','27') AND LDI.REP_CYCLE = $(vRepCycle)

;

Join (DATA_DYNAMIC)

LOAD LDI_KEY,

Field52

;

SQL SELECT

LDI.LDI_KEY,

LDI.CYCLE_DATE,

(SELECT trunc(FRC_SALE_DATE) FROM (SELECT internal_id as ilid, ATT_REFERRAL_DATE, FRC_SALE_DATE FROM FORE order by FRC_SALE_DATE DESC)

WHERE ilid=ldi.INTERNAL_ID AND TRUNC(FRC_SALE_DATE,'MM') <= ldi.CYCLE_DATE AND FRC_SALE_DATE is not null and ROWNUM=1) as "Field52"

FROM DATA_DYNAMIC LDI

WHERE LDI.REPORTED_ACTION_CODE='7' AND LDI.REP_CYCLE = $(vRepCycle)

;

STORE DATA_DYNAMIC into $(vQVDPath)$(vOutputFile);

DROP Table DATA_DYNAMIC;

Scenario 2

LET vQVDPath = '..\Internal\DataSources\CDB\';

LET vRepCycle = 20150701;

Let vFileName = Peek('FileName',$(i),'LOAN_INFO_QVD_LIST');

DATA_DYNAMIC:

LOAD

LDI_KEY,

RepCycle,

REPORTED_ACTION_CODE,

Field1,

Field2,

Field3,

......,

Field48

;

SQL SELECT

LDI.LDI_KEY,

LDI.REP_CYCLE,

LDI_REPORTED_ACTION_CODE,

LDI.Field1,

LDI.Field2,

LDI.Field3,

LDI. ,

LDI.Field48,

(SELECT BK_CODE FROM (SELECT internal_id as ilid, BK_FILED_DATE, BK_EXIT_DATE, BK_CODE FROM BK order by BK_FILED_DATE DESC)

WHERE LDI.REPORTED_ACTION_CODE=51 AND ilid=LDI.INTERNAL_ID AND TRUNC(BK_FILED_DATE,'MM') <= LDI.CYCLE_DATE AND (TRUNC(BK_EXIT_DATE,'MM') >= LDI.CYCLE_DATE or BK_EXIT_DATE IS NULL) and ROWNUM=1) as “Field49”,

(SELECT BK_FILED_DATE FROM (SELECT internal_id as ilid, BK_FILED_DATE, BK_EXIT_DATE, BK_CHAPTER_CODE FROM BK order by BK_FILED_DATE DESC)

WHERE LDI.REPORTED_ACTION_CODE=51 AND ilid=LDI.INTERNAL_ID AND TRUNC(BK_FILED_DATE,'MM') <= LDI.CYCLE_DATE AND (TRUNC(BK_EXIT_DATE,'MM') >= LDI.CYCLE_DATE or BK_EXIT_DATE IS NULL) and ROWNUM=1) as “Field50”,

(SELECT trunc(ATT_REFERRAL_DATE) FROM (SELECT internal_id as ilid, ATT_REFERRAL_DATE, FRC_EXIT_DATE FROM FORE order by ATT_REFERRAL_DATE DESC)

WHERE LDI.REPORTED_ACTION_CODE in ('3','17','27')AND ilid=ldi.INTERNAL_ID AND TRUNC(ATT_REFERRAL_DATE,'MM') <= ldi.CYCLE_DATE AND (TRUNC(FRCLSR_EXIT_DATE,'MM') >= ldi.CYCLE_DATE or FRC_EXIT_DATE IS NULL) and ROWNUM=1) as "Field51",

(SELECT trunc(FRC_SALE_DATE) FROM (SELECT internal_id as ilid, ATT_REFERRAL_DATE, FRC_SALE_DATE FROM FORE order by FRC_SALE_DATE DESC)

WHERE LDI.REPORTED_ACTION_CODE=7 ilid=ldi.INTERNAL_ID AND TRUNC(FRC_SALE_DATE,'MM') <= ldi.CYCLE_DATE AND FRC_SALE_DATE is not null and ROWNUM=1) as "Field52"

FROM DATA_DYNAMIC LDI

JOIN DATA_STATIC on DATA_STATIC.INTERNAL_ID = LDI.INTERNAL_ID

JOIN POOL_DATA POI on POI.POOL_NBR = LDI.POOL_NBR

JOIN SERIES_DATA SRI on SRI.SERIES_NBR = POI.SERIES_NBR

WHERE LDI.REP_CYCLE = $(vRepCycle)

;

STORE DATA_DYNAMIC into $(vQVDPath)$(vOutputFile);

DROP Table DATA_DYNAMIC;

Thank you for taking a look at it and your response

Bernie Sellers

Analytical Consultant

(p)410.884.2182

petter
Partner - Champion III
Partner - Champion III

The 4.5 hour run-time is that for all the six years or just for a single month?

bjsellers57
Contributor II
Contributor II
Author

Single Month

Bernie Sellers

Analytical Consultant

petter
Partner - Champion III
Partner - Champion III

My best advice is to run this very efficient query (for Oracle) on the Oracle server and store the result set in an Oracle table and then read this single table from QlikView and see what kind of performance you then get.

maxgro
MVP
MVP

I would do these test

1) in sqlplus run the single query (1 month, If I understood the 4.5 hour query) and wait until you get all records

Why? Because (in sql developer, toad, etc, .....) you get the first rows from the db after some seconds but you should measure the time for all the rows

This is your WAY 1

2) in sqlplus run the 6-7 query (WAY 2), the join doesn't matter

Compare the time of 1) and 2) with the time you get in Qlik (4.5 hours, 20-25 minutes).

I think you get the same time or just a little better in SqlPlus vs Qlik (in 2, sqlplus doesn't join)

If you get the same time, the way to get better is ask your Oracle dba to optimize the query (it seems there are some order by you can skip).

If not, let we know.

P.S. its not so strange that when the query becomes complex (single query), the Oracle optimizer doesn't choose the best execution plan and the exec time increases