Solved: Chunk data returned from a query - Page 2 - Qlik Community

Report Inappropriate Content · ‎2016-06-06

Hi,

I have a very simple query in my load script that is pulling a lot of rows (300+ million) from a table in Oracle. It was working fine for a while (so there's nothing wrong with the query's syntax), but it started timing out recently. I've spoken to my DBA, and he said that I need to chunk my data so that I'm not going after all the rows at one time.

Here's the original query:

PartnerTerritoryOrderHistory:

LOAD

[MI Country KY],

[Partner ID],

[MI Product Key],

[Action Type],

[Status Type]

;

SQL

SELECT

hst.mi_country_ky AS "MI Country KY",

hst.partner_ky AS "Partner ID",

hst.mi_product_ky AS "MI Product Key",

hst.latest_ofr_action_type AS "Action Type",

hst.latest_ofr_status AS "Status Type"

FROM

rpt_dw.dcr_order_history_product hst

WHERE

hst.PARTNER_KY IN (183,13245,1733,1089,28209,25683,28449,27969,27611,26487,25565,25563,25005,23881,23761,27649,26847,26567,25885,24563,24443,3103,546,27651,27327,26767,26003,25883,25083,8337,23921,27249,25443,24843,24765,24763,24723,24403,24203,27489,26967,26647,26609,26607,26485,26045,25643,25363,25283,25123,1433,22803,21319,21081,20685,20519,12631,22125,21921,21483,21441,21159,16199,2111,16519,5257,2879,23321,21721,21521,1331,16001,19281,2899,22721,7579,21199,2705,6497,20041,6657,23521,2237,10657,22601,22561,21601,2695,20279,18959,22481,21641,20959,20319,3701,15441,14603,21279,21121,21039,15839,12227,2416,2839,6777,22321,21361,20799,20559,13801,20399,14919,20199,20159,18919,18439,17399,16999,16039,15999,13799,18319,17439,16479,16159,13679,18489,18279,17279,15279,14363,18079,14999,18359,17999,17519,17239,16439,14843,14361,14359,13681,13605,15759,15639,15439,4737,11027,13959,18533,13167,13247,12999,11819,12143,11541,11381,2709,2699,2691,10699,2823,8581,8819,9137,2552,10577,9819,8699,7777,3463,3977,3337,10021,2707,4057,7977,6737,8977,6857,3617,2829,2847,3079,7711,5459,2254,2236,4937,10937,7057,7337,7177,3741,1894,1893,11021,1714,1655,1614,4217,1151,368,277,8457,4897,3857,7539,648,3073,3067,9737,10457,192,9459,9423,2821);

STORE * FROM PartnerTerritoryOrderHistory INTO '..\qvds\PartnerTerritoryOrderHistory.qvd' (QVD);

DROP TABLES PartnerTerritoryOrderHistory;

In the end, I want the result to be the same (i.e., all the data written to a single QVD). What I want to do going forward is have the load script automatically query this table by every unique country key (mi_country_ky column). In other words, let's say I have 200 unique country keys under the mi_country_ky column. I would want the script to loop through and essentially perform 200 individual queries by unique country key (thus "chunking" my results in 200 separate buckets), but still write all the data from each of the 200 queries to the same QVD (PartnerTerritoryOrderHistory.qvd). I don't know how to modify my script to do this though. Can anyone provide the correct syntax for this desired "looping/concatenation" that I want?

Thanks,

Kenny

Report Inappropriate Content · ‎2016-06-08

Hi Clever...I think I'm 99% there. I tweaked your code slightly for my situation, and it is cycling through each individual country key like I want. However, in the log for the QVW, the end looks like:

I don't understand why it's doing that last query with the blank country key (the second red rectangle) and why it has "127 lines fetched" twice. The country key in the top red rectangle (14077248) is the last country key in the cycle. Do you know why it's doing that? Thanks again!

Clever_Anjos · ‎2016-06-08

Would you mind posting the Inline table and the For part?

Report Inappropriate Content · ‎2016-06-08

Here's the code I'm using. I'm not loading the country key table as an inline table.

CountryKeys:

LOAD

[MI Country Key]

;

SQL

SELECT

DISTINCT mi_country_ky AS "MI Country Key"

FROM

rpt_dw.dcr_order_history_product

WHERE

mi_product_ky='1381605760' AND

partner_ky='11381'

;

For i = 0 to NoOfRows('CountryKeys');

LET vCountryKeys = Peek('MI Country Key',i,'CountryKeys');

PartnerTerritoryOrderHistory:

LOAD

[MI Country KY],

[Partner ID],

[MI Product Key],

[Action Type],

[Status Type]

;

SQL

SELECT

hst.mi_country_ky AS "MI Country KY",

hst.partner_ky AS "Partner ID",

hst.mi_product_ky AS "MI Product Key",

hst.latest_ofr_action_type AS "Action Type",

hst.latest_ofr_status AS "Status Type"

FROM

rpt_dw.dcr_order_history_product hst

WHERE

hst.mi_country_ky = '$(vCountryKeys)' AND

hst.PARTNER_KY IN (11381) AND

hst.mi_product_ky='1381605760';

DROP TABLES PartnerTerritoryOrderHistory, CountryKeys;

Clever_Anjos · ‎2016-06-08

For i = 0 to NoOfRows('CountryKeys')

should be

For i = 0 to NoOfRows('CountryKeys') -1

Report Inappropriate Content · ‎2016-06-08

Thanks for all your help. You went above and beyond. Really appreciate it.

Clever_Anjos · ‎2016-06-08

Good to read is working now

Be Qlik, have fun

Report Inappropriate Content · ‎2016-06-08

Clever...when I apply the code to my real and full set of data, it is way too slow. In the end, I need to load about 310+ million rows. In the past, when I would run my original simple query (before I started getting issues with it timing out), it would take about 2.5 hrs to pull back 300+ million rows. Now, with the FOR...NEXT looping, it's been running for 3+ hours and it has only pulled back ~20 million rows. Is there a way to optimize this significantly? If not, I think I'll have to look for another solution. The code works as I want it to...I just don't know why it's so slow.

Clever_Anjos · ‎2016-06-08

Please ask your DBA Team if it´possible to create indexes for this columns

hst.mi_country_ky

hst.PARTNER_KY IN

hst.mi_product_ky

I´m guessing they are not indexed