Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Toronto Sept 9th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Chunk data returned from a query

Hi,

I have a very simple query in my load script that is pulling a lot of rows (300+ million) from a table in Oracle. It was working fine for a while (so there's nothing wrong with the query's syntax), but it started timing out recently. I've spoken to my DBA, and he said that I need to chunk my data so that I'm not going after all the rows at one time.

Here's the original query:

PartnerTerritoryOrderHistory:

LOAD

    [MI Country KY],

    [Partner ID],

    [MI Product Key],

    [Action Type],

    [Status Type]

;

SQL

SELECT

    hst.mi_country_ky AS "MI Country KY",

    hst.partner_ky AS "Partner ID",

    hst.mi_product_ky AS "MI Product Key",

    hst.latest_ofr_action_type AS "Action Type",

    hst.latest_ofr_status AS "Status Type"

FROM

    rpt_dw.dcr_order_history_product hst

WHERE

    hst.PARTNER_KY IN (183,13245,1733,1089,28209,25683,28449,27969,27611,26487,25565,25563,25005,23881,23761,27649,26847,26567,25885,24563,24443,3103,546,27651,27327,26767,26003,25883,25083,8337,23921,27249,25443,24843,24765,24763,24723,24403,24203,27489,26967,26647,26609,26607,26485,26045,25643,25363,25283,25123,1433,22803,21319,21081,20685,20519,12631,22125,21921,21483,21441,21159,16199,2111,16519,5257,2879,23321,21721,21521,1331,16001,19281,2899,22721,7579,21199,2705,6497,20041,6657,23521,2237,10657,22601,22561,21601,2695,20279,18959,22481,21641,20959,20319,3701,15441,14603,21279,21121,21039,15839,12227,2416,2839,6777,22321,21361,20799,20559,13801,20399,14919,20199,20159,18919,18439,17399,16999,16039,15999,13799,18319,17439,16479,16159,13679,18489,18279,17279,15279,14363,18079,14999,18359,17999,17519,17239,16439,14843,14361,14359,13681,13605,15759,15639,15439,4737,11027,13959,18533,13167,13247,12999,11819,12143,11541,11381,2709,2699,2691,10699,2823,8581,8819,9137,2552,10577,9819,8699,7777,3463,3977,3337,10021,2707,4057,7977,6737,8977,6857,3617,2829,2847,3079,7711,5459,2254,2236,4937,10937,7057,7337,7177,3741,1894,1893,11021,1714,1655,1614,4217,1151,368,277,8457,4897,3857,7539,648,3073,3067,9737,10457,192,9459,9423,2821);

STORE * FROM PartnerTerritoryOrderHistory INTO '..\qvds\PartnerTerritoryOrderHistory.qvd' (QVD);

DROP TABLES PartnerTerritoryOrderHistory;

In the end, I want the result to be the same (i.e., all the data written to a single QVD). What I want to do going forward is have the load script automatically query this table by every unique country key (mi_country_ky column). In other words, let's say I have 200 unique country keys under the mi_country_ky column. I would want the script to loop through and essentially perform 200 individual queries by unique country key (thus "chunking" my results in 200 separate buckets), but still write all the data from each of the 200 queries to the same QVD (PartnerTerritoryOrderHistory.qvd). I don't know how to modify my script to do this though. Can anyone provide the correct syntax for this desired "looping/concatenation" that I want?

Thanks,

Kenny

17 Replies
Not applicable
Author

Hi Clever...I think I'm 99% there. I tweaked your code slightly for my situation, and it is cycling through each individual country key like I want. However, in the log for the QVW, the end looks like:

QVW Log.jpg

I don't understand why it's doing that last query with the blank country key (the second red rectangle) and why it has "127 lines fetched" twice. The country key in the top red rectangle (14077248) is the last country key in the cycle. Do you know why it's doing that? Thanks again!

Clever_Anjos
Employee
Employee

Would you mind posting the Inline table and the For part?

Not applicable
Author

Here's the code I'm using. I'm not loading the country key table as an inline table.

CountryKeys:

LOAD

    [MI Country Key]

;

SQL

SELECT

    DISTINCT mi_country_ky AS "MI Country Key"

FROM

    rpt_dw.dcr_order_history_product

WHERE

    mi_product_ky='1381605760' AND

    partner_ky='11381'

;

For i = 0 to NoOfRows('CountryKeys');

  LET vCountryKeys = Peek('MI Country Key',i,'CountryKeys'); 

PartnerTerritoryOrderHistory: 

  LOAD 

     [MI Country KY], 

     [Partner ID], 

     [MI Product Key], 

     [Action Type], 

     [Status Type] 

  ; 

  SQL 

  SELECT 

     hst.mi_country_ky AS "MI Country KY", 

     hst.partner_ky AS "Partner ID", 

     hst.mi_product_ky AS "MI Product Key", 

     hst.latest_ofr_action_type AS "Action Type", 

     hst.latest_ofr_status AS "Status Type" 

  FROM 

     rpt_dw.dcr_order_history_product hst 

  WHERE 

  hst.mi_country_ky = '$(vCountryKeys)' AND

  hst.PARTNER_KY IN (11381) AND

  hst.mi_product_ky='1381605760'; 

Next 

STORE * FROM PartnerTerritoryOrderHistory INTO '..\qvds\DataChunkingTest.qvd' (QVD);

DROP TABLES PartnerTerritoryOrderHistory, CountryKeys;

Clever_Anjos
Employee
Employee

For i = 0 to NoOfRows('CountryKeys')

should be

For i = 0 to NoOfRows('CountryKeys') -1

Not applicable
Author

Thanks for all your help. You went above and beyond. Really appreciate it.

Clever_Anjos
Employee
Employee

Good to read is working now

Be Qlik, have fun

Not applicable
Author

Clever...when I apply the code to my real and full set of data, it is way too slow. In the end, I need to load about 310+ million rows. In the past, when I would run my original simple query (before I started getting issues with it timing out), it would take about 2.5 hrs to pull back 300+ million rows. Now, with the FOR...NEXT looping, it's been running for 3+ hours and it has only pulled back ~20 million rows. Is there a way to optimize this significantly? If not, I think I'll have to look for another solution. The code works as I want it to...I just don't know why it's so slow.

Clever_Anjos
Employee
Employee

Please ask your DBA Team if it´possible to create indexes for this columns

   hst.mi_country_ky

  hst.PARTNER_KY IN

  hst.mi_product_ky

I´m guessing they are not indexed