load Optimization for Resident table - Qlik Community

Report Inappropriate Content · ‎2009-05-22

CustNo	Customer	Type	Country	Year
1000	customerA	A	USA	2008
2000	customerB	A	USA	2009
3000	customerC	B	USA	2008
4000	customerA	A	USA	2008
5000	customerA	A	USA	2008
6000	customerA	A	Mexico	2009
7000	customerA	A	Mexico	2008
8000	customerA	A	USA	2007

Consider this data as example

Load * Inline[
Customer
USA
];

Load * inline
[
Type
A
];



CUST:
Load
CustNo,
Customer,
Type,
Country
Year
from Customertable
Where exists(Country);

CUST_1:
Load
CustNo,
Customer,
Type,
Country
Year
resdient CUST
Where Type = 'A' and Year = '2008';

drop table CUST;

I am dealing with millions of records connecting the CUST1 table;

CUST table is QVD optimized .CUST 1 is taking long time .How can Optimized better way for loading speed.

CUST1 table - based on 2 different condition - I need to load the data.

Please advise for optimizing the CUST1 table.

Advance Thanks for your help.

johnw · ‎2009-05-22

To get an optimized load, I believe you only get to use exists(), and can only apply it to a single field, which is what you're already doing with Country. So in a sense, you can't do any better than you already are.

However, there's nothing to say that your single field can't be a compound field. So for maximum performance, you could add a new field to your QVD, like "Optimized Key", defined like "Country"&':'&"Type"&':'&"Year".

Then do this:

[Good Optimized Keys]:

LOAD * INLINE [

Optimized Key

USA:A:2008

]

;

[Customer]:

LOAD

CustNo

,Customer

,Type

,Country

,Year

,"Optimized Key"

WHERE EXISTS("Optimized Key")

;

DROP FIELD "Optimized Key"

;

Now, that would get impractical if you have a lot of different countries, types and years that you want to load. But if the number is small, I bet it would give you very good load performance at the cost of a little complexity.

(edit: Even if the number of combinations is large, you could probably generate the table of the combinations from the separate field values, and thus keep the high performance load without having the manually list all of the possible combinations of field values that you want.)

(edit: Continuing a little further with the idea of generating the table of keys, maybe something like this example for putting together a 4 field key with multiple values for each field:

[Good Optimized Keys]:

LOAD pick(mod(ceil(recno()/(4*5*6*1)),3)+1,'USA','Mexico','Canada')

&':'&pick(mod(ceil(recno()/( 5*6*1)),4)+1,'A','B','C','D')

&':'&pick(mod(ceil(recno()/( 6*1)),5)+1,'2001','2002','2004','2007','2008')

&':'&pick(mod(ceil(recno()/( 1)),6)+1,'L','M','N','O','P','Q')

AS "Optimized Key"

AUTOGENERATE 3*4*5*6

;

Alternatively, just have a loop for each. The above is basically mathematically simulating that without actually having all the loops. Yes, it could be shrunk down somewhat, but I wanted the structure of it to be as clear as possible, even if it isn't particularly clear.)

johnw · ‎2009-06-03

Following up on my post, there seems to be a MUCH simpler approach as long as you have enough RAM. Do your MOST restrictive selection with an exists(). In this case, I'm assuming that the country will narrow down the retrieved data more than the type or the year. Then do inner joins to inline tables to further restrict the data.

[Countries]:
LOAD * INLINE [
Country
USA
]
;
[Customers]:
LOAD
CustNo
,Customer
,Type
,Country
,Year
WHERE EXISTS("Country")
;
INNER JOIN LOAD * INLINE [
Type
A
];
INNER JOIN LOAD * INLINE [
Year
2008
];
DROP TABLE [Countries]
;

Your main load will be optimized, and the inner joins are very fast. That's definitely the approach I'd recommend over my previous one unless there is simply not enough RAM to pull in the data like that.