Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
CustNo | Customer | Type | Country | Year |
1000 | customerA | A | USA | 2008 |
2000 | customerB | A | USA | 2009 |
3000 | customerC | B | USA | 2008 |
4000 | customerA | A | USA | 2008 |
5000 | customerA | A | USA | 2008 |
6000 | customerA | A | Mexico | 2009 |
7000 | customerA | A | Mexico | 2008 |
8000 | customerA | A | USA | 2007 |
Consider this data as example
Load * Inline[ | ||
Customer | ||
USA | ||
]; | ||
Load * inline | ||
[ | ||
Type | ||
A | ||
]; | ||
CUST: | ||
Load | ||
CustNo, | ||
Customer, | ||
Type, | ||
Country | ||
Year | ||
from Customertable | ||
Where exists(Country); | ||
CUST_1: | ||
Load | ||
CustNo, | ||
Customer, | ||
Type, | ||
Country | ||
Year | ||
resdient CUST | ||
Where Type = 'A' and Year = '2008'; |
drop table CUST; |
I am dealing with millions of records connecting the CUST1 table;
CUST table is QVD optimized .CUST 1 is taking long time .How can Optimized better way for loading speed.
CUST1 table - based on 2 different condition - I need to load the data.
Please advise for optimizing the CUST1 table.
Advance Thanks for your help.
To get an optimized load, I believe you only get to use exists(), and can only apply it to a single field, which is what you're already doing with Country. So in a sense, you can't do any better than you already are.
However, there's nothing to say that your single field can't be a compound field. So for maximum performance, you could add a new field to your QVD, like "Optimized Key", defined like "Country"&':'&"Type"&':'&"Year".
Then do this:
[Good Optimized Keys]:
LOAD * INLINE [
Optimized Key
USA:A:2008
]
;
[Customer]:
LOAD
CustNo
,Customer
,Type
,Country
,Year
,"Optimized Key"
WHERE EXISTS("Optimized Key")
;
DROP FIELD "Optimized Key"
;
Now, that would get impractical if you have a lot of different countries, types and years that you want to load. But if the number is small, I bet it would give you very good load performance at the cost of a little complexity.
(edit: Even if the number of combinations is large, you could probably generate the table of the combinations from the separate field values, and thus keep the high performance load without having the manually list all of the possible combinations of field values that you want.)
(edit: Continuing a little further with the idea of generating the table of keys, maybe something like this example for putting together a 4 field key with multiple values for each field:
[Good Optimized Keys]:
LOAD pick(mod(ceil(recno()/(4*5*6*1)),3)+1,'USA','Mexico','Canada')
&':'&pick(mod(ceil(recno()/( 5*6*1)),4)+1,'A','B','C','D')
&':'&pick(mod(ceil(recno()/( 6*1)),5)+1,'2001','2002','2004','2007','2008')
&':'&pick(mod(ceil(recno()/( 1)),6)+1,'L','M','N','O','P','Q')
AS "Optimized Key"
AUTOGENERATE 3*4*5*6
;
Alternatively, just have a loop for each. The above is basically mathematically simulating that without actually having all the loops. Yes, it could be shrunk down somewhat, but I wanted the structure of it to be as clear as possible, even if it isn't particularly clear.)
Following up on my post, there seems to be a MUCH simpler approach as long as you have enough RAM. Do your MOST restrictive selection with an exists(). In this case, I'm assuming that the country will narrow down the retrieved data more than the type or the year. Then do inner joins to inline tables to further restrict the data.
[Countries]:
LOAD * INLINE [
Country
USA
]
;
[Customers]:
LOAD
CustNo
,Customer
,Type
,Country
,Year
WHERE EXISTS("Country")
;
INNER JOIN LOAD * INLINE [
Type
A
];
INNER JOIN LOAD * INLINE [
Year
2008
];
DROP TABLE [Countries]
;
Your main load will be optimized, and the inner joins are very fast. That's definitely the approach I'd recommend over my previous one unless there is simply not enough RAM to pull in the data like that.