Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello All
I have a Qlikview project where a Table A QVD contains about 1.5 million rows. i have already loaded this QVD into the memory.
I want to load another table Table B which is a subset of Table A using a where clause.
e.g.
Table B:
Load
AID as BID,
.....
resident load Table A
where Purchase Year = Last Purchase Year
In such a case is an Non Optimized QVD Load (because of where clause) better or a Resident Load?
Any help from your experience appreciated
Hi Dilip,
there is a ton of information on the subject of optimised loads and resident loads in the forum so I'm not sure why you've raised this question. Fundamentally, whether a load from QVD, or a Resident load will be quicker depends on your specific circumstances. This includes things like your environment, whether you're dealing with lots of text fields or not, how unique your data is, etc, etc, etc.
First off, are you dealing with 1.5 million or 1.7 billion rows? If it's 1.7 billion, are these all in a single QVD? If so, have you not thought about segmenting this file?
Incidentally, a WHERE EXISTS clause WILL be optimised. You could try something like:
TempFilter:
LOAD * INLINE [
FilterYear
2015];
Table B:
Load
*
FROM [Your QVD]
WHERE EXISTS(FilterYear, Year);
What will break optimisation though are things like calculated fields, renamed fields, and pretty much any manipulation in the initial load.
Marcus
same test, about the same result (SSD disk)
it seems the best one is qvd optimized and if you can't do an optimized load, resident and not optimized qvd are similar
t << AUTOGENERATE(10000000) 10.000.000 lines fetched
***** read resident 23/10/2015 10:12:56
tr << t 1.665.621 lines fetched
00:00:06
Dim2 << INLB829 1 lines fetched
***** read qvd optimized 23/10/2015 10:13:02
t << T (qvd optimized) 1.665.621 lines fetched
00:00:02
***** read qvd non optimized 23/10/2015 10:13:04
tq << T 1.665.621 lines fetched
00:00:06
@marcus
My understanding is that your example would not be optimised. Only the single parameter Exists will keep the load optimised. Like
Table B:
Load
*
FROM [Your QVD]
WHERE EXISTS(FilterYear);
Or has this changed in a recent SR?
Aah my bad, you're quite right.
Dilip R
I suggest you confirm one of the many responses saying pretty much the same thing as correct so we can close this thread. It's not particularly helpful to have lots of versions of almost the same response, I guess that's just the nature of forums - concluding with a correct answer should help future readers find what is most relevant in an answered thread.
Ps. It does not seem possible to 'mention' your name for some reason, just so you know. ( moderator care to comment on how to 'mention' Dilip R? Or is it privacy settings maybe preventing it?) Ds.
Cheers
Jonas
If i can do optimized load i dont doubt that would be better.
the where clause has to compare 2 dates columns within the table hence results in non optimized load. i cant find any documentation comparing non optimized qvd load and resident load.
judging from the answers and the size of the date handled within the application and qvd residing on a shared drive resident load seem to be the way to go.
once i get a chance to test in pre production i will update the thread
... Just a note as you mentioned you had a two field criteria:
it may sometimes be faster to do an optimised load of all data and then a RIGHT KEEP to reduce down the loaded data to what you need if you can't easily design a single field WHERE EXISTS (<field>) clause.
tmp:
RIGHT KEEP(bigTable)
LOAD * INLINE [
Field
A
];
DROP TABLE tmp;
Also, remember that you can construct a surrogate unique field from two independent field. FIELD1 & '|' & FIELD2 as FIELD3
Either method should allow you to benefit from an optimised QVD load
Cheers
Jonas