Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi
I have large QVD of 200m records.
How do I extract the min and max dates from this large table in the most efficient way possible. Is there a way to keep it an optimized load?
Thanks
Hi,
I'd use this:
temp:
LOAD Date
FROM [YourQVD.qvd] (qvd);
MinMaxDate:
Load min(Date) as MinDate,
max(Date) as MaxDate;
Load FieldValue('Date',IterNo()) as Date
autogenerate(1)
while not IsNull(FieldValue('Date',Iterno()));
Drop Table temp;
Not 100% if this is the best way, but it is the fastest I've seen in some of my tests.
Hope this helps
Fernando
Hi,
I'd use this:
temp:
LOAD Date
FROM [YourQVD.qvd] (qvd);
MinMaxDate:
Load min(Date) as MinDate,
max(Date) as MaxDate;
Load FieldValue('Date',IterNo()) as Date
autogenerate(1)
while not IsNull(FieldValue('Date',Iterno()));
Drop Table temp;
Not 100% if this is the best way, but it is the fastest I've seen in some of my tests.
Hope this helps
Fernando
Haven't tested it but taking Fernando's suggestions into consideration I would try
temp:
LOAD Date
FROM [YourQVD.qvd] (qvd);
temp2:
load
max(Date) as MaxDate,
min(Date) as MinDate
resident temp;
drop table temp;
To add one more point to Daniel: Since it is a large table reading the whole table will take time while loading the script the best bet is when querying from your database. Create a separate QVD for the dates. Lets say like below:
temp:
LOAD Date;
Select Date
FROM [Database_table] group by Date ;// this will give you distinct and less number of Dates so you dont need to read the complete QVD which is of 200 million records//
below script would be the same..
temp2:
load
max(Date) as MaxDate,
min(Date) as MinDate
resident temp;
drop table temp;
I disagree, it will depend on weather the database has an index on that field or not, the optimized load of a single field into memory will be faster in my opinion.
I did a test with 50 million rows.
This code took 2 mins to complete
LOAD max(OrderDate) as MaxDate,
min(OrderDate) as MinDate
FROM
[R00.QVD]
(qvd);
This code took slightly under a minute
temp:
LOAD distinct OrderDate
FROM
[R00.QVD]
(qvd);
temp2:
LOAD max(OrderDate) as MaxDate,
min(OrderDate) as MinDate
resident temp;
drop table temp;
Fernando's solution took 6 seconds (Fernando you're the man!) and there is no way to beat that time since just loading the field takes 5 secs.
temp:
LOAD OrderDate
FROM
[R00.QVD]
(qvd);
MinMaxDate:
Load min(OrderDate) as MinDate,
max(OrderDate) as MaxDate;
Load FieldValue('OrderDate',IterNo()) as OrderDate
autogenerate(1)
while not IsNull(FieldValue('OrderDate',Iterno()));
Drop Table temp;
This took 13 secs, kinda makes you wonder how qlikview does things
temp:
LOAD OrderDate
FROM
[R00.QVD]
(qvd);
store temp into orderdate.qvd;
drop table temp;
Directory;
LOAD max(OrderDate) as MaxDate,
min(OrderDate) as MinDate
FROM
orderdate.qvd
(qvd);
Hi Fernando
Is there any benefit in using IterNo in your statement? The MinMaxDate table only has 1 record, so FieldValue would have worked on its own. Or, is there a performance beneifit of using IterNo?
Thanks
-----------
MinMaxDate:
Load min(Date) as MinDate,
max(Date) as MaxDate;
Load FieldValue('OrderDate',IterNo()) as OrderDate
autogenerate(1)
while not IsNull(FieldValue('OrderDate',IterNo()));
Drop Table temp;
Hi Daniel
Are the times you specify including the "Load OrderDate" statement too, as this alone (on my machine) takes about 55 seconds using Fernando's approach. How do you achieve 6 seconds - I assume you are running this directly on the server?
Thanks
Andrew
Times include the entire script execution time, I'm not using the same data as you are and my machine specifications may defer from yours.
Make sure the date load from the QVD is optimized.
Please mark Fernando's answer as the correct one.
Although you have resolved your problem, there may be a quicker way of storing QVD info and retrieving it through the use of field comments (also table comments).
e.g. If you identify the range values in the QVD creation routine and comment the field as this ...
fName:
Load min(OrderDate) & '|' & max(OrderDate) as dRange resident SampleData;
let v = peek('dRange');
Comment OrderDate with '$(v)';
store SampleData into SampleData.qvd (qvd);
... the comment is stored in the QVD and can be retrieved via XML without the need to load any data - you just might need to parse the comment in the QVD reading document ...
QvdFieldHeader:
Load date(subfield(Comment,'|',1)) as minOrderDate,
date(subfield(Comment,'|',-1)) as maxOrderDate;
LOAD Comment
FROM [SampleData.qvd] (XmlSimple, Table is [QvdTableHeader/Fields/QvdFieldHeader])
where FieldName = 'OrderDate';
flipside