Qlik Community

QlikView Scripting

Discussion Board for collaboration on QlikView Scripting.

Not applicable

Extracting Min(Date) and Max(Date) from large QVD

Hi

I have large QVD of 200m records.

How do I extract the min and max dates from this large table in the most efficient way possible. Is there a way to keep it an optimized load?

Thanks

1 Solution

Accepted Solutions
fosuzuki
Valued Contributor II

Re: Extracting Min(Date) and Max(Date) from large QVD

Hi,

I'd use this:

temp:

LOAD Date

FROM [YourQVD.qvd] (qvd);

MinMaxDate:

Load min(Date) as MinDate,

        max(Date) as MaxDate;

Load FieldValue('Date',IterNo()) as Date

autogenerate(1)

while not IsNull(FieldValue('Date',Iterno()));

Drop Table temp;

Not 100% if this is the best way, but it is the fastest I've seen in some of my tests.

Hope this helps

Fernando

8 Replies
fosuzuki
Valued Contributor II

Re: Extracting Min(Date) and Max(Date) from large QVD

Hi,

I'd use this:

temp:

LOAD Date

FROM [YourQVD.qvd] (qvd);

MinMaxDate:

Load min(Date) as MinDate,

        max(Date) as MaxDate;

Load FieldValue('Date',IterNo()) as Date

autogenerate(1)

while not IsNull(FieldValue('Date',Iterno()));

Drop Table temp;

Not 100% if this is the best way, but it is the fastest I've seen in some of my tests.

Hope this helps

Fernando

danielrozental
Honored Contributor II

Re: Extracting Min(Date) and Max(Date) from large QVD

Haven't tested it but taking Fernando's suggestions into consideration I would try

temp:
LOAD Date
FROM [YourQVD.qvd] (qvd);

temp2:

load

    max(Date) as MaxDate,

    min(Date) as MinDate

resident temp;  

drop table temp;

vinay_bangari
Valued Contributor III

Re: Extracting Min(Date) and Max(Date) from large QVD

To add one more point to Daniel: Since it is a large table reading the whole table will take time while loading the script the best bet is when querying from your database. Create a separate QVD for the dates. Lets say like below:

temp:
LOAD Date;

Select Date
FROM [Database_table] group by Date ;// this will give you distinct and less number of Dates so you dont need to read the complete QVD which is of 200 million records//

below script would be the same..

temp2:

load

    max(Date) as MaxDate,

    min(Date) as MinDate

resident temp; 

drop table temp;

danielrozental
Honored Contributor II

Re: Extracting Min(Date) and Max(Date) from large QVD

I disagree, it will depend on weather the database has an index on that field or not, the optimized load of a single field into memory will be faster in my opinion.

I did a test with 50 million rows.

This code took 2 mins to complete

LOAD max(OrderDate) as MaxDate,     

          min(OrderDate) as MinDate

FROM

[R00.QVD]

(qvd);

This code took slightly under a minute

temp:

LOAD distinct OrderDate

FROM

[R00.QVD]

(qvd);

temp2:

LOAD max(OrderDate) as MaxDate,     

          min(OrderDate) as MinDate

resident temp;

drop table temp;

Fernando's solution took  6 seconds (Fernando you're the man!) and there is no way to beat that time since just loading the field takes 5 secs.

temp:

LOAD OrderDate

FROM

[R00.QVD]

(qvd);

MinMaxDate:

Load min(OrderDate) as MinDate,

        max(OrderDate) as MaxDate;

Load FieldValue('OrderDate',IterNo()) as OrderDate

autogenerate(1)

while not IsNull(FieldValue('OrderDate',Iterno()));

Drop Table temp;

This took 13 secs, kinda makes you wonder how qlikview does things

temp:

LOAD OrderDate

FROM

[R00.QVD]

(qvd);

store temp into orderdate.qvd;

drop table temp;

Directory;

LOAD max(OrderDate) as MaxDate,

min(OrderDate) as MinDate

FROM

orderdate.qvd

(qvd);

Not applicable

Re: Extracting Min(Date) and Max(Date) from large QVD

Hi Fernando

Is there any benefit in using IterNo in your statement? The MinMaxDate table only has 1 record, so FieldValue would have worked on its own. Or, is there a performance beneifit of using IterNo?

Thanks

-----------

MinMaxDate:

Load min(Date) as MinDate,

        max(Date) as MaxDate;

Load FieldValue('OrderDate',IterNo()) as OrderDate

autogenerate(1)

while not IsNull(FieldValue('OrderDate',IterNo()));

Drop Table temp;

Not applicable

Re: Extracting Min(Date) and Max(Date) from large QVD

Hi Daniel

Are the times you specify including the "Load OrderDate" statement too, as this alone (on my machine) takes about 55 seconds using Fernando's approach. How do you achieve 6 seconds - I assume you are running this directly on the server?

Thanks

Andrew

danielrozental
Honored Contributor II

Re: Extracting Min(Date) and Max(Date) from large QVD

Times include the entire script execution time, I'm not using the same data as you are and my machine specifications may defer from yours.

Make sure the date load from the QVD is optimized.

Please mark Fernando's answer as the correct one.

flipside
Valued Contributor II

Re: Extracting Min(Date) and Max(Date) from large QVD

Although you have resolved your problem, there may be a quicker way of storing QVD info and retrieving it through the use of field comments (also table comments).

e.g. If you identify the range values in the QVD creation routine and comment the field as this ...

fName:

Load min(OrderDate) & '|' & max(OrderDate) as dRange resident SampleData;

let v = peek('dRange');

Comment OrderDate with '$(v)';


store SampleData into SampleData.qvd (qvd);

... the comment is stored in the QVD and can be retrieved via XML without the need to load any data - you just might need to parse the comment in the QVD reading document ...

QvdFieldHeader:

Load date(subfield(Comment,'|',1)) as minOrderDate,

    date(subfield(Comment,'|',-1)) as maxOrderDate;

LOAD Comment

FROM [SampleData.qvd] (XmlSimple, Table is [QvdTableHeader/Fields/QvdFieldHeader])

where FieldName = 'OrderDate';

flipside

Community Browser