Qlik Community

QlikView Documents

Documents for QlikView related information.

Announcements
QlikWorld 2020: Join us May 11 - 14, 2020 in Phoenix, AZ. Register early and save $400. Learn More

MIN AND MAX IN LARGE TABLES

santiago_respane
Valued Contributor

MIN AND MAX IN LARGE TABLES

Hi Experts,

this is a small document about performance issues when looking for min and max values in very large tables.

Lets imagine that we have a fact table called myData with more than 100 M rows and we need to obtain the min and max value for a date field called DateKey.

1-In my first try i did as follows (traditional way):

TempCalendar:
NOCONCATENATE LOAD DISTINCT
    MIN([DateKey]) AS MinDate,
    MAX([DateKey]) AS MaxDate
RESIDENT myData WHERE ([DateKey] <> NULL());

This took 15 minutes, only for doing this!

In order to achieve my goal i had to change my strategy.

2-After a few tries i came with this:

TempCalendar: 
LOAD MIN(DateAux  ) AS MinDate, MAX(DateAux  ) AS MaxDate

LOAD FIELDVALUE('DateKey',ITERNO()) AS DateAux 
AUTOGENERATE(1) 
WHILE NOT ISNULL(FIELDVALUE('DateKey',ITERNO()));


With this change the reload time got decreased a lot and this only took 10 seconds!

What happened:

1-In my first try i was looking for min and max values of a specific field in the context of the table it belongs to, this makes Qlikview to first determine which are the distinct values of this field in this table and then determine with are the min and max (it will look in all 100 millions rows). Adding or removing the distinct clause didn't make any difference.

2-First lets remember that Qlikview stores for each field only the distinct values and the in the tables there are bit stuffed pointers pointing to those values, so using a combination of fieldvalue function, autogenerate and a while loop we will be looping only through those distinct values of the field (10K only instead of 100M) and obtaining the min and max values of the field a lot faster than using the traditional way (1).

I've always solved this kind of requirements with the traditional way and never had a time performance issue, so for small to medium tables it is a good solution. It's good to know that Qlikview has great alternatives for very large data sets also.

This are the things that make me love QV!

Always discovering new things and better ways to them!

Hope this helps you guys.

Kind regards,

Santiago Respane

Comments
Partner
Partner

You may find that this is shorter and even a little better performing:

LOAD

     Max(FieldValue('Field Name',RecNo())) as [Max Field Name],

     Min(FieldValue('Field Name',RecNo())) as [Min Field Name]

AUTOGENERATE FieldValueCount('Field Name');

Thanks for sharing Santiago‌. Rob has also mentioned about this method in his post here: “Fastest” Method to Read max(field) From a QVD | Qlikview Cookbook

Partner
Partner

One comment though.

You may find that having a lot of distinct values, such as transactionId, a resident load will be slightly faster than the autogenerate function.

If you want to find the min/max dates in your dataset, the performance using this solution cannot be beaten.

Do you actually need to calculate the min and max date from your data to generate your calendar?

The min date is often known and can be hard-coded, whilst the max date can be calculated based on today's date and an expression - it may be today, yesterday, tomorrow, the last day of the current year, or even today +N years.

These dates can be quickly calculated rather than finding the min/max of a large dataset

santiago_respane
Valued Contributor

Hi colin_albert, in my case the calendar was for historical and forecasting data in huge amounts, we dont know the max date for the forcasting neither the min date for the historical because it gathers data from many sources, so hardcoding it may end outside the boundary or generating dates that without fact data in calendar.

How woud you create a calendar for historical and forcasting data that you dont know it max and min date and without generating dates without data?

Thanks a lot for your comments.

Kind regards,

0 Likes
clondono
Contributor III

Thanks for sharing.

0 Likes
Version history
Revision #:
1 of 1
Last update:
‎2016-05-26 09:09 AM
Updated by: