Qlik Community

QlikView Scripting

Discussion Board for collaboration on QlikView Scripting.

vlakeshr
Contributor

Reduce the reload time within 1 hoour or 45 min

Reduce the reload time within tne hour.

load taking more time by sum of measures while grouping field, anybody have solution for this or  how can we reduce the  reload time which has taking for group by clause.

In Group by using 8 fields for 5 measures. I use below option but it's not working properly.

Anybody help me to reduce reload time by optimizing group by

Option 1:

Orders:

LOAD EmployeeID,

          Year,

          Sales,

          ....

FROM Orders;

Left Join (Orders)

LOAD EmployeeID,

    Year,

    Sum(Sales)

Resident Orders

Group By EmployeeID, Year;

8 Replies
vlakeshr
Contributor

Re: Reduce the reload time within 1 hoour or 45 min

Please note that Data Volume is 20 Million

Re: Reduce the reload time within 1 hoour or 45 min

Perform grouping only on incremental data

brunomiguelsilv
New Contributor II

Re: Reduce the reload time within 1 hoour or 45 min

Hi,

20Million it doesn't seem that much. Where are you extracting that data from? Are you sure that you do not have a network performance problem?

Besides that, do you really need to group by Year? I don't know your requirements, but personally I don't see the need of that. Sum is a quite fast operation in frontend. So if you already have your information group by Employee, it should be fast to get the year sum of that data.

Please clarify your requirement and explain better how are you extracting the data so people can help in a more focused way.

Kind Regards.

Not applicable

Re: Reduce the reload time within 1 hoour or 45 min

Following..

vlakeshr
Contributor

Re: Reduce the reload time within 1 hoour or 45 min

Data volume is 200 Million ..Sry It's not 20 million..

Data loading fast from QVD and resident load but it's taking much time for sum of measures ( 6 Fields) using group by

( 8 fields), takes over 30 minutes and use of 90% of the available memory (group by)

We have to add a calculation on the date, because orders picked till 3 o'clock next day have to record on current day.

This will take even more time, what will increase the time out of the limits of the timeframe.

Has anyone an idea where to change and where to add the calculation for best practice.

- build up the where exist() statements ?

-- speed up group by?

I expect there will be multiple "little" solutions so every idea is helpful


No need for full new script, just tips and tricks then i will figure it out

Re: Reduce the reload time within 1 hoour or 45 min

200 million records are quite a lot f data for an aggregation-load with 6 measures and over 8 fields. This won't be never fast and your load-times from 30 - 60 minutes isn't really bad. You might save time if you could remove some of the fields, keeping only one/two key-fields and adding the other fields afterwards per joining/mapping. Also a vertical and/or horizontal split of the task and adding them later again might be useful especially if your system runs out of memory and starts swapping with the virtual RAM. Another point which might reduce the load-times of this task is if the data are already sorted by loading them.

But before playing with a varity of rather small improvements I suggest to use an incremental approach for loading and transforming the data. Here you will find various posting to this topic and how data could be loaded in a optimized mode: Advanced topics for creating a qlik datamodel.

- Marcus

Re: Reduce the reload time within 1 hoour or 45 min

Try with this code, to see whether and how much this reduces the time needed to calculate the two tables. Check with 100mio or even 25mio rows:

Orders:

LOAD EmployeeID,

     Year,

     EmployeeID & '-' & Year AS EmployeeYear,

     Sales,

     ....

FROM Orders;

EmployeeSales: // Leave aggr. data in separate table, no join

NOCONCATENATE

LOAD EmployeeYear,

     Sum(Sales) AS Sales

RESIDENT Orders

GROUP BY EmployeeYear;

brunomiguelsilv
New Contributor II

Re: Reduce the reload time within 1 hoour or 45 min

Just like Peter, I would suggest to create an Id for those 8 fields that you need to aggregate. Create it with Autonumber, so it can perform faster, and then use it in the aggregation. Later you can Left Join it to you initial table, or use that ID field to connect both tables.

Either way you will get a lot of different ID, and like it was said before 200mio rows will always take time to aggregate. Be sure to aggregate to the minimum fields possible.

Community Browser