Discussion Board for collaboration on QlikView Scripting.
I have a huge data in my database. I have 500 million records in fact table for 2 years. Business needs atleast 2 years of data in the report and to the detailed sub product level.
I tried different approaches like document chaining. QVD size is coming to 15 GB and taking more than 5 hours to load. QVW is occupying 4 GB. I even tried incremental load but final qvd and qvw size is a concern. Can you please suggest any other approach which can handle big data.
Thanks and Regards
I think document chaining is the only option,
Create a landing QVW - like home QVW where you can select a month/Quarter/Semiannual/Annual. Then based on that take them to the QVW they should land on for details.
So this way you can have parallel reloads for detail QVWs.
Perhaps you could reduce the count of distinct values - besides not really required fields - to save space in qvd and RAM. Fields like a timestamp could be split in dates and times, also rowno() or combined keys need a lot of space.
I tried document chaining in monthly level. Also to subject area like Product.
I am taking necessary columns only. 5 measures and 10 dimensions are there including country, vendor, product, buyer, sale date, ship date, delivery date, return date.
There are no columns with timestamp
Did you try with incremental load? I think that all transformational functions must stay before save in qvd, after that, you can load data "faster" and show for your users...
When Large Data,
In an organisation that has masses of transactional data going back 2 years,
you don’t want to have to load all 2 years including the current years data every night.
One solution to this is below:
Create QVD Files for each year you wish to analyse. These can be created dynamically based on how many
years you wish to look at. Suppose you are looking two years,The script below show how to create a simple QVD file based on a moving date clause.
FROM C:\Data.xls (biff, embedded labels, table is [Core Data$])
Where Year(Date) = YEAR(TODAY())-2;
Store 'Year-Two' into C:\Year-Two.qvd;
Drop Table 'Year-Two';
You're probably doing this already, just in case - make sure the QVD load is optimized.
And, on the requirements level... In a similar situation, a client was OK with aggregating data by month, except for the last three months where they needed daily granularity.
I am already loading aggregated data. But as users need weekly or monthly aggregated data, we were not able to reduce the load. (~200M records we have after aggregation because of the dimension level data). Is there any other approach you can suggest with which we can handle this huge data.
I have ~200M records for each year and already splitting qvds. But it is a serial load when I load to qvw. Even if splitted qvd approach didn't help to reduce reload time.