Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Reading Big Data (50 million +)

I have a table in Vertica with about 60 Million plus data. We have been able to use this data in QlikView by aggregating(creating views) and loading into QlikView. Business users are looking at ability to view more details in QlikView (say look at productid and some details of a particular product).  Loading the table in QV without the aggregations takes a long time (10+ hours) to load and therefore performance issues.

I have tried breaking the loads into multiple Qvds by year, this reduces the load time a little but when all qvds are combined the load still takes a long time and sometimes locks up. I know we can exclude data from year 2005 to 2008 and keep 2009 to current year but that still would be about 40 million records.

Any pointers on reading from one table with big data like this? How to go from weighted view to detail view and not affect performance? Suggestions?

Kate

5 Replies
Clever_Anjos
Employee
Employee

What´s your server configuration?

50 mi rows is not so big deal to QlikView

bbi_mba_76
Partner - Specialist
Partner - Specialist

Hi,

with preceding load you could have CPUs working in parallel, the reload depends on RAM either (as said by

Clever Anjos).

Data is on DB or in qvd files?

Not applicable
Author

60 millions rows isn't really that much. I have loaded more than that on my laptop and it has only taken a few minutes. Load issues are often caused by inefficient data models or hardware inadequacies. There is normally no need to pre-aggregate data:

Can you post a screen shot of the Table Viewer (Ctrl-T) to show us the data model? Common data modelling issues are

- circular joins

- synthetic keys

- inefficient schema with too many hops (star or snowflake schemas are the best if dealing with multiple tables)

- too many left joins (use mapping loads instead)

What is your hardware spec?

Not applicable
Author

Model.PNG.pngTtable.PNG.png

There is no synthetic key as am only reading a single table that is in vertica.

Not applicable
Author

The table should be loading quickly without any prior aggregation. If you are loading the data directly from the database then it may be worth investigating bottlenecks, e.g. the database server, the QlikView Server, with network bandwidth. How fast can other systems pull in data from the same database server, e.g. into Excel or Access? If it is just as slow you can eliminate QlikView as the bottleneck.

You say that loading the whole table takes 10 hours +. Try storing the whole table in a single QVD file (run overnight if need be) and try loading the QVD into QlikView. The reload should take mere minutes. If it does then there could be latency en route to the database server, or on the database server itself.

How big is the resulting QVD file on disk (with the 60 million rows of data)? And how much RAM does the box have that hosts QlikView?