Slow loading & large resource consumption of QV - Qlik Community

Report Inappropriate Content · ‎2010-10-31

Hi,

During the evaluation process of QV using it as front end to our analytical platform database.
I created a test case which during this test I'm loading 18 tables which has some common dimensions
(you can see that 2 dimensions are shared across all tables)
and there some additional shared dimensions
The biggest table is 37760 rows which translate to a 1370KB QVD
There is another 33299 rows table which translate to a 1569KB QVD
and the rest are small tables (not more than 1400 rows).

QV memory fingerprint after this load is 1.4GB
The loading process itself was rather quick, finding the synonyms across the tables started in a reasonable pace
for the first level, but getting slower & slower in each indirection level (when syn started refrencing other syn's)

Does anybody can tell me what am I doing wrong (or is that the expected behaviour from the tool)

Attached you can find the loading log

Started at 11:13:52

METRIC_REPOSITORY << scheduled_entities 63 lines fetched
Music Categories Interest << view_results 35 lines fetched
view_results 35 lines fetched
view_results 35 lines fetched
view_results 35 lines fetched
Content By Offer << view_results 364 lines fetched
view_results 364 lines fetched
Navigation Paths << view_results 37,790 lines fetched
view_results 37,790 lines fetched
Offers Conversion Rate << view_results 28 lines fetched
view_results 28 lines fetched
view_results 28 lines fetched
view_results 28 lines fetched
view_results 28 lines fetched
Portal Categories << view_results 49 lines fetched
view_results 49 lines fetched
Handsets << view_results 616 lines fetched
view_results 616 lines fetched
view_results 616 lines fetched
view_results 616 lines fetched
Content Conversion Rate << view_results 455 lines fetched
view_results 455 lines fetched
view_results 455 lines fetched
view_results 455 lines fetched
Pages << view_results 182 lines fetched
view_results 182 lines fetched
view_results 182 lines fetched
view_results 182 lines fetched
Content Errors << view_results 33,299 lines fetched
view_results 33,299 lines fetched
view_results 33,299 lines fetched
view_results 33,299 lines fetched
Weekly Indicators << view_results 9 lines fetched
view_results 9 lines fetched
view_results 9 lines fetched
view_results 9 lines fetched
view_results 9 lines fetched
view_results 9 lines fetched
Daily Indicators << view_results 63 lines fetched
view_results 63 lines fetched
Golden Ratio With Segment << view_results 1,169 lines fetched
view_results 1,169 lines fetched
view_results 1,169 lines fetched
view_results 1,169 lines fetched
view_results 1,169 lines fetched
view_results 1,169 lines fetched
view_results 1,169 lines fetched
view_results 1,169 lines fetched
view_results 1,169 lines fetched
Golden Ratio << view_results 1,169 lines fetched
view_results 1,169 lines fetched
view_results 1,169 lines fetched
view_results 1,169 lines fetched
Handsets KPI << view_results 1,323 lines fetched
view_results 1,323 lines fetched
Mobile Ads << view_results 42 lines fetched
view_results 42 lines fetched
view_results 42 lines fetched
Error Codes KPI << view_results 119 lines fetched
view_results 119 lines fetched
Conversion Rates KPI << view_results 42 lines fetched
view_results 42 lines fetched
Errors KPI << view_results 42 lines fetched
view_results 42 lines fetched
$Syn 1 = Start Time+End Time
$Syn 2 = Start Time+End Time+Ad
$Syn 3 = Start Time+End Time+Device
$Syn 4 = Start Time+End Time+Category
$Syn 5 = Start Time+End Time+Category+Content
$Syn 6 = Start Time+End Time+Resource Name
$Syn 7 = Start Time+End Time+Resource Name+Previous Music Sub Category
$Syn 8 = Start Time+End Time+Offer

Slow ...
$Syn 9 = $Syn 1+$Syn 8
$Syn 10 = $Syn 1+$Syn 6
$Syn 11 = $Syn 1+$Syn 6+$Syn 7
$Syn 12 = $Syn 1+$Syn 4
$Syn 13 = $Syn 1+$Syn 4+$Syn 5
$Syn 14 = $Syn 1+$Syn 3
$Syn 15 = $Syn 1+$Syn 3+$Syn 6+$Syn 7
$Syn 16 = $Syn 1+$Syn 2

Very Slow ...
$Syn 17 = $Syn 12+$Syn 13
$Syn 18 = $Syn 10+$Syn 11
$Syn 19 = $Syn 10+$Syn 11+$Syn 14+$Syn 15

Very Very Slow ....
$Syn 20 = $Syn 18+$Syn 19

Ended at 11:25:20 ..

Thanks,
Avi

Anonymous · ‎2010-10-31

Avi,
The problem is that as soon as you have more than one common field in two logical tables, QlikView automatically creates synthetic keys ($Syn...). You first have to define how you data model should look like (on paper), based on this find out what fields to use as the keys (common name), and rename the fields that should not be used as the keys.
For the expected load time test, you can use "QUALIFY *" statement at the start of the script so all field names will be different, no keys at all, and see how long it will take to load.

Report Inappropriate Content · ‎2010-10-31

Thanks Michael .

I understand this concept and the common fields represent real relations which will be required by qlikview.

Is there any real limitation as far as how many relations exists between the different tables ?

This is a real concern since in real life scenario the tables will have more rows & the same relationships .

I guess that using QUALIFY which will ignore real relationship will be somehow missing the point of qlikview (or am I missing something in the way qlikview works...)

Thanks again,

Avi

Anonymous · ‎2010-10-31

Avi,
Both statements are correct:
- QUALIFY will ignore real relationship. I recommend it only to test the expected reload time, not for using in real application.
- You are missing something essential in the way qlikview works. Two tables in QlikView should be connected only by one common field. There could be exceptions, but don't go there unless you have a lot of experience with design.

The fact that you have synthetic keys is the edivence of poor data model design. As I understand, you don't think about this yet, and simply load database tables as is to see what happens. In most cases it is not what you're expected to do. You have to build data model first on paper or in your head. In most cases but not always it would be a star schema with facts (measures) in the middle and dimensions around it. And, no loops, otherwise the reload fails.
You need to learn the ropes of the data model design first. I think there are training presentations about it on Qlikview website.

Edit:
To summarize, the problem is not slow loading and high resource consuption, but the data model design.

Oleg_Troyansky · ‎2010-10-31

Avi,

you are struggling with complex data modeling - perhaps the most complex and demanding issue within any BI development process, and QlikView in particular...

In a nutshell, experienced QlikView developers use a number of data modeling techniques to overcome those issues, but this might require too much specific knowledge from someone who is still evaluating QlikView (I assume, no formal training, no prior experience...)

If you are serious about evaluating QlikView and understanding if it's good for your organization, I'd recommend to spend a little bit of money and let an experienced QlikView developer do it for you - either request a SIB ("Seeing is Believing") from your local QlikView reseller, or hire an experienced QlikView consultant to do it for you.

Metaphorically speaking - if you never played violin, you won't assess the quality of an instrument based on the sounds that YOU are able to produce with it, right? You'd invite a violinist to play for you...

If you still want to evaluate the violin by yourself - search this forum for "Link Tables" and "Concatenation" as data modeling techniques that we use to simplify complex data structures.

??????

Oleg Troyansky
Call me for expert support, performance tuning, advanced training and more!

Report Inappropriate Content · ‎2010-10-31

First I would like to thank you all for taking the time to answer my posts.

Let me try and explain what I'm trying to do and than you could comment on the model I'm using.

The base fact tables that I have are quite huge in terms of volumes (we're talking about several tables each can reach to around 1B records a day). We have a platform that manage a more manageble tables which are derived from these fact tables and structured in the following way ;

Dim1, Dim2, Dim3, Dim4, Time Dim, Measurement 1, Measurement 2 ... Measurement N

These are basicaly aggregation tables which simulates a simple OLAP (the actual tables are cubes with sub aggregations but I found that qlikview can not support these sub aggregation very well so I'm filtering these rows and I'm loading only the "clean" group by results set (and not the group by cube results set)).

The thing about these metric tables is that they share common dims columns. One table can have : dim1, dim2, m1, m2 and the other can have dim1,dim3,dim4, m3 and so on.

My understanding (and you may correct me if i'm wrong) is that it is not feasable to load the fact tables (1B rows a day) and provide analytics for periods of months/years. Since we have usually around 40 different dims in the model, it also not realistic to load a full group by using all ~40 dims since it will not reduce the number of rows considerably.

So what we have is a set of aggregation tables each with up to 4 dims + time dim + various measurements which create a limited space for the metric model (limited in a way it does not contain all dims combinations for each metric)

If you guys have any idea how we can use these tables in Qlikview to support these various metrics visulization + the ability to perform qlikview powerfull contextual switch using the selection concept by navigating between different views.

Thanks again,

Avi

Anonymous · ‎2010-10-31

Avi,
You're certainly not the first with this sort of question. Oleg just recommended to search for "concatenation" - this is a typical solution in cases like this.
In real life, it is rather a rule than an exception to have multiple measures, and different measures are using not the same set of dimensions. The most common (although not the only) way to solve it is to concatenate all mesaures into one "fact" logical table in QlikView application.
I think you need help from an experienced Qlikview consultant, from QlikTech or from one of the partners.
The best of luck,

Report Inappropriate Content · ‎2010-10-31

Thanks, will do .

Oleg_Troyansky · ‎2010-10-31

Avi,

yes, your understanding is correct - 1B rows a day, over several months is a bit too much for any in-memory model. So, if you still want to use the advantages of in-memory processing, you need to reduce the granularity of your data.

In order to do it in QlikView, you need to find the "common level of detal" for various fact tables and keep all the data at that level. It's hard to fisualize without a specific example...

Let's say you are trying to compare visits to your web site with the volume of purchases. In your transactional data, you might have billions of website hits, timestamped up to millisecond, and perhaps thousands of purchases. If you determine that the acceptable level of detail is daily summary, you need to summarize your transactions at the daily level, and (unfortunately) aggregate your key measurements to the same level.

What you cannot do in QlikView is keep the same data at different levels of detail (for example, this table holds data by Dim1-Dim2-Dim3 and that table holds the same data by Dim1-Dim2).

If several fact tables have multiple dimensions in common, you can either concatenate them into a single fact table ("concatenate") or create a composite key if you can find a sinlge set of dimensions that repeats itself ("Link Table").

Oleg Troyansky
Call me for expert support, performance tuning, advanced training and more!