Solved: Re: Nested If as a dimension - Qlik Community

Report Inappropriate Content · ‎2015-03-27

I have the below Nested IF statement as a DIMENSION expression in my straight table but now my QVW is VERY SLOW when on open. I am using this dimension expression to hide rows or values when the value is null because the rules don't apply. It seems I can not use the below IF statement in the script because all the fields in the conditions belong to different tables unless I combine all the tables into one table. It doesn't make sense to do this. I can't use set analysis because it is not a calculated expression, right? I also tried the PICK/MATCH functions but I can't seem to get the syntax working. Any help/suggestions are greatly appreciated.

=if([VARIANCE ID] = 14 AND [Management Line]='Structured Finance' AND [Business Group 1] <> 'Covered Bonds', Null(),

if([VARIANCE ID] = 15 AND [Management Line]<>'Structured Finance' AND [Business Group 1] <> 'Covered Bonds', Null(), 1))

Report Inappropriate Content · ‎2015-03-29

Hello, several dims for sure will produce more than 1 row. However, the one you defined may have only two possible values: NULL and 1. ValueList(1) would produce a synthetic dim and might be necessary in case your dimension is the only one.

since you do have some other dims - ValueList(1) is not needed. Try to move this logic into your expression. As a first step - remove the dimension and put your conditions into expression. Instead of count(dimension) you may use:

=sum(

if([VARIANCE ID] = 14 AND [Management Line]='Structured Finance' AND [Business Group 1] <> 'Covered Bonds', 0,

if([VARIANCE ID] = 15 AND [Management Line]<>'Structured Finance' AND [Business Group 1] <> 'Covered Bonds', 0, 1))

)

(+ as Marcus said this may be simplified). If this works (you see at least some performance improvement) - next step is replacing all the IFs with set analysis.

As a last resort - redesign your data structure to keep all the columns (that you use in this straight table) in 1 in-memory table (use table viewer to review how it looks).

The steps I provide all based on following simple observation:

1. Simple, field-based dims usually faster than calculated dims (in most cases)

2. Set analysis usually faster than any conditional operators (if, pick/match etc)

3. If any object definition is based on fields from ONE internal in-memory table then it works faster (calc time is lower) comparing to when it is based on TWO or more in-memory tables. This may require redesign data structure, change LOAD scripts (add joins, etc) and increase load/re-load a bit BUT it may save much calc. time.

View solution in original post

marcus_sommer · ‎2015-03-27

Your two conditions are quite similar. Therefore they could be reduced to one condition:

if(match([VARIANCE ID], 14, 15) AND [Management Line]<>'Structured Finance' AND [Business Group 1] <> 'Covered Bonds', Null(), 1)

Further you could speed up such calculations if the conditions didn't compare strings else numeric values, maybe like this:

if(match([VARIANCE ID], 14, 15) AND [Management Line ID]<> 50 AND [Business Group 1 ID] <> 5, Null(), 1)

- Marcus

Report Inappropriate Content · ‎2015-03-27

The straight table consts of 1 row, right? If I got the question correctly:

1. Use =ValueList(1) as a calculated dimension and move all the IF stuff into expression

2. Use Set Analysis instead of IFs

3. Why it doesn't make sense to create a single table? This is the best option from performance point of view. It worth trying if points 1,2 do not help to reach good performance

Report Inappropriate Content · ‎2015-03-28

Hi Andrei,

Thank you for your input.

I have a single straight table that consists of many rows with several dimensions and 1 expression with a count(dimension). One of my dimensions I use the expression above as a calculated dimension to hide the rows when it is null. I guess I don't know how I would use set analysis in either my dimension expression or calculated expression in this scenario. Any guidance is greatly appreciated.

Lastly, I am not understanding how I would use =ValueList(1) in my example. Can you explain in more detail?

Melinda

Report Inappropriate Content · ‎2015-03-29

Hello, several dims for sure will produce more than 1 row. However, the one you defined may have only two possible values: NULL and 1. ValueList(1) would produce a synthetic dim and might be necessary in case your dimension is the only one.

since you do have some other dims - ValueList(1) is not needed. Try to move this logic into your expression. As a first step - remove the dimension and put your conditions into expression. Instead of count(dimension) you may use:

=sum(

if([VARIANCE ID] = 14 AND [Management Line]='Structured Finance' AND [Business Group 1] <> 'Covered Bonds', 0,

if([VARIANCE ID] = 15 AND [Management Line]<>'Structured Finance' AND [Business Group 1] <> 'Covered Bonds', 0, 1))

)

(+ as Marcus said this may be simplified). If this works (you see at least some performance improvement) - next step is replacing all the IFs with set analysis.

As a last resort - redesign your data structure to keep all the columns (that you use in this straight table) in 1 in-memory table (use table viewer to review how it looks).

The steps I provide all based on following simple observation:

1. Simple, field-based dims usually faster than calculated dims (in most cases)

2. Set analysis usually faster than any conditional operators (if, pick/match etc)

3. If any object definition is based on fields from ONE internal in-memory table then it works faster (calc time is lower) comparing to when it is based on TWO or more in-memory tables. This may require redesign data structure, change LOAD scripts (add joins, etc) and increase load/re-load a bit BUT it may save much calc. time.

Report Inappropriate Content · ‎2015-04-01

Thanks for the responses! I've tried almost everything however, I am still having performance issues when opening the QVW. I can't use set analysis since this logic is on the dimensions and there are no expressions or metric calculations in the app. I am to the point of making ONE internal in-memory table but I am having some syntax issues trying to join all my tables. Can you confirm that the syntax below in my JOINS are correct? Basically I am trying to join 2 different tables to another. Please note, one of my tables is over 43 million rows so I am hoping I don't crash the QV dev server when doing this 1 in memory table.

Tables:

COMMITTEE – has 43 million rows
RATINGFACT – has 1.3 million rows
SERVICEDIM – has 81,000 rows

Here are what I think my JOINS should be but I am not 100%

COMMITTEE:
LOAD

RATING_COMMITTEEDIM_KEY,

etc…

ISSUERATINGFACT:
OUTER JOIN (COMMITTEE)
LOAD DISTINCT

RATING_COMMITTEEDIM_KEY,

SERVICEDIM_KEY

etc…

SERVICEDIM:

INNER join (ISSUERATINGFACT)

LOAD DISTINCT

SERVICEDIM_KEY,

etc…

NickHoff · ‎2015-04-01

To do a PICK you use the following.

RenameMe:

LOAD * INLINE [

%VarID, %VarDesc

1, VarianceID14Desc

2, VarianceID15Desc

];

Expression for PICK:

PICK(%VarID,

Expression for Variance14,

Expression for Variance 15))

NickHoff · ‎2015-04-01

In response to your joins, the logic is an issue. You should be joining to your fact table, not joining your fact to a dimension. Also, you might not need to have distinct on your load unless those tables are dirty data not normalized. Last of all, we run a lot more than 43 million rows "in memory" close to 500 million and we don't crash the server but we have a lot invested in our hardware 1TB of RAM each server and model our applications correctly where the final stage is in optimized qvd load format.

ISSUERATINGFACT:
LOAD

RATING_COMMITTEEDIM_KEY,

SERVICEDIM_KEY

etc…

LEFT JOIN(ISSUERATINGFACT)

LOAD

RATING_COMMITTEEDIM_KEY,

etc…

INNER join (ISSUERATINGFACT)

LOAD

SERVICEDIM_KEY,

etc…

marcus_sommer · ‎2015-04-02

To join all tables together is rather not a recommended approach. Often there are performance advantages in gui-calculations when fact-tables are matched but not by the dimension-tables.

Further important is the kind and amount of matching. In genereal you could use joining, concatenate and/or mapping whereat joining is a lot more difficult if you don't know the data-quality very good. Often fact-tables aren't really linked to eachother and could be simply concatenated.

Whereas mapping with applymap is very useful if cases of bad data-quality and/or if you only need to transfer a few fields or informations from one table to the other. This meant some fields will be transfered to another table and used for some calculations maybe to create flags but the datamodel itself don't need necessarily be changed.

To Join or not to Join

Don't join - use Applymap instead

Concatenate vs Link Table

Then you should check if you need really all fields and if you could optimize some fields and/or keys and that you have a proper datamodel.

The Importance Of Being Distinct‌

Symbol Tables and Bit-Stuffed Pointers‌

Synthetic Keys‌

Circular References.

Also you could use set analysis in a expression to filter out dimension-values then if the expression returned NULL dimension-values could be hidden (options by dimensions and within the tab presentations).

Beside these things you should check your server ressources (I think at least 16GB RAM for these app) and settings (cores disabled, working set - settings, ...) then if the system swap RAM into the pagefile it will be very slow.

- Marcus

Report Inappropriate Content · ‎2015-04-02

Hey,

Probably i am over simplifying here but is there a reason why you do not want to take your expression to the script and build a field out of it?

Regards

RL