Solved: Performance problem with WHERE clause - Qlik Community

Report Inappropriate Content · ‎2009-09-07

Hello everybody,

we have a performance problem while reloading an application. The used data stored in a qvd-file is only 50MB. The script consists of a lot of statements like the following:

code

Load

Jahr & Monat & Kostenstelle_Nr & Betrag_Plan & Konto as %DatensatzKey_PLAN,

Betrag_Plan as [Rohertrag_Plan]

RESIDENT IST_Plan

Where Konto_Gruppe Like 'Kundenbeitrag'

OR Konto_Gruppe Like 'Pflegeertrag'

OR Konto_Gruppe Like 'Dienstleistungsertrag'

OR Konto_Gruppe Like 'Warenertrag'

OR Konto_Gruppe Like 'Erlösschmälerungen'

OR Konto_Gruppe Like 'sonstige betriebliche Erträge'

OR Konto_Gruppe Like 'Warenaufwand'

OR Konto_Gruppe Like 'Dienstleistungsaufwand'

OR Konto_Gruppe Like 'Hilfsmittel Pflege'

OR Konto_Gruppe Like 'Aufwand für bezogene Leistungen'

OR Konto_Gruppe Like 'Nachlässe';

/code

Is there a better way of scripting that with a better performance?

Thank you very much in advance!

Cheers

tobklug

johnw · ‎2009-09-08


Oleg Troyansky wrote:I haven't done any benchmark testing (disclaimer for all the purists out there 🙂 ) but I'd expect exists() to be much faster than any other forms of "WHERE"-conditions.

I've done some performance comparisons. Exists() should be MUCH faster than match() in a case like this because it will allow you to get an optimized QVD load. Here's how I would set it up:

[MyTable]:
LOAD * INLINE [
Konto_Gruppe
Kundenbeitrag
Pflegeertrag
Dienstleistungsertrag
Warenertrag
Erlösschmälerungen
sonstige betriebliche Erträge
Warenaufwand
Dienstleistungsaufwand
Hilfsmittel Pflege
Aufwand für bezogene Leistungen
Nachlässe
];
INNER JOIN ([MyTable])
Load
Konto_Gruppe,
Jahr & Monat & Kostenstelle_Nr & Betrag_Plan & Konto as %DatensatzKey_PLAN,
Betrag_Plan as [Rohertrag_Plan]
FROM MyQVD.qvd (QVD)
WHERE EXISTS(Konto_Gruppe)
;

Now in your case, you were loading from a resident table and didn't show the part where you loaded from the QVD, so I suspect there are other aspects of your script that should be optimized to be able to pull this off. And I agree with autonumbering the key or revising your data model.

Edit: Looking at this again, it's possible that concatenating all those fields together into %DatensatzKey_PLAN might prevent an optimized reload. If so, I'm not sure if it would still be faster as shown above, or with a match, or perhaps with a second pass through the data resident to build the key.

View solution in original post

biester · ‎2009-09-07

Hi,

you could try the match statement instead of the multiple or's, e.g.

where match(Konto_Gruppe,'Kundenbeitrag','Pflegeertrag','WasweißderTeufelsonstnoch') > 0

Perhaps it's worth a try,

Rgds,
Joachim

Report Inappropriate Content · ‎2009-09-07

Why do yo use "like" if you don't use '%' ?? in sql is faster the '=' than the 'like'.

Anyway if you need the "like" you can use "wildmatch" instead of "match"

Bue

Walter

Report Inappropriate Content · ‎2009-09-07

...thank you very much for the quick reply!

Using the match function speeds up the reolad and was new to me. Unfortunately the problem with 4GB RAM not beeing enough for the whole application to reload stays.

Report Inappropriate Content · ‎2009-09-07

When not loading from a QlikView table (resident) you could also try to include this where-statement in the sql-block when loading the data ... this would also speed up your query ...

Best regards
Stefan

Oleg_Troyansky · ‎2009-09-07

I'd recommend using function exists(), especially within a QVD load - this is the only form of "where" that is allowed in the optimized load.

I haven't done any benchmark testing (disclaimer for all the purists out there 🙂 ) but I'd expect exists() to be much faster than any other forms of "WHERE"-conditions.

Exists() doesn't take care of the "wild card" search - the values need to be spelled precisely. If performance is an issue, perhaps you could create a list of the relevant groups, spelling each name completely.

Ask me about Qlik Sense Expert Class!

Oleg_Troyansky · ‎2009-09-07

Another observation... You are creating a long concatenated key (out of 5 individual keys), and I bet this key takes HUGE amount of memory. A few suggestions with this regard:

1. Try encolsing your concatenated values in a function autonumber() and see if it helps with your memory problem.

2. Try revising your data model. It looks like you are building a large LinkTable. Maybe, concatenating data tables into one table could work better in your case, if it helps eliminating a huge concatenated key.

3. Generate Memory Statistics (Document Properties, General Tab) and load the generated file into QlikView Optimizer (one of the sample documents, shipped with QlikView). Examine your largest "memory consumers" - fields, tables. charts, etc... Try to eliminate the biggest ones.

I'm sure you can do a lot of memory saving before giving up on your 4GB RAM...

Cheers!

Ask me about Qlik Sense Expert Class!

johnw · ‎2009-09-08


Oleg Troyansky wrote:I haven't done any benchmark testing (disclaimer for all the purists out there 🙂 ) but I'd expect exists() to be much faster than any other forms of "WHERE"-conditions.

I've done some performance comparisons. Exists() should be MUCH faster than match() in a case like this because it will allow you to get an optimized QVD load. Here's how I would set it up:

[MyTable]:
LOAD * INLINE [
Konto_Gruppe
Kundenbeitrag
Pflegeertrag
Dienstleistungsertrag
Warenertrag
Erlösschmälerungen
sonstige betriebliche Erträge
Warenaufwand
Dienstleistungsaufwand
Hilfsmittel Pflege
Aufwand für bezogene Leistungen
Nachlässe
];
INNER JOIN ([MyTable])
Load
Konto_Gruppe,
Jahr & Monat & Kostenstelle_Nr & Betrag_Plan & Konto as %DatensatzKey_PLAN,
Betrag_Plan as [Rohertrag_Plan]
FROM MyQVD.qvd (QVD)
WHERE EXISTS(Konto_Gruppe)
;

Now in your case, you were loading from a resident table and didn't show the part where you loaded from the QVD, so I suspect there are other aspects of your script that should be optimized to be able to pull this off. And I agree with autonumbering the key or revising your data model.

Edit: Looking at this again, it's possible that concatenating all those fields together into %DatensatzKey_PLAN might prevent an optimized reload. If so, I'm not sure if it would still be faster as shown above, or with a match, or perhaps with a second pass through the data resident to build the key.

Report Inappropriate Content · ‎2009-09-09

Thank you all for your quick and detailed help! All these advices were helpful and taught me that it is time to focus more on different ways to script. So far I consider myself as a good developer who can achieve the required result. The next step should be to do that with better performance!

Does QlikTech offer a class for that? Maybe the "advanced scripting"?

Thanks again

Cheers