Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Experts,
With the below scripts, I have 4 tables RAWDATA, Rule_1, Rule_2 & Rule_3 . But I want only RAWDATA with Flags created in different tables in it.
I have below script in place where RAWDATA table has all the dimensions along with Rule_4_Flag and Rule_5_Flag created from other tables(Rule_34 & Rule_5)...but I want to include Rule_1_Flag, Rule_2_Flag & Rule_3_Flag into RAW Data as dimensions.
RAWDATA:
LOAD *
FROM [lib:// Checks/ DATA_2019.QVD] (qvd);
///////////////////////////////////////////////Rule-1_/////////////////////////////////////////////
NoConcatenate
Rule_1:
LOAD *,
if((VALUE)>100,'Yes','No') as Rule-1_Flag
Resident RAWDATA
where match (DataID,'36950','42915');
///////////////////////////////////////////////Rule-2_////////////////////////////////////////////
NoConcatenate
Rule_2:
LOAD *,
if((VALUE)<0,'Yes','No') as Rule-2_Flag
Resident RAWDATA
where match (DataID,'36950');
///////////////////////////////////////////////Rule-3/////////////////////////////////////////////
NoConcatenate
Rule_3:
Load *,
//if(Frac(VALUE)>0,'WithDecimal','WithoutDecimal') as Decimal_Flag
if(SubStringCount(VALUE,'.')>=1,'WithDecimal','WithoutDecimal') as Decimal_Flag
Resident RAWDATA
where match (DataID,'60685');
///////////////////////////////////////////////Rule-4//////////////////////////////////////////////////////////
NoConcatenate
Rule_4:
LOAD CompanyID, [Time Series Year], VALUE As Value1
Resident RAWDATA
Where DataID=60634843285;
Left Join(Rule_4)
LOAD CompanyID, [Time Series Year], VALUE As Value2
Resident RAWDATA
Where DataID=60634843286;
Left Join(RAWDATA)
LOAD CompanyID, [Time Series Year], If(Value1<Value2, 'Yes','No') As Rule_4_Flag
Resident Rule_4;
Drop Table Rule_4;
///////////////////////////////////////////////Rule-5//////////////////////////////////////////////////////////
NoConcatenate
Rule_5:
LOAD CompanyID, [Time Series Year], VALUE As Value1
Resident RAWDATA
Where DataID=60634843285;
Left Join(Rule_5)
LOAD CompanyID, [Time Series Year], VALUE As Value2
Resident RAWDATA
Where DataID=60634843321;
Left Join(RAWDATA)
LOAD CompanyID, [Time Series Year], If(Value1<Value2, 'Yes','No') As Rule_5_Flag
Resident Rule_5;
Drop Table Rule_5;
Hi,
The only way you will make it so that data load is performant is to create the flags when you are creating the QVD, rather than after loading from it. The first load here is going to be an Optimised QVD Load, which will be super quick. All subsequent loads, from the RESIDENT tables will not be optimised.
Take a read of my blog post on optimised loads, as this goes into a fair bit of detail of what you can an cannot do with an optimised load:
https://www.quickintelligence.co.uk/qlikview-optimised-qvd-loads/
You are also going to have serious issues with how the code is written at present with synthetic keys. Even with the first two loads, if the QVD has columns A, B and C you are loading from these, and then creating a second table with columns A, B, C and Rule-1_Flag. Qlik will need to create a key which combines A, B and C in a synthetic key, this will be disastrous for performance, and if you have a high number of rows or columns it will crash things.
It will not be optimised, but you will get a much better result doing the following:
LOAD
*,
if(DataID = 36950 or DataID = 42915, if(VALUE)>100,'Yes','No'), null()) as Rule-1_Flag,
if(DataID = 36950, if((VALUE)<0,'Yes','No'), null()) as Rule-2_Flag,
if(DataID = 60685, if(SubStringCount(VALUE,'.')>=1,'WithDecimal','WithoutDecimal'), null()) as Decimal_Flag
FROM [lib:// Checks/ DATA_2019.QVD] (qvd);
I'm not sure what is going on with the joins in the subsequent checks, but you need to ensure that you don't create any synthetic keys in when you are creating those rules either. If it the checks can be done when you first load from the QVD (assuming you can't do it when you create the QVD) then that will be better.
Hope that helps.
Hi,
The only way you will make it so that data load is performant is to create the flags when you are creating the QVD, rather than after loading from it. The first load here is going to be an Optimised QVD Load, which will be super quick. All subsequent loads, from the RESIDENT tables will not be optimised.
Take a read of my blog post on optimised loads, as this goes into a fair bit of detail of what you can an cannot do with an optimised load:
https://www.quickintelligence.co.uk/qlikview-optimised-qvd-loads/
You are also going to have serious issues with how the code is written at present with synthetic keys. Even with the first two loads, if the QVD has columns A, B and C you are loading from these, and then creating a second table with columns A, B, C and Rule-1_Flag. Qlik will need to create a key which combines A, B and C in a synthetic key, this will be disastrous for performance, and if you have a high number of rows or columns it will crash things.
It will not be optimised, but you will get a much better result doing the following:
LOAD
*,
if(DataID = 36950 or DataID = 42915, if(VALUE)>100,'Yes','No'), null()) as Rule-1_Flag,
if(DataID = 36950, if((VALUE)<0,'Yes','No'), null()) as Rule-2_Flag,
if(DataID = 60685, if(SubStringCount(VALUE,'.')>=1,'WithDecimal','WithoutDecimal'), null()) as Decimal_Flag
FROM [lib:// Checks/ DATA_2019.QVD] (qvd);
I'm not sure what is going on with the joins in the subsequent checks, but you need to ensure that you don't create any synthetic keys in when you are creating those rules either. If it the checks can be done when you first load from the QVD (assuming you can't do it when you create the QVD) then that will be better.
Hope that helps.
Unless I am misunderstanding something, the code above is dealing with two DataIDs in this line:
if(DataID = 36950 or DataID = 42915,
I've removed the single quotes, as all your IDs seem to be numbers and comparing numbers is inherently quicker than comparing strings.
If I have missed something please explain.
Thanks.
that's right... I missed this point to include.. my bad.
for flag 2, i want to apply for certain DataIDs..
(DataID,'36950','42999','60471','60477','60475','60475','60473');
How can I handle if the more number of DataIDs to include ?
Hi,
You can still use the match statement as you had in your original code. I just used = as there were only two options in the example you gave. For three or more options match makes sense.
I would try it with and without quote marks, it may perform better without.
Steve