Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
What is the best way to introduce a checksum in my load script, and fail when this is not as expected.
Until now I have used a "KPI" chart in my application, where if(sum(facts)-sum(reference)=0,'OK','Diff') However I would prefer my script to fail before saving a new version if the checksum is wrong. Both to be notified immediately of errors when working on an application but also to ensure end users keep last good load if this happens due to poor data quality in production. (Typically this happens with joins where something is expected to be unique which suddenly isn't)
Here's an example where the last piece is for explanation and not expected to work.
Reference:
Load * Inline [
CatREF,ValueREF
a , 1
b , 2
c , 3
];
Facts:
Load
CatREF as Cat
, ValueREF as Value
Resident Reference;
//<--- adding stuff to the Facts table which may duplicate content by accident-->
if
sum(Value)-sum(ValueREF)=0
then
Trace Checksum OK 😄 ;
Else
Trace Checksum Failed;
Exit Script;
Thanks for your suggestion.
This is how i ended up solving it:
A) Creating a SUB
SUB CheckDiff (vRowTable)
Let vNowRows = NoOfRows('$(vRowTable)');
Let vDiff = vNowRows-vStartRows;
TRACE --------------
You started with $(vStartRows)
. Now you have $(vNowRows)
Diff is $(vDiff);
if vDiff = 0 then
Trace Everything is fine ;
else
Trace -------------- Mismatch aborting script --------------;
Load [Aborting because the table has been duplicated] ;
end if
End SUB;
B) Running this every time I manipulate the table , to check that nothing has duplicated.
Let vStartRows = NoOfRows('MyTable'); //At first load
Call CheckDiff('MyTable'); //Every time I want to check consistency
I agree that reducing risk of duplicates is a good thing, but verification still gives me comfort. I wear safety belt although my main priority is not to crash in the first place. And I have experienced duplicates even when using mapping.
To Qlik I'll just mention that it's a mystery to me why there is no way of executing a load gracefully. Having to do this by making a deliberate mistake hurts my feelings. Also it makes it more difficult to give an explanation within the error itself when this is triggered.
For this you will need two extra loads, like:
tx: load sum(Field) as ReferenceX resident X;
...
ty: load sum(Field) as ReferenceY resident Y;
if sign(fieldvalue('ReferenceX', 1) - fieldvalue('ReferenceY', 1)) then
error ...
end if
But just with this you won't know what your error is and what has caused it. You may get some more information if you would store this information before you break the script-execution.
Personally, I would tend to avoid the mentioned risk by replacing the join-measures with mappings.
- Marcus
Thanks for your suggestion.
This is how i ended up solving it:
A) Creating a SUB
SUB CheckDiff (vRowTable)
Let vNowRows = NoOfRows('$(vRowTable)');
Let vDiff = vNowRows-vStartRows;
TRACE --------------
You started with $(vStartRows)
. Now you have $(vNowRows)
Diff is $(vDiff);
if vDiff = 0 then
Trace Everything is fine ;
else
Trace -------------- Mismatch aborting script --------------;
Load [Aborting because the table has been duplicated] ;
end if
End SUB;
B) Running this every time I manipulate the table , to check that nothing has duplicated.
Let vStartRows = NoOfRows('MyTable'); //At first load
Call CheckDiff('MyTable'); //Every time I want to check consistency
I agree that reducing risk of duplicates is a good thing, but verification still gives me comfort. I wear safety belt although my main priority is not to crash in the first place. And I have experienced duplicates even when using mapping.
To Qlik I'll just mention that it's a mystery to me why there is no way of executing a load gracefully. Having to do this by making a deliberate mistake hurts my feelings. Also it makes it more difficult to give an explanation within the error itself when this is triggered.