Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Independent validation for trusted, AI-ready data integration. See why IDC named Qlik a Leader: Read the Excerpt!
cancel
Showing results for 
Search instead for 
Did you mean: 
deep2021
Creator III
Creator III

Issue with string comparison while loading a large data volume.

Hi All,

As per the need, I have to compare the strings. My calculation is as per the below,

 

    Bridge_Temp:
Load
EU_Key,
EU_Key_A_location,
EU_Key_A_WO_Location
Resident U;

join(Bridge_Temp)
Load
EU_Key,
E.EU_Key_E_location,
E.EU_Key_E_WO_Location,
E.entitlementGroup,
E.roleName
Resident E;

Bridge:
Load
EU_Key,
if(EU_Key_A_location=[E.EU_Key_E_location],[E.entitlementGroup],
if(EU_Key_A_WO_Location=[E.EU_Key_E_WO_Location] and match(SubField([E.EU_Key_E_location],'_',5),''),[E.entitlementGroup])) as [E.entitlementGroup_New],

if(EU_Key_A_location=[E.EU_Key_E_location],[E.roleName],
if(EU_Key_A_WO_Location=[E.EU_Key_E_WO_Location] and match(SubField([E.EU_Key_E_location],'_',5),''),[E.roleName])) as [E.roleName_New]

Resident Bridge_Temp;

drop table Bridge_Temp;

drop Fields E.EU_Key_E_location, E.EU_Key_E_WO_Location,EU_Key_A_location, EU_Key_A_WO_Location;
 
 
 
For small data sets it is working perfectly fine. But while loading a large data, it throws an error.
Can you please suggest the best possible way for the above calculations for big data sets.
 
Thanks
Labels (4)
1 Solution

Accepted Solutions
marcus_sommer

More interesting as the task-message would be the document-log to see when which error happens. If a script runs successfully with a smaller data-set and breaks with a larger one it could mean that it just takes too long and any timeout happens or any resource-threshold was hit and the execution was terminated or that there were any invalid data within the bigger data-set which couldn't be handled.

Beside this there might be potential for some performance optimization, for example by replacing the bridge_temp join with one or several mapping-tables. Mappings could be horizontally and vertically nested - from the table point of view as well as from the applymap() call - and they could be directly used without the need of following loads as joins would do. By loading the data with the right order within the mapping-table you may even avoid the applied if-loop checking.

Further your script indicates that a bridge-table is created which is quite often not the most suitable way to create a data-model. Officially recommended is to develop a data-model as a star-scheme ...

View solution in original post

3 Replies
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

What error message does it display?

-Rob

deep2021
Creator III
Creator III
Author

Hi Rob,

It is showing the message as per the below,

deep2021_0-1676935853338.png

 

This happens every time when I execute the above code for a large data set that I have.

Thanks

marcus_sommer

More interesting as the task-message would be the document-log to see when which error happens. If a script runs successfully with a smaller data-set and breaks with a larger one it could mean that it just takes too long and any timeout happens or any resource-threshold was hit and the execution was terminated or that there were any invalid data within the bigger data-set which couldn't be handled.

Beside this there might be potential for some performance optimization, for example by replacing the bridge_temp join with one or several mapping-tables. Mappings could be horizontally and vertically nested - from the table point of view as well as from the applymap() call - and they could be directly used without the need of following loads as joins would do. By loading the data with the right order within the mapping-table you may even avoid the applied if-loop checking.

Further your script indicates that a bridge-table is created which is quite often not the most suitable way to create a data-model. Officially recommended is to develop a data-model as a star-scheme ...