Alternative way to avoid duplicate records in qlik... - Qlik Community

deep2021 · ‎2023-04-26

Hi,

Can you please suggest the alternative way for where not exist clause using concatenation to avoid duplicate records,

I am doing the calculations like below,

Document:

Load

distinct

autonumber(CP_Key,'CP_Key') as CP_Key

,autonumber(CP_Key,'CP_Key_Temp') as CP_Key_Temp

,docType

,docPathPattern

,protocol

,location

,cutoff

,version

FROM [lib://MSS_MCA_XDS_Base_Folder/QVD/Extractor/2.1/Raw/teble1.qvd]

(qvd);

Concatenate(Document)

$(tmp_statement)

load

distinct

autonumber(CP_Key,'CP_Key') as CP_Key

,docType

,docPathPattern

,protocol

,location

,cutoff

,version

FROM [lib://MSS_MCA_XDS_Base_Folder/QVD/Extractor/2.1/Raw/teble2.qvd]

(qvd)

where not exists(CP_Key_Temp,autonumber(CP_Key,'CP_Key'))

Can you please suggest on this.

Thanks

Digvijay_Singh · ‎2023-04-26

Is there any problem in the way you are trying? I normally use this way only to avoid duplicates. Sample data would help to understand the problem better.

marcus_sommer · ‎2023-04-27

IMO the use of autonumber() is rather not expedient - means no direct benefit but adding more complexity. Further it depends on the kind of the merging-requirements if a field-duplication is needed to define the wanted exists-statement.

Be aware that exists() isn't related to any table else checks against all already loaded field-values which means the order of load-statements could impact the results and it might be sensible to change them and/or dividing the task into multiple applications - especially because by using exists() with two parameters the load couldn't be performed optimize anymore.

deep2021 · ‎2023-04-27

Hi,

It is showing me out of memory issue when I m trying for the larger datasets.

My reload is failing. My error is as per the below,

20230426T170745.516+0100 9 fields found: CP_Key, CP_Key_Temp, docType, docPathPattern, protocol, location, cutoff, version, ccyIndexInstrument,
20230426T175509.029+0100 -129 =
20230426T175534.758+0100 Execution Failed
20230426T175534.764+0100 Execution finished.

Sometimes the task is failing without giving any error msg.

Can you please suggest on this.

Thanks

marcus_sommer · ‎2023-04-27

There are various reasons possible. The simplest is it are really to much data for your current environment. But more likely are any mistakes within the script, for example further loads before and after your shown snippet, for example the $(tmp_statement) before the second load.

Helpful in such cases is to look within the document-log (it should be in general enabled and set to the instant writing mode to get log-information in the case the load breaks) and/or commenting some load-parts and looking what happens and which results the tables get and/or reducing the number of records with further where-clauses and/or first statements.

deep2021 · ‎2023-04-27

I have removed $(tmp_statement). My concern is how can I replace where not exist clause to remove duplicates.

Is there any alternative way.

I have huge data sets.

I am unable to preform the below calculation with where condition.

where not exists(CP_Key_Temp,autonumber(CP_Key,'CP_Key'))

Thanks

Alternative way to avoid duplicate records in qlik script

Developers

filter

General Question

Script