Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Deduplicating by few fields and not 1 in the load script in data load editor

Hi Experts,

I have ~10 fields in my table and I have records that I want to dedup by 4 fields i.e Owner, Subject, Date and Account only.  Below is the syntax that I started to put together that gives me error.  Can someone help with the code?

Thanks,

Monica

Unique_Activity:

NoConcatenate

LOAD *

RESIDENT Activity

Where Exists(Owner, Subject, [Date], [Account]);

Drop Table Activity;

11 Replies
sunny_talwar

Would you be able to share a sample of what you are wanting to do?

rupamjyotidas
Specialist
Specialist

Where Exists(Owner, Subject, [Date], [Account]); wont work,

Exists function compare filed already loaded with value in the field in the table currently loading

For Ex: Where Exists(Owner)  will works

Owner should be present in both tables

Not applicable
Author

Here is the sample of what I want to do -

I have 10 fields in my table- Owner, Subject, Date, Account, Field 5, Field 6 and so on to Field 10.  I have several records in my table that are duplicate by Owner, Subject, Date and Account and not duplicate by all the 10 fields.  I need only 1 distinct record by Owner, Subject, Date and Account instead of all getting all records. Since other field values are different it does not matter to me that it brings back only first record because I am doing count of the records in visualization

Here is my script that errors and does not work in data load editor. 

Thanks in advance for your help

Monica

Unique_Activity:

NoConcatenate

LOAD *

RESIDENT Activity

Where Exists(Owner, Subject, [Date], [Account]);

Drop Table Activity;

Not applicable
Author

Rewriting the request to be more clear:

Here is the sample of what I want to do -

I have 10 fields in my table- Owner, Subject, Date, Account, Field 5, Field 6 and so on to Field 10.  I have several records in my table that are duplicate by Owner, Subject, Date and Account and does not have same values for other 6 fields.  I need only 1 distinct record by Owner, Subject, Date and Account instead of all getting all records. Since other field values are different it does not matter to me that it brings back only first record because I am doing count of the records in visualization

Here is my script that errors and does not work in data load editor.

Thanks in advance for your help

Monica

Unique_Activity:

NoConcatenate

LOAD *

RESIDENT Activity

Where Exists(Owner, Subject, [Date], [Account]);

Drop Table Activity;

johnca
Specialist
Specialist

If you aren't concerned for the data in fields other than the 4 you mention, why not just bring those distinct fields in? Assuming the 4 fields are named the same thing in a previously loaded table not Activity.

Unique_Activity:

NoConcatenate

LOAD

     Owner,

     Subject,

     Date,

     Account

RESIDENT Activity

Where Exists(Owner)

And Exists(Subject)

And Exists(Date)

And Exists(Account);

Drop Table Activity;


If the field names are similar but different then the script would appear them like this;


Unique_Activity:

NoConcatenate

LOAD

     Owner,

     Subject,

     Date,

     Account

RESIDENT Activity

Where Exists(OwnerPreviousTable,Owner)

And Exists(SubjectPreviousTable,Subject)

And Exists(DatePreviousTable,Date)

And Exists(AccountPreviousTable,Account);

Drop Table Activity;

antoniotiman
Master III
Master III

Hi Monica,

try

Temp_Activity:
NoConcatenate
LOAD *,AutoNumber(RowNo(),Owner&Subject&Date&Account) as A
Resident Activity;
Drop Table Activity;

Unique_Activity:
NoConcatenate
LOAD *
Resident Temp_Activity
Where A=1;
Drop Field A;
Drop Table
Temp_Activity;

Regards,

Antonio

maxgro
MVP
MVP

add the other f6...f10 fields


U:

load * inline [

Owner, Subject, Date, Account, f5

1,1,1,1,a1

1,1,1,1,a2

4,4,4,4,c1

5,5,5,5,d

9,9,9,9,e1

8,8,8,8,f1

9,9,9,9,e2

8,8,8,8,f2

4,4,4,4,c2

];

UFinal:

NoConcatenate load

  *

Resident U

Where

  Owner & '|' & Subject & '|' & Date & '|' & Account <> peek('Owner')  & '|' & peek('Subject') & '|' & peek('Date') & '|' & peek('Account')

order by Owner, Subject, Date, Account

DROP Table U;

1.png

nagaiank
Specialist III
Specialist III

You may use the following script also.

U:

load * inline [

Owner, Subject, Date, Account, f5

1,1,1,1,a1

1,1,1,1,a2

4,4,4,4,c1

5,5,5,5,d

9,9,9,9,e1

8,8,8,8,f1

9,9,9,9,e2

8,8,8,8,f2

4,4,4,4,c2

];

Result:

NoConcatenate

load Owner, Subject, Date, Account,

AutoNumberHash256(Owner,Subject,Date,Account) as Key, f5 Resident U

where not Exists (Key,AutoNumberHash256(Owner,Subject,Date,Account));

DROP Field Key From Result;

DROP Table U;

Not applicable
Author

Thank you Antonio it worked