Qlik Community

QlikView Scripting

Discussion Board for collaboration on QlikView Scripting.

Not applicable

De-duplicating Simple Data Table in Load Script

Hello QV,

I have a very simple (but large) Staff table that gets updated more frequently than I would like it to be.  Because of this, I am trying to put in into QV to help automate some reports.

The issue: for most of my reports, I only care about my Employee ID Number (EIN) and the Manager that is associated with them for this duplicate identificaiton.  My list though includes the various Name variations we have on file for any given employee.  I want to recreate this list, but remove any duplicates.  Here is some basic sample.

   

EINNameLoginManager
1234A SmithAS34Manager1
1234Al SmithAS34Manager1
5678J CoopJC78Manager2
9123L ParkLP23Manager3

I thought I could create a "mapping" script to de-dup, but apparently I am still not good enough at this as none of the versions I have tried actually remove the duplicates.

Here is what I thought would work but does not as I still have duplicates:

Staff_Mapping:
Mapping
LOAD Distinct
EIN
,
Manager
FROM

(
ooxml, embedded labels, table is UpdatedUserList);

Staff_List:
LOAD
Login,
UserName,
Title,
Manager,,
EIN,

EIN as [Nurse Decode #],
ApplyMap('Staff_Mapping', EIN) as UniqueFlag
FROM

(
ooxml, embedded labels, table is UpdatedUserList);

Thank you in advance for your help in getting these duplicates removed!

1 Solution

Accepted Solutions

Re: De-duplicating Simple Data Table in Load Script

My bad, use RecNo() instead of RowNo()

Staff_List:

LOAD *

Where Flag = 1;
LOAD Login,
    UserName,
    Title,
    Manager,,
    EIN,

    AutoNumber(RecNo(), EIN) as Flag,
    EIN as [Nurse Decode #]
FROM

(ooxml, embedded labels, table is UpdatedUserList);

Sample attached

8 Replies

Re: De-duplicating Simple Data Table in Load Script

Out of the 2 rows of data that is duplicating one will be selected?

Not applicable

Re: De-duplicating Simple Data Table in Load Script

Yes, I only need one of the rows (don't really care which one), I just need a list with one unique EIN and the respective columns that go with it (again, don't care which one).  While this sample is simple, I have a list of hundreds of employees and there are a quite a few employees that have 5 or even 6 iterations of their names.

Digvijay_Singh
Honored Contributor III

Re: De-duplicating Simple Data Table in Load Script

Try creating one composite field EIN&Manager as EINManager and use

Where not exists(EINManager) while loading.


You may need to do it in preceding/resident load in case it doesn't work while loading from database.

Re: De-duplicating Simple Data Table in Load Script

May be try this:

Staff_List:

LOAD *

Where Flag = 1;
LOAD Login,
    UserName,
    Title,
    Manager,,
    EIN,

    AutoNumber(RowNo(), EIN) as Flag,
    EIN as [Nurse Decode #]
FROM

(ooxml, embedded labels, table is UpdatedUserList);

Not applicable

Re: De-duplicating Simple Data Table in Load Script

Thank you, but this does not work, I still get duplicates.

Re: De-duplicating Simple Data Table in Load Script

My bad, use RecNo() instead of RowNo()

Staff_List:

LOAD *

Where Flag = 1;
LOAD Login,
    UserName,
    Title,
    Manager,,
    EIN,

    AutoNumber(RecNo(), EIN) as Flag,
    EIN as [Nurse Decode #]
FROM

(ooxml, embedded labels, table is UpdatedUserList);

Sample attached

Not applicable

Re: De-duplicating Simple Data Table in Load Script

Thank you, this did it.  Now I will go research what you just did and learn something new.

Re: De-duplicating Simple Data Table in Load Script

AutoNumber(RecNo, EIN) as Flag assigned 1, 2, 3 to each EIN. So for example, your data looks like this

EIN, Flag

123, 1

123, 2

123, 3

345, 1

345, 2

156, 1

156, 2

156, 3

156, 4

I then picked the first instance of each EIN in the preceding load using the Where Flag = 1 statement.

HTH

Best,

Sunny

Community Browser