Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

avoiding duplicate rows

Hello everyone,

I need help in avoiding some duplicate rows in the data,while loading them.

I have to make sure that no two row should have same data for all the columns against one field.

Let me explain,

suppose there are 10 fields and lets say number is one of those and we have some around 50000 rows.

Now I have to make sure that for number there should not be any row which has all 9 fields same.

i don't understand how to do it.

pleaaase help..\

Thanks..!!

1 Solution

Accepted Solutions
sunny_talwar

You would need to find someway to combine them together into one otherwise checking the repetitions might not be easy.

View solution in original post

8 Replies
sunny_talwar

Try something like this:

Table:

LOAD Number,

          Field2,

          Field3,

          Field4,

          Field5,

          Field6,

          Field7,

          Field8,

          Field9,

          Field10,

          AutoNumber(Field2&'|'&Field3&'|'&Field4&'|'&Field5&'|'&Field6&'|'&Field7&'|'&Field8&'|'&Field9&'|'&Field10) as Check

FROM Source


Right Join (Table)

LOAD Check,

          FirstValue(Number) as FirstNumber

Resident Table

Group By Table;

The above will save the first instance of number in any repeating data.



maxgro
MVP
MVP

and if there are >= 2 rows with the same n fields?

you want to keep one of them? exclude all?

example with 3 fields, adapt to 9

orange = result

keep the first row

X:

load *, f1 & '-' & f2 & '-' & f3 as fcheck inline [

number, f1, f2, f3

1, z, z, z

2, a, a, b

2, a, a, c

3, a, b, c

3, a, b, c

3, d, e, f

];

Y:

NoConcatenate load * Resident X

where number<>Peek('number') or number=Peek('number') and fcheck <> Peek('fcheck')

order by number, fcheck;

DROP Table X;





discard all duplicated rows

X:

load *, f1 & '-' & f2 & '-' & f3 as fcheck inline [

number, f1, f2, f3

1, z, z, z

2, a, a, b

2, a, a, c

3, a, b, c

3, a, b, c

3, d, e, f

];

Right Keep (X)

load * where cnt1 = 1;

load number, f3, count(f3) as cnt1

Resident X

group by number, f3;

rteco0131
Contributor II
Contributor II

Hello @anita sain

Please give an example.


Regards,

Not applicable
Author

exclude one of those.

Not applicable
Author

That's a really good way to keep a check on the fields but what if I have a large number of fields say 100,

??

Not applicable
Author

Hi, use the keyword "DISTINCT" and RowNo() function.

I.E.

TABLE1:

LOAD DISTINCT

    FIELD_01,

    FIELD_02,

    FEILD_03,

    ...

    FIELD_09

    ;

with this keyword all rows with the same values on all fields are reduced to one row and the table are load with no duplicate rows.

TABLE2:

LOAD

     RowNo()  AS NUMBER,

     *

RESIDENT TABLE1;

DROP TABLE1;

with RowNo() you can enumerate the rows in a table.

sunny_talwar

You would need to find someway to combine them together into one otherwise checking the repetitions might not be easy.

Not applicable
Author

agree with sunny, to automate the field concatenation you can use

let vFieldCount = NoOfFields('TblName');

FOR i=1 to $(vFieldCount)

  IF i=1 then

  LET vFields = FieldName(i,'TblName');

  ELSE

  LET  vFields = '$(vFields)' &' | '& FieldName(i,'TblName');

  ENDIF

NEXT;