Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

delete duplicate rows

Good afternoon,

I want to compare rows from my dataset to delete duplicate rows from particular columns.

I've already had an answer from a previous question (named : compare rows of dataset)

Below what I wanted to do and the answer I had :

{

For example, I have the table below :

A    B    C                              D

1    1B    23/03/2017 12:12    344

2    2B    23/03/2017 11:50    20

3    1B    23/03/2017 12:12    344

4    3B    23/03/2017 09:11    32

The expected result would be to add a new column E like this :

A    B    C                              D      E

1    1B    23/03/2017 12:12    344    1

2    2B    23/03/2017 11:50    20      1

3    1B    23/03/2017 12:12    344    0

4    3B    23/03/2017 09:11    32        1


The solution :

Table:

LOAD *,

  AutoNumber(B&D) as Key;

LOAD * INLINE [

    A, B, C, D

    1, 1B, 23/03/2017 12:12, 344

    2, 2B, 23/03/2017 11:50, 20

    3, 1B, 23/03/2017 18:00, 344

    4, 3B, 23/03/2017 09:11, 32

];

FinalTable:

LOAD *,

  If(Key = Previous(Key), 0, 1) as E

Resident Table

Order By Key, C desc;

DROP Table Table;

}


Or now, I have a new column E.

Here the table I have at the beginning :

A    B    C                              D               E

1    1B    23/03/2017 12:12    344     ACCEPTED

2    2B    23/03/2017 11:50    20     ACCEPTED

3    1B    23/03/2017 14:15   344     ACCEPTED

4    3B    23/03/2017 09:11    32      REFUSED

5     4B   23/03/2017 15:00   344     REFUSED

6     4B   23/03/2017 15:01  344     ERROR

7     4B   23/03/2017 15:02   344     ACCEPTED


With the previous solution I will have :

A    B    C                              D               E                    F

1    1B    23/03/2017 12:12    344     ACCEPTED     0

2    2B    23/03/2017 11:50    20     ACCEPTED     0

3    1B    23/03/2017 14:15   344     ACCEPTED     0

4    3B    23/03/2017 09:11    32      REFUSED     1

5     4B   23/03/2017 15:00   344     REFUSED     0

6     4B   23/03/2017 15:01   344     ERROR     0

7     4B   23/03/2017 15:02   344     ACCEPTED     1


Or I would like :

A    B    C                              D               E                    F

1    1B    23/03/2017 12:12    344     ACCEPTED     1

2    2B    23/03/2017 11:50    20     ACCEPTED     1

3    1B    23/03/2017 14:15   344     ACCEPTED     1

4    3B    23/03/2017 09:11    32      REFUSED     1

5     4B   23/03/2017 15:00   344     REFUSED     0

6     4B   23/03/2017 15:01   344     ERROR     0

7     4B   23/03/2017 15:02   344     ACCEPTED          1

Indeed, I want to put 0 only if there was a first row with ERROR or REFUSED in column E.


I don't know if my explanations are very clear.


Thank you for your help.

Paloma

1 Solution

Accepted Solutions
sth_wch004
Contributor III
Contributor III

Table:

LOAD *,

  AutoNumber(B&D) as Key;

LOAD * INLINE [

A,B,C,D,E

1,1B,23/03/201712:12,344,ACCEPTED

2,2B,23/03/201711:50,20,ACCEPTED

3,1B,23/03/201714:15,344,ACCEPTED

4,3B,23/03/201709:11,32,REFUSED

5,4B,23/03/201715:00,344,REFUSED

6,4B,23/03/201715:01,344,ERROR

7,4B,23/03/201715:02,344,ACCEPTED

];

FinalTable:

LOAD *,

  If(Key = Previous(Key) and ( = 'ERROR' or = 'REFUSED'), 0, 1) as F

Resident Table

Order By Key, C desc;

DROP Table Table;

1.JPG

View solution in original post

3 Replies
sth_wch004
Contributor III
Contributor III

Table:

LOAD *,

  AutoNumber(B&D) as Key;

LOAD * INLINE [

A,B,C,D,E

1,1B,23/03/201712:12,344,ACCEPTED

2,2B,23/03/201711:50,20,ACCEPTED

3,1B,23/03/201714:15,344,ACCEPTED

4,3B,23/03/201709:11,32,REFUSED

5,4B,23/03/201715:00,344,REFUSED

6,4B,23/03/201715:01,344,ERROR

7,4B,23/03/201715:02,344,ACCEPTED

];

FinalTable:

LOAD *,

  If(Key = Previous(Key) and ( = 'ERROR' or = 'REFUSED'), 0, 1) as F

Resident Table

Order By Key, C desc;

DROP Table Table;

1.JPG

ahaahaaha
Partner - Master
Partner - Master

Hi Paloma,

May be like this?

Load*,

If(E='ACCEPTED', 1, 0);

Load*Inline

[A,    B,    C,                              D,              E

1,    1B,    23/03/2017 12:12,    344,    ACCEPTED

2,    2B,    23/03/2017 11:50,    20,    ACCEPTED

3,    1B,    23/03/2017 14:15,  344,    ACCEPTED

4,    3B,    23/03/2017 09:11,    32,      REFUSED

5,    4B,  23/03/2017 15:00,  344,    REFUSED

6,    4B,  23/03/2017 15:01,  344,    ERROR

7,    4B,  23/03/2017 15:02,  344,    ACCEPTED

];

Confuses only that in your example this line should have a value of 0 if the condition is important for you - 'Indeed, I want to put 0 only if there was a first row with ERROR or REFUSED in column E.'.

4    3B    23/03/2017 09:11    32      REFUSED  0


Regards,

Andrey

Not applicable
Author

Thank you very much for your help ! This is what I want to do.