Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Removing Redundant Data from a table

Dear All,

I browsed through the forum but didn´t exactly find an answer to my problem:

I have a the following data from a production process in a table which looks like this (total file contains more than 400k entries):

      

EquipmentSystemSerialNoOrder NoArticel CodeStartEndResult
MUMO303053SN16100082415485040X53207A 601Y282 219.12.2016 13:2319.12.2016 13:31PASS
MUMO303052SN16120055105504994X53207A 321Y 31 419.12.2016 13:2219.12.2016 13:23PASS
FLY13SN16110387885498245X53207A 550Y 39 119.12.2016 13:2219.12.2016 13:34FAIL
MUMO303054SN16110454265502653X53207A 356Y 21 419.12.2016 13:2119.12.2016 13:23PASS
MUMO303058SN16120124105505473X53207A 601Y370 519.12.2016 13:2119.12.2016 13:24PASS

It basically shows the Result of a product test process, i.e. on which equipment a product (Articel Code) with a unique serial number for a unique Order No has been tested and when the test process started and ended.

Unfortunately there are duplicates in the file availabe, e.g.:

      

EquipmentSystemSerialNoOrder NoArticel CodeStartEndResult
MUMO303056SN16110529125502273X53207A5558Y421 413.12.2016 02:1113.12.2016 02:25PASS
MUMO303056SN16110529125502273X53207A5558Y421 413.12.2016 02:1113.12.2016 02:25PASS
MUMO303056SN16110529125502273X53207A5558Y421 413.12.2016 01:5413.12.2016 02:02FAIL
MUMO303056SN16110529125502273X53207A5558Y421 413.12.2016 01:5413.12.2016 02:02FAIL
MUMO303056SN16110529125502273X53207A5558Y421 413.12.2016 01:4713.12.2016 01:49FAIL


I tried working with "load distinct" in the data editor script but that of course doesn´t work here, since then to many data gets discarded.

In the end I would like to get the following output:

      

EquipmentSystemSerialNoOrder NoArticel CodeStartEndResult
MUMO303056BF16110529125502273C53207A5558B421 413.12.2016 02:1113.12.2016 02:25PASS
MUMO303056BF16110529125502273C53207A5558B421 413.12.2016 01:5413.12.2016 02:02FAIL
MUMO303056BF16110529125502273C53207A5558B421 413.12.2016 01:4713.12.2016 01:49FAIL

I.e. all duplicates which have the exact same time stamp for a given SerialNo are to be discarded from the list.

Thanks for your support,

S

1 Solution

Accepted Solutions
maxgro
MVP
MVP

Using your second table as source,

tmp:

LOAD Equipment,

     System,

     SerialNo,

     [Order No],

     [Articel Code],

     Start,

     End,

     Result

FROM

[https://community.qlik.com/thread/244538] (html, codepage is 1252, embedded labels, table is @2);

final:

load

  SerialNo & '|' & Start & '|' & End as Key,

  *

Resident tmp

where not Exists(Key, SerialNo & '|' & Start & '|' & End)

order by SerialNo , Start , End;

DROP Table tmp;

View solution in original post

3 Replies
rittermd
Master
Master

Not sure if this would work.

But can you combine the Equipment # and the Start Date/Time into a key and then do a distinct on that key?

maxgro
MVP
MVP

Using your second table as source,

tmp:

LOAD Equipment,

     System,

     SerialNo,

     [Order No],

     [Articel Code],

     Start,

     End,

     Result

FROM

[https://community.qlik.com/thread/244538] (html, codepage is 1252, embedded labels, table is @2);

final:

load

  SerialNo & '|' & Start & '|' & End as Key,

  *

Resident tmp

where not Exists(Key, SerialNo & '|' & Start & '|' & End)

order by SerialNo , Start , End;

DROP Table tmp;

Not applicable
Author

maxgro

Great, that worked! Thanks a lot!