- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Removing Redundant Data from a table
Dear All,
I browsed through the forum but didn´t exactly find an answer to my problem:
I have a the following data from a production process in a table which looks like this (total file contains more than 400k entries):
Equipment | System | SerialNo | Order No | Articel Code | Start | End | Result |
---|---|---|---|---|---|---|---|
MUMO3030 | 53 | SN1610008241 | 5485040 | X53207A 601Y282 2 | 19.12.2016 13:23 | 19.12.2016 13:31 | PASS |
MUMO3030 | 52 | SN1612005510 | 5504994 | X53207A 321Y 31 4 | 19.12.2016 13:22 | 19.12.2016 13:23 | PASS |
FLY | 13 | SN1611038788 | 5498245 | X53207A 550Y 39 1 | 19.12.2016 13:22 | 19.12.2016 13:34 | FAIL |
MUMO3030 | 54 | SN1611045426 | 5502653 | X53207A 356Y 21 4 | 19.12.2016 13:21 | 19.12.2016 13:23 | PASS |
MUMO3030 | 58 | SN1612012410 | 5505473 | X53207A 601Y370 5 | 19.12.2016 13:21 | 19.12.2016 13:24 | PASS |
It basically shows the Result of a product test process, i.e. on which equipment a product (Articel Code) with a unique serial number for a unique Order No has been tested and when the test process started and ended.
Unfortunately there are duplicates in the file availabe, e.g.:
Equipment | System | SerialNo | Order No | Articel Code | Start | End | Result |
---|---|---|---|---|---|---|---|
MUMO3030 | 56 | SN1611052912 | 5502273 | X53207A5558Y421 4 | 13.12.2016 02:11 | 13.12.2016 02:25 | PASS |
MUMO3030 | 56 | SN1611052912 | 5502273 | X53207A5558Y421 4 | 13.12.2016 02:11 | 13.12.2016 02:25 | PASS |
MUMO3030 | 56 | SN1611052912 | 5502273 | X53207A5558Y421 4 | 13.12.2016 01:54 | 13.12.2016 02:02 | FAIL |
MUMO3030 | 56 | SN1611052912 | 5502273 | X53207A5558Y421 4 | 13.12.2016 01:54 | 13.12.2016 02:02 | FAIL |
MUMO3030 | 56 | SN1611052912 | 5502273 | X53207A5558Y421 4 | 13.12.2016 01:47 | 13.12.2016 01:49 | FAIL |
I tried working with "load distinct" in the data editor script but that of course doesn´t work here, since then to many data gets discarded.
In the end I would like to get the following output:
Equipment | System | SerialNo | Order No | Articel Code | Start | End | Result |
---|---|---|---|---|---|---|---|
MUMO3030 | 56 | BF1611052912 | 5502273 | C53207A5558B421 4 | 13.12.2016 02:11 | 13.12.2016 02:25 | PASS |
MUMO3030 | 56 | BF1611052912 | 5502273 | C53207A5558B421 4 | 13.12.2016 01:54 | 13.12.2016 02:02 | FAIL |
MUMO3030 | 56 | BF1611052912 | 5502273 | C53207A5558B421 4 | 13.12.2016 01:47 | 13.12.2016 01:49 | FAIL |
I.e. all duplicates which have the exact same time stamp for a given SerialNo are to be discarded from the list.
Thanks for your support,
S
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Using your second table as source,
tmp:
LOAD Equipment,
System,
SerialNo,
[Order No],
[Articel Code],
Start,
End,
Result
FROM
[https://community.qlik.com/thread/244538] (html, codepage is 1252, embedded labels, table is @2);
final:
load
SerialNo & '|' & Start & '|' & End as Key,
*
Resident tmp
where not Exists(Key, SerialNo & '|' & Start & '|' & End)
order by SerialNo , Start , End;
DROP Table tmp;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Not sure if this would work.
But can you combine the Equipment # and the Start Date/Time into a key and then do a distinct on that key?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Using your second table as source,
tmp:
LOAD Equipment,
System,
SerialNo,
[Order No],
[Articel Code],
Start,
End,
Result
FROM
[https://community.qlik.com/thread/244538] (html, codepage is 1252, embedded labels, table is @2);
final:
load
SerialNo & '|' & Start & '|' & End as Key,
*
Resident tmp
where not Exists(Key, SerialNo & '|' & Start & '|' & End)
order by SerialNo , Start , End;
DROP Table tmp;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Great, that worked! Thanks a lot!