Solved: Removing Redundant Data from a table - Qlik Community

Report Inappropriate Content · ‎2016-12-27

Dear All,

I browsed through the forum but didn´t exactly find an answer to my problem:

I have a the following data from a production process in a table which looks like this (total file contains more than 400k entries):

Equipment	System	SerialNo	Order No	Articel Code	Start	End	Result
MUMO3030	53	SN1610008241	5485040	X53207A 601Y282 2	19.12.2016 13:23	19.12.2016 13:31	PASS
MUMO3030	52	SN1612005510	5504994	X53207A 321Y 31 4	19.12.2016 13:22	19.12.2016 13:23	PASS
FLY	13	SN1611038788	5498245	X53207A 550Y 39 1	19.12.2016 13:22	19.12.2016 13:34	FAIL
MUMO3030	54	SN1611045426	5502653	X53207A 356Y 21 4	19.12.2016 13:21	19.12.2016 13:23	PASS
MUMO3030	58	SN1612012410	5505473	X53207A 601Y370 5	19.12.2016 13:21	19.12.2016 13:24	PASS

It basically shows the Result of a product test process, i.e. on which equipment a product (Articel Code) with a unique serial number for a unique Order No has been tested and when the test process started and ended.

Unfortunately there are duplicates in the file availabe, e.g.:

Equipment	System	SerialNo	Order No	Articel Code	Start	End	Result
MUMO3030	56	SN1611052912	5502273	X53207A5558Y421 4	13.12.2016 02:11	13.12.2016 02:25	PASS
MUMO3030	56	SN1611052912	5502273	X53207A5558Y421 4	13.12.2016 02:11	13.12.2016 02:25	PASS
MUMO3030	56	SN1611052912	5502273	X53207A5558Y421 4	13.12.2016 01:54	13.12.2016 02:02	FAIL
MUMO3030	56	SN1611052912	5502273	X53207A5558Y421 4	13.12.2016 01:54	13.12.2016 02:02	FAIL
MUMO3030	56	SN1611052912	5502273	X53207A5558Y421 4	13.12.2016 01:47	13.12.2016 01:49	FAIL

I tried working with "load distinct" in the data editor script but that of course doesn´t work here, since then to many data gets discarded.

In the end I would like to get the following output:

Equipment	System	SerialNo	Order No	Articel Code	Start	End	Result
MUMO3030	56	BF1611052912	5502273	C53207A5558B421 4	13.12.2016 02:11	13.12.2016 02:25	PASS
MUMO3030	56	BF1611052912	5502273	C53207A5558B421 4	13.12.2016 01:54	13.12.2016 02:02	FAIL
MUMO3030	56	BF1611052912	5502273	C53207A5558B421 4	13.12.2016 01:47	13.12.2016 01:49	FAIL

I.e. all duplicates which have the exact same time stamp for a given SerialNo are to be discarded from the list.

Thanks for your support,

S

maxgro · ‎2016-12-27

Using your second table as source,

tmp:

LOAD Equipment,

System,

SerialNo,

[Order No],

[Articel Code],

Start,

End,

Result

FROM

[https://community.qlik.com/thread/244538] (html, codepage is 1252, embedded labels, table is @2);

final:

load

SerialNo & '|' & Start & '|' & End as Key,

*

Resident tmp

where not Exists(Key, SerialNo & '|' & Start & '|' & End)

order by SerialNo , Start , End;

DROP Table tmp;

View solution in original post

rittermd · ‎2016-12-27

Not sure if this would work.

But can you combine the Equipment # and the Start Date/Time into a key and then do a distinct on that key?

maxgro · ‎2016-12-27