Solved: Store duplicated rows into another table - Qlik Community

lfalmoguera · ‎2019-10-02

Hi all,

I have a quick question in order to see if I can store duplicated values into a single file in order to deal with them afterwards.

Imagine I have this TABLE_EXAMPLE (my real table is much bigger with over +200 fields):

field1	field2	field3	field4
1	1	1	1
1	1	1	1
2	2	2	2
3	3	3	3

Then I script

Load * distinct Resident TABLE_EXAMPLE;

and will get

field1	field2	field3	field4
1	1	1	1
2	2	2	2
3	3	3	3

Is there quick way I can get the "erased row" into another table, to get:

TABLE_DUPLICATED_ROW:

field1	field2	field3	field4
1	1	1	1

Like a

load * "duplicated" Resident TABLE_EXAMPLE?

I want to avoid to "build" a 200 file-key.

Kind regards.

sunny_talwar · ‎2019-10-02

Give this a shot

TABLE_EXAMPLE:
LOAD * INLINE [
    field1, field2, field3, field4
    1, 1, 1, 1
    1, 1, 1, 1
    2, 2, 2, 2
    3, 3, 3, 3
];

TempTable:
CrossTable (Field, Value)
LOAD 1 as RowNum,
	 *
Resident TABLE_EXAMPLE
Where RecNo() = 1;

TempTable2:
LOAD Concat(DISTINCT '[' & Field & ']', '&') as ConcatField1,
	 Concat(DISTINCT '[' & Field & ']', ',') as ConcatField2
Resident TempTable;

LET vConcatField1 = Peek('ConcatField1');
TRACE $(vConcatField1);
LET vConcatField2 = Peek('ConcatField2');
TRACE $(vConcatField2);

DROP Tables TempTable, TempTable2;

TABLE_EXAMPLE2:
LOAD *,
	 If(Key = Previous(Key), RangeSum(Peek('KeyTemp'), 1), 1) as KeyTemp,
	 Hash128($(vConcatField1)&AutoNumber(If(Key = Previous(Key), RangeSum(Peek('KeyTemp'), 1), 1), $(vConcatField1))) as Key2;
LOAD *,
	 Hash128($(vConcatField1)) as Key	 
Resident TABLE_EXAMPLE
Order By $(vConcatField2);

DROP Table TABLE_EXAMPLE;
RENAME Table TABLE_EXAMPLE2 to TABLE_EXAMPLE;

Table:
LOAD DISTINCT $(vConcatField2),
	 Hash128($(vConcatField1)&1) as Key3
Resident TABLE_EXAMPLE;

Duplicate:
LOAD DISTINCT $(vConcatField2),
	 'Duplicate' as DuplicateFlag
Resident TABLE_EXAMPLE
Where not Exists(Key3, Key2);

DROP Table TABLE_EXAMPLE;
DROP Field Key3;

View solution in original post

DavidM · ‎2019-10-02

I think you need to add field and use Count(duplicatedfield) as count and group by some ID.

Then you can resident load table with condition Where(Count)>1

sunny_talwar · ‎2019-10-02

Without the key might not be possible, but you can dynamically create the key field like this

TABLE_EXAMPLE:
LOAD * INLINE [
    field1, field2, field3, field4
    1, 1, 1, 1
    1, 1, 1, 1
    2, 2, 2, 2
    3, 3, 3, 3
];

TempTable:
CrossTable (Field, Value)
LOAD 1 as RowNum,
	 *
Resident TABLE_EXAMPLE;

TempTable2:
LOAD Concat(DISTINCT '[' & Field & ']', '&') as ConcatField1,
	 Concat(DISTINCT '[' & Field & ']', ',') as ConcatField2
Resident TempTable;

LET vConcatField1 = Peek('ConcatField1');
TRACE $(vConcatField1);
LET vConcatField2 = Peek('ConcatField2');
TRACE $(vConcatField2);

DROP Tables TempTable, TempTable2;

Left Join (TABLE_EXAMPLE)
LOAD $(vConcatField2),
	 Count($(vConcatField)) as Count
Resident TABLE_EXAMPLE
Group By $(vConcatField2);

Table:
LOAD DISTINCT $(vConcatField2)
Resident TABLE_EXAMPLE;

Duplicate:
LOAD DISTINCT $(vConcatField2),
	 'Duplicate' as DuplicateFlag
Resident TABLE_EXAMPLE
Where Count > 1;

DROP Table TABLE_EXAMPLE;

lfalmoguera · ‎2019-10-02

Yeah, that´s the approach.

But I wanted to know if there were any "out of the box" solution, like a "reverse distinct" 🙂

Seems it does not.

Thanks!

lfalmoguera · ‎2019-10-02

Thanks a lot.

Just a slight ammend, just in case someone copy&paste your code.

There´s a missed 1 in line 3.

Left Join (TABLE_EXAMPLE)
LOAD $(vConcatField2),
	 Count($(vConcatField1)) as Count
Resident TABLE_EXAMPLE
Group By $(vConcatField2);

The code works fantastic and it´s automated, but with a +200 field per row and +10M file row, it makes the server go nuts : )

Anyhow, so much appreciated!

Any one with a "not su much process consuming " idea :' (

sunny_talwar · ‎2019-10-02

Yup, my bad for the typo....

I guess the number of fields might not be causing the issue, but the Group By is... we might be able to optimize this using Where Exist... let me play around with it.

sunny_talwar · ‎2019-10-02

Give this a shot

TABLE_EXAMPLE:
LOAD * INLINE [
    field1, field2, field3, field4
    1, 1, 1, 1
    1, 1, 1, 1
    2, 2, 2, 2
    3, 3, 3, 3
];

TempTable:
CrossTable (Field, Value)
LOAD 1 as RowNum,
	 *
Resident TABLE_EXAMPLE
Where RecNo() = 1;

TempTable2:
LOAD Concat(DISTINCT '[' & Field & ']', '&') as ConcatField1,
	 Concat(DISTINCT '[' & Field & ']', ',') as ConcatField2
Resident TempTable;

LET vConcatField1 = Peek('ConcatField1');
TRACE $(vConcatField1);
LET vConcatField2 = Peek('ConcatField2');
TRACE $(vConcatField2);

DROP Tables TempTable, TempTable2;

TABLE_EXAMPLE2:
LOAD *,
	 If(Key = Previous(Key), RangeSum(Peek('KeyTemp'), 1), 1) as KeyTemp,
	 Hash128($(vConcatField1)&AutoNumber(If(Key = Previous(Key), RangeSum(Peek('KeyTemp'), 1), 1), $(vConcatField1))) as Key2;
LOAD *,
	 Hash128($(vConcatField1)) as Key	 
Resident TABLE_EXAMPLE
Order By $(vConcatField2);

DROP Table TABLE_EXAMPLE;
RENAME Table TABLE_EXAMPLE2 to TABLE_EXAMPLE;

Table:
LOAD DISTINCT $(vConcatField2),
	 Hash128($(vConcatField1)&1) as Key3
Resident TABLE_EXAMPLE;

Duplicate:
LOAD DISTINCT $(vConcatField2),
	 'Duplicate' as DuplicateFlag
Resident TABLE_EXAMPLE
Where not Exists(Key3, Key2);

DROP Table TABLE_EXAMPLE;
DROP Field Key3;

Brett_Bleess · ‎2019-10-09

Luis, did Sunny's last post and example get you a good working solution? If so, please be sure to come back and use the Accept as Solution button on that post to let others know that it worked and to give Sunny credit for the help. If you have further questions, leave an update.

Regards,
Brett

To help users find verified answers, please do not forget to use the "Accept as Solution" button on any post(s) that helped you resolve your problem or question.
I now work a compressed schedule, Tuesday, Wednesday and Thursday, so those will be the days I will reply to any follow-up posts.

lfalmoguera · ‎2020-02-16

Sorry Brett. My bad,

Solution accepted!