Re: Combining table without summing rows - Qlik Community

cawestelaken · ‎2018-11-21

Hello,

I've been trying to alter an ETL script so that it produces less rows in 1 table.

One of the potential changes would be to the following:

LOAD
%Incident_key
-- extra information --
FROM Datasource_A;

CONCATENATE LOAD
%Incident_key
1 AS _Flag_NewIncident
FROM Datasource_B;

Datasource_A has many more rows and keys than Datasource_B.

The end result of this script would be the sum of the rows in Datasource_A and Datasource_B.

EXAMPLE (the reasoning behind why you'd want to do this isn't logical in the example):

Datasource_A:

%Incident_Key	Type
1	A
2	A
3	B
4	A
5	A
6	B
7	C
8	B
9	A
10	C

Datasource_B:

%Incident_key

6

7

9

10

End result:

%Incident_Key	Type	_Flag_NewIncident
1	A	.
2	A	.
3	B	.
4	A	.
5	A	.
6	B	.
7	C	.
8	B	.
9	A	.
10	C	.
6	.	1
7	.	1
9	.	1
10	.	1

Is it possible to make the script so that the end result would be:

%Incident_Key	Type	_Flag_NewIncident
1	A	.
2	A	.
3	B	.
4	A	.
5	A	.
6	B	1
7	C	1
8	B	.
9	A	1
10	C	1

Yours sincerely,
Casper Westelaken

jfkinspari · ‎2018-11-21

AFAIKT you want to join the two tables more than you want to concatenate them.
LOAD
%Incident_key
-- extra information --
FROM Datasource_A;

JOIN LOAD
%Incident_key
1 AS _Flag_NewIncident
FROM Datasource_B;

cawestelaken · ‎2018-11-21

I did not realize it was possible to JOIN LOAD.
Thank you very much!

Yours sincerely,

Casper Westelaken

cawestelaken · ‎2018-11-21

Okey,

Whilst this was the perfect solution for my example. Sadly it doesn't solve my issue the way it should.
This is my fault for not correctly describing the issue.

The Incident_key is a combined key in the first table, where in the second table it is a sole key field.
As follows:

Datasource_A:

%Incident_key	%Extra_key	Type
1	11	A
1	12	A
1	13	A
2	11	B
3	11	A
3	12	A
4	22	C
5	31	C
5	41	C
5	23	C
5	12	C
6	56	B
6	32	B
7	12	A
8	43	B
9	23	A
9	54	A
10	32	C

Datasource_B:

%Incident_key

2

4

5

7

9

End Result using original Concatenate:

%Incident_key	%Extra_key	Type	_Flag_IncidentNew
1	11	A
1	12	A
1	13	A
2	11	B
3	11	A
3	12	A
4	22	C
5	31	C
5	41	C
5	23	C
5	12	C
6	56	B
6	32	B
7	12	A
8	43	B
9	23	A
9	54	A
10	32	C
2	Example error		1
4	Example error		1
5	Example error		1
7	Example error		1
9	Example error		1

End result using Join:

%Incident_key	%Extra_key	Type	_Flag_IncidentNew
1	11	A
1	12	A
1	13	A
2	11	B	1
3	11	A
3	12	A
4	22	C	1
5	31	C	1
5	41	C	1
5	23	C	1
5	12	C	1
6	56	B
6	32	B
7	12	A	1
8	43	B
9	23	A	1
9	54	A	1
10	32	C

Desired End Result:

%Incident_key	%Extra_key	Type	_Flag_IncidentNew
1	11	A
1	12	A
1	13	A
2	11	B	1
3	11	A
3	12	A
4	22	C	1
5	31	C	1
5	41	C
5	23	C
5	12	C
6	56	B
6	32	B
7	12	A	1
8	43	B
9	23	A	1
9	54	A
10	32	C

Do you know whether or not this is possible?

jfkinspari · ‎2018-11-22

How do you know which of the rows with e.g. incident_key = 5, that should be flagged?

%Incident_key	%Extra_key	Type	_Flag_IncidentNew
1	11	A
1	12	A
1	13	A
2	11	B	1
3	11	A
3	12	A
4	22	C	1
5	31	C	1
5	41	C
5	23	C
5	12	C

cawestelaken · ‎2018-11-23

In this example I only used an %Extra_key, but since we're working with a facts table and dimensions setup this is actually multiple keys.

One of these keys is a %Date_key.

The row that should be flagged would the one that coincides with the earliest instance of the Incident.
Datasource_B (The table with the Incident dimension) includes a SysCreated field which during the load statement would be turned into a %Date_Key for loading into the facts table only (to avoid a synthetic key)

Sadly though... Theoretically it is possible for 2 rows in the Facts table to have the same %Incident_key and %Date_key. So I'd rather create something that would simply load it in once per Incident and date combination. All that matters is that the earliest date per incident gets flagged, so that no double flags exist.

So essentially I'm loading the _Flag_IncidentNew into the Facts table using the Incidents table in which the %Incident_key is distinct, where during this load I'm transforming one of the columns in the Incidents table into a %Date_key.

In the actual situation (non-example) every %Incident_key gets flagged once.

I guess my examples were insufficient.

The end result would be:

%Incident_key	%Extra_key	Type	_Flag_IncidentNew
1	11	A	1
1	12	A
1	13	A
2	11	B	1
3	11	A	1
3	12	A
4	22	C	1
5	31	C	1
5	41	C
5	23	C
5	12	C