Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
i have two tables
Table1:
client_id campaign_id comm_type_id sent delivered
23 134 44 234 230
24 136 46
23 134 45 4 3
23 135 44 567 560
24 136 45 908 34
23 134 48
24 136 46
Table2:
campaign_id comm_type_id sent_count delivered
134 44 230 230
136 46 76 70
134 45 4 3
135 44 566 560
136 45 907 34
134 48 40 2
136 46 20 2
Now you see there are no counts in Table1 for some rows, but you have those counts in Table2.
So now i want to get those row empty counts in Table1 from Table2, but not those which are present in Table1.
How could i do this Please help me on this
Do you wish to have this done in the load script while getting data in or do you wish to do it with two tables that have already been loaded and do the result in a table in the UI?
HI,
I want to do it in Load script
A script like this should do the trick for you. You will have to do a LEFT JOIN between the two tables. LEFT JOIN makes all the rows of Table1 get all matching rows from Table2 to be joined by the common fields (fields that have exactly the same name):
Table1:
LOAD
client_id,
campaign_id,
comm_type_id,
sent as sent1,
delivered as delivered1
INLINE [
client_id campaign_id comm_type_id sent delivered
23 134 44 234 230
24 136 46
23 134 45 4 3
23 135 44 567 560
24 136 45 908 34
23 134 48
24 136 46
] (delimiter is spaces);
LEFT JOIN
Table2:
LOAD
campaign_id,
comm_type_id,
sent_count as sent2,
delivered as delivered2
INLINE [
campaign_id comm_type_id sent_count delivered
134 44 230 230
136 46 76 70
134 45 4 3
135 44 566 560
136 45 907 34
134 48 40 2
136 46 20 2
] (delimiter is spaces);
DATA:
LOAD
client_id,
campaign_id,
comm_type_id,
if( IsNull(sent1) OR Trim(sent1)='' , sent2 , sent1 ) AS sent,
if( IsNull(delivered1) OR Trim(delivered1)='' , delivered2 , delivered1 ) AS delivered
RESIDENT
Table1;
DROP TABLE Table1;
You would probably read the two tables from external files I guess and then:
Lines
8 to 17 would be replaced by something like this:
FROM [Table1.xlsx] (ooxml.....) ;
Lines 29 to 38 would be replace by:
FROM [Table2.xlsx (ooxml....);
HI Petter,
Thanks Peter, This works!
But I have very large data Tables, if i do left join they will create duplicate records and extra records not required. Is there any other way i could do this, which might take less loading time?
I believe inner join is sufficient when you concerned about campaign and it's counts and also I believe campaign id exists in each table in every case.
How about this for non-join option:
MAP_TABLE:
Mapping load
campaign_id&'|'& comm_type_id AS LookUpKey,
sent_count&'|'&delivered as Value
inline [
campaign_id,comm_type_id,sent_count,delivered
134,44,230,230
136,46,76,70
134,45,4,3
135,44,566,560
136,45,907,34
134,48,40,2
136,46,20,2
];
FACT:
load
client_id,
campaign_id,
comm_type_id,
IF(LEN(trim(sent))>0,sent,SubField(ApplyMap('MAP_TABLE',campaign_id&'|'& comm_type_id),'|',1)) AS sent,
IF(LEN(trim(delivered))>0,delivered,SubField(ApplyMap('MAP_TABLE',campaign_id&'|'& comm_type_id),'|',2)) AS delivered
inline [
client_id,campaign_id,comm_type_id,sent,delivered
23,134,44,234,230
24,136,46,,
23,134,45,4,3
23,135,44,567,560
24,136,45,908,34
23,134,48,,
24,136,46,,
];
Hi Kaan,
What if i have a table with duplicate campaign_id and comm_type_id in Table2, at this time i should be able to sum sent and delivered counts
Table2:
load * inline [
campaign_id,comm_type_id,sent_count,delivered
134,44,230,230
136,46,76,70
136,46,7,2
134,45,4,3
134,45,1,0
135,44,566,560
136,45,907,34
134,48,40,2
136,46,20,2
];
now i should be able to get sum of sent_count and delivered for campaign_id and comm_type_id. for that i used this
MAP_TABLE:
MAPPING LOAD campaign_id&'|'& comm_type_id AS LookUpKey, sum(sent_count)&'|'& sum(delivered) as Value Resident Table2 group by campaign_id, comm_type_id;
this gives me empty values in some fields. How could i approach this?
You should use LookUpKey as group by dimension.
MAP_TABLE:
MAPPING LOAD campaign_id&'|'& comm_type_id AS LookUpKey, sum(sent_count)&'|'& sum(delivered) as Value Resident Table2
group by campaign_id&'|'& comm_type_id;
But the real question is what should happen if there is a duplicate records on FACT table
Table1:
client_id campaign_id comm_type_id sent delivered
23 134 44 234 230
24 136 46
23 134 45 4 3
23 135 44 567 560
24 136 45 908 34
23 134 48
24 136 46
Before looking values from Map table, Grouping the fact table is the best approach I guess
Script should be like this:
MAP_TABLE:
Mapping load
campaign_id&'|'& comm_type_id AS LookUpKey,
sum(sent_count)&'|'&sum(delivered) as Value
inline [
campaign_id,comm_type_id,sent_count,delivered
134,44,230,230
136,46,76,70
136,46,80,120
134,45,4,3
135,44,566,560
136,45,907,34
134,48,40,2
136,46,20,2
]
Group By campaign_id&'|'& comm_type_id;
fact:
Load
client_id,
campaign_id,
comm_type_id,
IF(sent<>0,sent,SubField(ApplyMap('MAP_TABLE',campaign_id&'|'& comm_type_id),'|',1)) AS sent,
IF(delivered<>0,delivered,SubField(ApplyMap('MAP_TABLE',campaign_id&'|'& comm_type_id),'|',2)) AS delivered
;
load
client_id,
campaign_id,
comm_type_id,
sum(sent) as sent,
sum(delivered) as delivered
inline [
client_id,campaign_id,comm_type_id,sent,delivered
23,134,44,234,230
24,136,46,,
23,134,45,4,3
23,135,44,567,560
24,136,45,908,34
23,134,48,,
24,136,46,,
]
Group By client_id,
campaign_id,
comm_type_id