Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Joining two files with duplicate sales amounts

I am doing a left join from file A to file B. For each unique identifier however in file B there are duplicate sales amounts. I only want to join where the sales amounts are unique for each identifier. How would I go about doing this ?

10 Replies
crystles
Partner - Creator III
Partner - Creator III

Try doing the left Join the other way.

Load *

TableA;

Left Join(A)

Load*

TableB;

Not applicable
Author

Hi Sid,

Try use the synthase applyMap..

Anonymous
Not applicable
Author

Hi Smith,

I suggest you to do like the following method:

Step1: In the table B, identify the identifiers for which we have unique amount

Step2: Do the Left Join

Detail of the script=>

TableB_TMP:

Load *

Where Nb_Amount=1;  //We keep unique amount ligne

LOAD Id, count(Amount) as Nb_Amount, Sum(Amount) as Amount

From TableB

Group By Id;

TableA:

LOAD *

From TableA;

Left Join(TableA)

LOAD Id, Amount

Resident TableB_TMP;

DROP Table TableB_TMP;

Not applicable
Author

Table A:

Sales_id, Sales_amount

91000, 100

91000, 50

Table B:        

Sales_id, region, unit,

91000, Asia, AL

91000, Brazil, BR

91000, America, AR

91000, France, FR

91000, Spain, SP

When I load table A and then left join on table B the resultant table is 10 rows of data :

Sales_id, region, unit, Sales_amount

91000, asia, AL, 100

91000, brazil, BR, 100

91000, America, AR, 100

91000, France, FR, 100

91000, Spain, SP, 100

91000, asia, AL, 50

91000, brazil, BR, 50

91000, America, AR, 50

91000, France, FR, 50

91000, Spain, SP, 50

However this is incorrect as it is joining the Sales_id in table A to each instance in table B hence skewing the ‘Sales_amount’.  Effectively the Sales_amount should equal to 150 where sales_id is 91000.

Hope this clarifies.

prashantbaste
Partner - Creator II
Partner - Creator II

Hello Sid

I guess you need to Load first table as -

Load Sales_id, Sum(Sales_Amount) from Table1Group By Sales_id;


Left Join


Load Sales_id, region, unit from Table2;


This may help you to get desired output.


**Please not that sample data you have shown has only one Sales_id 91000. Accordingly it will show Sales_Amount 150 against each row.

maxgro
MVP
MVP

no, I think it doesn't

I think you want

Sales_id, region, unit, Sales_amount

91000, Asia, AL

91000, Brazil, BR

91000, America, AR

91000, France, FR

91000, Spain, SP

But how to distribute 150 between 5 countries?

Random?

Not applicable
Author

Thanks Prashant. I should have mentioned I have more fields than included in my original sample data. Would I need to add all of these in the group by clause as currently it returns an 'invalid expression' error.

prashantbaste
Partner - Creator II
Partner - Creator II

Yes Sid,

All fields other than aggression must be listed in Group By clause.

Not applicable
Author

So I have an update. There is another field in table B called quantity.

What I am being asked for is to join table A ( aggregating the Sales_amount) to table B joining on which ever single line has the maximum quantity value.

Is this even possible (?)