Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Toronto Sept 9th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
jerownimow
Contributor III
Contributor III

How to sum group by with specific conditions?

Hi Folks,
Good day!

I would like to request for your help on my problem:

I have the following requirements:
1. If itemGroupId are all +, add all, and get the items from the first item
2. If itemGroupId are all -, load them as is.
3. If itemGroupId are both + and -, equate them, and get the other fields on the first item.

 

And the source is this:
0683p000009M8fn.png
I somehow manage to achieve this by the below logic:
0683p000009M8fs.png

And the result is this:

0683p000009M8fx.png

The problems with this logic are the following:

1. How can I extract the aggregated level?
2. On 708157001, it was not all summed up because the 2 of 3 of the items have the same sign, which should be added up based on the 3rd requirement.
3. How can I get the other fields from the first item only for 1st and 3rd requirement? because there were other fields aside from those 2.

Badly needed your help. Please let me know if you need more info.

Thank you so much!

Labels (3)
1 Solution

Accepted Solutions
Anonymous
Not applicable

OK, you've got another requirement here. The requirement that the order of the data remains the same. For example, given the following data....

708152001 12.23
708153001 2.54
708152001 -10.12
708152001 9.3
708153001 11.98
708157001 -2.78
708155001 23.32
708152001 1.23
708154001 12.43
708155001 12.76
708153001 -3.45
708157001 -12.45

....I think you need to do the following.

 

1) Establish an order. Simply use a numeric sequence to number the rows from 1 to the total number of rows. Add this as a new column.

2) Next calculate which groups (1,2 or 3) each code group belongs to. You can do this using a tJavaFlex and a HashMap quite easily. This data will need to eventually be added as a column as well.

3) Then sort the rows by product code and then the original row number. This will allow you to process the rows (add the values) in the correct way.

 

I have created a quick job which gets the data I showed above into the state below.....

Code Value Original Order Action
708152001 12.23 1 3
708152001 -10.12 3 3
708152001 9.3 4 3
708152001 1.23 8 3
708153001 2.54 2 3
708153001 11.98 5 3
708153001 -3.45 11 3
708154001 12.43 9 1
708155001 23.32 7 1
708155001 12.76 10 1
708157001 -2.78 6 2
708157001 -12.45 12 2

 

Once you have this, then you can process the data based on the action type and can maintain the order using the Original Order and Code columns.

 

The job I have written can be seen below....

0683p000009M8g7.png

I won't describe the tFileInputExxel_1 as it is just bringing in the data. 

 

The tMap_1 is where I assign the original row order. This is simply done with a tMap variable and the Numeric.sequence method. This is shown below.....

0683p000009M8gC.png

 

The tJavaFlex_1 is where there is a little code. This is important for identifying the action to be taken with each code group. The code can be seen below....

 

Start Code

// start part of your Java code
java.util.HashMap<String, Integer> codeStatus = new java.util.HashMap<String, Integer>();

Main Code

// here is the main part of the component,
// a piece of code executed in the row
// loop
if(((Integer)codeStatus.get(out1.code))==null){
	if(out1.value>=0){
		codeStatus.put(out1.code,1);
	}else{
		codeStatus.put(out1.code,2);
	}
}else if(((Integer)codeStatus.get(out1.code))==1){
	if(out1.value>=0){
		codeStatus.put(out1.code,1);
	}else{
		codeStatus.put(out1.code,3);
	}
}else if(((Integer)codeStatus.get(out1.code))==2){
	if(out1.value>=0){
		codeStatus.put(out1.code,3);
	}else{
		codeStatus.put(out1.code,2);
	}
}else{
	codeStatus.put(out1.code,3);
}

End Code

// end of the component, outside/closing the loop
globalMap.put("codeStatus", codeStatus);

The above stores the action that should be applied to the code group in a HashMap. This will be used later.

 

The tSortRow simply orders by the Code then the Row Number identified in the tMap_1.

 

The tHashOutput_1 simply stores the data in memory.

 

The tHashInput reads the data back from memory in a new SubJob.

 

The tJavaFlex_2 finishes the processing of the data with regard to the action type. The code for that can be seen below. I am adding a new column called "action"

 

Start Code

// start part of your Java code
java.util.HashMap<String, Integer> codeStatus = ((java.util.HashMap<String, Integer>)globalMap.get("codeStatus"));

Main Code

// here is the main part of the component,
// a piece of code executed in the row
// loop
row4.action = ((Integer)codeStatus.get(row5.code));

There is no "End Code"

 

Finally I am just printing the data out using the tLogRow.

 

Once you are at this point, you can easily carry out the addition logic and logic applied to the other columns you mention, based on the action and the ordering.

View solution in original post

11 Replies
Anonymous
Not applicable

I'm afraid this is difficult to understand. I understand requirements 1 and 2, but 3 I am struggling with. What does "If itemGroupId are both + and -, equate them, and get the other fields on the first item" actually mean? Can you give an example? Also, I'm not sure if you've noticed, but the order of the data has changed between input and output.

 

4+5.32+6.25 would look like this...

 

4
9.32

15.57

 

Yet in your output it is shown like this...

 

4

10.25

15.57

 

I'm not sure whether that is important, but I suspect it might be.

 

If you are unsure of how many rows of data you will have, the easiest way of solving a problem like this is to look at the data twice. Go through it the first time to sort it and analyse each group. Which criteria does the group fall into, 1,2 or 3? Add that as another column. Then when you process it, you will know from the first value of the group how you need to deal with it.

jerownimow
Contributor III
Contributor III
Author

Hi @rhall 

 

Thank you so much for your prompt response. For the 3rd requirement here's an example:
0683p000009M8g2.png

This group should add up because they have the same itemGroupId, but if they are all negative they shouldn't.

Regarding the order of the data, you're right, talend sorted it as 4+6.25+5.32. Thank you for that, I will add a criteria on the sorter so that it will be sorted correctly.

Can you illustrate it on an example? I'm sorry I'm just a beginner in Talend. Thank you so much.

Anonymous
Not applicable

I'm still not sure about the 3rd requirement. Is it really 2 requirements, if any are positive then add them, if they are ALL negative leave them as they are? That is the example you have given.

jerownimow
Contributor III
Contributor III
Author

Thanks @rhall . yes, as long as they have the same itemGroupId.

1. the same ItemGroupId with all positive amounts, add them and get the other columns from ItemGroupId.
2. the same ItemGroupId with all negative amounts, leave them be.

3. the same ItemGroupId having both positive and negative amounts, add them and get the other columns from ItemGroupId.

I was planning to handle like this:
tSortRow -> tMap with the following outputs:
1. row3.itemGroupId == Var.prev_itmGrpId && row3.amount >= 0 

2. row3.itemGroupId == Var.prev_itmGrpId && row3.amount < 0  

3. row3.itemGroupId == Var.prev_itmGrpId && row3.amount >= 0 || row3.itemGroupId == Var.prev_itmGrpId && row3.amount <= 0

But the problem here is that the 1st row will always be missed. Although, I haven't tried to simulate this because I knew it was wrong hehe

Anonymous
Not applicable

OK, you've got another requirement here. The requirement that the order of the data remains the same. For example, given the following data....

708152001 12.23
708153001 2.54
708152001 -10.12
708152001 9.3
708153001 11.98
708157001 -2.78
708155001 23.32
708152001 1.23
708154001 12.43
708155001 12.76
708153001 -3.45
708157001 -12.45

....I think you need to do the following.

 

1) Establish an order. Simply use a numeric sequence to number the rows from 1 to the total number of rows. Add this as a new column.

2) Next calculate which groups (1,2 or 3) each code group belongs to. You can do this using a tJavaFlex and a HashMap quite easily. This data will need to eventually be added as a column as well.

3) Then sort the rows by product code and then the original row number. This will allow you to process the rows (add the values) in the correct way.

 

I have created a quick job which gets the data I showed above into the state below.....

Code Value Original Order Action
708152001 12.23 1 3
708152001 -10.12 3 3
708152001 9.3 4 3
708152001 1.23 8 3
708153001 2.54 2 3
708153001 11.98 5 3
708153001 -3.45 11 3
708154001 12.43 9 1
708155001 23.32 7 1
708155001 12.76 10 1
708157001 -2.78 6 2
708157001 -12.45 12 2

 

Once you have this, then you can process the data based on the action type and can maintain the order using the Original Order and Code columns.

 

The job I have written can be seen below....

0683p000009M8g7.png

I won't describe the tFileInputExxel_1 as it is just bringing in the data. 

 

The tMap_1 is where I assign the original row order. This is simply done with a tMap variable and the Numeric.sequence method. This is shown below.....

0683p000009M8gC.png

 

The tJavaFlex_1 is where there is a little code. This is important for identifying the action to be taken with each code group. The code can be seen below....

 

Start Code

// start part of your Java code
java.util.HashMap<String, Integer> codeStatus = new java.util.HashMap<String, Integer>();

Main Code

// here is the main part of the component,
// a piece of code executed in the row
// loop
if(((Integer)codeStatus.get(out1.code))==null){
	if(out1.value>=0){
		codeStatus.put(out1.code,1);
	}else{
		codeStatus.put(out1.code,2);
	}
}else if(((Integer)codeStatus.get(out1.code))==1){
	if(out1.value>=0){
		codeStatus.put(out1.code,1);
	}else{
		codeStatus.put(out1.code,3);
	}
}else if(((Integer)codeStatus.get(out1.code))==2){
	if(out1.value>=0){
		codeStatus.put(out1.code,3);
	}else{
		codeStatus.put(out1.code,2);
	}
}else{
	codeStatus.put(out1.code,3);
}

End Code

// end of the component, outside/closing the loop
globalMap.put("codeStatus", codeStatus);

The above stores the action that should be applied to the code group in a HashMap. This will be used later.

 

The tSortRow simply orders by the Code then the Row Number identified in the tMap_1.

 

The tHashOutput_1 simply stores the data in memory.

 

The tHashInput reads the data back from memory in a new SubJob.

 

The tJavaFlex_2 finishes the processing of the data with regard to the action type. The code for that can be seen below. I am adding a new column called "action"

 

Start Code

// start part of your Java code
java.util.HashMap<String, Integer> codeStatus = ((java.util.HashMap<String, Integer>)globalMap.get("codeStatus"));

Main Code

// here is the main part of the component,
// a piece of code executed in the row
// loop
row4.action = ((Integer)codeStatus.get(row5.code));

There is no "End Code"

 

Finally I am just printing the data out using the tLogRow.

 

Once you are at this point, you can easily carry out the addition logic and logic applied to the other columns you mention, based on the action and the ordering.

jerownimow
Contributor III
Contributor III
Author

Thanks, rhall will definitely try this. I'll let you know. Just a quick question, what do you mean by HashMap? Is it by using tHashInput/Output or tHMap?

Thank you.
Anonymous
Not applicable

A HashMap is a data structure. I used it in the Java code that I gave you. 

jerownimow
Contributor III
Contributor III
Author

Thank you so much @rhall ! It really worked well! How did you come up with this? 0683p000009MACJ.png
Never thought it can be done with tJavaFlex, though I really don't know it's use 0683p000009MAB6.png

Now my next problem is how to get the other fields for the first item only during group by and how to merge the other sources with this, because it will be a whole JSON in the end. I'll try to figure out it first and will just come back here if I don't know how to do it.

Thank you so much! This community is a really huge help!

Anonymous
Not applicable

Not a problem. Good luck with the rest of the problem. You should be able to simply pass through the other columns along with the data needed to solve the above.