Clear tHashOutput buffer?

Anonymous · ‎2012-03-20

Does anyone know of a way to clear a tHashOutput buffer so that rows don't accumulate? After a tFlowToIterate I just want to operate on the current record from before the tFlowToIterate. I typically use tBufferOutput/tBufferInput for this purpose (in conjunction with tBufferOutput.clear method), but I can't use this technique here because I'm already using a tBufferOutput earlier in the job and I'm assuming you can't nest tBufferOutputs.
I am running TIS 4.2.2
Thanks!

Anonymous · ‎2012-03-21

Hi
You can use tHashOutput/tHashInput instead of tBufferOutput/tBufferInput.
Don't check "Append" of tHashOutput.
Select "Component List" in tHashInput.
Regards,
Pedro

Anonymous · ‎2012-03-21

Thanks for your response Pedro. This is exactly that I was looking for but, alas, I don't see an "Append" checkbox in the tHashOutput settings. Was this checkbox added after 4.2.2? If so, can I retrofit the updated component into 4.2.2. withou upgrading to 4.2.4? I planned to do that upgrade later, but I'm on a tight deadline right now.
Thanks again!

Anonymous · ‎2012-03-21

Hi
Please upload a screenshot of your job.
We will try to find a workaround.
Regards,
Pedro

Anonymous · ‎2012-03-22

I have uploaded the relevant fragment of the joblet. Note the following:
* Remember that I can't use tBufferOutput/tBufferInput unless they can be nested. That's because this joblet that is called by a job that has an active tBufferOutput.
* A "Don't append rows" checkbox or java functionality equivalent to tHashOutput.clear() would be perfect.
* My work tends to be very Java intensive, so a Java based solution would be fine. For example, I notice that tBufferOutput/tBufferInput simply adds String arrays to globalBuffer, which is defined as

List<String[]> globalBuffer

. This implies that perhaps I can nest tBufferOutputs.
* It would be so helpful to know the difference between tBufferOutput/tBufferInput and tHashOutput/tHashInput from a Java implementation perspective. I have asked this question before and never got a response.
Thanks!

Anonymous · ‎2012-03-27

Hi Pedro:
I don't mean to bother you, but I'm wondering if you have made any progress in determining a workaround for this?
Thanks,

Anonymous · ‎2012-03-27

Hi Rob,
I have a solution other than HashOutput.
Save the data in a Java hashMap, you will have to create separate has map for each output
sample
public static java.util.concurrent.ConcurrentHashMap<String, Object> Sample = new java.util.concurrent.ConcurrentHashMap<String, Object> ();
instead of hash ouput use a tJavaRow and add
routines.Lookups.sample.put(KeyColumn, cacheEntry);
Here we are adding the data to sample hashmap
You can also clear the Hashmap when you want

Anonymous · ‎2012-03-29

Hi Lijo:
I have already created classes for automatically saving a flow row in a hash map and I do use this approach in many circumstances. The inconvenience comes when I need to generate a flow from that hash map in order to do further flow-based processing (tMaps, etc.). The only way I know of to create a flow from a hash map is to use a tFixedFlowInput. For flows with many columns, this is a real pain. Does anyone know of a component that automatically creates a flow from a hash map based on a schema? If I had the time (which I don't right now), I would write it myself.
What surprises me is that a mostly wonderful platform like Talend doesn't natively support what is, I believe, a very common scenario: I have a flow and I want to have one or more subjobs that process each row in that flow below a tFlowToIterate. This seems so obvious and so common. Am I missing something?

Anonymous · ‎2012-03-29

It seems like this common scrap of reflection code could help if you put it in a tJavaRow linked up with a tIterateToFlow. You need an output schema and a map with property names that match.

Map wholeRow = (Map)input_row.wholeRowObj;
for( Object inputFieldObj : wholeRow.entrySet() ) {
	Map.Entry inputField = (Map.Entry)inputFieldObj;
	Field outputField = output_row.getClass().getField( (String)inputField.getKey() );	
	outputField.set(output_row, inputField.getValue());
}

The top part of the screenshot uses a tFixedFlowInput iterating over a Map of Maps. The bottom part iterates, coverts to a flow with a single field 'wholeRowObj'.

Anonymous · ‎2013-01-03

Well, 10 months later, after the project dust settled, I finally had a chance to digest Carl?s solution. In fact, it inspired me to create a component that I expect to be extremely useful for all future projects. It is called tJavaMapsInput, and I?ve added it to the component exchange.
tJavaMapsInput is an input component that creates a Talend flow from a Map collection. In particular, it generates a flow from any Iterable<Map<String, ?>> (LinkedList of HashMap, array of LinkedHashMap, etc.).
This component will be very welcome to those of you who do a lot of their Talend work in Java and sometimes need to construct flows from map collections. Previously, you had to do this using tFixedFlowInput or tIterateToFlow (as per Carl), which is a royal PITA. With tJavaMapsInput, you can just manufacture a map collection in Java (with all the attendant benefits of working in the Eclipse/Java environment) and reference the collection from the component.
The component is fully tested (it deals with nullable schema columns and default values correctly BTW) and the component ZIP includes Test_tJavaMapsInput.zip, which has a test job and supporting routines that you can import into your project for testing a demo purposes.
Hope someone else finds it useful.
Thanks Carl.

Talend Data Integration

v5.x