Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all. I've been working on a problem all day and I think I finally figured out what's going on. I am using the paid version of Studio 7.1 big data.
I have a main job that will call a series of child jobs. This works fine. I am trying to use contexts to pass data from the parent to the child. This always seems to work fine for simple objects (Strings, Integers, etc.).
I am trying to pass a ConcurrentHashMap (basically I am trying to do something almost exactly like this: https://www.talendbyexample.com/talend-returning-values-from-subjobs.html).
What I notice is that if I do NOT check "Use dynamic job", the Object context variable (aka the ConcurrentHashMap) is sent properly. I can also manipulate it in the child job and the changes are available in the parent. Exactly what I want!
The problem is that I do need to use dynamic jobs. When I check this box, now all of sudden the Map is a String!!
I ended up putting printlns in tJava components like this:
System.out.println(context.sharedMap.getClass().toString());
I figured out that just by changing the dynamic job flag (and the job name of course), this will vary between:
class java.util.concurrent.ConcurrentHashMap [without dynamic flag]
class java.lang.String [with dynamic flag]
Please help me understand what is going on here and more importantly how to fix it.
If anyone wants to replicate it, simply do the example in the URL above and change back and forth between dyanamic or static child jobs. I'd be curious if it's just me or if others have the same problem.
Thanks,
Tom
Hi Tom,
I feel your pain with this. I came across this "feature" a while ago. The cause of this is that when you run dynamic jobs, they actually run as an independent process. Essentially you are running them in a different virtual machine and starting them using command-line arguments. However, there is a workaround. It's not ideal, but you can build routines to make it as reusable as possible.
Essentially what you need to do is serialise your ConcurrentHashMap to a String (or create a new object which will serialise to a String). I have built a quick and (very) dirty example routine for serialising and deserialising a ConcurrentHashMap that will always hold String keys and values. Obviously this can be altered for different types......
package routines; import java.util.Enumeration; import java.util.concurrent.ConcurrentHashMap; /* * user specification: the function's comment should contain keys as follows: 1. write about the function's comment.but * it must be before the "{talendTypes}" key. * * 2. {talendTypes} 's value must be talend Type, it is required . its value should be one of: String, char | Character, * long | Long, int | Integer, boolean | Boolean, byte | Byte, Date, double | Double, float | Float, Object, short | * Short * * 3. {Category} define a category for the Function. it is required. its value is user-defined . * * 4. {param} 's format is: {param} <type>[(<default value or closed list values>)] <name>[ : <comment>] * * <type> 's value should be one of: string, int, list, double, object, boolean, long, char, date. <name>'s value is the * Function's parameter name. the {param} is optional. so if you the Function without the parameters. the {param} don't * added. you can have many parameters for the Function. * * 5. {example} gives a example for the Function. it is optional. */ public class ConcurrentHashMapWrapper { public static String serialiseConcurrentHashMap(ConcurrentHashMap<String,String> chm){ String returnVal = null; if(chm!=null){ returnVal = ""; Enumeration<String> e = chm.keys(); while(e.hasMoreElements()){ String key = e.nextElement(); returnVal = key+"|"+chm.get(key)+"|"; } } return returnVal; } public static ConcurrentHashMap deserialiseConcurrentHashMap(String data){ ConcurrentHashMap<String,String> returnVal = null; if(data!=null&&data.trim().compareToIgnoreCase("")!=0){ returnVal = new ConcurrentHashMap<String,String>(); String[] pairs = data.split("\\|"); for(int i=1; i<pairs.length;i=i+2){ returnVal.put(pairs[i-1], pairs[i]); } } return returnVal; } }
This will allow the values to be passed to your child job when using Dynamic jobs.
If you want to send the data back to your parent job, then you will have to go to extra lengths to achieve this. I would recommend using a database (if possible) or a flat file. You will need to write the serialised String back to the third party repository and then read it into your parent job once the child job has finished processing.
Granted, this is not ideal. But this will get you around the issue you are experiencing.
Hi Tom,
I feel your pain with this. I came across this "feature" a while ago. The cause of this is that when you run dynamic jobs, they actually run as an independent process. Essentially you are running them in a different virtual machine and starting them using command-line arguments. However, there is a workaround. It's not ideal, but you can build routines to make it as reusable as possible.
Essentially what you need to do is serialise your ConcurrentHashMap to a String (or create a new object which will serialise to a String). I have built a quick and (very) dirty example routine for serialising and deserialising a ConcurrentHashMap that will always hold String keys and values. Obviously this can be altered for different types......
package routines; import java.util.Enumeration; import java.util.concurrent.ConcurrentHashMap; /* * user specification: the function's comment should contain keys as follows: 1. write about the function's comment.but * it must be before the "{talendTypes}" key. * * 2. {talendTypes} 's value must be talend Type, it is required . its value should be one of: String, char | Character, * long | Long, int | Integer, boolean | Boolean, byte | Byte, Date, double | Double, float | Float, Object, short | * Short * * 3. {Category} define a category for the Function. it is required. its value is user-defined . * * 4. {param} 's format is: {param} <type>[(<default value or closed list values>)] <name>[ : <comment>] * * <type> 's value should be one of: string, int, list, double, object, boolean, long, char, date. <name>'s value is the * Function's parameter name. the {param} is optional. so if you the Function without the parameters. the {param} don't * added. you can have many parameters for the Function. * * 5. {example} gives a example for the Function. it is optional. */ public class ConcurrentHashMapWrapper { public static String serialiseConcurrentHashMap(ConcurrentHashMap<String,String> chm){ String returnVal = null; if(chm!=null){ returnVal = ""; Enumeration<String> e = chm.keys(); while(e.hasMoreElements()){ String key = e.nextElement(); returnVal = key+"|"+chm.get(key)+"|"; } } return returnVal; } public static ConcurrentHashMap deserialiseConcurrentHashMap(String data){ ConcurrentHashMap<String,String> returnVal = null; if(data!=null&&data.trim().compareToIgnoreCase("")!=0){ returnVal = new ConcurrentHashMap<String,String>(); String[] pairs = data.split("\\|"); for(int i=1; i<pairs.length;i=i+2){ returnVal.put(pairs[i-1], pairs[i]); } } return returnVal; } }
This will allow the values to be passed to your child job when using Dynamic jobs.
If you want to send the data back to your parent job, then you will have to go to extra lengths to achieve this. I would recommend using a database (if possible) or a flat file. You will need to write the serialised String back to the third party repository and then read it into your parent job once the child job has finished processing.
Granted, this is not ideal. But this will get you around the issue you are experiencing.
Wow. I wasn't expecting such a detailed well thought out response. Thanks. I get it now. Ok, I'll have to rethink the design and incorporate what you have suggested.
Thanks again.
No problem. I've been in exactly the same position, so knew what you'd need to point you in the right direction 🙂