Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Save $650 on Qlik Connect, Dec 1 - 7, our lowest price of the year. Register with code CYBERWEEK: Register
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Alternative to globalmap in talend bigdata spark

Hello

I have a requirement to use globalMap in Spark bigdata suit.

I understand in 6.x version this is not supported , Are there any alternatives. I need to populate the globalmap with some static values and reference them in the tmap transformation. 

there are about 200 entries for the map. i can get that job working in DI but not in Bigdata enc.

My flor is 

file --> tjavaflex(populate globalmap) --> in another subjob access it in tmap.

 

Thanks!

 

Labels (1)
7 Replies
Anonymous
Not applicable
Author

Hello,

In fact we have no access for "globalMap" in Spark Batch mode.(because of the different implementation way on Spark Batch compared to DI)
The reason is that it's difficult to have a synchronous "global" variable in distributed mode and in addition the globalMap is not totally serializable by default.

Here are some articles about context in spark job.

https://community.talend.com/t5/Design-and-Development/Using-the-Implicit-Context-Load-Feature-for-a...

https://community.talend.com/t5/Architecture-Best-Practices-and/Spark-Dynamic-Context/ta-p/33038

Hope it will shed some light on your requirement.

Best regards

Sabrina

Anonymous
Not applicable
Author

Hello

I ran into that article too but my requirement dosent change.

i need to pass a hashmap to the spark jobs if not globalmap then some other way .

I tried using hashmap as context variable but couldn't  cast it back to hashmap from string  from a context value in sparkjob.

my flow was

Inputfile --> tjava(populate hashmap and add to context varibale the hashmap) --> pass context to spark job --> tmap access hashmap (error: cant cast string to hashmap) 

any suggestions?

Anonymous
Not applicable
Author

Hello,

Would you mind posting your current job setting screenshots on forum? Which will be helpful for us to get more information.

Please mask your sensitive data.

Best regards

Sabrina

Anonymous
Not applicable
Author

my use case was to load a static business file which is of the forum k,v

The problem i cannot use it as a map lookup because  the different rows attributes, pass different attributes some static some non static to find a value . So to calculate x if i pass (col1) from main flow , to calculate y i pass col2. etc. so i cannot use any set of same columns as part of map key.

I resolved this by writing procedure to read this file and storing it as a string buffer then parse this string in loop with record separator and try to match and return the matched result.

 

 

 

AnandK1
Contributor
Contributor

I have a exactly same issue.

I have a csv file that I need to load into org.apache.commons.collections.map.MultiKeyMap() and pass it to bigdata context. So, I am using a Standard job to read CSV file-> tJavaRow to build a MultiKeyMap and store into a context called lookupMap of type Object.

After successfully loading the context, for testing I wrote a tJava to read the lookupMap and cast it to a MultiKeyMap, but I get an error saying that cant cast String to MultiKeyMap.

lookupMap in Context is defined as Object and not String. So why do I get the cast error?

Attached are the screenshots

Job, 

0683p000009M4SM.png

context,

0683p000009M4Ro.png

tJavaRow (sets the context),

0683p000009M49a.png

and tJava (reads the context).

Looks like the first line is the one is getting the cast error.

0683p000009M2q8.png

Here is the error.

Looks like a bug to me.

0683p000009M4SW.png

 

 

AnandK1
Contributor
Contributor

As an FYI, if it is Standard job to Standard job, I already have a solution, I can use the global map to save my MultiKeyMap.

But I have to pass this to a Big data job, as you mentioned globalMap doesn't work for big data job. So looks like context is the solution but I get this error.

Anonymous
Not applicable
Author

Hi Sabrina,

I followed the below page and able to write to RDD and saving the values into globalMap for further use in job.

Please review the below page and the mention of GlobalMap usage in spark big data job. I have used hive component executed the query store in Globalmap in tjava component and used on subjob ok to the actual hive components where the global map values are used. I am able to execute the query and could print the values successfully.The values are properly substituted in the latter queries.

https://community.talend.com/t5/Design-and-Development/tcontextload-component-in-talend-Big-data/m-p...