Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all,
I have a query in reference to how is code generated by Big Data Studio. My understanding is that in case I am using any of the hive/hdfs components then MR jobs are created and submitted on the Hive/Hadoop server.
Let us say , I want to take read data from a Hive server, do some transformations and then load it into another Hive table. I can do it following way:
Read the data from the Hive Table, use tMap to do the transformations, Load it into a Hive table
How will data be processed in this case? Will a MR program be generated for all 3 components or will it be like data will be fetched on the server where talend job is running, then transformation will happen on that server?
Basically what I want to know is that will all 3 components use the parallel processing power of the hive server or will there be any processing happening on the server where Talend job is running?