Hive Query

Anonymous · ‎2016-10-07

Hi all,
I have a query in reference to how is code generated by Big Data Studio. My understanding is that in case I am using any of the hive/hdfs components then MR jobs are created and submitted on the Hive/Hadoop server.
Let us say , I want to take read data from a Hive server, do some transformations and then load it into another Hive table. I can do it following way:
Read the data from the Hive Table, use tMap to do the transformations, Load it into a Hive table
How will data be processed in this case? Will a MR program be generated for all 3 components or will it be like data will be fetched on the server where talend job is running, then transformation will happen on that server?
Basically what I want to know is that will all 3 components use the parallel processing power of the hive server or will there be any processing happening on the server where Talend job is running?

Anonymous · ‎2016-10-10

Please help!

Anonymous · ‎2016-10-10

my understanding is that you should use tELTHiveInput, tELTHiveMap and tELTHiveOutput to achieve this to take advantage of the MR functionality in the hadoop cluster.

Big Data

v6.x

Related Topics