Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Learn how to migrate to Qlik Cloud Analytics™: On-Demand Briefing!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Hive Query

Hi all,
I have a query in reference to how is code generated by Big Data Studio. My understanding is that in case I am using any of the hive/hdfs components then MR jobs are created and submitted on the Hive/Hadoop server.
Let us say , I want to take read data from a Hive server, do some transformations and then load it into another Hive table. I can do it following way:
Read the data from the Hive Table, use tMap to do the transformations, Load it into a Hive table
How will data be processed in this case? Will a MR program be generated for all 3 components or will it be like data will be fetched on the server where talend job is running, then transformation will happen on that server?
Basically what I want to know is that will all 3 components use the parallel processing power of the hive server or will there be any processing happening on the server where Talend job is running?

Labels (2)
2 Replies
Anonymous
Not applicable
Author

Please help!
Anonymous
Not applicable
Author

my understanding is that you should use tELTHiveInput, tELTHiveMap and tELTHiveOutput to achieve this to take advantage of the MR functionality in the hadoop cluster.