Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us to spark ideas for how to put the latest capabilities into action. Register here!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Sqoop vs Hive in Talend

Hi All,

 

I am slightly confused between Sqoop and Hive when considering the saving of Structured data in HDFS. 

Which is the best component to choose from Sqoop or Hive while saving the structured data in HDFS and Why.

 

Thanks

Labels (2)
1 Solution

Accepted Solutions
lojdr
Creator II
Creator II

Hello,

 

Sqoop is used to import/export data from RDBMS to HDFS and Hive is a SQL layer abstraction on top of Hadoop. The purpose of these tools is different. 

You can use Sqoop for importing data into HDFS and then use Hive for querying. 

 

Regards

Lojdr

View solution in original post

5 Replies
lojdr
Creator II
Creator II

Hello,

 

Sqoop is used to import/export data from RDBMS to HDFS and Hive is a SQL layer abstraction on top of Hadoop. The purpose of these tools is different. 

You can use Sqoop for importing data into HDFS and then use Hive for querying. 

 

Regards

Lojdr

Anonymous
Not applicable
Author

Thank you. 

 

When it comes to the saving the data (Structured) in HDFS, then it can be done through HIVE as well as through Sqoop. 

 

Which component shall we use to store the structured data in HDFS. 

lojdr
Creator II
Creator II


@ShahabZaidi wrote:

    Which component shall we use to store the structured data in HDFS. 


It depends on your needs and preferences. You can use any RDBMS. It is perfect storage for structured data. No need to have both structured and unstructured data in HDFS, but there are several tools... Kudu, Parquet, HBase... All designed for analytics workload. If you need OLTP workload, I would prefer RDBMs.

 

Regards

Lojdr

Anonymous
Not applicable
Author

I recently faced this question in a certification that which component is the best to save the structured data in HDFS, I understand that it could be possible through Hive as well as through Sqoop, but only 1 option needs to be chosen. 

lojdr
Creator II
Creator II

Sqoop.

My opinion.