Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I am slightly confused between Sqoop and Hive when considering the saving of Structured data in HDFS.
Which is the best component to choose from Sqoop or Hive while saving the structured data in HDFS and Why.
Thanks
Hello,
Sqoop is used to import/export data from RDBMS to HDFS and Hive is a SQL layer abstraction on top of Hadoop. The purpose of these tools is different.
You can use Sqoop for importing data into HDFS and then use Hive for querying.
Regards
Lojdr
Hello,
Sqoop is used to import/export data from RDBMS to HDFS and Hive is a SQL layer abstraction on top of Hadoop. The purpose of these tools is different.
You can use Sqoop for importing data into HDFS and then use Hive for querying.
Regards
Lojdr
Thank you.
When it comes to the saving the data (Structured) in HDFS, then it can be done through HIVE as well as through Sqoop.
Which component shall we use to store the structured data in HDFS.
@ShahabZaidi wrote:
Which component shall we use to store the structured data in HDFS.
It depends on your needs and preferences. You can use any RDBMS. It is perfect storage for structured data. No need to have both structured and unstructured data in HDFS, but there are several tools... Kudu, Parquet, HBase... All designed for analytics workload. If you need OLTP workload, I would prefer RDBMs.
Regards
Lojdr
I recently faced this question in a certification that which component is the best to save the structured data in HDFS, I understand that it could be possible through Hive as well as through Sqoop, but only 1 option needs to be chosen.
Sqoop.
My opinion.