Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Discover how organizations are unlocking new revenue streams: Watch here
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Breaking up Hive query extract into multiple files?

I am using a Hive query as my source and the result written in a file on HDFS (7.7GB). My aim is to move this into S3 but there is a file limitation of 5GB on S3.
Is there a way for me to break up this file into multiple chunks?
tHDFSConnection --> tHiveConnection --> tHiveInput --> tMap --> tHDFSOutput
 
Labels (2)
2 Replies
amarouni
Contributor
Contributor

Hello,
The output of tHDFSOutput is a single 7.7 GB file ? Are you executing the job on a Hadoop cluster ?
You can take a look at the tELTHive components (tELTHiveInput, tELTHiveMap, tELTHiveOutput) (), the output will be written to a Hive table but the whole job will be executed on cluster. If you're using a cluster with multiple machines, this would generate separate partition files that you can then move to S3.
Anonymous
Not applicable
Author

Hi 
Which component among "tHive / TELTHive " is better? and please suggest where should we go with thive component and when should we go for tELTHive component.