Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I am unloading data from oracle database into Hadoop.
landing the file as as HDFS file. But the unloaded file is a single file 750 MB in size.
Can anyone please help me understand why its acting so. I would like to have each part file 128 MB in size.
where do i have to update the settings ?
Found 2 items
-rw-r--r-- 3 hdfs supergroup 0 2019-02-07 07:37 /var/lib/hadoop-hdfs/lake/itineraryitem2/_SUCCESS
-rw-r--r-- 3 hdfs supergroup 730087184 2019-02-07 07:37 /var/lib/hadoop-hdfs/lake/itineraryitem2/part-00000
@badri-nair ,check below link will help you to understand.
https://www.talend.com/blog/2018/04/12/apache-spark-performance-and-tuning-blog/