Hi,
I am running a big data batch job(Map reduce) job with 58 million records, which is taking 5 hours to complete. There are few joins. The one thing I noticed that for every row, it is writing like below. So it is taking lot of time to complete.
Running job: job_1483713701028_0784
map 0% reduce 0%
1|1|0.0|0.0 <== This is getting printed in log, how to block this from printing???
1|1|0.0|0.0
1|1|0.0|0.0
1|1|0.0|0.0
1|1|0.0|0.0
1|1|0.0|0.0
Please note that I am reading a parquet file. I filtered the records and try to display only one record in the console, even then I am getting the above number of displays of "
1|1|0.0|0.0"
I have increased Map and reduce memory to 5gb and 10GB. But not much improvement in reduction in the run time.
Question:
1) Why is this getting printed? How to fix this?
Appreciate your help!!!