Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Task:
I have group of messages in queue and they are consumed by consumer and get latest record among using spark streaming job and loaded into HDFS
Issue:
1. Wanted to save data into a file as .csv but some number pattern is added to file name which is given in tfileOutput component
Example: give below i wanted to save data in maindata.csv but it is creating maindata.csv-1522775132000 folder and saving data in that folder
2. Creating 14 empty partitions files and inserting data into 15 partition file
Expected Output:
1. Can i insert data into maindata.csv ??
2. Can i determinate partitions according to data ??
Thanks in advance!!
One solution option for Issue-1 is to check the 'Merge result to single file' option in tFileOutputDelimited component properties. Set the property 'Merge File Path' to your file path for maindata.csv.
This creates a file with a name of your choice, in the path defined by you, with all the part- files data merged into one file. Optionally you could remove the source directory and/or override target file.
Hope this helps.