Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi !
I'm new to Talend and I'm tasked to create a job to read in excel files and output as CSV. I know I can read in excel file in Talend but how do I programatically read in files with different sheet names? For example, sometimes the sheet name might be "october_PP", "Oct_PP", "10_PP",etc.. It is dynamic but follows a pattern of "*_PP".
Appreciate all help!!
@pratikpandya, if your sheetname is formatted like "OCT 2016 (all data)" the regex must be:
".*\\(All Data\\)$"
wich means any string finishing by the substring "(All Data)".
If your string may contained any string between the (), change the regex like this one:
".*\\(.*\\)$"
wich means any string finishing by any substring contained between ().
@Victor, if your excel file contains only 1 sheet, the sheetname doesn't matter, just tick the option "All sheets".
Else, the following regex should work (don't forget to tick the option "Use Regex"):
".*_PP$"
which means any string finishing by "_PP".
Hi,
If the sheetname changes but not the sheet order, you may also replace the sheetname by its position (from 0 to n) or, if there is only 1 sheet per file, tick the option "All sheets".
Hi,
Could you please further elaborate how to achieve this, I too have the similar requirement. The sheet name changes every time, and there is a fixed pattern such as "Nov 2015 ALL Data" Where "Nov 2015" will keep on changing but the last two words "ALL DATA" will remain as is. Is there a way i read based on sheet name matching as per pattern?
thanks, but it's not working.
Sheet name is "OCT 2016 (all data)"
Applied regex as "*all data"
That's the whole point, I need only a pattern to match, enclosed "(" and ")" may not exist in subsequent data sets. Text "all data" will always remain though.
Same error ,
java.util.regex.PatternSyntaxException: Dangling meta character '*' near index 0
*(all data)
^
@pratikpandya, if your sheetname is formatted like "OCT 2016 (all data)" the regex must be:
".*\\(All Data\\)$"
wich means any string finishing by the substring "(All Data)".
If your string may contained any string between the (), change the regex like this one:
".*\\(.*\\)$"
wich means any string finishing by any substring contained between ().
@Victor, if your excel file contains only 1 sheet, the sheetname doesn't matter, just tick the option "All sheets".
Else, the following regex should work (don't forget to tick the option "Use Regex"):
".*_PP$"
which means any string finishing by "_PP".