
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How to read excel file with dynamic sheet names?
Hi !
I'm new to Talend and I'm tasked to create a job to read in excel files and output as CSV. I know I can read in excel file in Talend but how do I programatically read in files with different sheet names? For example, sometimes the sheet name might be "october_PP", "Oct_PP", "10_PP",etc.. It is dynamic but follows a pattern of "*_PP".
Appreciate all help!!
- « Previous Replies
-
- 1
- 2
- Next Replies »
Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@pratikpandya, if your sheetname is formatted like "OCT 2016 (all data)" the regex must be:
".*\\(All Data\\)$"
wich means any string finishing by the substring "(All Data)".
If your string may contained any string between the (), change the regex like this one:
".*\\(.*\\)$"
wich means any string finishing by any substring contained between ().
@Victor, if your excel file contains only 1 sheet, the sheetname doesn't matter, just tick the option "All sheets".
Else, the following regex should work (don't forget to tick the option "Use Regex"):
".*_PP$"
which means any string finishing by "_PP".

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
tFileInputExcel component supports Use Regex to set the sheet name.
Regards
Shong

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
If the sheetname changes but not the sheet order, you may also replace the sheetname by its position (from 0 to n) or, if there is only 1 sheet per file, tick the option "All sheets".

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Could you please further elaborate how to achieve this, I too have the similar requirement. The sheet name changes every time, and there is a fixed pattern such as "Nov 2015 ALL Data" Where "Nov 2015" will keep on changing but the last two words "ALL DATA" will remain as is. Is there a way i read based on sheet name matching as per pattern?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Enter "*ALL Data" (including the quotes) in the "Sheet list" area then tick the "Use Regex" option.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks, but it's not working.
Sheet name is "OCT 2016 (all data)"
Applied regex as "*all data"
error.PNG

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try this one:
"*(all data)"

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That's the whole point, I need only a pattern to match, enclosed "(" and ")" may not exist in subsequent data sets. Text "all data" will always remain though.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Same error ,
java.util.regex.PatternSyntaxException: Dangling meta character '*' near index 0
*(all data)
^

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@pratikpandya, if your sheetname is formatted like "OCT 2016 (all data)" the regex must be:
".*\\(All Data\\)$"
wich means any string finishing by the substring "(All Data)".
If your string may contained any string between the (), change the regex like this one:
".*\\(.*\\)$"
wich means any string finishing by any substring contained between ().
@Victor, if your excel file contains only 1 sheet, the sheetname doesn't matter, just tick the option "All sheets".
Else, the following regex should work (don't forget to tick the option "Use Regex"):
".*_PP$"
which means any string finishing by "_PP".

- « Previous Replies
-
- 1
- 2
- Next Replies »