Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

How to read excel file with dynamic sheet names?

Hi !

 

I'm new to Talend and I'm tasked to create a job to read in excel files and output as CSV. I know I can read in excel file in Talend but how do I programatically read in files with different sheet names? For example, sometimes the sheet name might be "october_PP", "Oct_PP", "10_PP",etc.. It is dynamic but follows a pattern of "*_PP".

 

Appreciate all help!!

 

 

Labels (1)
1 Solution

Accepted Solutions
TRF
Champion II

@pratikpandya, if your sheetname is formatted like "OCT 2016 (all data)" the regex must be:

".*\\(All Data\\)$"

wich means any string finishing by the substring "(All Data)".

 

If your string may contained any string between the (), change the regex like this one: 

".*\\(.*\\)$"

wich means any string finishing by any substring contained between ().

 

@Victor, if your excel file contains only 1 sheet, the sheetname doesn't matter, just tick the option "All sheets".

Else, the following regex should work (don't forget to tick the option "Use Regex"):

".*_PP$"

which means any string finishing by "_PP". 

View solution in original post

19 Replies
Anonymous
Not applicable
Author

Hi
tFileInputExcel component supports Use Regex to set the sheet name.

Regards
Shong
TRF
Champion II

Hi,

If the sheetname changes but not the sheet order, you may also replace the sheetname by its position (from 0 to n) or, if there is only 1 sheet per file, tick the option "All sheets".

 

Anonymous
Not applicable
Author

Hi,

Could you please further elaborate how to achieve this, I too have the similar requirement. The sheet name changes every time, and there is a fixed pattern such as "Nov 2015 ALL Data" Where "Nov 2015" will keep on changing but the last two words "ALL DATA" will remain as is. Is there a way i read based on sheet name matching as per pattern?

TRF
Champion II

Hi,
Enter "*ALL Data" (including the quotes) in the "Sheet list" area then tick the "Use Regex" option.
Anonymous
Not applicable
Author

thanks, but it's not working. 

 

Sheet name is "OCT 2016 (all data)" 

 

Applied regex as "*all data" 

 

 

 

 


error.PNG
TRF
Champion II

It doesn't work because it doesn't match the example you've given.
Try this one:
"*(all data)"
Anonymous
Not applicable
Author

That's the whole point, I need only a pattern to match, enclosed "(" and ")" may not exist in subsequent data sets.  Text "all data" will always remain though. 

Anonymous
Not applicable
Author

Same error , 

 

java.util.regex.PatternSyntaxException: Dangling meta character '*' near index 0
*(all data)
^

TRF
Champion II

@pratikpandya, if your sheetname is formatted like "OCT 2016 (all data)" the regex must be:

".*\\(All Data\\)$"

wich means any string finishing by the substring "(All Data)".

 

If your string may contained any string between the (), change the regex like this one: 

".*\\(.*\\)$"

wich means any string finishing by any substring contained between ().

 

@Victor, if your excel file contains only 1 sheet, the sheetname doesn't matter, just tick the option "All sheets".

Else, the following regex should work (don't forget to tick the option "Use Regex"):

".*_PP$"

which means any string finishing by "_PP".