Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

tFileList: iterate through Excel files and sum values

Hey,
I have 700 Excel files with identical structure (ie. tables). I want to be able to iterate through each one of them, grab some of the tables' values, sum them up and then finally write them to my Access DB.
I have, by using a tFileList component, created a setup that iterates through a directory and grabs each file. It also identifies the correct tables and grabs their values. However, when I want to write the sum of the table-values to my DB with a tAccessOutput I end up, not with the sum, but only with the last entry in each respective table. I tried adding a tAccessInput to my tMap component (see image) but it seems that the tMap that I use doesn't understand that I want to grab the values there each iteration and instead it looks up the values right at the start of the execution and then uses the same ones over and over (messing up the sum function I use in tMap).
If anyone has any tips or advice that might help (is java code reauired for example?) it would be greatly appreciated.
Thanks in advance!
edit: Nevermind about the 700 files not being seen in the image, in this example I just used 2 files first to test my job
0683p000009MEh5.jpg
Labels (3)
6 Replies
Anonymous
Not applicable
Author

You will need to aggregate all the values for each table, the tAggreagateRow component might do the job.
You just need to work out if it should be before your tMap or after, which depends what the tMap does, but most likely before the tMap.
Anonymous
Not applicable
Author

Thanks, I tried the tAggregateRow and it seems to work partly. However it doesn't seem to want to sum values with 4 digits. I get the error below (image).
The sum of all columns is 10169 and if you calculate 6773+2318+1078 it gives you that. However, what is actually written in the DB after tMap is the 6773. Type is Integer length 255 in Talend so shouldn't be a problem I think?
0683p000009MEhA.jpg
Anonymous
Not applicable
Author

Just to clarify, do you need to sum some columns on a record or do you want to sum all recors in the file with one total value?
if the first than the aggregate is not required and you can just put the formula into your tMap.
Anonymous
Not applicable
Author

No what I meant was the first option you mentioned, one summation per file. The tAggregateRow worked fine. Problem was that the values in thousands were saved with a space "1 234" for example instead of "1234". I wrote a macro to sort it out and now it works.
However, a new problem arose. I manage to iterate through all the .xlsx files except the ones that are empty. My tFileList component just skips these. What I want is to set these files' summation values to "0". I tried a filter with the condition Effectif = null (Effectic being the column I am summing up) but the problem is that the file won't even go through the ouput in the tFileList component.
Any ideas on how to solve this?
edit: added image
0683p000009MEgn.jpg
Anonymous
Not applicable
Author

Solved it with an Excel macro instead. Still, would be nice to know for future reference if this would have had been possible in Talend.
alevy
Specialist

If the file exists and meets the filter criteria in tFileList then it will get as far as tFileInputExcel. The problem is that an empty file will not have any rows to read and thus will not progress past that point.
What you should do is have a RunIf trigger from tFileInputExcel checking e.g. (Integer)globalMap.get("tFileInputExcel_1_NB_LINE")==0, and using a tFixedFlowInput to create a row to send to your DB for the empty files.