Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello, please could you help me.
I have a job that runs an Excel file list, the problem is that the column names change, so I created a generic schema and I want to replace the header with the first line.
Any suggestions are welcomed.
Thanks
Input :
colA | colB | colC |
Name | 201901 | 201902 |
aa | 10 | 20 |
Output :
Name | 201901 | 201902 |
aa | 10 | 20 |
If you need to read the header columns of excel, you can do it easily by specifying the "Limit" attribute of the component. It will limit the number of columns to be read. So you can do a read of all files with header details alone to get the date part and pass them to a separate output for later processing. Please transmit file name also with output so that you will be able to trace which file contain which date value.
Then you can do one more read with data alone (by specifying header value as 1).
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Hi
We can have an header record option in tFileInputExcel, and did you tried with that option?
Thank you for your reply.
Indeed, I created a metadata for an excel file that I use every time but the header of my file’s changes.
For example:
File1:
Name|201901|201902
File2:
Name|201801|201802
So, I need to change column names each execution.
Hi,
Keeping the column name as a year and month combination is not a good long term strategy. You should use it as data for later analysis.
For example, 201901 and 201902 are two column names in your input data. If you are keeping it as columns, you will find it difficult to manage since each table will have its own column names in the form of year+month.
But if you extract your data to another table in a different format, it will be the most easy to use.
Table A - Time dimension with year and month
Table B - Sales for each time dimension. ( You can link the values for each product say aa. ab etc. with time so that you will have full visibility of sales over a period of time)
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Thank you @nthampi for your answer.
That's why I have to keep the column names because I'm going to convert them to data.
I have several excel files that have the same structure, the same number of columns but whose names change and I want to create a single file in with a general structure col1, col2, col3 and that will take each run the first line ( which is the true header of the source file) as a header.
If you need to read the header columns of excel, you can do it easily by specifying the "Limit" attribute of the component. It will limit the number of columns to be read. So you can do a read of all files with header details alone to get the date part and pass them to a separate output for later processing. Please transmit file name also with output so that you will be able to trace which file contain which date value.
Then you can do one more read with data alone (by specifying header value as 1).
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Thank you @nthampi for your response.
I was able to solve the problem with your solution !!