Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
IanM
Contributor III
Contributor III

Duplicate column names - xlsx

Hello,

I'm using the excellent tFileExcelSheetInput component but I'm having an issue. It is not with the component itself but with the incoming data.

tFileExcelSheetInput allows me to 'search' for column names across the top row of each sheet, and then return the data under that column heading into my flow (which is great).

My issue is that the incoming spreadsheet can contain duplicate column names. tFileExcelSheetInput appears to be taking the right hand most column with that heading and returning that data. As I said, it isn't really the component I have an issue with it is the duplicate column names.. The duplicated column names have different data below.

So I can have column headings something like:

Name
Tel
Mobile
House Number
Name
Address1
Address2
etc..


The column positions can change from sheet to sheet - hence why I am using tFileExcelSheetInput.
As you can see above, the Column name 'Name' is repeated.
tFileExcelSheetInput will return the second 'Name' data (applies to a house name here and is hardly ever populated) rather than the persons 'Name' data.

I am getting around this by reading the top row, pivoting it, adding a numeric counter (to maintain its position) then doing a tUniqRow to identify duplicates, then renaming the duplicate with a 1,2,3etc re-forming the column, order it on the numeric counter to get the column positions correct, and then unpivot back to a spreadsheet, where I then append the data from the original worksheet.

In this example I end up with column names:

Name
Tel
Mobile
House Number
Name1
Address1
Address2
etc..

This is a very roundabout process.

Does anyone know of a better way of dealing with the duplicate column names?

many thanks

Ian

Labels (2)
1 Reply
Anonymous
Not applicable

Hi Ian
I think you have to identify the uniqueness of column whatever way you use with community version. You can take a look at the dynamic schema feature in Talend Subscription product.

Best regards
Shong