Solved: how to read an excel file with multiple headers - Qlik Community

muralam · ‎2018-12-11

Hi,

I have an excel file with multiple headers, kindly suggest how can i read this

File looks like above

please suggest any workaround to read this file

Regards,

Meena

Anonymous · ‎2018-12-11

@muralam

Please refer the sample job below.

Job output

First file read

Second file read

By changing the value of header and number of lines to read, you can selectively fetch the records from a sheet. Now in your mind there might be a query that how to fetch the header line or number of lines to read.

Ideally, its source team's responsibility since they are sending the data in this non-standard manner. But if they are not ready to give, then you will have to parse the each line and then identify the string pattern to read the starting and ending position.

Now a third approach will be to read all the lines using single file read and send it to a tjavarow for fetching. If the column1's value is "Name", understand that the following records are a different subset and add a new sequence group number to it. Later separate the records for each group to different flows (you can use a tMap here).

If you do not want to separate the data and just combine all the different datasets under a single file, then after reading, skip the records with "Name" for column1 using a tMap and it will be easy way forward.

Hope I have answered your query. Could you please mark the topic as closed, you are happy with the comments? Kudos are also welcome 🙂

Warm Regards,

Nikhil Thampi

View solution in original post

Anonymous · ‎2018-12-11

Hi,

You will have to read the same file twice with different values for first (header) and last records (limit) in tfileinputExcel.

It will help you to parse the data separately. But the ideal way will be to send the data from source at right format.

Warm Regards,

Nikhil Thampi

muralam · ‎2018-12-11

@nthampi

could you please help to explain in detail

Anonymous · ‎2018-12-11

@muralam

Please refer the sample job below.

Job output

First file read

Second file read

By changing the value of header and number of lines to read, you can selectively fetch the records from a sheet. Now in your mind there might be a query that how to fetch the header line or number of lines to read.

Ideally, its source team's responsibility since they are sending the data in this non-standard manner. But if they are not ready to give, then you will have to parse the each line and then identify the string pattern to read the starting and ending position.

Now a third approach will be to read all the lines using single file read and send it to a tjavarow for fetching. If the column1's value is "Name", understand that the following records are a different subset and add a new sequence group number to it. Later separate the records for each group to different flows (you can use a tMap here).

If you do not want to separate the data and just combine all the different datasets under a single file, then after reading, skip the records with "Name" for column1 using a tMap and it will be easy way forward.

Hope I have answered your query. Could you please mark the topic as closed, you are happy with the comments? Kudos are also welcome 🙂

Warm Regards,

Nikhil Thampi

muralam · ‎2018-12-12

@nthampi,

Thanks for the idea to try with plan B, i have tried the the suggested way unfortunately i am not able to read the file

here the case is headers are not static it may start any where in the file and can end at any point so help is needed to find a generic solution.

Thanks in advance

muralam · ‎2018-12-12

thanks the third step worked as expected

how to read an excel file with multiple headers

Talend Data Integration

v7.x