Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik GA: Multivariate Time Series in Qlik Predict: Get Details
cancel
Showing results for 
Search instead for 
Did you mean: 
MNREDDY
Contributor
Contributor

Convert Excel file to CSV file whose metadata varies each month

Hello Team,

 

I have a requirement like every month we get excel file and we need convert this into csv file for further process. Source and Target file structure like below.

Month 1:
SourceFile.xlsx
CUSTID,CUSTNAME,DOB,GENDER,StepFlag,PreviousStepFlag,CODE1,CODE2,CODE3,CODE4

TargetFile.csv
CUSTID,CUSTNAME,DOB,GENDER,StepFlag,PreviousStepFlag,CODE1,CODE2,CODE3,CODE4

Month 2:
SourceFile.xlsx
CUSTID,CUSTNAME,DOB,GENDER,StepFlag,PreviousStepFlag,CODE2,CODE4,CODE5,CODE6,CODE7,CODE10

TargetFile.csv
CUSTID,CUSTNAME,DOB,GENDER,StepFlag,PreviousStepFlag,CODE2,CODE4,CODE5,CODE6,CODE7,CODE10

Note: We would get every time more than 100 codes in the excel file. Also actual header starts from B8 cell in the excel.

But we shouldn't change the schema metadata  for every monthly run.

Once we loaded to csv file, then it should load into DB table and It is like below 

CUSTID,CUSTNAME,DOB,GENDER,StepFlag,PreviousStepFlag,CODE

For each code 1 record to be inserted, for 4 codes 4 records to be inserted, 6 codes 6 records to be inserted, n codes n records to be inserted.

Please help us to handle this scenario.

 

Regards,

NarsimhaReddy

 

Labels (3)
3 Replies
dipanjan93
Contributor
Contributor

You can use Dynamic schema feature of the Talend Enterprise Edition
akumar2301
Specialist II
Specialist II

@MNREDDY 

 

There is no single component to do so  but you could program it , thanks to tFileexcelInput , Start and end column Properties .

 

1) Read 1st 7 key column. Add sequence number

SEQNO CUSTID CUSTNAME DOB GENDER StepFlag PreviousStepFlag

 

2) You could get number of Column in excel using below command

new org.apache.poi.xssf.usermodel.XSSFWorkbook((String)globalMap.get("filename")).getSheet("SheetName").getRow(rownumber).getLastCellNum()

3) Now as you know column count , you could read columns in loop with set of 50 (thanks to tFileexcelInput , Start and last column Properties ).

e.g. if number of code columns is 60 , read 1st 50 code columns( startcolumn 8 ,endColumn 58) and next iteration 10 ( startcolumn 58 , endColumn 108)

4)

4.1) while reading , again add SEQNO

SEQNO CODE1 CODE2 CODE3 CODE4 ...

4.2) normalize the result  ( tSplitRow)

SEQNO CODE1

SEQNO CODE2

SEQNO CODE3

4.3) you can use tMap to do inner join(all match) between step1 and step 4.2 result on SEQNO

SEQNO CUSTID CUSTNAME DOB GENDER StepFlag PreviousStepFlag CODE1value

SEQNO CUSTID CUSTNAME DOB GENDER StepFlag PreviousStepFlag CODE2value

 

Above result you can update on table.

Hope this will be helpful to start.Let me know if you face problem.

 

 

 

MNREDDY
Contributor
Contributor
Author

@dipanjan93

We have tried this option, but  it is unable to recognize all the columns as we have 90 columns in the sheet. It is reading 12 columns only.

 

Regards,

NarsimhaReddy