Skip to main content
Announcements
SYSTEM MAINTENANCE: Thurs., Sept. 19, 1 AM ET, Platform will be unavailable for approx. 60 minutes.
cancel
Showing results for 
Search instead for 
Did you mean: 
RMotta2408
Creator II
Creator II

Reading CSV files with different schemas

Hello everyone,

I want to develop a Job to read CSV files from a folder and display the results in the same order always.

The problem is, there's no fixed schema for these files.

I mean, the files may come with different schemas:

  • Column_A, Column_B, Column_C, Column_D
  • Column_B Column_D, Column_A, Column_C
  • Column_D, Column_B, Column_C, Column_A

... any possible combination.

My question is: can the Job be dynamic enough so that, no matter the order of the columns, the files always gets read and the results displayed in the right order (A, B, C, D)?

Thank you so much,

Rui

Labels (4)
1 Solution

Accepted Solutions
RMotta2408
Creator II
Creator II
Author

Hi there,

 

I've found a solution.

 

For each CSV file tFileList finds, I do this:

 

1- tFileInputDelimited

  • schema: column01 (dynamic).

2- tExtractDynamicFields

  • input schema: column01 (dynamic).
  • output schema: columnA, columnB, columnC, columnD.

3- tDBOutput

 

This way, no matter what comes in the CSV file, my Job will only consider what values come in these 4 columns. And if one or more of these columns are missing, the Job considers that column with the value "null".

 

Hope this helps someone.

 

Thank you.

View solution in original post

7 Replies
Anonymous
Not applicable

Hello

You need to use Dynamic schema, you have a similar requirement as described in this KB article.

 

Regards

Shong

RMotta2408
Creator II
Creator II
Author

Hi @Shicong Hong​ ,

 

I will have a look RIGHT NOW!

 

Thank you

RMotta2408
Creator II
Creator II
Author

The link is inaccessible. 😮

 

0695b00000hsqafAAA.png

Anonymous
Not applicable

@Rui Motta​ , sorry, I didn't notice that the article is only visible for internal staffs, let me think about what I can do, or I can do my best to provide an example.

RMotta2408
Creator II
Creator II
Author

Hi there,

 

I've found a solution.

 

For each CSV file tFileList finds, I do this:

 

1- tFileInputDelimited

  • schema: column01 (dynamic).

2- tExtractDynamicFields

  • input schema: column01 (dynamic).
  • output schema: columnA, columnB, columnC, columnD.

3- tDBOutput

 

This way, no matter what comes in the CSV file, my Job will only consider what values come in these 4 columns. And if one or more of these columns are missing, the Job considers that column with the value "null".

 

Hope this helps someone.

 

Thank you.

Anonymous
Not applicable

It is a good solution! Thank you for your sharing! @Rui Motta​ 

Regards

Shong

RMotta2408
Creator II
Creator II
Author

My pleasure.

 

Rui