Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
uTK421
Contributor
Contributor

Complex CSV file as a source - Can Replicate ingest and process?

Hello folks,

We currently use a different product to take a complex CSV file and load each of the different record types into their own database table.  I would like to replicate this process in Replicate as closely as possible.  Has anyone tried something similar, and how were you able to get it to work if it was doable?

From playing around with File endpoint configurations and digging through documentation, it does NOT appear that Replicate is able to handle this scenario.  Unless I am missing something, which I would be glad to have pointed out.

In this case, a complex (or jagged) CSV file is a CSV file that contains records that will have differing numbers of fields.  For example, the first 10 records may have 25 fields and the next 10 records may have 47 fields.  The number of fields allow for identifying what kind of record it is and where/what table it should be loaded into.

Configuring a File Source endpoint with multiple, different tables using multiple CSV files, one for each record type, is rather straight forward.  However, our initial thought that the "File preprocessing command" option could be used for this does not appear that it will work.  The preprocessing command and script looks like it would be run against every CSV file that is defined in the "full load file" field of each table definition.  However, the file defined must match the table definition or else Replicate throws an error during processing.  This means that the single complex CSV file can't be listed as the "full load file" for each table defined and then preprocessed, from what I'm seeing.

Essentially, table1 and table2 are being expected to be defined and matched from the already existing files csv1 and csv2, and any preprocessing will be done to csv1 and csv2 equally.  In our case, we only have the one complex csv file at the start, not one per table.

Unless there are any good ideas, or someone can point out what I've missed, it looks like the workaround is going to be to process the complex CSV outside of Replicate before any file ingestion to generate the set of CSV full load files for each individual table in the File Source configuration.  After that, the File Source is simply configured with the multiple full load CSVs and multiple table definitions.

Using the Change Processing options and files may have some promise, but runs into the issue that full load files need to be defined and matched to the tables configured.  I have to do some more exploration down this route, though it doesn't look like it will work.

Thanks for any possible thoughts or ideas, or other curve-balls to consider.

 

Labels (2)
1 Solution

Accepted Solutions
Jonathan_V
Support
Support

Hi team,

 

Replicate cannot read such file. 

It is a fixed structure file that you need to define on the connection settings:

https://help.qlik.com/en-US/replicate/May2022/Content/Replicate/Main/File/define_table_and_full_load...

 

Please let me know if any additional questions we may be able to further assist you.

 

Best regards,

Jonathan.

View solution in original post

2 Replies
Jonathan_V
Support
Support

Hi team,

 

Replicate cannot read such file. 

It is a fixed structure file that you need to define on the connection settings:

https://help.qlik.com/en-US/replicate/May2022/Content/Replicate/Main/File/define_table_and_full_load...

 

Please let me know if any additional questions we may be able to further assist you.

 

Best regards,

Jonathan.

Dana_Baldwin
Support
Support

Hi @uTK421 

Please submit an enhancement request at the link below for our Product Management Team to consider adding support for this:

https://community.qlik.com/t5/Ideas/idb-p/qlik-ideas

Thanks,

Dana