Solved: Re: Lock on flat files - Qlik Community

suvbin · ‎2024-04-29

Hi,

Facing error "Failed to write record id: x, Number of values: y is not equal to number of columns: z" . When Qlik is reading the file and at the same time , file is being overwritten by another job.

So is there any setting in the replicate, to lock the file while reading by replicate.

Thanks.

john_wang · ‎2024-04-29

Hello @suvbin ,

Thanks for reaching out to Qlik Community!

To address the issue you've encountered with CSV source files being overwritten during replication, it seems that newer versions of the CSV files were generated before Qlik Replicate completed reading them, resulting in reported errors.

To improve the process flow and prevent Qlik Replicate from reading incomplete CSV files, consider the following steps:

Ensure that third-party applications deposit the CSV files into an interim data folder.
Allow Qlik Replicate to complete reading a batch of files, ensuring that the task stops after the Full Load is finished.
Implement a "rename" command to swiftly move the CSV files from the interim folder to Qlik Replicate source data folder for processing. The "rename" operation is notably faster than a "copy" operation especially while the 2 folders are in the same drive. This strategy helps prevent Qlik Replicate from initiating the reading of incomplete CSV files mid-copy.

By following these steps, you can streamline the process flow and mitigate errors caused by file overwriting and incomplete data reads.

Hope this helps.
John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

View solution in original post

john_wang · ‎2024-04-30

Totally agree with @Heinvandenheuvel , "rename" is the best approach among "copy" and other options.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

View solution in original post

john_wang · ‎2024-04-29

Hello @suvbin ,

Thanks for reaching out to Qlik Community!

To address the issue you've encountered with CSV source files being overwritten during replication, it seems that newer versions of the CSV files were generated before Qlik Replicate completed reading them, resulting in reported errors.

To improve the process flow and prevent Qlik Replicate from reading incomplete CSV files, consider the following steps:

Ensure that third-party applications deposit the CSV files into an interim data folder.
Allow Qlik Replicate to complete reading a batch of files, ensuring that the task stops after the Full Load is finished.
Implement a "rename" command to swiftly move the CSV files from the interim folder to Qlik Replicate source data folder for processing. The "rename" operation is notably faster than a "copy" operation especially while the 2 folders are in the same drive. This strategy helps prevent Qlik Replicate from initiating the reading of incomplete CSV files mid-copy.

By following these steps, you can streamline the process flow and mitigate errors caused by file overwriting and incomplete data reads.

Hope this helps.
John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

Heinvandenheuvel · ‎2024-04-29

As @john_wang writes, it is best to NOT expose Replicate to a file which might still be written to by NOT having it have it final name, or final location while being written. Rename or Move when ready for ingestion.

Please consider using a 'reference file', and add a new line referencing a new file to process on when that file is fully available.

Hein.

suvbin · ‎2024-04-30

thank you for the update. Is there any document on the same please

john_wang · ‎2024-04-30

Hello @suvbin ,

If you are looking for the docs about how to setup File Source, Please check File Source overview.

Hope it helps.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

DesmondWOO · ‎2024-04-30

Hi @suvbin ,

I've conducted a test and Replicate does not lock the file during full load.

I think you may separate the working folders, e.g. "Qlik" and "App" for Replicate and your application.

Replicate reads a source file under "Qlik" and customer appends records to a text file under "App". When Replicate task starts, copies the source file from "App" to "Qlik". This can be done by "Preprocessing Command" defined in the source endpoint. Then your application won't affect Replicate's job.

However, please be aware that you have to update the source table metadata in the endpoint accordingly when you perform full load.

Regards,
Desmond

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

Heinvandenheuvel · ‎2024-04-30

>> When Replicate task starts, copies the source file from "App" to "Qlik". This can be done by "Preprocessing Command" defined in the source endpoint.

I beg to differ. COPY is NOT the proper method. That creates a timing window and potentially consumes excessive resources. Rename file-name or move into alternative (sub)directory (on same volume) does NOT involve data movement or access conflicts.

Hein.

john_wang · ‎2024-04-30

Totally agree with @Heinvandenheuvel , "rename" is the best approach among "copy" and other options.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

Lock on flat files

Best Practices

Configuration

General Question