Skip to main content
Announcements
UPGRADE ADVISORY for Qlik Replicate 2024.5: Read More
cancel
Showing results for 
Search instead for 
Did you mean: 
eksmirnova
Contributor III
Contributor III

Oracle to ADLS Full load is failing with ]E: Failed to convert file from csv to parquet

Hello,

We have a Full load + Store changes task which loads data from Oracle to ADLS(parquet snappy format). But it is failing for one table during Full load with the error:

00018089: 2024-06-05T05:47:20 [TARGET_LOAD ]E: Failed to convert file from csv to parquet [1024902] (file_utils.c:899)
00018089: 2024-06-05T05:47:20 [TARGET_LOAD ]E: Failed to convert file '/data/replicate/qlik/tasks/task_name/data_files/MY_TABLE_NAME/LOAD0000000E.tmpcsv'. [1024902] (file_imp.c:2538)
00018088: 2024-06-05T05:47:20 [SOURCE_UNLOAD ]I: Unload finished for table 'FS'.'MY_TABLE_NAME' (Id = 6). 16436666 rows sent. (streamcomponent.c:3784)
00018033: 2024-06-05T05:47:20 [TASK_MANAGER ]W: Table 'FS'.'MY_TABLE_NAME' (subtask 1 thread 1) is suspended. (replicationtask.c:3147)
 
Any suggestions to resolve that? 
Labels (1)
1 Solution

Accepted Solutions
john_wang
Support
Support

Hello @eksmirnova ,

Thanks for reaching out to Qlik Community!

Varied reasons that may lead to the same error, eg lack of resource, invalid UTF-8 characters etc.
Let's get additional information to understand the issue:
1- set target side internal parameter "keepCSVFiles" to true
2- Make sure no files exist before you run the task, the folder is
"/data/replicate/qlik/tasks/task_name/data_files/MY_TABLE_NAME/"
3- Reduce the "Maximum file size(KB):" setting in target endpoint of ADLS General tab
Now it's set to 1000000 KB = 1G, let's set it to 10M or smaller for easier file analysis
4- re-create the issue then collect  the generated interim CSV files (set in above step 2) and try to understand the reason
Hope this helps.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

View solution in original post

10 Replies
john_wang
Support
Support

Hello @eksmirnova ,

Thanks for reaching out to Qlik Community!

Varied reasons that may lead to the same error, eg lack of resource, invalid UTF-8 characters etc.
Let's get additional information to understand the issue:
1- set target side internal parameter "keepCSVFiles" to true
2- Make sure no files exist before you run the task, the folder is
"/data/replicate/qlik/tasks/task_name/data_files/MY_TABLE_NAME/"
3- Reduce the "Maximum file size(KB):" setting in target endpoint of ADLS General tab
Now it's set to 1000000 KB = 1G, let's set it to 10M or smaller for easier file analysis
4- re-create the issue then collect  the generated interim CSV files (set in above step 2) and try to understand the reason
Hope this helps.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
eksmirnova
Contributor III
Contributor III
Author

Thank you John,

I made those changes and the full load succeed. Does it mean that the reason is the lack of resource and we should not create 500Mb files?

Previously the  "Maximum file size(KB):" was set to 500 Mb.

eksmirnova
Contributor III
Contributor III
Author

We tried to find where exactly it is failing. And found that the process arep_csv2prq is crashing with the error: Process arep_csv2prq crashed with status "Segmentation fault"

@john_wang  Could you please suggest any tuning options which would help us? 

eksmirnova_0-1717618850387.png

 

Dana_Baldwin
Support
Support

Hi @eksmirnova 

If a component of Qlik Replicate is crashing please review this link for what we need to collect, then open a support case with that information: Collecting Replicate Process Dumps - Qlik Community - 1746840

Thanks,

Dana

john_wang
Support
Support

Hello @eksmirnova ,

Besides @Dana_Baldwin comment, if you are running Qlik Replicate 2022.11 build number lower than 628, or Replicate 2023.5 build number lower than 213, then please upgrade to latest build of the same major version.

Hope this helps,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
eksmirnova
Contributor III
Contributor III
Author

@john_wang  Thank you, I opened a support case.

Our version is 2022.11.0.1001, so it is grater than 628.

john_wang
Support
Support

Hello @eksmirnova ,

Thanks for the update. Our support team is working on this case.

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
john_wang
Support
Support

Hello @eksmirnova ,

Noticed the line in task log file:

2024-06-06T15:45:54:794538 [TARGET_LOAD ]T: file converter ended with status 11 (file_utils.c:896)

Error code 11 in Linux typically corresponds to EAGAIN , which means "Try Again"
or a "Resource temporarily unavailable" error. Please check:

1- File System limits: using ulimit -a 

2- Disk space: using df -h and ensure there is enough space on the relevant partitions

3- Memory usage: using top or htop and ensure there is enough free memory available

4- You may try to check the Linux Server System log to see if any clue.

Hope this helps.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
vinayak_m
Contributor III
Contributor III

Hi @eksmirnova & @john_wang ,

I was going through the above post and found that I am getting a similar error, my source/target endpoints are different though.
My source is DB2 and target is AWS S3, the current maximum file size is the default that was there, 1GB, should I lower it down and then re-initiate the task? or is there any workaround for this?

[TARGET_LOAD     ]E:  Failed to convert file from csv to parquet
Error:: failed to read csv temp file