Solved: Oracle to ADLS Full load is failing with ]E: Fail... - Qlik Community

eksmirnova · ‎2024-06-05

Hello,

We have a Full load + Store changes task which loads data from Oracle to ADLS(parquet snappy format). But it is failing for one table during Full load with the error:

00018089: 2024-06-05T05:47:20 [TARGET_LOAD ]E: Failed to convert file from csv to parquet [1024902] (file_utils.c:899)

00018089: 2024-06-05T05:47:20 [TARGET_LOAD ]E: Failed to convert file '/data/replicate/qlik/tasks/task_name/data_files/MY_TABLE_NAME/LOAD0000000E.tmpcsv'. [1024902] (file_imp.c:2538)

00018088: 2024-06-05T05:47:20 [SOURCE_UNLOAD ]I: Unload finished for table 'FS'.'MY_TABLE_NAME' (Id = 6). 16436666 rows sent. (streamcomponent.c:3784)

00018033: 2024-06-05T05:47:20 [TASK_MANAGER ]W: Table 'FS'.'MY_TABLE_NAME' (subtask 1 thread 1) is suspended. (replicationtask.c:3147)

Any suggestions to resolve that?

john_wang · ‎2024-06-05

Hello @eksmirnova ,

Thanks for reaching out to Qlik Community!

Varied reasons that may lead to the same error, eg lack of resource, invalid UTF-8 characters etc.
Let's get additional information to understand the issue:
1- set target side internal parameter "keepCSVFiles" to true
2- Make sure no files exist before you run the task, the folder is
"/data/replicate/qlik/tasks/task_name/data_files/MY_TABLE_NAME/"
3- Reduce the "Maximum file size(KB):" setting in target endpoint of ADLS General tab
Now it's set to 1000000 KB = 1G, let's set it to 10M or smaller for easier file analysis
4- re-create the issue then collect the generated interim CSV files (set in above step 2) and try to understand the reason
Hope this helps.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

View solution in original post

john_wang · ‎2024-06-05

Hello @eksmirnova ,

Thanks for reaching out to Qlik Community!

Varied reasons that may lead to the same error, eg lack of resource, invalid UTF-8 characters etc.
Let's get additional information to understand the issue:
1- set target side internal parameter "keepCSVFiles" to true
2- Make sure no files exist before you run the task, the folder is
"/data/replicate/qlik/tasks/task_name/data_files/MY_TABLE_NAME/"
3- Reduce the "Maximum file size(KB):" setting in target endpoint of ADLS General tab
Now it's set to 1000000 KB = 1G, let's set it to 10M or smaller for easier file analysis
4- re-create the issue then collect the generated interim CSV files (set in above step 2) and try to understand the reason
Hope this helps.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

eksmirnova · ‎2024-06-05

Thank you John,

I made those changes and the full load succeed. Does it mean that the reason is the lack of resource and we should not create 500Mb files?

Previously the "Maximum file size(KB):" was set to 500 Mb.

eksmirnova · ‎2024-06-05

We tried to find where exactly it is failing. And found that the process arep_csv2prq is crashing with the error: Process arep_csv2prq crashed with status "Segmentation fault"

@john_wang Could you please suggest any tuning options which would help us?

Dana_Baldwin · ‎2024-06-05

Hi @eksmirnova

If a component of Qlik Replicate is crashing please review this link for what we need to collect, then open a support case with that information: Collecting Replicate Process Dumps - Qlik Community - 1746840

Thanks,

Dana

john_wang · ‎2024-06-05

Hello @eksmirnova ,

Besides @Dana_Baldwin comment, if you are running Qlik Replicate 2022.11 build number lower than 628, or Replicate 2023.5 build number lower than 213, then please upgrade to latest build of the same major version.

Hope this helps,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

eksmirnova · ‎2024-06-06

@john_wang Thank you, I opened a support case.

Our version is 2022.11.0.1001, so it is grater than 628.

john_wang · ‎2024-06-07

Hello @eksmirnova ,

Thanks for the update. Our support team is working on this case.

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

john_wang · ‎2024-06-07

Hello @eksmirnova ,

Noticed the line in task log file:

2024-06-06T15:45:54:794538 [TARGET_LOAD ]T: file converter ended with status 11 (file_utils.c:896)

Error code 11 in Linux typically corresponds to EAGAIN , which means "Try Again"
or a "Resource temporarily unavailable" error. Please check:

1- File System limits: using ulimit -a

2- Disk space: using df -h and ensure there is enough space on the relevant partitions

3- Memory usage: using top or htop and ensure there is enough free memory available

4- You may try to check the Linux Server System log to see if any clue.

Hope this helps.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

vinayak_m · ‎2024-06-20

Hi @eksmirnova & @john_wang ,

I was going through the above post and found that I am getting a similar error, my source/target endpoints are different though.
My source is DB2 and target is AWS S3, the current maximum file size is the default that was there, 1GB, should I lower it down and then re-initiate the task? or is there any workaround for this?

[TARGET_LOAD ]E: Failed to convert file from csv to parquet
Error:: failed to read csv temp file

Oracle to ADLS Full load is failing with ]E: Failed to convert file from csv to parquet

Errors - Unexpected Behavior