Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
We are replicating data from DB2 to ALDS Gen2 container. Our set up is DB2 -> LogStream -> ADLS
Full load +Store changes with Parquet file format.
We are able to read Full load files without any issues. But getting an error while reading _CT files:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 11232.0 failed 4 times, most recent failure: Lost task 0.3 in stage 11232.0 (TID 76655) (10.241.17.215 executor 19): java.io.IOException: Could not read or convert schema for file: abfss://PATH/TABLE_NAME_ct/20240320-145022523.snappy.parquet
The column header_change_mask is causing this issue.
If I unchecked "change_mask" Change Table Header Columns we are able to read the file.
How can we solve that problem keeping header_change_mask column?
Posting here because I think that might be helpful for somebody.
We found two options for us:
1) Remove "change_mask" header column from target files:
2) Set internal parameter byteNotFixedLenType to true as described here: Qlik Replicate: header__change_mask column value p... - Qlik Community - 2103612
So we decided to go with option #1 because in our case nobody was using [header__] change_mask column.
Hello @eksmirnova ,
Thanks for reaching out to Qlik Community!
Are you able to confirm if change_mask values are NULL (or other pattern) caused the error? Anyway, please open a support ticket and attach:
1- Task Diag Packages
2- How you get the error "Could not read or convert schema for file" , what's the command or tools etc.
Thanks,
John.
Some of change_mask values are null, some of them are actual values:
Out assumption that FIXED_LEN_BYTE_ARRAY datatype is causing that issue.
We are getting this error in Data bricks.
Hi @eksmirnova ,
change_mask values are NULL because those are BEFORE-IMAGE records. It is normal.
Regards,
Desmond
@DesmondWOO yes, I know. This is not the issue.
The issue is Databricks is not able to read _CT parquet file because of change_mask column datatype. If I unchecked "change_mask" header column we are able to read the file.
Posting here because I think that might be helpful for somebody.
We found two options for us:
1) Remove "change_mask" header column from target files:
2) Set internal parameter byteNotFixedLenType to true as described here: Qlik Replicate: header__change_mask column value p... - Qlik Community - 2103612
So we decided to go with option #1 because in our case nobody was using [header__] change_mask column.