Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I am looking for some help related to transformations. So i want to add more column in target data for field let say "RECORD_ID" and in this field i want to use Auto Increment so if there any insert in Source Data, the Target Data will count 1,2,3 etc
we have tried to add identity column on Azure SQL Server with configuration on column RECORD_ID BIGINT IDENTITY(1,1) NOT NULL, and it works as expected
but we have target table using Azure Databricks Lakehouse (Delta), with configuration column RECORD_ID NOT NULL GENERATED BY DEFAULT AS IDENTITY (START WITH 1 INCREMENT BY 1) the result fails with error message "[DELTA_NOT_NULL_CONSTRAINT_VIOLATED] NOT NULL constraint violation for column: RECORD_ID."
previously I asked in the following post: https://community.qlik.com/t5/Qlik-Replicate/How-to-transformation-column-auto-increment-in-target-d...
Thanks you
Anto
Hello Anto, @AgusBudianto
Thanks for the update.
It seems that using the current timestamp is the best approach to identify the latest updated rows. In Databricks, a column defined as GENERATED BY DEFAULT AS IDENTITY cannot be directly updated for existing data. Additionally, it cannot be modified when using the LOAD method to insert records into a table. However, it can be updated through INSERT operations, but Qlik Replicate does not support transactional apply mode for Databricks endpoint.
Regards,
John.
Hi,
The recommendation provided in your previous community entry are still in effect. What is the number of the case that you have opened? Please add the log with the error that you get to the case so we can see exactly when he error is manifested and help troubleshooting the problem.
Regards,
Orit
Hi @AgusBudianto ,
As value of IDENTITY column will be generated by the database, target table should be created manually and the RECORD_ID column should not be mapped in the transformation page.
I can insert records to the target table, however, RECORD_ID is still NULL. I think it is a limitation. Please create a support ticket and let us check with the R&D team.
Regards,
Desmond
Hi @OritA and @DesmondWOO
We have created a ticket to Qlik Support and the result we got is that there is indeed a limitation for Azure Databrick Delta table targets and they recommend using randomblob(N) or random() function.
Please note that we use record_id to detect the last update, for example if 1 ID updates data several times, then we will take the last data based on the last record_id or sorted by record_id.
However, we found a new problem on the SQL Server target that previously had no problem adding record_id, after we checked the contents of the record_id became random, for example we did an insert on ID 123 it should be record_id 1 then we did an update on ID 123 it should be record_id 2 but the conditions were reversed this made us unable to use it in capturing changes, maybe we use the "Batch optimized apply" method on the task
Finally we replaced the record_id with the current time in the hope of being able to take the last time
Regards
Anto
Hello Anto, @AgusBudianto
Thanks for the update.
It seems that using the current timestamp is the best approach to identify the latest updated rows. In Databricks, a column defined as GENERATED BY DEFAULT AS IDENTITY cannot be directly updated for existing data. Additionally, it cannot be modified when using the LOAD method to insert records into a table. However, it can be updated through INSERT operations, but Qlik Replicate does not support transactional apply mode for Databricks endpoint.
Regards,
John.
Thanks for the response and explanation @DesmondWOO and @OritA
Thanks @john_wang for convincing us that the method we used before is the best way, we have a CDC task using the Store Changes Processing mechanism and adding Record_ID to the Global Rule parameter "Add a column for column data type %.% 'RECORD_ID' STRING(50) and sub type Regular with $AR_H_TIMESTAMP" to retrieve the last update data based on 'RECORD_ID'
Regards,
Anto
Hello Anto, @AgusBudianto , copy @OritA , @DesmondWOO
Thank you so much for the feedback. It's helpful for all of us!
Best Regards,
John.
Hi @john_wang I am interested in this topic,
During CDC, Qlik replicate should add incremented sequence number at the time each record insert into the table. Please suggest how we can achieve this. by the way we are replicating from DB2 for LUW to AlloyDB. We have to generate a unique primary key for the target AlloyDB table.
The standard AUTO_INCREMENT feature in AlloyDB is not a good option because it automatically applies a NOT NULL constraint, which is causing our replication to fail.
can you suggest on this?
Hello @bantuh2 ,
For AlloyDB/PostgreSQL database, column SERIAL should be used. Please check my comments in article load target table with new identity column (GENERATED BY DEFAULT).
Hope it helps.
John.