Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi, we have a task, reading from a SQL Server and writing to a Datalake Gen2 (Azure). Sometimes, on CDC change processing, we receive warnings about transasction timestamp (The transaction timestamp already exists in an earlier partition, timestamp = '1598272041707000', batch begin time = '1598583600000000') and, after this warnings, the apply throughput decreases a lot. So, the task starts to accumulate data and, finally, we need to do a full load again.
Does anyone know if exists advanced parameters to increase the apply throughput or, how can we solve this issue? We have a ticket on support, but, unfortunately, until this moment, this issue is not solved.
Nuno,
Thank you for the post to the Forums. You can try the below steps to increase the Threads and Stream Buffer size to see if this helps.
Please export the task, open the json file and search for common_settings:
Note: The 10 buffers at 200 MB will need to be considered as this will affect the Resources on the Replicate Server. You may want to try with a lower setting as our default is 3 Buffers at 8MB piece.
"common_settings": {
"change_table_settings": {
Add the 2 lines:
"stream_buffers_number": 10,
"stream_buffer_size": 200,
end result:
"common_settings": {
"stream_buffers_number": 10,
"stream_buffer_size": 200,
"change_table_settings": {
save the json, import it to Replicate, stop and resume the task. Also ensure that change processing tuning settings are intact.
Thanks!
Bill
Are there any updates on this? We are having a similar issue with SQL Server and writing to AWS EMR
No. The issue still occurs.
I recommend opening a support ticket for these types of issues. Support and R&D will request logs and artifacts to help troubleshoot the issue.
Nuno,
Thank you for the post to the Forums. You can try the below steps to increase the Threads and Stream Buffer size to see if this helps.
Please export the task, open the json file and search for common_settings:
Note: The 10 buffers at 200 MB will need to be considered as this will affect the Resources on the Replicate Server. You may want to try with a lower setting as our default is 3 Buffers at 8MB piece.
"common_settings": {
"change_table_settings": {
Add the 2 lines:
"stream_buffers_number": 10,
"stream_buffer_size": 200,
end result:
"common_settings": {
"stream_buffers_number": 10,
"stream_buffer_size": 200,
"change_table_settings": {
save the json, import it to Replicate, stop and resume the task. Also ensure that change processing tuning settings are intact.
Thanks!
Bill