Support tomb-stoning methodology for Kafka target endpoints
When there is a change to PK fields on the source system, Replicate sends to Kafka the previous PK data as part of the envelope for the update message. However, the update message has the new PK data as key. The result is that it can go to a different Kafka partition than the old message. This can impact to downstream consumers since there are no ordering guarantees for messages on different partitions.
The idea is to send to Kafka a tombstone record for keys that no longer exist on the source system, not just to add the previous PK as metadata in the message envelope on a completely different key (the “beforeData” for each message).
Here is a use case: suppose there is an Email table in the source database where primary key fields are PersonId and EmailType. Suppose there is a record where PersonId=1 and EmailType=Office. Last, suppose someone were to UPDATE that record to have PersonId=1 and EmailType=Personal.
In current behavior, the consumer is required to look at the “beforeData” for each message and create a tombstone record, with a potential processing for the original key from the new partition (B). This adds complexity on the consuming side.
With the tomb-stoning methodology, Message2 would be the tombstone message for the key { “personId”: 1, “emailType”: “office” }, handled by the producer and avoiding extra complexity on the producing side.
I'd at least like to see source endpoint deletes replicated to Kafka configurable to use either the Replicate classic method for generating "delete" messages or to generate a true Kafka tombstone message where the key is provided, but the message body is NULL. This configuration option should be applied regardless of whether the source system PK was changed or just deleted.
Any movement on this? This is a common feature in most other change data capture products including several of your competitors.
Without this feature, data written into kafka that is maintained and compacted violates the data rules when it is partitioned at all, as a record will show as active in 2 partitions, thought it has truly just had a primary key update.
Thanks to all that commented and voted on this ideation. We hear you and are looking to address this as one of the higher priority items in our backlog. I can't provide an ETA yet, but I will when I have one.
This has made it up the priority list and we are planning to enable the Replicate Engine to support updates ok PKs in the Nov 22 release - which will enable support of Tombstoning.
NOTE: Upon clicking this link 2 tabs may open - please feel free to close the one with a login page. If you only see 1 tab with the login page, please try clicking this link first: Authenticate me! then try the link above again. Ensure pop-up blocker is off.