Kafka - Does Replicate guarantee that a message is delivered only once and exactly the right order?
[publishing on behalf of Global Support]
Replicate guarantees that messages are delivered to Kafka at least once.
Each message contains a “change sequence” field (same as in CT tables), which is monotonically increasing. In case a message was produced to Kafka more than once, the customer is able to detect it and ignore that message.
Replicate produces messages in batches. At the same time, different batches are sent to different broker machines (depends which broker is the leader of which partitions at a given time).
It is possible that record X is produced to broker B1 and record X+1 is produced to broker B2.
Broker B2 might respond fast and return an acknowledgment for X+1, while broker B1 might be slower (or down) and record X will get into recovery or fail. In that case, Replicate task will start sending the stream of records as follows:
Replicate v6.2 and lower -- from the earliest record that failed (record X).
Replicate v6.3 and higher -- from the beginning of the failed transaction.
It is possible that some of the following records (X+1) will be duplicated on Kafka.
As mentioned above, the customer can easily filter out these duplicates, if exist.