Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
We are using Replicate to enable CDC on our operational sources and publish it to target. The operational sources and the targets are physically separated.
In this situation, should the Replicate host be co-located with source or with target?
I am assuming that the data flowing in from source to Replicate is same as that flowing out from Replicate to target. Is this a fair assumption?
In that case will the performance be same if we co-locate Replicate with either source or target?
Hi @Prabodh Replicate should be installed always close to the source. Let me know for any additional questions.
Regards
JR
Best practice is to have the Replicate server close(r) to the source than the target. Smaller network latency between the location of the Replicate Software and the source will result in lower latency / faster throughput for the your Replicate tasks.
May I know if there is any documentation available somewhere indicating such direction? I've heard about these recommendations, but I need some documentation to share internally 🙂
Appreciate any help
I apologize - I checked the User Guide and public facing knowledge articles but could not find this mentioned. We do consistently recommend installing Replicate closer to the source than the target. Maybe your Account Manager knows of a white paper on this?
Sorry I could not be of more help.
Dana
"It depends" - and as such it is best to either build your own insights through experimentation or engage professional support who will know the various factors involved and their influence in your specific suggested solution.
>> I am assuming that the data flowing in from source to Replicate is same as that flowing out from Replicate to target. Is this a fair assumption?
Noop!
Several source endpoints (which you failed to indicate) will read/transfer the entire transaction log to the Replicate server (notably non-Logminer Oracle) and select actions for the for the tables select for the task.
Therefor, unless the tasks is to select all active tables in a database, the input data volume read is typically significantly bigger than the data processed and applied to target. That suggests locating towards source. This is notably the case for Oracle having multiple pluggable/portable database hosted by a single CBD in which case there is a single REDO log stream for all PDB's whereas a single task can only deal with 1 PDB at a time.
Also, the typical source read loop timeout if a second or a few seconds where the target apply may only happen every few minutes. Therefore, for near(er) real time replication source proximity, on in extreme case on the source server, is better.
However, you CAN configure to read the source less frequently and in large badges (extreme example is SQLserver only processing from backup logs) the target might be in transactional mode or have smaller batched to many tables. In that case placement 'closer' to target is better.
Considering the above, there does your planned solution fit ? What are the sources/target DB types. Target a DB or a Message stream? How is the source changes log reading expected to be configured? How is the target apply expected to be configured ? Filters? Data manipulation? Apply, Store or both?
Do you agree that the answer is 'it depends' and is likely to be involved to be able to be properly answered by a bunch of ever-so-well-willing , but insufficiently informed, volunteers in this forum ?!
Good luck!
Hein