Solved: Re: Qlik Replicate Performance - Qlik Community

ankuranand · ‎2022-03-16

Hello Experts,

We have a need to keep data between an Oracle database and a SQL Server database in-sync. Data could be modified on either of the two databases and our objective is to have the update replicated to the other database as quickly as possible. We would also like to avoid situations where two users end up updating same row in either database at the same time. Essentially making the two databases act as master database.

Does Qlik Replicate support this kind of scenario? if so, how quick is to update the other end of the database. There will mostly be transactional updates but on occasions we would have full loads as well (not worried too much about it)

if Qlik replicate is not the right product for this ask, do you have another product that could support this scenario?

Thank you

Ankur

SwathiPulagam · ‎2022-03-16

Hi @ankuranand ,

You can achieve this through Bidirectional replication.

Bidirectional replication enables organizations to synchronize data between two endpoints (henceforth referred to as Endpoint A and Endpoint B), ensuring that both endpoints contain identical records. The endpoints can either be the same type (e.g. Oracle-to-Oracle) or different types (e.g. Microsoft SQL Server-to-Oracle).

Please refer to the below user guide link for more information related to Bidirectional and how to setup.

https://help.qlik.com/en-US/replicate/November2021/Content/Global_Common/Content/SharedEMReplicate/D...

Thanks,

Swathi

View solution in original post

SwathiPulagam · ‎2022-03-16

Hi @ankuranand ,

1. How quickly is the data replicated? is it near-instant or some sort of a timed process?

Answer: near-instant

2. How do you prevent two users updating the same row in two databases, if it happens, who wins?

Answer: In bidirectional replication, loopback prevention is a mechanism that prevents the same data from being replicated back and forth in an endless loop. To enable loopback prevention, you need to specify a source and target Loopback prevention table schema.

Below is the user guide link:

https://help.qlik.com/en-US/replicate/November2021/Content/Global_Common/Content/SharedEMReplicate/C...

Thanks,

Swathi

View solution in original post

Heinvandenheuvel · ‎2022-03-16

Be sure to carefully read the Replicate User Guide, chapter 7 "defining and managing tasks" section "Bidirectional replication"

Batch/bulk apply optimization is not available therefor transactional mode is used. That should be fine for hundreds of changes per second, but problematic for thousands of changes per second.

>> How quickly is the data replicated? is it near-instant or some sort of a timed process?

It is a timed process, not a triggered. You can tighten the timing loop to make it near-instant depending on your definition of near instant.

To keep the databases as will in sync as possible you want to tune for speed and with that possibly increase resource usage. For example for the Oracle source endpoint you probably want to set the Advanced Option "Retry interval:" to 1 or 2 seconds, not the default (5?). For the SQL server source endpoint leave logScanPollingCycleInterval set to 1.

There is loopback prevention, meaning that the same change made on one server applied to an other is not going to be re-applied as a fresh change to the first. Great.

However, there is no conflict resolution. If a row row changed on one server is also changed on the other server while the CDC action is 'in flight' they will overwrite each other. There is timing window which may or might not be relevant for the application. You should expect that timing window to be several seconds, not sub-second.

Hein

View solution in original post

SwathiPulagam · ‎2022-03-16

Hi @ankuranand ,

You can achieve this through Bidirectional replication.

Bidirectional replication enables organizations to synchronize data between two endpoints (henceforth referred to as Endpoint A and Endpoint B), ensuring that both endpoints contain identical records. The endpoints can either be the same type (e.g. Oracle-to-Oracle) or different types (e.g. Microsoft SQL Server-to-Oracle).

Please refer to the below user guide link for more information related to Bidirectional and how to setup.

https://help.qlik.com/en-US/replicate/November2021/Content/Global_Common/Content/SharedEMReplicate/D...

Thanks,

Swathi

ankuranand · ‎2022-03-16

Hello Swathi,

Thank you for the quick response. Good to know this is supported. Maybe we can start with a POC in my team.

Follow up questions

1. How quickly is the data replicated? is it near-instant or some sort of a timed process?

2. How do you prevent two users updating the same row in two databases, if it happens, who wins?

Thanks,

Ankur

SwathiPulagam · ‎2022-03-16

Hi @ankuranand ,

1. How quickly is the data replicated? is it near-instant or some sort of a timed process?

Answer: near-instant

2. How do you prevent two users updating the same row in two databases, if it happens, who wins?

Answer: In bidirectional replication, loopback prevention is a mechanism that prevents the same data from being replicated back and forth in an endless loop. To enable loopback prevention, you need to specify a source and target Loopback prevention table schema.

Below is the user guide link:

https://help.qlik.com/en-US/replicate/November2021/Content/Global_Common/Content/SharedEMReplicate/C...

Thanks,

Swathi

ankuranand · ‎2022-03-16

Great. Thank you Swathi. I have enough information to start a POC now. Hopefully this would go well. Thanks again

Heinvandenheuvel · ‎2022-03-16

Be sure to carefully read the Replicate User Guide, chapter 7 "defining and managing tasks" section "Bidirectional replication"

Batch/bulk apply optimization is not available therefor transactional mode is used. That should be fine for hundreds of changes per second, but problematic for thousands of changes per second.

>> How quickly is the data replicated? is it near-instant or some sort of a timed process?

It is a timed process, not a triggered. You can tighten the timing loop to make it near-instant depending on your definition of near instant.

To keep the databases as will in sync as possible you want to tune for speed and with that possibly increase resource usage. For example for the Oracle source endpoint you probably want to set the Advanced Option "Retry interval:" to 1 or 2 seconds, not the default (5?). For the SQL server source endpoint leave logScanPollingCycleInterval set to 1.

There is loopback prevention, meaning that the same change made on one server applied to an other is not going to be re-applied as a fresh change to the first. Great.

However, there is no conflict resolution. If a row row changed on one server is also changed on the other server while the CDC action is 'in flight' they will overwrite each other. There is timing window which may or might not be relevant for the application. You should expect that timing window to be several seconds, not sub-second.

Hein

ankuranand · ‎2022-03-17

Thank you for the detailed explanation @Heinvandenheuvel

The fact that some changes may get overwritten while CDC action is in-flight worries me a bit. Maybe I should change my application logic a bit to support unidirectional replication so we have 100% consistency.

ankuranand · ‎2022-03-17

Does Qlik allow Selective UniDirectional replication. I have two Tables

Oracle to SQL replicate Table A

SQL to Oracle replicate Table B

Would this ensure 100% consistency and no-overwrites, no-conflicts?

Thanks,

Ankur

Heinvandenheuvel · ‎2022-03-17

@ankuranand wrote:

Does Qlik allow Selective UniDirectional replication. I have two Tables

Oracle to SQL replicate Table A

SQL to Oracle replicate Table B

Would this ensure 100% consistency and no-overwrites, no-conflicts?

Yes, you pick and choose your tables.

If there is no overlap, then you can also just have multiple regular tasks going each way. That' more predictable and some what easier as you can create the tasks with full-load as desired and bulk apply optimization.

The bi-directional task type is specifically there to help when the same table is two tasks each with the same source/target databases but in the other direction. It has a valid purpose but comes with the non-conflict resolution gotcha. Now often that's not a big deal. A single customer would supposedly only call on one line and updates made for that customer come through a single agent. However, if for sake of a silly example, if the call handling is in database-A and the customer web access self-service is in database-B and the customer think it is a good idea to update for example an alternate phone-number online where the agent updates another column in the same row at the same time that will be a potential problem. The more the databases are normalized, the less of a problem that is. Here for example if there is not a phone phone-1 and phone-2 in the base 'customer-details' row but instead there is a phone numbers table with customer-id and phone instance as key then this update conflict would be avoided.

fwiw,

Hein

ankuranand · ‎2022-03-25

Thank you for answering my questions. We are starting a POC next week but I see Qlik only offers cloud trial. For data security, we cannot expose our endpoints to the cloud at this point. Is there any downloadable trial option?

Thanks,

Ankur

Qlik Replicate Performance

Configuration

Performance