topic Re: Replicating huge databases in Qlik Replicate

Replicating huge databases

guilherme-matte — Sun, 19 Feb 2023 22:34:36 GMT

Hello guys!

Re: Replicating huge databases

john_wang — Mon, 20 Feb 2023 07:47:33 GMT

Thanks for you reaching out.

Not sure what's the source/target database type and Replicate version, in general the suggestions are:

1. We may utilize filters to 'cut' the massive size data from source to smaller size to transfer to target side; eg each part contains 1 year history data only. or (2)

2. Using Parallel Load to speed up the transfer if network bandwidth is sufficient.

3. Run Full Load ONLY task to pass the unchanged old data to target by dedicated task(s), or by different filter conditions in the single task;

4. We may transfer unchanged data to different temporary table(s) in target in parallel, then merge the temporary table(s) into target table in off-peak time, before the CDC task startup. Transfer the 'history' data prior to 'change' data to avoid the UPDATE/DELETE cannot find the target rows in target side database/tables.

5. If possible please use partition tables (eg 1 partition for 1 year data) in target DB for easy management and better performance, also it has Primary Key/Unique Index/Unique Key etc to make sure no full table scanning during CDC stage otherwise latency builds up.

6. We'd like to suggest PS team engaged as this not an easy performance tuning job, and need to solve the various issues during the long time running.

Hope this helps.

Regards,

John.

Re: Replicating huge databases

guilherme-matte — Tue, 21 Feb 2023 03:00:15 GMT

Hello John!

Thank you as always for your help.

I will get more information about the steps you mentioned (parallel load, partitions, etc...) and also consider engaging the PS team as well after I get some more info.

In general, what would Qlik be able to handle without many issues? I mean, what would be considered a really big database that should require some extra tunning and usually which are the sizes that Qlik would handle without many problems? I know it might depend on extra factors, but its just to have an idea since HUGE databases, as in the question, is a bit subjective.

Cheers!

Re: Replicating huge databases

john_wang — Tue, 21 Feb 2023 09:10:06 GMT

Hello @guilherme-matte ,

In general we need a configuration tuning if the data is huge however it's hard to tell a number , that depends on hardware/OS settings/network throughputs/database type etc factors, just as you said.

Best Regards,

John.