Solved: Moving from On-Prem to Azure Managed SQL - perform... - Qlik Community

RonFusionHSLLC · ‎2022-06-24

We have just moved our production work loads to testing in Azure. We are in the process of testing and tuning.

Out of the box, we see a significant reduction in performance (5x slower) in Azure Managed SQL (MS-CDC source/Azure Managed SQL target) during full load replication. The Azure Managed SQL instance is sized about the same as our on-prem environments in terms of memory/cpu, I believe AMSI uses SSD, we have way slower HDD on-prem.

Using DPA we see about the same graph proportions from our on-prem source/target as we do in Azure...Azure having wait time numbers around 5x greater than on-prem.

DPA is showing the majority are Memory/CPU waits on the target and the majority are AsyncIO waits on the source - during full load.

Has anyone had similar experience, how did you size/tune (either replicate or azure smi)to get Azure to perform similar to on-prem.

ron

Steve_Nguyen · ‎2022-06-24

below are some link on performance :

https://community.qlik.com/t5/Knowledge/Latency-Performance-Troubleshooting-and-Tuning-for-Replicate...

https://community.qlik.com/t5/Knowledge/Improving-performance-on-Replicate-Target-Apply/ta-p/1906500

https://community.qlik.com/t5/Knowledge/Troubleshooting-Qlik-Replicate-Latency-and-Performance-Issue...

Help users find answers! Don't forget to mark a solution that worked for you! If already marked, give it a thumbs up!

View solution in original post

Steve_Nguyen · ‎2022-06-24

@RonFusionHSLLC

1. i am not sure what is the source and target.

2. normally Replicate would be close to the source as possible.

3. you should enable task to have performance logging to trace to try to isolate where is the slow and work from there.

Help users find answers! Don't forget to mark a solution that worked for you! If already marked, give it a thumbs up!

RonFusionHSLLC · ‎2022-06-24

There are 2 Azure Sql Managed Instances in play

Source is Azure Sql Managed Instance (MS-CDC connection) - Instance1

Target is Azure Sql Managed Instance (target ods is setup as sql server connection - per instructions) - Instance2

We also replicate two tables within Instance2 from a small transactional db to the target ods

The replicate server is in Azure on a windows server - it is as close to the Azure Sql Managed Instances as it can get. I believe the ASMI are in their own subnet or even different network (under Microsoft control) and we have a nat'd connection between our lan an the ASMI.

Thanks for the heads up on the performance logging...

Ron

Steve_Nguyen · ‎2022-06-24

below are some link on performance :

https://community.qlik.com/t5/Knowledge/Latency-Performance-Troubleshooting-and-Tuning-for-Replicate...

https://community.qlik.com/t5/Knowledge/Improving-performance-on-Replicate-Target-Apply/ta-p/1906500

https://community.qlik.com/t5/Knowledge/Troubleshooting-Qlik-Replicate-Latency-and-Performance-Issue...

Help users find answers! Don't forget to mark a solution that worked for you! If already marked, give it a thumbs up!

RonFusionHSLLC · ‎2022-06-27

Our current on-prem performance: 288 tables Less than 2hours. 4cpu/32gig Replication server, buffers 3/8meg (default) - 5 simultaneous table loads - Full Re-load

We ran several tests in the Azure environment described above...on a 8cpu/32gig Replication server in Azure

Increased simultaneous table in tests 1, 5, 10, 15, 20, 25 at a time...very little difference between 10 and 25 - finished in 25hours. Less than 10 - would have finished in days

Increased buffers from 3/8 megto 10/40meg

Performance latencies range from 1-5secs

We see strange behavior once the load gets down to less than 10 tables. While the load is running upto 25tables, we can see single table performances over 10,000/sec (aggregated over 100,000/recs sec) . Once we get down to less than 10 tables, the larger tables (1-14million rows) single table performance drops to 1500/recs per second and it seems that only one of the 7-10 tables is actually loading data, the others show very little or very infrequent activity. What causes this?

Ron

Steve_Nguyen · ‎2022-06-28

@RonFusionHSLLC

normally performance tuning are done by our Professional Service team.

if there are no error/warning/fail then it is best that you get with your account manger to get PS help on tuning .

Help users find answers! Don't forget to mark a solution that worked for you! If already marked, give it a thumbs up!

RonFusionHSLLC · ‎2022-06-28

Engaged with PS as of yesterday...we shall see where that goes.

My first question is what is the math behind session buffers/buffer sizing. What are all the terms, table size, cardinality, commit rates????

Second question is LOB loading - understand the math; how is performance effected.

If I tweak some of these numbers, I get huge swings in memory/cpu usage and thus runtimes.

Additional random questions - segmentation/parallel loading, table priority (does size matter), number of current tables loaded.

What's different about Azure source/targets vs. our on-prem, since our on-prem performance is way better than Azure at this point...that makes no sense as our on-prem environment doesnt have anywhere near the specs configured in Azure - faster processors, SSD disk, more memory...etc.

Want to get to understanding and a point where I'm no long 'trying things' and more so calculating outcome.

ron

Steve_Nguyen · ‎2022-06-30

since you have PS, best to engage them with all performance question, as each environment is unique

Help users find answers! Don't forget to mark a solution that worked for you! If already marked, give it a thumbs up!

Moving from On-Prem to Azure Managed SQL - performance issues

Best Practices

Performance