Re: Increase in Latency - Qlik Community

bindupenmatsa

Hi Team,

We have 7 CDC tasks running in Transactional apply mode with different SCN's and Schemas. PFB Transaction Offload Tuning:

Total transactions memory size exceeds (MB): 1024
Transactions duration exceeds (seconds): 60

Since we were not provided ASM privileges and logs were getting stored outside ASM in Backup folder.

Previously the data is accumulating on the Source Disk and waiting until source commit. So, we've requested for increase in Memory and virtual processors along with Network bandwidth. Currently we've got Memory - 64GB, virtual processors - 16 and Network bandwidth from source - 50mbps allocated. Now the data is getting stored in Target - On Disk, we've added buffer parameters to the task as below mentioned:

"common_settings": {
"stream_buffers_number": 20,
"stream_buffer_size": 100,

Source - Oracle

Target - Bigquery [commit mode is enabled]

Still we can see the increase in Latency for all the tasks. Any suggestions on how can we bring down the latency and speed up the log processing time?

Kent_Feng

Hi @bindupenmatsa

You may need to increase logging component PERFORMANCE to trace then analyze the task log to find out if it is a source latency or target latency at the first place.

Secondly, may I know why you use 'transactional apply mode' instead of 'bulk apply mode'. Transactional apply mode would cause latency issue usually.

Thirdly, are those 7 CDC tasks using the same source endpoint and same target endpoint? If so, can you combine them into less number of tasks. If they have same source endpoint but different target endpoint, consider using logstream task.

Thanks

Kent

*** Greetings from Down Under ***

Dana_Baldwin

Hi @bindupenmatsa

In addition to what @Kent_Feng shared, if the latency is *only* due to waiting for the source to commit the changes, there isn't much you can do besides work with the source DBA to have smaller / more frequent commits on the source transactions. Replicate only applies committed changes to the target.

Dana

john_wang

Besides @Dana_Baldwin and @Kent_Feng , you may replace the GBQ target by NULL Device, and keep all other settings as is, rerun the task again to see if latency disappears.... if Yes, it means the network and target GBQ cannot swallow the changes volume. If NO, the source side issue.

Hope this helps to position where the bottleneck is.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

bindupenmatsa

Thanks team for the responses!

We've set PERFORMANCE to trace and we can see the latency on both Source and target. PFB log below:

"[PERFORMANCE ]T: Source latency 337814.62 seconds, Target latency 337814.62 seconds, Handling latency 0.00 seconds"

We've tried Batch optimized apply mode earlier which didn't help to meet the real time CDC with huge latency.

We've 7 designed different tasks cause we were provided with different SCN's, Schemas and few tables do not have primary key.

And we tried the same task settings with NULL target, there is no decrease in latency.

These were the IO of Source Backup folder :

iops- 9661 iops for read and 1704 iops for write.

Any other workaround for this issue

john_wang

Hello @bindupenmatsa ,

Thanks for the information.

>> [PERFORMANCE ]T: Source latency 337814.62 seconds, Target latency 337814.62 seconds, Handling latency 0.00 seconds

This means the bottleneck is in the source side instead of target side.

Target latency equals source latency plus handling latency.

>> We've tried Batch optimized apply mode earlier which didn't help to meet the real time CDC with huge latency.

It's hard to understand why Batch optimized mode is worse.

>> We've 7 designed different tasks cause we were provided with different SCN's, Schemas and few tables do not have primary key.

Looks to me the entire tasks need to be designed or merged, because of:

1- The redo logs were read 7 times. for example you have 10G changes (redo log files size) per day, then totally 70G files will be transferred between Oracle server and Qlik Replicate server via internet/intranet connection every day. Both I/O resources and network bandwidth are wasted.

2- Too many connections were established in Oracle database. Let's say one task need 6 connections, then the total connections are more than 6*7=42. Almost same for target side. It occupy the source and target side database resources.

3- Too many tasks not only impact the source and target side database, these tasks will impact each other as resource conflict etc which may lead latency.

In short, looks to me it's too eelier to tell a workaround. I'd like to suggest PS engaged (charge based) to understand the hardware and software environment, and merge the existing tasks to less tasks, at most 2 or 3 tasks.

Hope this helps.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

sureshkumar

Hello @bindupenmatsa

As John mentioned as entire tasks needs to re-designed. The last week we communicated the same over a call this is purely infrastructure side issue. Kindly engage the PS team asap to resolve the source latency issue.

Regards,

Suresh

Increase in Latency

Best Practices

Configuration

Connectivity - Sources or Targets

Functionality

Performance