Solved: How to increase the throughput in full load - Qlik Community - 1906855

Skip to main content

Qlik.com | Qlik Help | Resources

Ask a Question

Hi Team,

We are replicating data from Oracle to MSSQL server it a full load (no CDC), but the thoughput is very less and its taking more time to get loaded from Attunity/Qlik replicate end.

Can anyone please help me on how to increase the throughput of the table in full load and FYI we are already replicating the table in parallel loading using partitions.?

Thanks and Regards,

Antony S

2 Solutions

Accepted Solutions

Hmm, basic 101 performance troubleshooting s needed.

Please make it very clear whether there is a concerns about a single table, or overal task throughput for multiple table.

It seems the concern is a single table, and it seems you already applied the single most important single-table tuning method being parallel partition loading. Correct.

"very less" - less than expected/needed? What ball-park number? Is the expected vs actual load time seconds? Minutes? Hours? Days?

What is the speed concern? How many rows/secs are being loaded and how many rows/sec are expected/needed. Ditto for MB/sec.

Does the table have (many) lob columns? Avg row sizes? How many rows?

When you switch from simple load to partitioned, what was the observed improvement, if any.

How many partitions? What is the partitioning method? By PK ranges?

Where do you THINK the bottleneck might be? Source? Target? Network? Replicate Server? For starters quickly check the overall system CPU loads. Nothing more than 50% I hope? Now just look at the CPU for the active processes in the load, only the per-process load. Is any process near 100% of a core?

What is the Top-SQL on the source during load. Does that seem reasonable? How many resources are used?

Have you tried a NULL , or FILE target to exclude a target TB and the network to that DB?

Good luck!

View solution in original post

9 hours, 250 million rows is indeed 8000 rows/sec

>>> Yes, When I tried loading it to a NULL target it loaded at a very high speed.

Ah! That's critical information. It suggests that the source and network can deliver. This may well end up 100 being a target DB configuration issue. Be sure to engage with your target DB managers/consultant to explore top queries, resource usage and so on.

Have you loaded to a simple table and a non-compressed target as comparison. I strongly recommend testing with for example a 10 million row subset, or just look at the throughput for 5 minutes and 'call it' without waiting for the full full load. You can also increase the task logging levels for SOURCE_UNLOAD and TARGET_LOAD to TRACE for a few minutes while loading to see if that give any hints.

On the task setting under Full Load tuning there is "Commit rate during full load:". The default is just 10000 (to limit replicate memory usage). Have you tried putting a 0 behind that for 100,000? If that helps, try even more to see if it helps still more.

Hein

View solution in original post

3 Replies

Hmm, basic 101 performance troubleshooting s needed.

Please make it very clear whether there is a concerns about a single table, or overal task throughput for multiple table.

It seems the concern is a single table, and it seems you already applied the single most important single-table tuning method being parallel partition loading. Correct.

"very less" - less than expected/needed? What ball-park number? Is the expected vs actual load time seconds? Minutes? Hours? Days?

What is the speed concern? How many rows/secs are being loaded and how many rows/sec are expected/needed. Ditto for MB/sec.

Does the table have (many) lob columns? Avg row sizes? How many rows?

When you switch from simple load to partitioned, what was the observed improvement, if any.

How many partitions? What is the partitioning method? By PK ranges?

Where do you THINK the bottleneck might be? Source? Target? Network? Replicate Server? For starters quickly check the overall system CPU loads. Nothing more than 50% I hope? Now just look at the CPU for the active processes in the load, only the per-process load. Is any process near 100% of a core?

What is the Top-SQL on the source during load. Does that seem reasonable? How many resources are used?

Have you tried a NULL , or FILE target to exclude a target TB and the network to that DB?

Good luck!

Author

Hi Heinvandenheuvel,

Its about a single table.

It seems the concern is a single table, and it seems you already applied the single most important single-table tuning method being parallel partition loading. Correct.
ANS:
Yes

"very less" - less than expected/needed? What ball-park number? Is the expected vs actual load time seconds? Minutes? Hours? Days?
ANS:
The current throughput : 8000
Required throughput : more than 30000

Current time taken: 9hrs
Required time: below 4hrs

Does the table have (many) lob columns? Avg row sizes? How many rows?
ANS:
No LOB column
Count: 242,628,371

How many partitions? What is the partitioning method? By PK ranges?
ANS:
Partitioned by Month ID.

Where do you THINK the bottleneck might be? Source? Target? Network? Replicate Server? For starters quickly check the overall system CPU loads. Nothing more than 50% I hope? Now just look at the CPU for the active processes in the load, only the per-process load. Is any process near 100% of a core?
ANS:
As of I know the table is getting replicated to MSSQL where the table compressed while loading.

Have you tried a NULL , or FILE target to exclude a target TB and the network to that DB?
ANS:
Yes, When I tried loading it to a NULL target it loaded at a very high speed.

Thanks and Regards,

Antony S

9 hours, 250 million rows is indeed 8000 rows/sec

>>> Yes, When I tried loading it to a NULL target it loaded at a very high speed.

Ah! That's critical information. It suggests that the source and network can deliver. This may well end up 100 being a target DB configuration issue. Be sure to engage with your target DB managers/consultant to explore top queries, resource usage and so on.

Have you loaded to a simple table and a non-compressed target as comparison. I strongly recommend testing with for example a 10 million row subset, or just look at the throughput for 5 minutes and 'call it' without waiting for the full full load. You can also increase the task logging levels for SOURCE_UNLOAD and TARGET_LOAD to TRACE for a few minutes while loading to see if that give any hints.

On the task setting under Full Load tuning there is "Commit rate during full load:". The default is just 10000 (to limit replicate memory usage). Have you tried putting a 0 behind that for 100,000? If that helps, try even more to see if it helps still more.

Hein