Re: single table parallel load only uses one segme... - Qlik Community

RichJ · ‎2024-01-26

Hi Sir,

I need to replicate a large table from MS SQL Server to Snowflake db. "Use Data Range" is selected for Parallel Load Method, however, there is no parallel loading occurred: the source db did not produce data files in parallel; rather, it unloaded the data for the 1st segment first; once that finished, it started to unload data for the 2nd segment; and so on. For each segment, data files were also produced sequentially without parallelism (there are about 5 or 6 datafiles for each segment).

Are there some more configurations needed to be done in order to have parallel operation?

Thanks,

Richard

john_wang · ‎2024-01-26

Hello @RichJ ,

Thanks for reaching out to Qlik Community!

During the Full Load stage , a partition (or called segment) is handled just like it's a separate 'table'. By default up to 5 segments can be loaded in parallel, this number controlled by the parameter:

Task setting --> Full Load --> Full Load Tuning --> Tuning Settings --> Maximum number of tables (default is 5).

Not sure how many tables are included in the task, if the above setting number is too small then less tables can be handed in parallel. Or you may check the table setting to see the parallel load tables number. In my below sample, my task setting "Maximum number of tables" is set to 12:

BTW, there is Load Order setting in the task. please set the partition table load order priority to see if the table's partition load can be startup prior to other tables. By default, these tables are handled by their name order sequence.

Hope this helps.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

deepaksahirwar · ‎2024-01-28

Dear @RichJ ,

You have selected “Use Data Range” as the parallel load method, but you did not see any parallel loading happening. This could be due to several reasons, such as:

The data range columns that you selected are not evenly distributed, which means that some segments may have more data than others and take longer to load.
The number of processes that you specified in the task settings is too low, which limits the parallelism that can be achieved.
The source or target endpoints have some limitations or bottlenecks that prevent parallel loading, such as network bandwidth, CPU usage, memory consumption, or disk I/O.

Resolution:

To troubleshoot this issue, you can try the following steps:

Check the data distribution of the columns that you used to define the data ranges, and make sure that they are balanced and indexed. You can use the SQL Server Management Studio to run some queries and analyze the data statistics.

Increase the number of processes in the task settings, under the Full Load tab, in the Full Load Tuning section, in the Maximum number of tables field. You can experiment with different values, but be careful not to overload the source or target endpoints. You may need to consult with your DBA to find the optimal number.

Monitor the performance and resource utilization of the source and target endpoints, and see if there are any factors that slow down the parallel loading. You can use the Qlik Enterprise Manager to view the task logs, metrics, and alerts, and identify any errors or warnings.

If our response has been helpful, please consider clicking "Accept as Solution". This will assist other users in easily finding the answer.

Best Regards,

Deepak

SushilKumar · ‎2024-01-29

Hello team,

If our response has been helpful, please consider clicking "Accept as Solution". This will assist other users in easily finding the answer.

Regards,

Sushil Kumar

RichJ · ‎2024-01-29

Thanks John for the reply.

Maximum number of tables was set to 8 in the Full Load setting, and only that big table is running in the replication. The target is snowflake db that has 8 cores, however we don't see data files coming to SF parallelly. The issue is on the source site (SQL Server) - it produces datafile sequentially; I 'm wondering if some settings in Qlik can affect that.

RichJ · ‎2024-01-29

Thanks Deepak for your reply.

1. we used 8 segments, and data is distributed in the 8 segments quite evenly

2. we used 8 parallel processors in the task setting.

3. the target db is snowflake that has enough computing resources. The issue should be in the source SQL server. The qlik task log shows data files are produced one segment after another segment sequentially - no parallelism.

john_wang · ‎2024-01-30

Hello @RichJ ,

Thanks for the feedback. I'd like suggest you open a support ticket with below information:

1- The source table creation DDL (include the PK and partition information)

2- The target table creation DDL (if the table was NOT created by Qlik Replicate)

3- (optional) the source table size, or partition rows number, it's hopeful for our reproduction

4- The task Diagnostics Packages with source_unload/target_load set to Trace

Qlik Support Team would like to help you further.

Best Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

single table parallel load only uses one segment at a time

Best Practices

Configuration

Performance