Re: Full load alternative - Qlik Community

Kerri · ‎2025-07-10

We have a table that is over 1.5 million rows. This table is replicating from MYSQL to Snowflake via a CDC task. We have to increase the LOB limit because the json values are being truncated. What are some alternatives to repopulating the historical data without having to do a full reload.

john_wang · ‎2025-07-13

Hello @Kerri ,

I'm not entirely sure I understood your question correctly. If I got it right, you're saying that the initial Full Load was completed using a specific LOB size setting. Now you want to increase the LOB size in the Replicate task settings and replicate the LOB columns again with this new (larger) size. Is that correct?

If so, here are two possible options to proceed:

Option 1:

Replicate only the primary key and LOB columns (with the updated, larger LOB size) to a temporary table, then merge the LOB columns from the temporary table into the final target table within Snowflake.

Note: This approach can be complex and may not guarantee full consistency, since the two tasks (original and temporary) run asynchronously.

Option 2:

Reload the original task with the new, larger LOB size setting.

This is the safer and more reliable option for data integrity.

You can also enable Parallel Load to accelerate the initial load process if the dataset is large.

Hope this helps.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

Kerri · ‎2025-07-15

Hello John!

Yes, you understood the question correctly. Due to the size of the table, a full reload of that table would take several days, which is what we are trying to avoid, therefore reloading the original task isn't really an option.

Option 1 does sound like something we could try, although your point of it being more complex makes me nervous.

We currently have the full load tuning to 8 for this dataset.

john_wang · ‎2025-07-15

Hello @Kerri ,

Thanks for the update.

Yes, neither option is ideal for such a large dataset. Parallel Load might be the more manageable approach, even though it takes time and consumes significant resources.

Best Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

Kerri · ‎2025-07-16

Hello again John,

What is the most conservative approach for this setting? We currently have the value set to 8. I don't want to negatively impact the other tasks nor bring down MYSQL.

Thanks in advance

Full load alternative

Best Practices

CDC - 1-many - Log Stream

Option 1:

Option 2: