Re: 26 billion full load - Qlik Community

sreaney89

Hi all -

I wondering if i could collate some recommendation for a full load. I appear to be getting a disconnect from the DB issue, that restarts the full load each time. I haven't set the logs to verbose just yet to see whats happening. But what I do know the tables loading are in the 10s of billions for records size.

I am wondering if given that records count there might be something obvious I may need to set to stop this happening. It should be noted its taking about 5 hours per billion records.

thanks.

john_wang

Hello @sreaney89 ,

Thanks for reaching out to Qlik Community!

There are several issues with the task that need to be addressed:

Improper Task Settings:
Please disable Apply Changes Processing and keep only Store Changes Processing enabled. This may resolve some configuration-related errors.
Error: "WAL reader terminated with broken connection / recoverable error. WAL stream loop ended abnormally":
This error is causing the task to stop and attempt auto-recovery. The root cause is likely network-related—potential issues include an unstable connection, connection timeout, firewall rules closing inactive connections, server settings, or resource constraints on the server. To mitigate this, please enable the WAL heartbeat on the PostgreSQL source endpoint to check if it improves stability.

Impact on Full Load Performance:

These errors are negatively affecting the full load performance and may lead to the full load stopping and restarting during recovery.

Current load time: 5 hours per billion records.

To improve performance:

Enabling Parallel Load can significantly reduce full load times. However, if the Replicate server and the source/target databases are not on the same intranet, bandwidth limitations could become a bottleneck.
PS engaged if you need help.

Hope this helps.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

sreaney89

Just another quick question - if I start and stop the full load task does it restart from the beginning and attempt to load the entire table? Something I've noticed as well is that some of the estimates its generating are way off.. in the millions rather than billions of records.

aarun_arasu

Hi @sreaney89 ,

If you are stopping a task where full load is happening then Stop and Start will reinitiate a fresh reload .

Regards

Arun

Dana_Baldwin

Hi @sreaney89

To add to @aarun_arasu 's post, we run a simple "select *" query against the source rather than a "select * order by" for performance reasons. Due to this, there is no way to track where a full load left off in order to resume. Also, this could leave out records if some were inserted after the task stopped and before it was resumed.

I hope this helps!

Dana

DesmondWOO

Hi @sreaney89 ,

In the past, we had a feature that allowed us to resume loading from a specific record. However, we found this feature to be impractical.

To resume from a processed record, we need to query the records in order, such as by primary key or unique index. If a table contains many records, using "ORDER BY" can place a significant load on the system due to the need to sort the records. Therefore, as @Dana_Baldwin mentioned, we avoid using "ORDER BY" for performance reasons.

Regards,
Desmond

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

26 billion full load

Configuration

Connectivity - Sources or Targets

Errors - Unexpected Behavior

General Question

Performance

Impact on Full Load Performance: