Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
We have a task with source as ms-sql and target as snowflake. We have a scenario where a huge volume (~50 mill) of updates are happening on a table which is causing the approximately 25-50 million records in the incoming changes. This is resulting in a delay ~15 hours.
Few observations that we captured.
1. In snowflake we see a time difference of >1 minutes between 2 batches (copy, megre, truncate is one batch). Attached is a screen shot.
2. We took logs with TARGET_APPLY at verbose and see
2.1. The table with these updates is having lob columns (nvarchar(-1)). The slowness is ONLY happening during change processing.
2.2. We see that the there is approx 1 minute between the 2 batches (2 copy statements) in the TARGET_APPLY logs. Most of this time, it seems like it is trying to lookup data of the lob from sql server, one record at time. Attached are the logs (logs.txt). We see that it is trying to ' Going to fill lob data for columns in table dbo.XXX' for each record in the batch.
3. I have added the change processing tuning parameters that we have for this task along with batch slowness observed.
Questions:
1. Is this issue happening due to the lob lookup ? If so, is there a way we can set lob to be read 'inline' even for change processing. Is there anything like 'maxLobSize' option that can be set such that the lookup only happens if it goes beyond this option and 'inline' if it is within the option size. Note we are specifically looking for this possibility for Change processing as full load doesnot present this issue.
2. I see this in qlik documentation 'When replicating from Microsoft SQL Server, inline LOBS will always be read directly from the logs (i.e. without lookup).' Isnt nvarchar(-1) inline lob? and shouldn't it be loaded inline instead of the lookups that we see in the logs.
3. Or is it something not related to lobs and something else ? I have added screen shot of the fine tunings that we have for qlik change processing.
Hi @sandeep_r
Yes, source lookup is a separate thread/connection to the source in order to populate the data on the target. This will impact performance. The max size for an inline LOB is 8,000 bytes. You can set the limit LOB size in task settings to below this limit to prevent the lookup.
Also, your change processing tuning settings are at the defaults which are not good for Snowflake target. It prefers larger batches. Here are the "best practices" settings. They might require further adjustment for your use case. For help with tuning, our Professional Services team (fee based) helps with that.
1) To increase the size of the batches sent to Snowflake, please edit the settings in Task Settings \ Change Processing \ Change Process Tuning:
Longer than (seconds) to 299
But less than (seconds) to 300
Force apply a batch when memory exceeds to 2000
Transaction offload tuning (helps prevent caching to disk):
Total transactions memory size exceeds (MB) to 5000
Transactions duration exceeds (seconds) to 86400
2) Depending on the Snowflake Warehouse size XS/S/M/L set the Max file size on the Advanced tab of connection to either 500/1000/1500/2000
3) Snowflake Advanced tab -> Additional ODBC connection properties:
CLIENT_SESSION_KEEP_ALIVE=true;ABORT_DETACHED_QUERY=true;CLIENT_SESSION_KEEP_ALIVE_HEARTBEAT_FREQUENCY=900;
This helps to ensure the connection stays active.
Thanks,
Dana