Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik GA: Multivariate Time Series in Qlik Predict: Get Details
cancel
Showing results for 
Search instead for 
Did you mean: 
TomaszRomanowski
Partner - Contributor II
Partner - Contributor II

Job hangs after start (LogStream to DataBricks)

Qlik replicate version: 2023.11.0.468

Source: MSSQL with LogStream (data rotation 3 days, max data 300GB).
Target: DataBricks Delta

Job (MSSQL -> LogStream) tables:
9 typical tables + 1 table with XML column with max lob size 20MB.

Job (LogStream -> DataBricks) tables:
Only one table with XML column with max lob size 20MB.

Current Status:
In DataBricks I have data of this tables but 32 hours behind real-time.
So, I run task without full load just to point date (32 hours in the past)
In Logstream I have 3 days rotation data (currently about 80GB data).

So when I run task I use option:
Advanced Run Options -> Table are already loaded. Start processing changes from -> Data and Time
I put there past date (32h ago).

Configuration on job (LogStream -> DataBricks):
- max lob size = 20480
- stream_buffers_number = 5
- stream_buffer_size = 1024
- Apply Conflicts insert -> „Duplicate key when applying Insert” = UPDATE
- one extra column that is calculated.
- Total transactions memory size exceeds (MB) = 2048
- target max file uload = 2000 (2GB)


The task starts, sends some data to the target, and then hangs indefinitely.
There is no bigger cpu or disk usage. 

Logs show the following key messages:

[SORTER ]T: Reading from source is paused (sorter_transaction.c:76)
...
[SORTER_STORAGE ]T: Memory size needed 2147684472, Memory size limit: 2147483648 (transaction_storage.c:5380)
...
[SORTER_STORAGE ]T: Swap committed transaction to free memory, transaction index 10 (transaction_storage.c:5398)
[SORTER_STORAGE ]T: Transaction (Type 'Commited Transactions storage', Id 00000000000000000000000000000bae, Events # 3) is moving to file 'D:\qlik\Replicate\data\tasks\MSSQL_LS_DBRICKS/sorter/ars_swap_tr_00000000000000000001.tswp' (transaction_storage.c:5482)
...
[SORTER_STORAGE ]T: Memory size needed 2094508184, Memory size limit: 2147483648 (transaction_storage.c:5420)
...
[SORTER ]T: Reading from source is paused (sorter_transaction.c:76)
...
[FILE_FACTORY ]T: Source 'D:\qlik\Replicate\data\tasks\MSSQL_LS_DBRICKS\cloud\bulk\CDC00000001.csv.gz' exists (type = 2), size is 7129552 bytes (at_universal_fs_object.c:658)
[FILE_FACTORY ]T: uploading file 'D:\qlik\Replicate\data\tasks\MSSQL_LS_DBRICKS\cloud\bulk\CDC00000001.csv.gz' to '/staging/attrep_changes32432562243S4234D/CDC00000001.csv.gz' (AttAdls2FileFactory.java:52)
[FILE_FACTORY ]T: upload done (AttAdls2FileFactory.java:55)
...
[FILE_FACTORY ]T: upload of file <D:\qlik\Replicate\data\tasks\MSSQL_LS_DBRICKS\cloud\bulk\CDC00000001.
[TARGET_APPLY ]T: Data is copied to attrep_changes table (cloud_bulk.c:1245)
[TARGET_APPLY ]T: cloud_bulk_start_applying - refreshing net changes table (cloud_bulk.c:1527)
[TARGET_APPLY ]T: Refresh table - owner: qlik, table: attrep_changes32432562243S4234D (databricks_imp.c:607)
[TARGET_APPLY ]T: Execute statement: REFRESH TABLE `qlik`.`attrep_changes32432562243S4234D` (databricks_imp.c:612)
..
[TARGET_APPLY ]T: Start applying of 'UPDATE (3)' events for table 'dbo'.'TABLE_XML' (1). (bulk_apply.c:2903)
[AT_GLOBAL ]T: Bulk update statement:
[TARGET_APPLY ]T: Going to run update statement, from seq 1 to seq 3. 'MERGE INTO `qlik`.


Then in the loop:
[SORTER_STORAGE ]T: Forwarded counters. Tr. index 0, Joined tr. # 259 (transaction_storage.c:2298)
[SORTER_STORAGE ]T: Forwarded counters. Tr. index 1, Joined tr. # 252 (transaction_storage.c:2298)
[SORTER_STORAGE ]T: Forwarded counters. Tr. index 2, Joined tr. # 251 (transaction_storage.c:2298)
[SORTER_STORAGE ]T: Forwarded counters. Tr. index 3, Joined tr. # 242 (transaction_storage.c:2298)
[SORTER_STORAGE ]T: Forwarded counters. Tr. index 4, Joined tr. # 246 (transaction_storage.c:2298)
[SORTER_STORAGE ]T: Forwarded counters. Tr. index 5, Joined tr. # 307 (transaction_storage.c:2298)
[SORTER_STORAGE ]T: Forwarded counters. Tr. index 6, Joined tr. # 370 (transaction_storage.c:2298)
[SORTER_STORAGE ]T: Forwarded counters. Tr. index 7, Joined tr. # 370 (transaction_storage.c:2298)
[SORTER_STORAGE ]T: Forwarded counters. Tr. index 8, Joined tr. # 362 (transaction_storage.c:2298)
[SORTER_STORAGE ]T: Forwarded counters. Tr. index 9, Joined tr. # 318 (transaction_storage.c:2298)
[SORTER_STORAGE ]T: Forwarded counters. Tr. index 10, Joined tr. # 191 (transaction_storage.c:2298)

There is one time:
[IO ]T: rep_net_server_select: Server poll timeout. failed (at_repnet.c:489)

After this I have still: "T: Forwarded counters. Tr. index ...."

In sorter folder, Qlik Replicate created some Swap files (few with data and few with 0 data)
After a while, I tried to stop Task but it was for a long time in stopping mode. So I restarted QR service to regain control.

It seems that somehow Qlik Replicate was not able to read data from LogStream (with back in time).
It reads some part of data and then hangs.


Questions:
1. Why did replication to the target stall despite no errors on the Databricks side?

2. Are the following log messages indicative of the root cause?
"Reading from source is paused"
"rep_net_server_select: Server poll timeout. failed"

3.Could the issue stem from:
-Memory constraints on the sorter?
-LogStream not delivering data correctly when processing historical changes?

For me interesting messages are:
Reading from source is paused
rep_net_server_select: Server poll timeout. failed

1 Reply
sureshkumar
Support
Support

Hello @TomaszRomanowski 

Could you please open a support ticket to analyze further.

 

Regards,

Suresh