Solved: Re: Data gap configuration - Qlik Community

ilham-alldata · ‎2024-09-24

Our Case: We wants to do an unplanned switchover from source A to source B

Before doing the switchover, the Our Team conducted testing first:
1. We uses DB2 for iSeries (IBM AS400) as the Source.
2. We uses MIMIX for data replication that is directly connected to the source for High Availability (HA) needs.
3. We also uses 1 server dedicated to Qlik Replicate for data replication and data stream to the target (Kafka).
4. We conducted a replication test for both with the same table, the same amount of data, the same source, and the same duration, but it seems that there is a difference in the number of records that have been streamed in Mimix and Qlik CDC.
5. We has been tweaked the Manage Endpoint Advanced Setting 'Check for changes every (sec)' which was previously the default 5 sec we tried changing it to 1 sec'(attachment: Manage Endpoint Advanced Setting.jpg).

For more details, see the test result table (attachment: Test MIMIX-Qlik.jpg)
- We ensure that the duration is the same as turning off the connection to the source.
- The 'Source Machine (P8GTI3)' column is the number of data records on the source
- The 'Mimix target' column is the number of data records that were successfully replicated by Mimix to the target
- The 'CDC' column is the number of data records that were successfully replicated by Qlik Replicate to the target (Kafka)
This result is in accordance with the total completed in the Full-load process and
- The 'Delta Source-Target Mimix' column is the difference in data records on Source (P8GTI3) and Mimix Target
- The 'Delta Mimix-CDC' column is the difference in data records on Source (P8GTI3) and CDC target

The question is:
- What is the best configuration to reduce the data gap in the case above?
- How does Qlik Replicate work for streaming data?
- In the image below (Fullload.jpg) there is *Estimated for the 'Remaining' column, what does that mean?
even though the Full-load process is 100% complete
- We have a large network bandwidth of 10Gb/s but only <50% is used, we want to take advantage of this
to boost streaming data performance to be near-realtime, what is the best way? combine tables
with the same journal and schema in one task, or split the table into several tasks? will this
have an effect?

@rivan-alldata @nikko_alldataint

john_wang · ‎2024-09-24

Hello @ilham-alldata , copy @rivan-alldata , @nikko_alldataint ,

Thanks for reaching out to Qlik Community!

We need additional information to fully understand the issue. I'd like to suggest you opening a support ticket with below information:

1- If the table has Primary Key (PK), please attach the source table creation DDL to ticket

2- if some rows were inserted into the table during the Full Load period

3- (if possible) Please set SOURCE_UNLOAD to Trace, recreate the behavior and attach the diagnostics packages

Our support team will be more than happy to assist you.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

View solution in original post

john_wang · ‎2024-09-24

Hello @ilham-alldata , copy @rivan-alldata , @nikko_alldataint ,

Thanks for reaching out to Qlik Community!

We need additional information to fully understand the issue. I'd like to suggest you opening a support ticket with below information:

1- If the table has Primary Key (PK), please attach the source table creation DDL to ticket

2- if some rows were inserted into the table during the Full Load period

3- (if possible) Please set SOURCE_UNLOAD to Trace, recreate the behavior and attach the diagnostics packages

Our support team will be more than happy to assist you.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

Data gap configuration

Best Practices

Configuration

Connectivity - Sources or Targets

Functionality

General Question

Performance