Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Solving the Informatica Dilemma: On-Demand Briefing - Watch On Demand!
cancel
Showing results for 
Search instead for 
Did you mean: 
DWH_TEAM
Contributor II
Contributor II

Filters in Logstream Change Processing

Dear Qlik Team

I have a question regarding the interplay between Change Processing and Filters. Let's consider the following scenario:

We have a replication form Oracle to SQL Server with Change Processing (using Replicate Log Reader) with "Apply Changes".

Here is the state of the source table before replicating:

ColA, ColB, ColC
secret, a , 500
public, b , 600

In the Out-Task, we activate a filter including only rows with ColA = 'public'.

State of the target table after initial full load:

ColA, ColB, ColC
public, b , 600


While in Change Processing, on the source DB the following Statement is committed:

UPDATE ColB SET ColB = 'secretValue' WHREE ColC = 500

Question:

Which of the following Statements is true? I would like to know for transactional mode and also for Batch Apply mode (attrep_changes temp tables sent to target DB).

  1. The Qlik Server checks locally if the affected row is filtered or not. It notes that the updated row is not included in the filter expression, therefore the commit is ignored and nothing is sent to the target machine.
  2. The Information of the update statement is sent to the target machine, but it will be ignored there.

Best Regards, Chris

 

 

 

Labels (3)
1 Solution

Accepted Solutions
Dana_Baldwin
Support
Support

Hi @DWH_TEAM 

It depends on where/how the filter is applied. A pass-thru filter is only applied during the full load and modifies the source query so that only the selected rows are returned to Qlik Replicate. A standard filter is applied for both full load and change processing phases of the task - but the filter is applied at the Qlik Replicate server and the data being filtered out is not sent to the target.

For reference: Using filters | Qlik Replicate Help

Filter for last 90 days of data in Qlik Replicate - Qlik Community - 1880761

This second link describes how to set up a pass thru filter or regular filter.

The change processing apply mode, batch vs. transactional, does not affect the behavior of either filter.

Hope this helps!

Dana

View solution in original post

5 Replies
Dana_Baldwin
Support
Support

Hi @DWH_TEAM 

It depends on where/how the filter is applied. A pass-thru filter is only applied during the full load and modifies the source query so that only the selected rows are returned to Qlik Replicate. A standard filter is applied for both full load and change processing phases of the task - but the filter is applied at the Qlik Replicate server and the data being filtered out is not sent to the target.

For reference: Using filters | Qlik Replicate Help

Filter for last 90 days of data in Qlik Replicate - Qlik Community - 1880761

This second link describes how to set up a pass thru filter or regular filter.

The change processing apply mode, batch vs. transactional, does not affect the behavior of either filter.

Hope this helps!

Dana

john_wang
Support
Support

Hello @DWH_TEAM ,

In addition to @Dana_Baldwin ’s comment, I understand your concern is about how the filter is applied. Please allow me to explain it in more detail:

In Qlik Replicate, the 'filter' function is undertook by 3 different areas (see below). Their functions may be overlap, or they can cooperate with each other, depends on your needs. So far you can build up complex filter conditions in Qlik Replicate:

john_wang_0-1749894295796.jpeg

 

 
  1. Filter Conditions
    Impacts both Full Load and CDC 

    The "Filter Conditions" has best performance (as the expression will be executed in source side database if possible) but it does not support complex expressions
  2. Record Selection Condition
    Impacts both Full Load and CDC 
    The "Record Selection Condition" supports complex expressions include dynamic input values however the performance is not as good as "Filter Conditions", because in latter filter mode, the rows will be retrieved from source DB, transferred to Replicate machine then being filtered in Replicate server, prior to apply to target side database. It may take more resources (CPU, and especially IO and network)
  3. Fullload Passthru Filter
    Impacts Full LoadONLY (even the task is enabled both Full Load and CDC ). The query will be delivered to source DB for executing and retrieves needful rows and sends to target side.

I hope this clear out the doubts.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
DWH_TEAM
Contributor II
Contributor II
Author

Thank you very much for the answer, that it is exactly what I wanted to know. My question was related to standard-filters, i forget to mention that, thanks for clarifying, also to you @john_wang.

Out of curiosity:
How does it work internally? I Immagine it is something like this:

  • Qlik saves all Oracle ROWIDs during the initial full load into an internal table and marks them as included or excluded with regards to the filter.
  • The Oracle redolog-file contains the info which ROWIDs to update with which values.
  • When Qlik reads the update statement in the redolog, it looks up if the row-ID was included during the initial load and only then sends the statement to the target, regardless of the current value of the filter expression.

Is that about right?

Greetings,

Chris

 

 

john_wang
Support
Support

Hello Chris, @DWH_TEAM 

Let's assume the task is configured with Full Load + CDC (Apply Changes). During Qlik Replicate startup:

1. Replicate starts the CDC thread first, capturing and caching all changes made in the source while the Full Load is running.

2. A Full Load thread is then initialized, transferring all historical rows from the source endpoint (EP) to the target endpoint.

3. After the Full Load completes, Replicate applies the cached changes captured during the Full Load phase.

4. Finally, Replicate enters the CDC phase, continuously capturing and applying changes from the source EP to the target EP, using filters, sorters, and performance optimizations as configured.

Regarding your concern:
Qlik Replicate does not rely on the Oracle ROWID internal column during either the Full Load or CDC stages. Instead, Replicate uses the System Change Number (SCN) to track progress and determine the recovery/breakpoint position. All changes are captured from Oracle redo log files including archived and online redo log files (with the exception of non-inline LOBs, which are handled differently).

Hope this helps.

John.

 

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
DWH_TEAM
Contributor II
Contributor II
Author

Hello John

What I still don't undestand: From where does Qlik Replicate take the information if a row affected by an update statement is filtered or not?

Let's consider two examples:

  • Redolog contains an update statment for the row with COLID = 1
  • Redolog contains an update statement for the row where column 'primarykey' = 1

How does Qlik determine if the row is included in the filter (--> send the update statement to the target) or if it is excluded by the filter (--> do not send update statement to the target)?

 

Cheers,

Chris