Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
When we do initial load ( DB2 z/OS ) on a single DB2 table using target endpoint ( Kafka topic ), does the count in the Task monitor indicate the number of rows that are currently in the table or changes that have happened to the table that are in the transaction log, which may contain rows that may have multiple changes to them after the row was initially inserted.
This question comes to mind because, Qlik works on the transaction logs, not on actual tables, right?
Thank you
Hi @adbdkb ,
In Replicate UI there are 2 screens one for Full Load and the other for CDC.
Initially, the Full Load tab will show the row count based on table stats, but at the end of the Full Load, it will show the actual transferred row count.
The CDC tab will show the records read from the Transactional log file.
Thanks,
Swathi
> This question comes to mind because, Qlik works on the transaction logs, not on actual tables, right?
That's a hard NO.
For the purpose of full-load and to gather the source table metadata Replicate accesses the actual tables.
For the CDC portion of the work indeed transaction logs are use sometimes with help from the base table to read LOB values if and when needed. The exception for reading a TX log is when 'Using ODBC with CDC as a Source' - there the base table is read and changes detected using 'at least one special "Context" column'
>> Task monitor indicate the number of rows that are currently in the table or changes that have happened to the table that are in the transaction log, which may contain rows that may have multiple changes to them after the row was initially inserted.
Correct. The task monitor screen for ACTIVE loading tables shows a 'CACHED CHANGES' column which provided and INDICATION of how much the final count might change. Replication counts, but does not know yet whether the change was already captured during fullload or not. For end-of-table inserts, a common usage case, most inserts during the fullload will already have been read during that fullload and the count will not change much.
fwiw,
Hein.
Hi @adbdkb ,
In Replicate UI there are 2 screens one for Full Load and the other for CDC.
Initially, the Full Load tab will show the row count based on table stats, but at the end of the Full Load, it will show the actual transferred row count.
The CDC tab will show the records read from the Transactional log file.
Thanks,
Swathi
> This question comes to mind because, Qlik works on the transaction logs, not on actual tables, right?
That's a hard NO.
For the purpose of full-load and to gather the source table metadata Replicate accesses the actual tables.
For the CDC portion of the work indeed transaction logs are use sometimes with help from the base table to read LOB values if and when needed. The exception for reading a TX log is when 'Using ODBC with CDC as a Source' - there the base table is read and changes detected using 'at least one special "Context" column'
>> Task monitor indicate the number of rows that are currently in the table or changes that have happened to the table that are in the transaction log, which may contain rows that may have multiple changes to them after the row was initially inserted.
Correct. The task monitor screen for ACTIVE loading tables shows a 'CACHED CHANGES' column which provided and INDICATION of how much the final count might change. Replication counts, but does not know yet whether the change was already captured during fullload or not. For end-of-table inserts, a common usage case, most inserts during the fullload will already have been read during that fullload and the count will not change much.
fwiw,
Hein.
Hi @SwathiPulagam - I am sorry, but I am not seeing the two tabs. Could you put a screen-shot here?
Thanks
Hi @adbdkb ,
Below is the Full Load screenshot:
Go to Monitor-->Full Load tab
Go to Monitor-->Change Processing tab:
Thanks,
Swathi
Thanks @Heinvandenheuvel . Can I make a same request here? Can you post the screen-shot here?
The situation we will have is - we will have a task that will do the full load initially and then it will continue with incremental load all the time. ( At least, that is my understanding for how the tasks will work ). So, after the initial full load, the changes that are happening in the table, will be part of the Change Data Capture and that will continue until we stop the task, right? Is my understanding correct?
Hi @adbdkb ,
Please refer to @Heinvandenheuvel post as I just explained in high level, Hein also explained about cached events and you cannot depend on the UI count because when you stop and resume the task the count will go to Zero.
Thanks,
Swathi
Thank you both @SwathiPulagam and @Heinvandenheuvel . And is all of this information also available in the control table ( Kafka topics )? Can you post a sample json for that type of reply, if possible?
Thanks
Hi,
Yes, your understanding is correct.
Thanks,
Michael
And all of this effort is for reconciliation between DB2 and Kafka. And I don't want to look at the UI to manually check and in production, as a developer, we may not even have access to the Qlik UI. Hence the attempt to find a way, where this can be automated.
My first concern was - reconciling between DB2 table counts and Qlik counts - because I wasn't sure how do we separate # of rows in the table vs. what was pulled in because I thought the information is retrieved from the transaction logs. But as @Heinvandenheuvel explained, it is from actual tables - so that count can be considered as count(*) from the actual table, right?
I still don't know how to get the incremental count between an interval and then compare that count with the messages received in Kafka during that same interval and reconcile
Thanks