Solved: Use backup logs for Postgres - Qlik Community

sergsyb · ‎2022-10-27

Hi,

For Microsoft SQL endpoint is it possible to set processing mode as --> Read changes from Backup Logs Only.
Is it possible set up the same for Postgres endpoint? May be there is some internal parameter for it.

john_wang · ‎2022-10-27

Hello @sergsyb ,

The mechanism of changes capture are vary between different databases. In SQL Server or Oracle, there are backup TLOG, or archived REDO log files, so it's possible to read only backup/archived logs. However, for PostgreSQL CDC, the SLOT is used to get changes from WAL files, known as Logical Replication . It's different with other DBs.

Hope this helps.

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

View solution in original post

john_wang · ‎2022-10-27

Hello @sergsyb ,

The mechanism of changes capture are vary between different databases. In SQL Server or Oracle, there are backup TLOG, or archived REDO log files, so it's possible to read only backup/archived logs. However, for PostgreSQL CDC, the SLOT is used to get changes from WAL files, known as Logical Replication . It's different with other DBs.

Hope this helps.

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

sergsyb · ‎2022-10-27

Thanks John. But may be there is a missprint in you answer re MSSQL/Oracle. Instead of ", so it's impossible to read only backup/archived logs." we should read "so it's possible to read only backup/archived logs"

john_wang · ‎2022-10-27

Hello @sergsyb ,

Good catch! my bad, I'm going to correct it right now.

thank you so much,

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

SwathiPulagam · ‎2022-10-27

Hi @sergsyb ,

Yes, it was a typo mistake. What John confirmed is in SQL Server\ Oracle you can read from only backups but not in PostgreSQL.

Thanks,

Swathi

sergsyb · ‎2022-10-27

Thank you guys it is clear now

Heinvandenheuvel · ‎2022-10-27

[update: I now see that other replies came in while I worked on this - there may well be duplication but sometimes other wording helps anyway, so I'm just leaving this]

I don't think such distinction is as critical as it is for SQLserver. Google is your friend for the inner workings of change logging on various DB platforms. Some examples...

How I interpret this is that SQLserver tries to keep the active transactions in memory backed up by structured 'active log' Log Database File (LDF). On a commit/rollback it can figure out the appropriate data file changes all from memory or at least a structure with some sort of keyed access. The backuplog is therefor not so essential and processing can be set to 'simple' (not for Replicate!). Other database, such as Oracle an Postgres, write their datafile changes constantly to a sequential file. For Oracle this is a round robin set of 'REDO' (with Active being the special one). For Postgres this is the WAL ( Write-Ahead Logging ) . For those the changes are written sequentially linearly all the time, and Replicate might just as well read them linearly on the fly as they are produced. For SQlserver there is an active interaction and a different API - sp_repldone and fn_dblog vs fn_dump_dblog. So there it may make good sense for Replicate to stay out of the active TX processing and just read when the dust has settles and all the change data is in the backup.

Admittedly there may still be reasons to distinguish between the 'actively written to' and stable backup files. One reason could be the underlying physical activity. Another (more likely) reason is the OS filesystem interaction with shared access. For example trying to read Oracle redo files on the recommend AIX configuration can lead to overall IO degradation due to 'demoted IO'. Oracle makes it easy for Replicate to know what-is-what through the internal table v$achived_log and the likes and Replicate offers an option to listen to that. For Postgres this may or many not be easy to determine (I never investigated) and Replicate it does not (currently?) appear offer the option. No explicit option in the (advanced) GUI, no obvious internal parameter to flip.

Back to you! Why did you ask? Just curious? or do you believe to see actual (performance?) issue? If so, you may want to carefully and completely gather some evidence and perhaps comparisons and submit an improvement request.

Hope this helps some,

Hein.

sergsyb · ‎2022-10-27

Hi Hein,

No evidence of an effect of Replicate on Postgres performance. The Postgres is our new end point.
We set reading backup logs only for MSSQL, because sometimes there was increased activity from the login under which the REPLICATE is running. And we are adding a new Postgres source right now and looking ahead we wanted to set up it in the same way as MSSQL. Now it is clear it is impossible to do. Thanks all

Dana_Baldwin · ‎2022-10-27

Hi @sergsyb

Just so you have it handy, here is the link to our product enhancement request page: https://community.qlik.com/t5/Ideas/idb-p/qlik-ideas

Thanks,

Dana

Use backup logs for Postgres

Configuration

Connectivity - Sources or Targets