Solved: Re: Target Postgres - Updating unnecessary columns - Qlik Community

Jon_Donker · ‎2024-02-11

Hi Team,

I am investigating performance problems on a target Postgres database. Most of the dml is UPDATE statements; updating a few select columns. As a constraint of the business requirements; we have to have the task in Transaction apply mode.

With logging turned up; I ran the following on the source:

UPDATE dbo.my_table
SET field_1 = 10
WHERE
	key_1 = 'A' AND
	key_2 = 'B' AND
	key_3 = 'C';

In the qlik replicate logs; the update came through as:

UPDATE dbo.my_dest_table
SET	key_1 = ?,
	key_2 = ?,
	key_3 = ?,
	field_1 = ?,
	field_2 = ?,
	field_3 = ?
WHERE
	key_1 = ? AND
	key_2 = ? AND
	key_3 = ?;

This means that qlik is updating columns that it doesn't need to be; and more importantly it is updating the keys. I'm not sure if the Postgres compiler is smart enough to work to know that no action is required; or it will try modifying the index on the keys; adding to performance overhead.

Is there a way to make qlik replicate update only columns that are necessary? i.e.

UPDATE dbo.my_dest_table
SET	field_1 = ?
WHERE
	key_1 = ? AND
	key_2 = ? AND
	key_3 = ?;

john_wang · ‎2024-02-11

Hello @Jon_Donker ,

While Replicate runs on Windows then it's much easier:

Open PostgreSQL target endpoint
Go to the Advanced tab
Open Internal Parameters
Add a new parameter named $info.query_syntax.pk_segments_not_updateable
Set its value to TRUE and re-run the task, a sample:

Again, if the PK columns values are changed , then a reload is needed.

Good luck,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

View solution in original post

john_wang · ‎2024-02-11

Hello @Jon_Donker ,

Thanks for reaching out to Qlik Community!

We may control the UPDATE behavior in Qlik Replicate however would you please elaborate why the task was set to Transactional Applying Mode, maybe the latency still builds up after the PK columns are excluded in the UPDATE SQL. We'd like to get an accurate understanding the task setting.

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

Jon_Donker · ‎2024-02-11

Hi John,

Thanks for the reply.

We have to have the task in transaction apply mode because we will eventually build triggers on the events of downstream tables and therefore must have that transactional integrity. Since batch optimize builds up a net of the transactions; we will lose that integrity.

At the moment there are no triggers on the destination tables; so there should be no performance impact that are caused by them.

john_wang · ‎2024-02-11

Hello @Jon_Donker ,

Thanks for the clarification.

What you got is the default behavior of PostgreSQL target database. We may change the PostgreSQL syntax, set pk_segments_not_updateable, to TRUE to exclude the PK columns in the UPDATE SQL Statement however please be careful that if there is chance the PK columns values are changed in source side, then this changing may cuase data out of integrity issue.

In summary, the default behavior is the most saft mode. Unless you can guarantee NO PK columns values change otherwise the above parameter should not be used.

Hope this helps.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

Jon_Donker · ‎2024-02-11

Sounds promising - where do I set "pk_segments_not_updateable"?

john_wang · ‎2024-02-11

Hello @Jon_Donker ,

Please share the Replicate server platform (is it a Linux or Windows), and a sample SQL of source table creation DDL (and source DB type), I'd like to confirm it for you.

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

Jon_Donker · ‎2024-02-11

We're on Windows and our source DB type is Microsoft SQL.

The table (obscured for sensitivity is) :

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE dbo.my_table(
	FIELD_01 numeric(5, 0) NULL,
	FIELD_02 numeric(5, 0) NULL,
	FIELD_03 varchar(6) NULL,
	FIELD_04 char(1) NULL,
	KEY_1 numeric(9, 0) NOT NULL,
	KEY_2 char(10) NOT NULL,
	KEY_3 char(2) NOT NULL,
	KEY_4 char(2) NOT NULL,
	FIELD_05 char(3) NULL,
	FIELD_06 char(15) NULL,
	
	UPDATE_FIELD numeric(11, 2) NULL,
	
	FIELD_07 numeric(5, 0) NULL,
	FIELD_08 numeric(11, 2) NULL,
	FIELD_09 numeric(11, 2) NULL,
	FIELD_10 char(20) NULL,
	FIELD_11 char(1) NULL,
	FIELD_12 char(1) NULL,
	FIELD_13 numeric(9, 0) NULL,
	FIELD_14 numeric(9, 0) NULL,
	FIELD_15 numeric(5, 0) NULL,
	x_y varchar(10) NULL,
PRIMARY KEY CLUSTERED 
(
	KEY_1 ASC,
	KEY_2 ASC,
	KEY_3 ASC,
	KEY_4 ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

john_wang · ‎2024-02-11

Hello @Jon_Donker ,

While Replicate runs on Windows then it's much easier:

Open PostgreSQL target endpoint
Go to the Advanced tab
Open Internal Parameters
Add a new parameter named $info.query_syntax.pk_segments_not_updateable
Set its value to TRUE and re-run the task, a sample:

Again, if the PK columns values are changed , then a reload is needed.

Good luck,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

Jon_Donker · ‎2024-02-11

Cheers - I can see that the PK columns are now excluded from the update. Any way we can exclude the other non updated fields?

john_wang · ‎2024-02-12

Hello @Jon_Donker ,

I do not see the possibility for SQL Server source.

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

Target Postgres - Updating unnecessary columns

Performance