Solved: Re: Mapping SQL Server View as source for replicat... - Page 2 - Qlik Community

JAfonso · ‎2023-02-06

Hello,

when I try to select Tables/views in a task from an Sql server source I only see tables and can´t see a view that was created.

Can anyone advise?

Michael_Litz · ‎2023-02-06

Hi @JAfonso ,

I am posting a link to an article about a Full Load / Split CDC task pair. It discusses some potential issues with this type of approach and how to mitigate them.

https://community.qlik.com/t5/Official-Support-Articles/Qlik-Replicate-Full-Load-and-CDC-Split-Task-...

I hope this helps out.

Thanks,
Michael

JAfonso · ‎2023-02-07

Hi everyone,

the reason for trying to use the view is to workaround to a contnued timeout isssue in a full load.
Here is my plan, I would appreciate your comments and advise if you think this can work.

Scenario:
- we have multiple source systems (more than 10), some are Oracle some are SQL Server all using Replicate for Change Processing
- the table that is causing the issue is the biggest of 40 tables from an SQL source
- target system is for all tasks is an Oraclle Database

Issue:
due to the size of the table full load never ends (timeout errors occur and the load process restarts from the begining)
The error we get gives 2 messages:
Error1
Stream component 'st_2_S_SOARIAN_HBA_PRD_SQLS' terminated
Stream component failed at subtask 2, component st_2_S_SOARIAN_HBA_PRD_SQLS
Error executing source loop
Endpoint is disconnected

Error2
RetCode: SQL_ERROR SqlState: HYT00 NativeError: 0 Message: [Microsoft][ODBC Driver 17 for SQL Server]Query timeout expired

Assumption:
- the critical info we need is for data no more than 1-2 months old but the source table has more than 1.100.000.000 records (data since 2012)
- Old data is unchanged so change processing is not needed on that data

Solution A:
1- create a view in the source only with data from this year
2- run full load from this view into the current target table
3- keep change processing working based on the view (does this work?)
4- feed the oldest data by chunks using insert select with database llink from target to source

Solution B:
1- Perform the full load filtering the data from the source data
2- keep change processing working based on the filtered table
3- feed the oldest data by chunks using insert select with database llink from target to source

Thanks in advance if you have read this far!!!

JAfonso · ‎2023-02-08

Thank you @Michael_Litz , the issue we are facing is not actually splitting Full Load and CDC but doing full load by chunks: the table is huge and each times there is a failure it truncates target and restarts.

Any ideas on how to address this?

Thanks

john_wang · ‎2023-02-10

Hello @JAfonso ,

Sounds like huge rows number leads troubles. Error1 caused by TCP/IP long time connection; Error2 caused by some specific query timeout. To shorten the full load time and reduce the cached change records:

1. As you mentioned, use filter;

2. Run Full Load ONLY task to pass the unchanged old data to target by dedicated task(s); And use Parallel Load to speed up the transfer if network bandwidth is sufficient.

3. We may transfer unchanged data to different temporary table(s) in target, then merge the temporary table(s) into target table in off-peak time, before the CDC task startup.

4. We'd like to suggest PS team engaged as this not an easy performance tuning job, and need to solve the various errors during the troubleshooting.

BTW, regarding the question of "keep change processing working based on the view (does this work?)", the answer is NO, there is an explanation in my comment SQL Server does not record VIEW's data into TLOG.

Hope this helps.

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

JAfonso · ‎2023-02-13

Hi @john_wang , Thanks a lot for your input.

Regarding the filter part:

Can I do the filtering in the source with views? Likethis:

Step 1-Imagine I create 4 views SQL2023, SQL 2022, SQL2021, SQL2020 and I full load 4 equivalent target tables in Oracle: ORA23,ORA22,ORA21, ORA20

Step 2- I Truncate the current Target Table and merge the data from the 4 ORA tables.

Step3- I turn on CDC on the Target table and it starts sync

In case the above is possible Any issue if later on I merge even more ancient data to the target table? (e.g. SQL2019,...)

Thanks again.

Regards,

Joao

john_wang · ‎2023-02-15

Hello @JAfonso ,

That should be fine. BTW, I think it's better to use partition table (eg 1 partition for 1 year data) in Oracle DB for easy management and better performance; transfer the 'history' data prior to 'change' data to avoid the UPDATE/DELETE cannot find the target rows.

Good Luck,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

JAfonso · ‎2023-02-15

Hi John, we are thinking about partitioning, yes.

The idea is to start with most recent data (the one that is bond to be changed) and start cdc asap (500.000 records inserted7updated per day) than we had the hystory dat that we know is not updated while users are using the current up to date data.

Thank you John and all for your very useful advise.

Mapping SQL Server View as source for replicate server

Connectivity - Sources or Targets

Errors - Unexpected Behavior