Skip to main content
Announcements
UPGRADE ADVISORY for Qlik Replicate 2024.5: Read More
cancel
Showing results for 
Search instead for 
Did you mean: 
Ole_Dufour
Creator
Creator

Insertions and deletions in third table after replication of two

My Qlik task successfully replicates :

  • source table "Person" to target table "Person"
  • source table "Address" to target table "Address"

Source and Target tables are all in Oracle. In the below scenario John Doe has two addresses.
In my target database there is a third table that joins Persons with their Addresses.

The target table "Person" looks as follows:

ID (primary key, sequence.next val) Person_ID_Source Name
4 5665888 John Doe

 

The target table "Address" looks as follows :

ID (primary key, sequence.nextval)  Person_ID_Source Street_No City 
20 5665888 Holland street 123 New York
21 5665888 Baker street 456 New York



Making the connection via Person_ID_Source I want  my third target table "Person_Address" to be filled as follows :

ID (primary key, sequence.nextval) PersonID AddressId
1 4 20
2 4 21


If an Address is deleted at the source, it should disappear from both "Address" and "Person_Address" target tables.
If an Address is updated at the source, it should be updated in "Address" target table.
If an Address is inserted at the source, it should be inserted in "Address" and "Person_Address"  target tables.

If a Person is deleted at the source,  his addresses should disappear from "Address" and "Person_Address" target tables.
If a Person is inserted at the source,  his addresses should be inserted to "Address" and "Person_Address" target tables.

How would I include these operations in a Task? I'm asking for CDC and Full load.
Thanks a million.

 

1 Solution

Accepted Solutions
Heinvandenheuvel
Specialist III
Specialist III

Does the 'Person_Address' table not appear on the source and is maintained there?

Replication of that 3rd source table should  take care of  it all. 

Replicate strictly does ONE-on-ONE row processing. It is NOT an Informatica or similar tool. You cannot use CDC to fix a broken source DB design.

An action (or the existence for full-load)  on one row in one table will create an action on one row the assigned target table for that source table. There is  NO option to impact a second row (but you can do source or target lookups on other rows).

Now the target DB might have a TRIGGER which could try to impact other tables , like in your example a target DB trigger could impact  'person address'. Now DELETE would be easy, but INSERTS would be impossible because of lack of context. When an address is  created, no person is known yet. But wouldn't the source application also update its 'Person_Address' table and that table would simply also be replicated?

Without the source making recognizable row changes, I get the feeling that you'll end up building an application module on the target OR PERHAPS A VIEW which consolidates your person/address data on a regular (hourly? Daily) basis to maintain, expose, desired new connections which  do not come with the raw base tables.

Groetjes,

Hein.

I

View solution in original post

5 Replies
Heinvandenheuvel
Specialist III
Specialist III

Does the 'Person_Address' table not appear on the source and is maintained there?

Replication of that 3rd source table should  take care of  it all. 

Replicate strictly does ONE-on-ONE row processing. It is NOT an Informatica or similar tool. You cannot use CDC to fix a broken source DB design.

An action (or the existence for full-load)  on one row in one table will create an action on one row the assigned target table for that source table. There is  NO option to impact a second row (but you can do source or target lookups on other rows).

Now the target DB might have a TRIGGER which could try to impact other tables , like in your example a target DB trigger could impact  'person address'. Now DELETE would be easy, but INSERTS would be impossible because of lack of context. When an address is  created, no person is known yet. But wouldn't the source application also update its 'Person_Address' table and that table would simply also be replicated?

Without the source making recognizable row changes, I get the feeling that you'll end up building an application module on the target OR PERHAPS A VIEW which consolidates your person/address data on a regular (hourly? Daily) basis to maintain, expose, desired new connections which  do not come with the raw base tables.

Groetjes,

Hein.

I

Ole_Dufour
Creator
Creator
Author

Hi Hein,

thanks for your response.  In the source db there's no source table  that joins Persons with their Addresses directly. There's a Family table that joins a Person with a Family, and it's the Family that has one or more addresses, not a Person. So in our source database all Persons in the same Family have the same Address. Besides no orphan addresses exist ie addresses without a Person attached. 

We are creating a .Net application that will be used alongside our current CRM,  Siebel. The latter will be decommissioned once the .Net app is entirely operational, but that will take a few years.

In consequence  we will probably be using triggers to perform create, update and delete operations for "Person_Address" as we need changes in the source to be replicated in real time. 

Houdoe,

Ole

Heinvandenheuvel
Specialist III
Specialist III

One more thought. If the structure of the source DB and target DB are significantly different, then you may want to consider change tables instead of, or in addition to, applying changes.

Then  you could run a regular job (every minute, hour or day) to consolidate those changes into the target.

The right way to design this depends on expect change rate, width and volume. 

Unfortunately you cannot enable change table in a per-table fashion. It's all or nothing. If only a few tables need change tables then you  could add a task  for those tables.

The kind of data you describe typically doesn't change too often. Perhaps hundreds per day for a large organization, not hundreds per seconds. It may be fine to setup change tables for all tables and use those that are needed, blindly truncating the others or just purging rows older than a few days. That gives the auditing/debugging benefit of being able to look back and see what happened.

Veuillez agréer, monsieur, mes salutations distingué,

Hein.

Ole_Dufour
Creator
Creator
Author

Thank you.
Yes using the change tables is an interesting option.
Unfortunately the regular job you mentioned won't work out for us as our management wants the source data to be replicated in real time.

Heinvandenheuvel
Specialist III
Specialist III

>> our management wants the source data to be replicated in real time.

Sure they want that, and I want a pot of gold - every week.

Everyone wants (near) realtime. But at what price? What is the actual business need?

With your help they have to define 'real time' (average, max per day, max per year) and up to you get to tell them the price .(development time and production resources)  and how realistic their goal is.

- less than 1 seconds (average) ?  Unobtanium.

- less than 10 seconds (average) ?  4 weeks development, 1/4 person all day every day to monitor, 1 CPU all day every day.

- less than a minute? 1 weeks development, 20% of a CPU all day.

- hourly?   Neglect able CPU 

Of course I don't really have an idea about the actual development cost as that depends on skills and available tools and the runtime resources depend on the server/db platforms and volume of changes to be processed.

Good luck,

Hein