Qlik Replicate: How to configure a many-to-one replication

Sue_Macaluso · Apr 9, 2021 10:51:18 AM

Transcript

hello my name is Michael Litz and i work in the Qlick QDI support group we often get asked if replicate can handle moving data from multiple source databases into a single target database the simple answer is yes replicate can handle this type of configuration I'd like to demonstrate how to configure Replicate tests to handle this type of many-to-one replication for example imagine you have multiple separate customer contact databases across many regions of the country you want to bring all the separate data into a single target database for reporting there are three key concepts to this configuration first each instance of the customer database has to have the same table structure second there is no guarantee that there are primary key values will be unique across the databases third in order to make this work correctly the target database tables will need to have unique primary keys which we will accomplish using a table transformation in my example today i have two source databases region 1 and region 2. each source database needs its own replicate task in order to move the data into the target a database called regional let's go have a look in sql server management studio and check out the databases okay here we are in sql server management studio and we're going to look at our two source databases region one region two very simple databases for my example i just have a single table customer in each and the customer table has the exact same identical structure two fields primary key cuss key and a name same thing is true for region two now the interesting thing is if i select from region one you'll see that i have three records with customer key one two and three mike amy and ted when i select from region two you see i have the same customer key one two three with different customer names that's to be expected these are in different regions different parts of the country so this could cause a problem if these two tables end up as a single table in the target because we'd have conflicting primary keys and we're going to solve that problem when we get into replicate by doing a table level transformation we're going to add a region code field and we're going to add that field into the primary key thereby giving us a two-part primary key on the target made up of region code and customer key that will take care of any conflicts we might have with non-unique key values from the different source databases and the one last thing i'll do while i'm here is i'll make a database for our target so i'm going to create a new database and i'm going to call it region all and that's all i have to do i don't have to put any tables in here the replicate tests are going to create the tables for me automatically so at this point i think we're set with sql server so let's now go over and take a look at the replicate tasks and how we would configure them okay now we're in the replicate gui and we're going to look at how you configure the tasks first thing i'm going to take a look at is managing the endpoint connections we need to have a source connection to the region 1 database we need a second connection to the region 2 database and we need a single connection to the regional database to target i'm going to have two tasks region 1 and region 2 and i'll go to the region 1 to region all tasks first and i need to add my end points so now that i've got region 1 as my source and regional is my target i'll come over and do table selection i'm going to search find my customer table and i've added it in to my task once i've done that i'm going to go to the table settings and this is where we do that all important table level transformation where i'm going to add a column i'm going to call this column region code and i'm going to make it part of the primary key by just clicking out here i'll say ok to that i'm going to go over to region 2 and do the same thing region 2 to all i'm going to put my source and my target endpoint i'm going to go in select my tables customer table i'm going to say okay to that now i'm going to go in and do my transformation i'm going to add the exact same name column region code and i'll click over here to make it part of the primary key so that's one of the important key concepts is to go into the table do a transformation and modify the table at least as far as the target's concerned you're not touching the source and we've added region code as a field to the cust key field that was already there for a two-part primary key now we do have to set a default value so whereas this is region two to regional i'll set my region code to two i'm going to say okay to that i'm going to come back to the region one to regional i'm going to go into my table settings i'm going to look at that transformation i created here's my two part key and i'm going to set a value to this region code one so how's this significant if you remember in the source database i had identical primary keys add one two three in region one and one two three in region two what we've just did with those transformations is we've told replicate that there's a two part key and is consisting of the field region code and cusky so now in the target we'll have 1 1 1 2 and 1 3 for region 1 records and we'll have 2 1 2 2 and 2 3 for region 2 records this way there'll be no conflict in the target okay next thing i need to do in designer mode i've got to go over and this is very important on the full load the default on a task is to do a drop and create table and you don't want to do this because it's fine with a single replicate task because if you were to reload you could drop everything and you wouldn't mind because it's all coming from a single source where we have multiple sources we've got to tell replicate to do nothing just load the data and obviously if there's no table out there replicate will create it on the first run but the significant part is when we run the second task or the third of the fourth or the fifth desk it won't alter the structure it won't drop it will not truncate records that are already there that's pretty important otherwise you lose everything you got from your previous regions while you're loading up your additional regions so i'm going to do the same thing with the region 2. once again its default is dropping create i'm gonna tell it to do nothing and i'll save region one i'll save region two and at this point we're ready to run so i'll reload target on region one and that went quick because as you remember we only had three rows but we've completed region 1 full load and it's now in change processing mode waiting for updates or inserts and deletes now do the same reload of the target for region 2 and this has just completed its full load from the region to source database so at this point and it's now in change processing mode waiting for updates inserts and deletes from the source so at this point pretty simple configuration key concept is to have a unique primary key on the target where the source may not and we accomplished that by doing a transformation and adding a field region code which we then hard coded to a particular region so i'm going to leave these tasks running and at this point it's going to make sense to go back to sql server and have a look at our database okay here we are back in sql server management studio and we're going to take a look at the newly created and populated region all database you can see we did get our customer table down here i'm going to right click it and select top 100 rows and you'll see that we have got cuskey1 name mic region code 1 2 3 and we have our three records from region two these two pieces make up our primary key up here and that's why we can do updates and we don't have to worry about any conflicts so what i'm gonna do now is go back into region one and i'm going to do an insert of number four steve then i think i'll change mike to be uppercase and i'm going to make a couple of changes in region two i'll put in five and i'll make dana uppercase so there's an example of an insert and an update and let's go back out to our target regional customer database and let's take a look at what we have out there now so you can see that in region code 1 we have our original mic is now uppercase and the same thing in region code two the record dana is now uppercase we've also got our newly inserted records from region one i inserted steve in region two i insert a bill so that's a very quick and short demonstration of how to create a many-to-one task environment and a couple of the key concepts keep in mind and how to configure that task the number one thing being to have that transformation making a unique primary key on the target and the second most important thing is to set the task to do nothing because when you run that second task you don't want to overwrite or drop or truncate any of the records that have already been loaded from running the first task thank you very much i'm glad you were able to enjoy this video.