Skip to main content
Announcements
Marching toward a simplified navigation! READ ON

Qlik Replicate: Many-to-One Replication Configuration

100% helpful (1/1)
cancel
Showing results for 
Search instead for 
Did you mean: 
Michael_Litz
Support
Support

Qlik Replicate: Many-to-One Replication Configuration

Last Update:

Aug 10, 2022 2:17:24 PM

Updated By:

Dana_Baldwin

Created date:

Mar 1, 2021 7:49:53 PM

Click here for Video Transcript

Environment

Overview

One popular configuration option for Replicate tasks is to handle a Many-to-One scenario. Imagine regional source sales databases that you want to consolidate into a single target database in order to simplify reporting needs across regions. This article will describe how to set up and configure replicate tasks to handle this situation, and introduce some key concepts for this to work.

Michael_Litz_0-1614629348609.png

Source Database

Lets start with the source databases.

  • Key Concept The source database tables structure must be identical across the many source database.

    Michael_Litz_1-1614631575682.png
  • Key Concept No guarantee that the primary key values on tables will be unique across the many source databases.
    Michael_Litz_2-1614631640296.png

Each source table primary key will have its own unique key values. If these values are duplicates in other source database tables there will be conflicts in the target database as the many sources get replicated to the target
( There would be duplicate primary key issues). 

NOTE: This potential conflict will be taken care of in the task design/transformation

Task configuration

When configuring Replicate tasks to handle a Many to One scenario you will need to have a task for each source database. Each task will do Full Load and CDC  to the same target database thus creating the merged database that contains all the tables and values from each source in one target database. In my example I will only have one table in my tasks called customer.
My example has (Region_1 to Region_ALL) and  (Region_2 to Region_ALL)  tasks configured.

 

Michael_Litz_1-1614639972388.png

For each task you define there will be a unique source endpoint and the identical target endpoint.

There are 2 source endpoints and only one target endpoint defined, as each task will use the same target endpoint.

Michael_Litz_0-1614639918065.png

Key Concept Each task must be set to Do Nothing on the Full Load/Full Load Settings/Target Table Preparation screen. If the tables do not exist then the task will create them. This setting insures that as each task runs its full load the task will not drop or delete records from the previous task full load. NOTE: It is okay to set this during the task creation as the task will still create the tables if they do not already exist.

Michael_Litz_0-1614640214260.png

Key Concept Table level transformation adding a new field to the target side table that becomes part of the tables primary key. You can see the table transformation for the Customer table in this image.

Michael_Litz_1-1614640539868.png

The field RegionCode has been added to the target table Customer, then click on the "Key" column to add it to the PK on the target. For each task hard code a unique number for the region code. i.e. 1, 2, 3 etc.

NOTE: This does not impact the source table, it is target side only.

Target Database

Lets have a look at the target side database/tables. You can see that the RegionCode field has been added to the table and it is part of the PK, thus insuring there will be no duplicate key errors.

The customer table structure.

Michael_Litz_2-1614640772431.png

 

The customer table showing the merged records from both Region 1 and Region 2 source databases.

Michael_Litz_3-1614640885978.png

 

NOTE: Please also watch the accompanying video on how to configure this many to one environment.

The information in this article is provided as-is and to be used at own discretion. Depending on tool(s) used, customization(s), and/or other factors ongoing support on the solution above may not be provided by Qlik Support.

Related Content

Tags (1)
Labels (2)
Comments
BrianS1
Contributor
Contributor

@Dana_Baldwin  Is many to one replication supported for Azure ADLS targets?  I'm not sure if there is a mechanism in place to handle writes to the same target file.  I'm using Apply Changes which records files with a timestamp and there is the potential for the same filename especially when there are many sources writing to one target. 

Alternatively, is there a way to avoid filename conflicts in the target?  I haven't found a way to modify the target filename and append with something unique like the name of the source but am interested in any suggestions.  There is post-upload processing where the file can be renamed afterwards but that may not help in this case.

Thanks!

BrianS1
Contributor
Contributor

@Dana_Baldwin I noticed that the Azure ADLS target docs specify the filename format but the last digit is missing on the files that I'm currently writing to data lake.  Note that I'm using parquet rather than CSV but doubt that matters.  I'm wondering if perhaps the last digit is some type of counter to handle name collisions if ADLS doesn't actually support many to one replication.  If not then perhaps the docs are incorrect.

20141029-1134010000.csv <- docs

20240219-175816530.parquet   <- sample file from my data

Thanks again!

Dana_Baldwin
Support
Support

Hi @BrianS1 

I suspect you are right, that this configuration is only feasible with a traditional RDBMS target, not a file based target (or one that uses a file as a delivery method).

@Michael_Litz would you agree?

Michael_Litz
Support
Support

Yes, I believe so. I only tested with SQL server target.

Thanks,
Michael

BrianS1
Contributor
Contributor

@Dana_Baldwin @Michael_Litz thank you.  Makes sense for support in RDMBS and not file targets at least directly.  I'm wondering if perhaps Qlik Replicate resolves filename collisions through some  other approach such as incrementing that last digit that's shown in the docs.  Can you clarify?

Alternatively, do you have any suggestions on producing unique filenames?  I don't think post-upload processing would resolve the issue because that's after the file is uploaded.  Is there a way to do something like include the name of the task in the filename? 

I could write the files to different folders but that causes some inefficiencies downstream when I process data from multiple folders.

Thanks again!

Dana_Baldwin
Support
Support

@BrianS1 I believe the file names are incremented as you have said. I don't see a way to influence the actual file name used. You might want to submit a feature request here if you would like more options - this goes directly to our Product Managers: https://community.qlik.com/t5/Ideas/idb-p/qlik-ideas

Thanks

Version history
Last update:
‎2022-08-10 02:17 PM
Updated by: