How to design Parent and Child relationship tables... - Page 2 - Qlik Community

Anonymous · ‎2014-05-01

I have to establish the relationship between parent and child tables using Talend.
I tried to design but it is not working as expected.
Please suggest the best way to do this.
I am using Oracle_Sequence for my key column in the the parent table.

Anonymous · ‎2014-05-05

Can any one help me regarding my issue described in my earlier posts..
Currently it is stopping my work.Please let me know if any one needs any more details..

willm1 · ‎2014-05-05

Hello mcgovardhan -
Let's walk through a simple design that would result in an insert/update:
Read data from XML file (Indicate on schema what field is Primary Key) -- dataflow --> tOracleOutput (with Insert/Update)
Doing inserts are easy - the records simply get added to the table. Doing updates require you to have defined a key. That's why I noted above that you have to specify the key field in you XML schema. Here's what tOracleOutput does when you set the DB operation to Insert/Update: it uses the primary key you specified to see if a record exists in that table with that ID. If it finds it, it updates the row. If it does not find a row with that Primary Key, it adds that row.
Now to your problem: You're using a sequence to generate your primary key - which is great. The problem is, a new key is being generated for EVERY XML row. So when you process your XML file twice, the keys are definitely different - because it doesn't have any intelligence to know that it's the same file - and your keys in your final table end up changing.
The first question that I have is - what is the PK of your XML data? You have to have one in order to do updates (that avoid the problem you're describing). Is there a field or set of fields that are unique in every XML row? If so, you can use them for your primary key (check the 'Key' fields in your schema), and those keys will be passed through your tMap and your final tables. And they will be used for the update/insert operation.
IF you want to translate the primary key in your XML file to a surrogate key using the Sequence, you can still do that. In that case, you'll be doing an operation similar to a SCD (slow-changing dimension) where the primary keys you define are translated into a Surrogate key. But before you worry about this, let us know what your design decision is for the question above.
Will

Anonymous · ‎2014-05-06

Thanks Willm,
I have 50-60 XML elements in my XML file..And i do have unique key column in my XML in my case it is case id.
And i have 15-18 target destination tables.
In each table based on the data i will be using sequence keys in each table..
In my destination table i will be creating sequences for below columns..Each column has separate sequence..
ActivityId,
OrgId,
PersonId
offenderId
OfficerId
ChargeId
ActPersonId
ActCommentId
AddressId
WepId
etc..

willm1 · ‎2014-05-06

So it seems like what you have to do is check if an incoming row from the XML has a CaseID that is already in the destination table: if it is, you update some columns only; if it's not in the destination table, you insert (along with new sequence keys).
If this is the case, here's what I'd do:
Bring in your XML row data into a tMap. Do an inner join with the destination table as a LookUp. This way, every row that exists in your XML file and that already existed in your destination table will be selected. Any row that is in your XML but is NOT in your destination table will be rejected.
(1) On the output side of your tMap, create the first output table and select all the fields for the destination table EXCEPT the sequence key fields. Make sure you set the Key field of this table in the tMap to CaseID.
(2) Then create the second output table and turn on "Catch Lookup Inner Join Reject" (set to true). See attached screenshot. This will catch all incoming XML rows that don't exist in your destination table.
Finally, connect the first table (inner join is true) to the same destination table with say a tOracleOutput, and set the table operation to 'Update'. This will update all the fields included in the table based on the CaseID - and will not update the corresponding Sequence IDs.
Then connect the second output table (inner join 'rejects') from the tMap to another tOracleOutput and set the table operation to 'Insert'. In the advanced settings of the tOracleOutput, you can specify the Oracle sequence to be used for each key field. See attached image for how to do this...
Let me know if this works...
Will

willm1 · ‎2014-05-06

And to speed up execution, you could read that XML file once and store the data in a text file which is easier to parse for subsequent multiple reads, or a database table... You'd repeat the process I described above for each destination table - since you need to get the inner join rejects for each table...

Anonymous · ‎2014-05-06

Thanks Willm,Can you provide me the dataflow what you have explained above..
What i mean is something like below..
tFileList -> XMLFile-> tLogRow-> tMap-> "<FirstOutput table>" - > tOracleOutput?
here "First Output table" is temporary table in tMap right side or tOracleOutPut?
Do you want me to create two oracleoutputs?
if it is tOracleoutput again do i need to create one moretMap to Map?
here i have 15-18 target tables do i need to do the same for all the target tables?

willm1 · ‎2014-05-06

mcgovardhan - here's what I came up with... I used the hash components to store data midstream in order to avoid the contention at the DB level by trying to do inserts and updates at the same time...

Anonymous · ‎2014-05-06

Thanks Willm
Update record and Insert records are two separate jobs?
The first image is one single job?or else multiple jobs?
How do i link my source file to different target tables?
I mean i have Activity,ActivityComemnts,Person,PersonComments,PersonPhysical,PersonAddress tables..etc..

willm1 · ‎2014-05-06

Thanks Willm
Update record and Insert records are two separate jobs?
The first image is one single job?or else multiple jobs?
How do i link my source file to different target tables?
I mean i have Activity,ActivityComemnts,Person,PersonComments,PersonPhysical,PersonAddress tables..etc..

Yes, insert and update are separate subjobs as shown. You can simplify what I sent you by removing the hash components and placing the tOracleOutputs directly after the tMap (see screenshot below). The reason you I'm suggesting separate update and delete is because when you're updating, you don't include the Sequence fields, and hence leave them unchanged. When you're inserting, you include them and generate them upon inserting.
The first image is a single job. I've attached a simplified version (re: my comment above about removing the hash components) of the job.
You repeat the above flow for each table... The attached screenshot shows how you'd chain them...

Anonymous · ‎2014-05-07

Thanks Willm.I am trying with dimensional tables option and use your approach for storing data into individual tables....

How to design Parent and Child relationship tables using Talend.

Talend Data Integration

v5.x