Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik GA: Multivariate Time Series in Qlik Predict: Get Details
cancel
Showing results for 
Search instead for 
Did you mean: 
Parikhharshal
Creator III
Creator III

Compare source and target row count and send deleted records

Hi there

I am building a flow where I want to achieve following:

I am building a flow where I want to achieve below:

I’m using Talend 6.4 and redshift

1.Find count of ids in source vs target.
2. Comparison - If count matches then proceed to next job. If counts don’t match then find those ids which aren’t in destination but in source (appearantly been deleted). Therefore, mark them as deleted.

How do I build this flow? What components should I use?
Labels (3)
1 Solution

Accepted Solutions
Jesperrekuh
Specialist
Specialist

Hi, yeah I (think I) get it...  You only want to delete if the count doesn't match? right?

tMSSQLInput_1 : "select count(*) src_cnt from src_table"  -->  tSetGlobalVar1

tMSSQLInput_2 : "select count(*) trg_cnt from trg_table"  --> tSetGlobalVar2

0683p000009Lzh3.png

Basically your count results go into global vars and next after the counts have run... you use an if
In the --- if ---> you check the values which are set in the global var:
((Integer)globalMap.get("src_cnt")) <> ((Integer)globalMap.get("trg_cnt"))
Finally In the tRunJob call the job for deleting and (non) matching records.

 

View solution in original post

5 Replies
Jesperrekuh
Specialist
Specialist

 

I would just go for an inner join, the output ive added the trgt source too to make sure it was null when testing

0683p000009LzTa.png

Parikhharshal
Creator III
Creator III
Author

Hi Dijke thanks for your reply.

How do I get counts of source and target and compare them?
Jesperrekuh
Specialist
Specialist

If you want to count in talend you basically retrieve records first... so you might as well join them to get the 'deleted' records and run the syncing process so you are always correct.
If both sources are tables :
select count(*) cnt, count(distinct id) unique_cnt from src_table

My experience on counting records to track if you;re in sync... is a dirty fix
Think of reasons why your record count between source and target could be equal but when you compare the data it turns out to be different. Like moving records to a backup environment... or ... records in src first deleted (5records) and later coincidence 5 records are added, your count hasn't changed.

Be schwifty 0683p000009MA9p.png
Parikhharshal
Creator III
Creator III
Author

Hi Dijke

Thanks for your prompt reply.

Let me give you little bit of background so that things are clear.

I have already got existing flow which caters for inserts and updates.

Deletes were not handled at all so that’s what I’m doing now.

I only want to run delete flow assuming ids don’t match. In matching case I don’t want to do anything and proceed to next one.

Only because of performance every time I don’t want to check how many records were deleted instead only go to delete flow when records are not matching. Hopefully this makes sense.

So I’m focused more on comparison for counts. Could you please help here?

Thanks
Harshal.
Jesperrekuh
Specialist
Specialist

Hi, yeah I (think I) get it...  You only want to delete if the count doesn't match? right?

tMSSQLInput_1 : "select count(*) src_cnt from src_table"  -->  tSetGlobalVar1

tMSSQLInput_2 : "select count(*) trg_cnt from trg_table"  --> tSetGlobalVar2

0683p000009Lzh3.png

Basically your count results go into global vars and next after the counts have run... you use an if
In the --- if ---> you check the values which are set in the global var:
((Integer)globalMap.get("src_cnt")) <> ((Integer)globalMap.get("trg_cnt"))
Finally In the tRunJob call the job for deleting and (non) matching records.