Skip to main content
Announcements
Accelerate Your Success: Fuel your data and AI journey with the right services, delivered by our experts. Learn More
cancel
Showing results for 
Search instead for 
Did you mean: 
prabuj
Contributor III
Contributor III

How to extract a data like this from the source system in Talend ?

Hi Team @xdshi @Richard Hall,

I have data daily like this flowing in source system and need to move to target system. But how can I handle these type of data. ?

Do you want to set primary key to some columns , or any sort of rules - a good suggestion is appreciated. You can include a new column to validate, it's up to your wish. But - How can I handle this ?

In simple words for you all to understand these concept and easier way to give me a solution.

 

  1. Same person but different passport no
  2. different person but same passport no

Same person but different passport no:

0695b00000fLHLgAAO.png

different person but same passport no

0695b00000fLHLlAAO.png

How do you correct when these kind of data flowing in ?

Thanks in advance

Labels (2)
1 Solution

Accepted Solutions
rhall1
Contributor III
Contributor III

Unless someone is able to have 2 passport numbers (maybe a dual national) this should be considered an error. The same with different names for the same passport number (this is definitely an error). Errors should be dealt with separately from the "good" data....but how do you know that this is good? To extract these examples you can do something along the lines of aggregating against passport number and against names. If you have more than 1 record in those aggregations, then you know you need to treat these records with suspicion.

View solution in original post

1 Reply
rhall1
Contributor III
Contributor III

Unless someone is able to have 2 passport numbers (maybe a dual national) this should be considered an error. The same with different names for the same passport number (this is definitely an error). Errors should be dealt with separately from the "good" data....but how do you know that this is good? To extract these examples you can do something along the lines of aggregating against passport number and against names. If you have more than 1 record in those aggregations, then you know you need to treat these records with suspicion.