Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Matching using Talend Open studio for DQ (community version)

Hi,
I am trying to do a matching between 2 csv files based on multiple columns(foe example first name,last name, date of birth) using TOS fo DQ. I found a way to do it using matching analysis but i am facing a challenge:
In the matching analysis perspective it seems that i can do the matching only using only 1 csv file, but i have 2 files (1 input file + 1 reference or master file)
My question is how can i match the 2 csv files using the matching analysis in DQ ?
thank you
Regards
Labels (3)
3 Replies
Anonymous
Not applicable
Author

Hi,
Can the component TalendHelpCenter:tFuzzyMatch which compares a column from the main flow with a reference column from the lookup flow and outputs the main flow data displaying the distance meet your needs?
Best regards
Sabrina
Anonymous
Not applicable
Author

hi xdshi,
i already tried this component but it does not offer many algorimthms (soundex, q gram,... not included). Also the levenshein algorimth does not provide a score (between 0 and 1) but a distance.
The other challenge i face it that i need to get from the reference file all the rows that have similarity with the main input file. Here is a scheme of the job i did using tfuzzymatch:
                                   tfileinputdelimited(reference flow)                        
                                                           ||
tfileinputdelimited(input flow)=====>tfuzzymatch====> matched rows from the input flow.
Instead of matched rows in the input flow i want the matched rows from the reference flow.
thank you
Anonymous
Not applicable
Author

Hi,
Here is a component tRecordMatching which can join two tables by doing a fuzzy match on several columns using a wide variety of comparison algorithms,however, this component will be available in the Palette of Talend Studio on the condition that you have subscribed to one of the Talend Platform products not open source.
Best regards
Sabrina