Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
Is it possible to extract the difference data from tFileCompare ?
One approach to compare is LookUp (but i have different set of files to compare and each set has different headers, so for lookup i need to modify the file schema everytime), is there any component to compare two files and get the difference data as output directly ?
Thanks
Are you looking for a comparison character by character? So, for example, in the following examples, the differences would be as shown below....
aaaaaaaaaaaaaaaabbbbbbbbbbbbbbcccccccccccccccdddddddddddd eeeeeeeeeeeeeeffffffffffffffffffffggggggggggggggggghhhhhhhhhhhhiiiiiiii
1aaaaaaaaaaaaaaaabbb2bbbbbbbbbbbcccccccccccccccdddddddddddd eeeeeeeeeeeeeeffffffffffffffffffffggggggggg3gggggggghhhhhhhhhhhhiiiiiiii
If that is what you want, there is nothing "out of the box" and it might be quite tricky to build this using standard components. You could try the Talend Exchange or look for a Java API to handle this and call it from Talend.
Hello,
Where are your input files from? Tables? Are you looking for redundancy analysis in talend data quality prodcut?
Best regards
Sabrina
If the tables that are supplying the data have the same schema, you don't need to worry about the headers at all. Just join your two files using a tMap and ensure that every column that should be the same is joined. Then have two outputs; 1 for the matches and one for rows from the main that do not match
Yes, you can create one job for all tables.....but it will only tell you about rows that are exactly the same....and it will be complicated to build if you are new to this.
1) Input your data from your tables with ALL of the columns concatenated and hashed. Output this as a String (Varchar). You will need a primary key on the table to be output as well. So your data from each table will be ....
Key
ConcatenatedHash
2) In your tMap join on your ConcatenatedHash column. Remember that the Main flow will be the only flow where ALL rows are guaranteed to be tested. If you require both sides to be tested you will have to reverse the lookup in another tMap.
3) When you identify matches, you can link back to your unconcatenated data using the Key.