Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Check Sha-1

Hi all,
i am receiving in a ftp 2 files: 
  - 1 csv for data
  - 1 csv for a sha-1 hash
I need to check if the file was not altered, so i want to calculate the sha-1 of the csv containing the date versus the hash received
Please how can i perform this task in Talend ?
Thank you in advance
Br
Labels (2)
7 Replies
Anonymous
Not applicable
Author

Hi,
Check the solution given in the thread...
https://community.talend.com/t5/Design-and-Development/sha1-hash-key/td-p/109750
I would have following approach..
Generate sha-1 for input file and write it to another file
Read both sha-1 and use inner join to check using tMap. In case of reject row, the file is altered, else it is not.
I hope you got an idea.
Vaibhav
Anonymous
Not applicable
Author

Here's a "routine" that generates a hash for a String. This might give you the right pointer if you want to adapt it for hashing an entire file.
public static String getSHA256(String plainText) {
   if(plainText == null) return null;
   else {
   try {
   java.security.MessageDigest md = java.security.MessageDigest.getInstance("SHA-256");
   md.update(plainText.getBytes("UTF-8"));
   byte[] bytes = md.digest();
   java.math.BigInteger bI = new java.math.BigInteger(1, bytes);
   return bI.toString(16);
   }
   catch (Exception e) {
   return null;
   }
   }
   }
Anonymous
Not applicable
Author

Thank you ,
it works with a pair of data and checksum file. but my issue is that in my folder i have many files ( lets say 10 data files and 10 checksum files). 
How can i articulate my job in order to:
- read data file
- calculate checksum from this file
- compare the value to the checksum file
file by file ?
Thank you
Anonymous
Not applicable
Author

If you have above scenario, then you can
- Create a single file with 10 SHA-1 for input file
- Create a single file with 10 SHA-1 for sha-1 files
- Use tMap to compare both the files to each other.
This way it will compare the code to each other
Reason for not doing this in existing files is that, tFileList component reads file one by one and there is no option to re-load all file once again after completion.
Thanks
Vaibhav
Anonymous
Not applicable
Author

Thank you Vaibhav,
will try this.
SO if i understand it is not possible to compare pair by pair for example (datafile 1 & checksumfile1), then (datafile 2 & checksumfile2) ...........(datafile n & checksumfile n)  ?
Best Regards
Anonymous
Not applicable
Author

You can do this, if your files are in specific sort order of name/date else not possible
Vaibhav
Anonymous
Not applicable
Author

hi Vaibhav,
my files are named as follows:
data files: "XXXXXX-YYYYMMDDHHMMSS.csv"          ( Y=year, M=month, D=day, H=hour, M=minute, S=second)
checksum files: "XXXXX_YYYYMMDDHHMMSS.syn"     ( Y=year, M=month, D=day, H=hour, M=minute, S=second)
Each pair of files (data file, associated checksum file) have the same (date, hour) in the filename.
Regards