Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I want to mask sensitive data in my DB with Talend, some data to be masked are key fields, so I need to use them to join, i used tDataMasking, but applying the same function to the same key in two different tables, the output is different. How can I fix it? Is there a particular function that I have to choose in tDataMasking for this use (doing join with masked data)?
Hi Mark,
That's a very good question. At the beginning, most of the tDataMasking functions were purely random (i.e. we did not care about what is in the input). We added in 6.3 some functions for SSN (called "Generate unique xxx SSN number" where xxx can be Chinese, French, German, Indian, Japanese, UK, US) that are able to do exactly what you want, if you have a SSN as an input. We may do it for other types (like credit cards). In what functions are you interested in ?
Damien
If you don't use SSN, there is still an approximate way to do it:
first, store all your unique Ids in a file
then use the "Replace by consistent items from input list (or file)" function to read from this file.
Thank you for your answer,
I want to do the join between the keys of a table, this keys could be a string of integers or letters or both. So I tried to mask these keys with a "replace all" "replace all digits" "replace all letters" and other functions but the join isn't done correctly because if I apply the same function to the same key in two different tables is masked differently.
Hi Mark,
This cannot work. The replacement done by these functions are purely random.
As I said in my previous answer:
first, store all your unique keys in a file (ideally, add more keys to this file but avoid duplicates).
then use the "Replace by consistent items from input list (or file)" function to read from this file.
See the attached example.