Skip to main content
Announcements
SYSTEM MAINTENANCE: Thurs., Sept. 19, 1 AM ET, Platform will be unavailable for approx. 60 minutes.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Data masking and anonymization using Talend

I have a requirement to migrate data from a source database to a destination database. Also, we want to make sure some of the confidential data like passwords, phone numbers etc., should not be exposed in the destination database. We are planning to use talend data integration tool for this requirement, but I couldn't find out any information related to data masking in user manual.
Could someone please let me know if we can perform data masking/anonymizing using talend?
Thanks in advance for any help you are able to provide.
-Satish
Labels (2)
16 Replies
Anonymous
Not applicable
Author

Hi Satish
No, you can't use Talend Studio to perform data masking, it is an ETL tool, not a data masking tool.
Shong
Anonymous
Not applicable
Author

As highlighted by Shong, there is no dedicated masking functionality. However, there are ways of achieving a similar result using what is available within Talend. For example replacing values with a random or. Lookup/replace - to maintain consistency - using the map component.
However, do also recognize that it would be easier if the transformation is done pre or post migration. Perhaps even using a dedicated tool where the data modeling can be done more elaborately.
Anonymous
Not applicable
Author

Thanks Shong and Kootstra for your valuable suggestions.
Anonymous
Not applicable
Author

Talend Open Studio is one of the best tools I have used in recent years but for the purpose of apply data masking I have been using Data Masker.
I do however agree with Kootstra_a that replacing values with a random or. Lookup/replace - to maintain consistency - using the map component is an inexpensive approach it wont always stand up in you test use cases.
Check out http://www.datakitchen.com.au/ for info about data masking, data masking methodology, data discovery and test data management.
Anonymous
Not applicable
Author

the key suggestion by the 'data kitchen' (i loved their photo - very over the top)
anyhow, they discuss 'Developing a Data Masking Methodology'
which can be achieve independently of the tool at hand.
so define your requirements, research and then use Talend to deliver it.
i have recently implemented similar workstream for a client and it works fine with Talend.
regards,
Anonymous
Not applicable
Author

Hii,
Can somebody explain how to have my job run with different schema dynamically?
Means my job have some tfileinput written to tlogrow and I am passing the file name to the tfileinput delimited thru the context variable. I want to display different file contents every time I run this job by passing different file name with different schema to the context variable.
How can I achieve that just by changing the file name provided in the context variable?
P.S:The different file names have different schemas.
_AnonymousUser
Specialist III
Specialist III

Data Masking is a very complex topic actually. It has a lot of IP around it, thus the more-less proofed solutions are pretty expensive. There are simple ways to obfuscate data in ETL, but for some of you guys - would you be interested to look at the components suite that you can call from Talend via simple API?
Anonymous
Not applicable
Author

Hi,
I wonder if somebody did a tutorial on data masking using Talend. A lot of people is talking in general terms, but a detailed tutorial would be most appreciated.
Thanks in advance
Anonymous
Not applicable
Author

Hi spr655,

Here is a KB article about: TalendHelpCenter:How to setup encryption of the passwords in Talend Studio?
In addition, you can use tContextLoad or implicit tContextLoad to hide the password characters assigned to different context variables.
Best regards
Sabrina