Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi there,
I'm using desktop version of Talend data preparation, in which I'm able to clean few rows fom csv file based on duplicate values. Now, it will be good if I explain my concern with Example :
Morning:
1. The client provide us csv sheet where there are 2000 rows.
2. I extract this file in dataset, and clean some data by applying some functions
e.g. Deleted 4 rows having city = 'ABC',
3. total rows = 1996
4. I save this file to preparation.
5. Either export this file or let it as it is there in preparation.
Evening:
1. The same client sent same csv file with some more additional rows, say 3000, i.e. 1000 rows added to previous 2000 rows.
2. In this file there are total 8 rows with city = 'ABC'
3. I again applied same function to clean data, delete rows with city = ABC, (4 rows I already deleted from previous sheet)
4. Total rows = 2992
5. Export
And the process goes in same manner, which is definitely not an efficient way.
How can we save and automate these rules again and again ?
Is there any solution (tool) provided by Talend for this scenario ?
Hi,
What you are looking for is the ability to reapply an existing preparation to another dataset with the same structure, right? If so, that is doable natively in Data Prep - see https://community.talend.com/t5/Data-Quality-and-Preparation/Multiple-Datasets-with-the-same-structu...
Let me know if that matches your needs.
Regards,
Gwendal
Hi,
What you are looking for is the ability to reapply an existing preparation to another dataset with the same structure, right? If so, that is doable natively in Data Prep - see https://community.talend.com/t5/Data-Quality-and-Preparation/Multiple-Datasets-with-the-same-structu...
Let me know if that matches your needs.
Regards,
Gwendal