Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
In Data Preparation, I would like to create a column with the number of occurrences of the values of another column. This would filter the duplicate values for example.
I didn't find out how to do that.
Lilian
Hello,
I am actually not sure why you want to create a new column. As you describe it, the profiling panel of Data Prep seems to be the perfect candidate for such a use case. You can see very easily the number of occurrence of your values (so that you can directly see duplicates and unique values) and then you can directly filter by selecting them.
See the attached screenshot where you can see 2 duplicate values and 1 unique value directly in the profiling bar chart.
Regards,
Patrick
Hello Patrick,
You're right, profiling panel allows to visualize the duplicates and filter them but nothing allows to filter all duplicated values via an action.
For example: I have the following data:
col_1
-------
1
1
2
3
3
3
4
I wish I could apply an action on col_1 to calculate the number of occurrences of values as follows:
col_1 col_2
------- -------
1 2
1 2
2 1
3 3
3 3
3 3
4 1
Then, I could apply an action on col_2 to filter the values > 1. So, automatically, I could quickly treat ALL duplicated values.
I hope my explanations are clear 😉
Regards,
Lilian