Our Cloudera DataLake holds data from various sources with no insight or control on the source data changes.
Currently when changing a Primary Key value in the source record it will cause a new record to be inserted in the storage table.
Due to this behavior our BI environment displays an huge amount of duplicate values when selecting from the storage base table or the current view.
We have a need for this behavior to change. A possible solution maybe to have a header deleted value for the before image.
Thanks
Janine