Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
We have an attribute, for example PERSON_NAME, that sometimes changes case in the source. For example, the person name is "JOHN", but then later changes to "John".
Compose does not detect this change because due to SQL Server columns being case insensitive, it sees that JOHN = John. Therefore, the name change does not flow through to the Data Warehouse and Data Mart.
We thought of altering the collation of PERSON_NAME to SQL_Latin1_General_CP1_CS_AS. This would have to be done on the TSTG table and the target data warehouse table.
However, since the TSTG table is dropped and recreated on each run, then the collation goes back to the default case insensitive.
Is there another recommended way to deal with this situation?
Thanks in advance.
Hello @ricatorrespri
As checked with our internal team
As an ELT solution, we adhere to the collation of the target database.
The recommended approach here is to leverage the database collation to support the type of processing etc.
Another choice is to create a specific field that concatenates and hashes certain columns. (eg an SHA hash of John and JOHN equate to different values. Compose will see this as a change to the data – and type 2 the record.
Regards,
Suresh
Hello @ricatorrespri
As checked with our internal team
As an ELT solution, we adhere to the collation of the target database.
The recommended approach here is to leverage the database collation to support the type of processing etc.
Another choice is to create a specific field that concatenates and hashes certain columns. (eg an SHA hash of John and JOHN equate to different values. Compose will see this as a change to the data – and type 2 the record.
Regards,
Suresh
this is considered the correct solution for pretty much all instances of this issue in computer science. though most people convert to lower... converting to upper will get you some weird looks, but it'll work just as well.