Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi guys
I have a very simple question. There is a product table in my database and in that table I have a 'Category' column .In that column there are variables like watch,jewellery,accesories etc. But table is not that clean as it should be. For example just for accesories, there are more than one name. Lower case, upper case... different ways to dedicate 1 category. I want to take all of these types as 1 and put them in my chart. For example for accesories I have 'accesories' and 'ACCESORIES' when I write it as if (lower(category)='accesories',category). It takes both of the names separately. How ca I write it as 1 while I dont lose any data.
The most general way to solve this is to use a mapping table to map the incorrect items to a common item. Something like this code sample:
Map_Cleanup:
Mapping LOAD * Inline
[
From, To
WxAtCH,Watch
Acessssories,Accesories
Watches,Watch
];
Source:
LOAD ID,
Proper(ApplyMap('Map_Cleanup', Product,Product)) as Product,
Sales
Inline
[
ID,Product,Sales
1,ACCESORIES,100
2,WATCH,120
3,accesories,130
4,Accesories,220
5,WxAtCH,89
6,Acessssories,100
7,Watches,20
];
In this example, the ApplyMap corrects the 3 misspelled items and the Proper() converts them all to a common case (you could use Upper() or Lower() instead if you prefer).
I have used in-lines for illustration purposes, but the mappings could easily come from a file or database table.
Hi Oykes.
The best way to do this, is creating a previos flow/process of datacleasing, either with the laod script or an ETL...
In your case, a quick way to solve your problem may be to use this function on the product field in the load script..:
load
Replace(Upper(ltrim(rtrim(Product))),' ','') ,
sales
FROM ..........
With this function, you can normalize diferent values for de producto field..:
Product;sales
ACCESORIES;100
WATCH;120
accesories;130
Accesories;220
W AtCH;89
Kind regards.
Thank you Jupitopino,
It worked while the difference is only upper case, lower case or space but when we have a different difference between 2 variables,I still can't merge them. The table that I mentioned also uses multiple languages for the same variable so I also need to merge them in 1 variable. Do you know how to do that??
best regards...
The most general way to solve this is to use a mapping table to map the incorrect items to a common item. Something like this code sample:
Map_Cleanup:
Mapping LOAD * Inline
[
From, To
WxAtCH,Watch
Acessssories,Accesories
Watches,Watch
];
Source:
LOAD ID,
Proper(ApplyMap('Map_Cleanup', Product,Product)) as Product,
Sales
Inline
[
ID,Product,Sales
1,ACCESORIES,100
2,WATCH,120
3,accesories,130
4,Accesories,220
5,WxAtCH,89
6,Acessssories,100
7,Watches,20
];
In this example, the ApplyMap corrects the 3 misspelled items and the Proper() converts them all to a common case (you could use Upper() or Lower() instead if you prefer).
I have used in-lines for illustration purposes, but the mappings could easily come from a file or database table.
Thank you so much! It works really well...