Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
Is there any way to perform data cleansing on QlikView, Below attached is my data for your reference,
For Example i want to replace "ADNOC XXXX/XX" with 'ADNOC' only. there are large number of garbage in data and i cannot perform cleansing without doing it mannualy.
The solution i have found is to daily download the data and identify new garbage cases and update with the valid value.
Regards,
Shumail Hussain
This may help as a starting point as long as you have no similarities in the beginning of the names like ADNOC and ADNOCIS...
CleanNames:
LOAD * INLINE [
CleanName,
ADNOC,
TRY2,
];
Data:
LOAD * INLINE [
Crap, CrapValue
ADNOC-xr, 1
ADNOC vm, 2
ADNOC_xy, 3
TRY2 crap, 4
TRY2_crap, 5
TRY2-crap, 6
];
LET NoOfCleanRows = NoOfRows('CleanNames');
FOR i = 0 TO NoOfCleanRows - 1
CrapNames:
LOAD
left(Crap,len(peek('CleanName',$(i),'CleanNames'))) as CrapNameCleaned,
CrapValue
RESIDENT Data
WHERE left(Crap,len(peek('CleanName',$(i),'CleanNames'))) = peek('CleanName',$(i),'CleanNames');
;
NEXT i
Moreover, If you look into the data I am exactly looking for some intelligent solution, which automatically select valid value without doing this exercise manually. Previously i was doing it manually and perform cleansing using excel where I have developed customized formulas i.e. put valid values in front of garbage field column and utilize it on the main sheet using vlookup.
Regards,
Shumail Hussain
You can use mapping functionality for data cleansing, however, there is no "automatic" algorithm for data cleansing. Once the name is mapped, it will stay mapped for the following reloads.
cheers,
Oleg
If I understood your specific example, here's one solution:
if(left("Merchant Name",5)='ADNOC','ADNOC',"Merchant Name") as "Merchant Name"
As there are numerious values, I think I have to go for manual option just like create a new table of Merchants with garbage field and updated field link it with the table.
Regards,
Shumail Hussain
hi Shumail ..
if you trying to create a listbox . inside field list box select expression .............type following code
it gives you dimension with clean data
=if(Len(<column name>)='4',<column name>)
this really defn will reduce manually work 🙂
This may help as a starting point as long as you have no similarities in the beginning of the names like ADNOC and ADNOCIS...
CleanNames:
LOAD * INLINE [
CleanName,
ADNOC,
TRY2,
];
Data:
LOAD * INLINE [
Crap, CrapValue
ADNOC-xr, 1
ADNOC vm, 2
ADNOC_xy, 3
TRY2 crap, 4
TRY2_crap, 5
TRY2-crap, 6
];
LET NoOfCleanRows = NoOfRows('CleanNames');
FOR i = 0 TO NoOfCleanRows - 1
CrapNames:
LOAD
left(Crap,len(peek('CleanName',$(i),'CleanNames'))) as CrapNameCleaned,
CrapValue
RESIDENT Data
WHERE left(Crap,len(peek('CleanName',$(i),'CleanNames'))) = peek('CleanName',$(i),'CleanNames');
;
NEXT i
Hi Archanah,
Length of the value's varies so, i can't specific to any number of length like ADNOC, Carrifor etc...
What i have identified is just like in Excel I can do it through customize formula using VlookUp in excel OR using Join by creating new table for Garbbage values so, the correct values can only be obtain through manual data cleansing.
Regards,
Shumail Hussain
Thanks Christian,
Your code would support me in this, I need to create a cleanname table with all dependencies and use the for loop to generate the final merchant table.but manual dependency is still thier because of human error's which cant be rectified without manual update.
Regards,
Shumail Hussain