Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
shumailh
Creator III
Creator III

Data Cleansing

Hi,

Is there any way to perform data cleansing on QlikView, Below attached is my data for your reference,

For Example i want to replace "ADNOC XXXX/XX" with 'ADNOC' only. there are large number of garbage in data and i cannot perform cleansing without doing it mannualy.

The solution i have found is to daily download the data and identify new garbage cases and update with the valid value.

Regards,
Shumail Hussain

1 Solution

Accepted Solutions
Not applicable

This may help as a starting point as long as you have no similarities in the beginning of the names like ADNOC and ADNOCIS...

CleanNames:
LOAD * INLINE [
CleanName,
ADNOC,
TRY2,
];

Data:
LOAD * INLINE [
Crap, CrapValue
ADNOC-xr, 1
ADNOC vm, 2
ADNOC_xy, 3
TRY2 crap, 4
TRY2_crap, 5
TRY2-crap, 6
];

LET NoOfCleanRows = NoOfRows('CleanNames');

FOR i = 0 TO NoOfCleanRows - 1
CrapNames:
LOAD
left(Crap,len(peek('CleanName',$(i),'CleanNames'))) as CrapNameCleaned,
CrapValue
RESIDENT Data
WHERE left(Crap,len(peek('CleanName',$(i),'CleanNames'))) = peek('CleanName',$(i),'CleanNames');
;
NEXT i

View solution in original post

8 Replies
shumailh
Creator III
Creator III
Author

Moreover, If you look into the data I am exactly looking for some intelligent solution, which automatically select valid value without doing this exercise manually. Previously i was doing it manually and perform cleansing using excel where I have developed customized formulas i.e. put valid values in front of garbage field column and utilize it on the main sheet using vlookup.

Regards,
Shumail Hussain

Oleg_Troyansky
Partner Ambassador/MVP
Partner Ambassador/MVP

You can use mapping functionality for data cleansing, however, there is no "automatic" algorithm for data cleansing. Once the name is mapped, it will stay mapped for the following reloads.

cheers,

Oleg

johnw
Champion III
Champion III

If I understood your specific example, here's one solution:

if(left("Merchant Name",5)='ADNOC','ADNOC',"Merchant Name") as "Merchant Name"

shumailh
Creator III
Creator III
Author

As there are numerious values, I think I have to go for manual option just like create a new table of Merchants with garbage field and updated field link it with the table.

Regards,
Shumail Hussain

Not applicable

hi Shumail ..

if you trying to create a listbox . inside field list box select expression .............type following code

it gives you dimension with clean data

=if(Len(<column name>)='4',<column name>)

this really defn will reduce manually work 🙂

Not applicable

This may help as a starting point as long as you have no similarities in the beginning of the names like ADNOC and ADNOCIS...

CleanNames:
LOAD * INLINE [
CleanName,
ADNOC,
TRY2,
];

Data:
LOAD * INLINE [
Crap, CrapValue
ADNOC-xr, 1
ADNOC vm, 2
ADNOC_xy, 3
TRY2 crap, 4
TRY2_crap, 5
TRY2-crap, 6
];

LET NoOfCleanRows = NoOfRows('CleanNames');

FOR i = 0 TO NoOfCleanRows - 1
CrapNames:
LOAD
left(Crap,len(peek('CleanName',$(i),'CleanNames'))) as CrapNameCleaned,
CrapValue
RESIDENT Data
WHERE left(Crap,len(peek('CleanName',$(i),'CleanNames'))) = peek('CleanName',$(i),'CleanNames');
;
NEXT i

shumailh
Creator III
Creator III
Author

Hi Archanah,

Length of the value's varies so, i can't specific to any number of length like ADNOC, Carrifor etc...

What i have identified is just like in Excel I can do it through customize formula using VlookUp in excel OR using Join by creating new table for Garbbage values so, the correct values can only be obtain through manual data cleansing.

Regards,
Shumail Hussain

shumailh
Creator III
Creator III
Author

Thanks Christian,

Your code would support me in this, I need to create a cleanname table with all dependencies and use the for loop to generate the final merchant table.but manual dependency is still thier because of human error's which cant be rectified without manual update.

Regards,
Shumail Hussain