Qlik Community

New to QlikView

Discussion board where members can get started with QlikView.

Highlighted
martinholm
New Contributor

Search for similar field values, replace and remove duplicates

Hello

I'm working on a project where categories have been saved with similar names multiple times. An issue which should have been fixed at an earlier stage, but cannot be done.

here's an example of values that should all be named "Solumstrand"

clipboard_image_0.png

As there is quite a lot of these mistakes, i would like to give a column of correct names which should be (wild?)matched with the faulty names and replace them.

I have attempted like this for joining and i suppose i can remove the faulty names in a new load statement

//Names i want to correct to
Prosjekttabell:
Load Prosjektnavn;
LOAD Prosjektnavn INLINE [
Prosjektnavn
'Solumstrand',
'Muusøya',
'Kim Karlsen'
'Sande',
'Svelvik'
'Elvika',
'Lier',
'Linnes',
'Mjøndalen',
'Vidar Hansen',
'Åsgårdstrand'
];

//I want to search Prosjekt for substrings found in Prosjekttabell and join the fitting name to all the correct rows
join (Prosjekttabell)
Load distinct PROSJEKT_ID,
PROSJEKT_NR,
PROSJEKT
Resident 'Drammen_test'
where WildMatch(PROSJEKT, [Prosjektnavn])= [Prosjektnavn];

I get the error message "Cannot find the field name Prosjektnavn"

Thanks in advance!

Labels (2)
1 Solution

Accepted Solutions
JGMDataAnalysis
Contributor II

Re: Search for similar field values, replace and remove duplicates

A slightly shorter version...

ProsjektnavnMap:
MAPPING LOAD Lower(Prosjektnavn), '<' & Prosjektnavn & '>'
INLINE [
        Prosjektnavn
        Solumstrand
        Muusøya
        Kim Karlsen
        Sande
        Svelvik
        Elvika
        Lier
        Linnes
        Mjøndalen
        Vidar Hansen
        Åsgårdstrand
];

Test:
NOCONCATENATE
LOAD * WHERE Len(Trim(newPROSJEKT)); // Comment this line to load all rows.
LOAD DISTINCT
    PROSJEKT_ID,
    PROSJEKT_NR,
    PROSJEKT AS oldPROSJEKT,
    TextBetween(
    	MapSubstring('ProsjektnavnMap', Lower(PROSJEKT)), '<', '>'
    ) AS newPROSJEKT
RESIDENT Drammen_test;

View solution in original post

4 Replies
Support
Support

Re: Search for similar field values, replace and remove duplicates

Have a look at the following Design Blog post, that may be a better way to go at things potentially.

https://community.qlik.com/t5/Qlik-Design-Blog/Mapping-and-not-the-geographical-kind/ba-p/1463192

If you want to search for other blog posts, use the following link:

https://community.qlik.com/t5/Qlik-Design-Blog/bg-p/qlik-design-blog

Regards,
Brett

To help users find verified answers, please don't forget to use the "Accept as Solution" button on any posts that helped you resolve your problem or question.
JGMDataAnalysis
Contributor II

Re: Search for similar field values, replace and remove duplicates

What criteria do you use to classify "duplicates"?

Try this script and tell me about it...

Prosjekttabell:
NOCONCATENATE
LOAD * INLINE [
    Prosjektnavn
    Solumstrand
    Muusøya
    Kim Karlsen
    Sande
    Svelvik
    Elvika
    Lier
    Linnes
    Mjøndalen
    Vidar Hansen
    Åsgårdstrand
];

ProsjektnavnMap:
MAPPING LOAD Lower(Prosjektnavn), '<' & Prosjektnavn & '>'
RESIDENT Prosjekttabell;

Test:
NOCONCATENATE
LOAD * WHERE Len(Trim(newPROSJEKT)); // Comment this line to load all rows.
LOAD DISTINCT
    PROSJEKT_ID,
    PROSJEKT_NR,
    PROSJEKT AS oldPROSJEKT,
    TextBetween(
    	MapSubstring('ProsjektnavnMap', Lower(PROSJEKT)), '<', '>'
    ) AS newPROSJEKT
RESIDENT Drammen_test;

DROP TABLE Prosjekttabell;

 

JGMDataAnalysis
Contributor II

Re: Search for similar field values, replace and remove duplicates

A slightly shorter version...

ProsjektnavnMap:
MAPPING LOAD Lower(Prosjektnavn), '<' & Prosjektnavn & '>'
INLINE [
        Prosjektnavn
        Solumstrand
        Muusøya
        Kim Karlsen
        Sande
        Svelvik
        Elvika
        Lier
        Linnes
        Mjøndalen
        Vidar Hansen
        Åsgårdstrand
];

Test:
NOCONCATENATE
LOAD * WHERE Len(Trim(newPROSJEKT)); // Comment this line to load all rows.
LOAD DISTINCT
    PROSJEKT_ID,
    PROSJEKT_NR,
    PROSJEKT AS oldPROSJEKT,
    TextBetween(
    	MapSubstring('ProsjektnavnMap', Lower(PROSJEKT)), '<', '>'
    ) AS newPROSJEKT
RESIDENT Drammen_test;

View solution in original post

martinholm
New Contributor

Re: Search for similar field values, replace and remove duplicates

Hello! Sorry for late answer.

This works like a charm, thanks a lot!