Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
Shefali30
Contributor II
Contributor II

String Search Text in script

Shefali30_0-1598524884746.png

I need to run a search to see if the name Registered is same as Company name using levenshtein.

Rules to be followed while searching text:

1. Excluding expressions like Pvt | Private| Ltd| Limited| &| And

Or there is any other way to find difference between the two.

Please advise

@sunny_talwar @swuehl 

Thanks 

Shefali

1 Solution

Accepted Solutions
Kushal_Chawda

there is function  there in Qlik.

Data:
LOAD *,      LevenshteinDist(Registered_Name_String_clean,Company_Name_String_clean) as Levenshtein_Distances ;
LOAD *,
trim(purgechar(replace(replace(replace(replace(replace(replace(lower(Registered_Name),'pvt',''),'private',''),'ltd',''),'limited',''),' and ',' '),' & ',' '),'.')) as Registered_Name_String_clean,
     trim(purgechar(replace(replace(replace(replace(replace(replace(lower(Company_Name),'pvt',''),'private',''),'ltd',''),'limited',''),' and ',' '),' & ',' '),'.')) as Company_Name_String_clean;

Load * inline [
Company_Name,Registered_Name
Ajanta Creations Pvt Ltd,Ajanta Creations Private Limited
Sanjay Handlooms, Sanjay Handlooms
Heena Pharmaceutical Limited,Heena Pharmaceutical Ltd
Chatterjee Company,Chatterjee
Om Prakash Medical Store, Om Prakash Store
Raj Kumar Company private, Raj Company
Anjum Creations Private,Anjum Creations Pvt.,
Raj and Sons Company, Raj & Sons Company ];

Annotation 2020-08-30 185311.png

 

 

 

View solution in original post

6 Replies
stevedark
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi @Shefali30 

To use levenshtein you will need to connect to a Python server, which may be overkill for what you are after.

What you will want to do is the same set of replace / capitalize statements on both sides and then do a compare.

To avoid duplication I would use a variable which take s a parameter, something like this:

set vClean = replace(replace(replace(UPPER([$1]), 'PVT', ''), 'PRIVATE', ''), 'COMPANY', '');

LOAD
  Field1,
  Field2,
  if($(vClean(Field1)) = $(vClean(Field2)), 'Yes', 'No') as Match,
  ...

 

Obviously the number of replaces in the variable would be a lot more than what are there, but hopefully you get the idea.

Paramterised variables are explained in more details here:

https://www.quickintelligence.co.uk/variables-parameters-load-script/

 

Steve

Kushal_Chawda

Does it not 8 for "Om Prakash Store"  and 6 for "Raj Kumar Company"?

 

Kushal_Chawda

there is function  there in Qlik.

Data:
LOAD *,      LevenshteinDist(Registered_Name_String_clean,Company_Name_String_clean) as Levenshtein_Distances ;
LOAD *,
trim(purgechar(replace(replace(replace(replace(replace(replace(lower(Registered_Name),'pvt',''),'private',''),'ltd',''),'limited',''),' and ',' '),' & ',' '),'.')) as Registered_Name_String_clean,
     trim(purgechar(replace(replace(replace(replace(replace(replace(lower(Company_Name),'pvt',''),'private',''),'ltd',''),'limited',''),' and ',' '),' & ',' '),'.')) as Company_Name_String_clean;

Load * inline [
Company_Name,Registered_Name
Ajanta Creations Pvt Ltd,Ajanta Creations Private Limited
Sanjay Handlooms, Sanjay Handlooms
Heena Pharmaceutical Limited,Heena Pharmaceutical Ltd
Chatterjee Company,Chatterjee
Om Prakash Medical Store, Om Prakash Store
Raj Kumar Company private, Raj Company
Anjum Creations Private,Anjum Creations Pvt.,
Raj and Sons Company, Raj & Sons Company ];

Annotation 2020-08-30 185311.png

 

 

 

swuehl
MVP
MVP

Well done. Kush!
Just discovered the LevenshteinDist() function, it's always good to follow Qlik Community posts.

Strange that this function is still undocumented for three years now...

Kushal_Chawda

@swuehl  Thank you 😊 Actually Kudos to you that you had already discovered this way back which I just referred .

Even I ignored searching on community due to this uncanny function  and started building my own logic using complex loops. I achieved the logic and even posted here. But later I discovered your post and found the function then I deleted that post.😅

Lesson learned- always search on community first.

Shefali30
Contributor II
Contributor II
Author

Thanks. It worked😊