Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
jlahners
Contributor II
Contributor II

Identify all unique substrings

Hi all -

I'm trying to build a view that will show me all the unique words used a field in my product description and am having some challenges.  I'm starting by removing all the special characters found in my data (of which I can get most out but not all) and second I'm trying to remove a few "known words" from my list because I don't want to report on common words. 

Here's an example:

 

Summaries:
MAPPING LOAD * INLINE [
Summary
This is for a desktop deployment
This is for a new switch to replace ABC-123
"Need to install a new mouse"
Running cable for John & Mike
];

WordsToRemove:
LOAD * INLINE [
BadWords
for
the
need
this
];

InitialWords:
LOAD lower(SubField(MapSubString('SpecialChars',[Summary]),' ')) as Word RESIDENT [Summaries];

ValidWords:
LOAD Word AS NewWord
RESIDENT InitialWords 
WHERE LEN(Word)>3 
AND NOT Exists(BadWords,Word);

Problems:

  1. I can't find a way to remove some special characters, especially the Microsoft "long dash" and their fancy quote symbols, as well as the ']' (because it breaks the inline table load)
  2. The remove of the "bad words" isn't working at all - they still show up in my resultant list

Any help would be much appreciated!  Thank you in advance! 

Labels (1)
4 Replies
Anil_Babu_Samineni

Mapping table require 2 field while you are mapping.

1. May be use purgechar() function
2. This will done if you plug corrrct way of syntax.
Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful
jlahners
Contributor II
Contributor II
Author

How do I type the Microsoft quotes and long dash in the load editor? As for the Not Exists, I must be missing the syntax error. Can you expand?
jlahners
Contributor II
Contributor II
Author

Thank you for the pointers, I was able to get it fixed up.  My biggest problem was I needed a TRIM() call in the NOT EXISTS section.  

vkish16161
Creator III
Creator III

Try like this:

Load * inline "

";