Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi
I want to clean a list of words, which is over 1 million long.
The base data was ratings given in form of sentences. I broke this down into words, with subfield(), but i get words with "commas" or "question marks" or other signs. As separator i used ' ' (empty). I need just the words, because i need to count the frequency.
Example
Word
bad
bad,
bad//
-bad
....bad
is considered as 3 different words, but it is only one. How can i eliminate all these signs around the words?
thanks
felipe
Use KeepChar() or PurgeChar(). Like
Load
KeepChar(Word, 'abcdefghijklmnopqrstuvwxyz') as FreshWord
Use KeepChar() or PurgeChar(). Like
Load
KeepChar(Word, 'abcdefghijklmnopqrstuvwxyz') as FreshWord
thanks!