Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
amiroh81
Creator
Creator

split text from field

Hi All,

I have a field that contains random text and I want to split the text into single words.

The text can be separated by spaces, commas, periods, etc.

I tried using SubField with a space separator but I only split the first word.

In addition, I was unable to separate more than one type of separator.

For example, I have the following sentence:

how are-you.doing today

The result I would like to have is:

how

are

you

doing

today

Thank you!

16 Replies
amiroh81
Creator
Creator
Author

what about split a field on the model itself?

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

What do you want to do with the split words? Show them in a textbox? or a listbox?  Is it only one selection you want to process or all of them?

-Rob

amiroh81
Creator
Creator
Author

In the end I want to count the amount of words that appear in the sentence and how many times each word appeared.

It is important for me to do this in the model, because the counting should be done after the user performs several filters in the model

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

You can break out the words in the script and leave them linked to the sentences. Then the words will reflect selections.

CharMap:

Mapping LOAD *, ' '

Inline [

char

.

,

-

] (delimiter is '\t');

Sentences:

LOAD *, RecNo() as RecId

Inline [

Sentence

how are-you.doing today

I am fine

];

Words:

LOAD

RecId,

SubField(MapSubString('CharMap', Sentence), ' ') as Word

Resident Sentences;

-Rob

MarcoWedel

Hi,

another method to define non-letter characters for the word separation might be (included in the already proposed SubField/MapSubString solution😞

QlikCommunity_Thread_281822_Pic1.JPG

mapNonLetterToSpace:

Mapping

LOAD Chr(RecNo()), ' '

AutoGenerate 65535

Where Upper(Chr(RecNo()))=Lower(Chr(RecNo()));

tabTextLines:

LOAD RecNo() as ID, *

Inline '

    TextLine

    "SubField - script and chart function"

    "Subfield() is used to extract substring components from a parent string field, where the original record fields consist of two or more parts separated by a delimiter."

    "The Subfield() function can be used, for example, to extract first name and surname from a list of records consisting of full names, the component parts of a path name, or for extracting data from comma-separated tables."

    "If you use the Subfield() function in a LOAD statement with the optional field_no parameter left out, one full record will be generated for each substring. If several fields are loaded using Subfield() the Cartesian products of all combinations are created."

    "Syntax:"

    "SubField(text, delimiter[, field_no ])"

    "Return data type: string"

    "Arguments:"

    "Argument Description"

    "text The original string. This can be a hard-coded text, a variable, a dollar-sign expansion, or another expression."

    "delimiter A character within the input text that divides the string into component parts."

    "field_no The optional third argument is an integer that specifies which of the substrings of the parent string text is to be returned. A negative value causes the substring to be extracted from the right-hand side of the string. That is, the string search is from right to left, instead of left to right, if field_no is a positive value."

';

tabWords:

LOAD Distinct * Where Len(Word);

LOAD ID,

    SubField(MapSubString('mapNonLetterToSpace',TextLine),' ') as Word

Resident tabTextLines;

hope this helps

regards

Marco

MarcoWedel

one example of calculating word frequencies and counting words per text line in the front end might be:

QlikCommunity_Thread_281822_Pic2.JPG

QlikCommunity_Thread_281822_Pic3.JPG

QlikCommunity_Thread_281822_Pic4.JPG

QlikCommunity_Thread_281822_Pic5.JPG

QlikCommunity_Thread_281822_Pic6.JPG

mapNonLetterToSpace:

Mapping

LOAD Chr(RecNo()), ' '

AutoGenerate 65535

Where Upper(Chr(RecNo()))=Lower(Chr(RecNo()));

tabTextLines:

LOAD RecNo() as LineID, *

Inline '

    TextLine

    "SubField - script and chart function"

    "Subfield() is used to extract substring components from a parent string field, where the original record fields consist of two or more parts separated by a delimiter."

    "The Subfield() function can be used, for example, to extract first name and surname from a list of records consisting of full names, the component parts of a path name, or for extracting data from comma-separated tables."

    "If you use the Subfield() function in a LOAD statement with the optional field_no parameter left out, one full record will be generated for each substring. If several fields are loaded using Subfield() the Cartesian products of all combinations are created."

    "Syntax:"

    "SubField(text, delimiter[, field_no ])"

    "Return data type: string"

    "Arguments:"

    "Argument Description"

    "text The original string. This can be a hard-coded text, a variable, a dollar-sign expansion, or another expression."

    "delimiter A character within the input text that divides the string into component parts."

    "field_no The optional third argument is an integer that specifies which of the substrings of the parent string text is to be returned. A negative value causes the substring to be extracted from the right-hand side of the string. That is, the string search is from right to left, instead of left to right, if field_no is a positive value."

';

tabWords:

LOAD

    LineID,

    Word,

    AutoNumber(IterNo,'WordNo'&LineID) as WordNo,

    AutoNumber(Hash128(LineID,IterNo),'WordID') as WordID,

    Upper(Word) as WORD

Where Len(Word);

LOAD LineID,

    IterNo() as IterNo,

    SubField(MapLine,' ',IterNo()) as Word

While IterNo()<=SubStringCount(MapLine,' ')+1; 

LOAD LineID,

    MapSubString('mapNonLetterToSpace',TextLine) as MapLine

Resident tabTextLines;

hope this helps

regards

Marco

amiroh81
Creator
Creator
Author

Thank you very much