16 Replies Latest reply: Nov 30, 2017 7:42 AM by amir ohev shalom RSS

    split text from field

    amir ohev shalom

      Hi All,

       

      I have a field that contains random text and I want to split the text into single words.

      The text can be separated by spaces, commas, periods, etc.

      I tried using SubField with a space separator but I only split the first word.

      In addition, I was unable to separate more than one type of separator.

       

       

      For example, I have the following sentence:

      how are-you.doing today

       

      The result I would like to have is:

      how

      are

      you

      doing

      today

       

      Thank you!

        • Re: split text from field
          Youssef Belloum

          Hi,

           

          test:

          LOAD *,

          subfield(Replace(Replace(name,'-', ' '),'.',' '),' ',1) as String1,

          subfield(Replace(Replace(name,'-', ' '),'.',' '),' ',2) as String2,

          subfield(Replace(Replace(name,'-', ' '),'.',' '),' ',3) as String3,

          subfield(Replace(Replace(name,'-', ' '),'.',' '),' ',4) as String4

          Inline [

          name

          how are-you.doing today

          ];

          • Re: split text from field
            Vishwarath Nagaraju

            May be try this using Mapping load. You can add any special characters in the mapping table named Table1 as you go.

             

            Table1:

            Mapping LOAD * INLINE [

            Text, String

            ' ', @

            -, @

            ., @

            ',', @

            ];

             

             

            LOAD *, Subfield(replaceString, '@') AS NewField;

            LOAD *, MapSubString('Table1', Text) AS replaceString INLINE [

            Text

            how are-you.doing today

            ];

            • Re: split text from field
              Antonio Mancini

              Try this

               

              LOAD SubField(SubField(SubField(Field,' '),'.'),'-') as Field

              • Re: split text from field
                Rob Wunderlich

                CharMap:

                Mapping LOAD *, ' '

                Inline [

                char

                .

                ,

                -

                ] (delimiter is '\t');

                 

                data:

                LOAD

                SubField(MapSubString('CharMap', Input), ' ') as Word

                Inline [

                Input

                how are-you.doing today

                ];

                 

                -Rob

                http://masterssummit.com

                http://qlikviewcookbook.com

                • Re: split text from field
                  Vishnu Chakravaram

                  Viswarath's solution is the best of the lot. It will work.

                  • Re: split text from field
                    Marco Wedel

                    Hi,

                     

                    another method to define non-letter characters for the word separation might be (included in the already proposed SubField/MapSubString solution):

                     

                    QlikCommunity_Thread_281822_Pic1.JPG

                     

                    mapNonLetterToSpace:
                    Mapping
                    LOAD Chr(RecNo()), ' '
                    AutoGenerate 65535
                    Where Upper(Chr(RecNo()))=Lower(Chr(RecNo()));
                    
                    tabTextLines:
                    LOAD RecNo() as ID, *
                    Inline '
                        TextLine
                        "SubField - script and chart function"
                        "Subfield() is used to extract substring components from a parent string field, where the original record fields consist of two or more parts separated by a delimiter."
                        "The Subfield() function can be used, for example, to extract first name and surname from a list of records consisting of full names, the component parts of a path name, or for extracting data from comma-separated tables."
                        "If you use the Subfield() function in a LOAD statement with the optional field_no parameter left out, one full record will be generated for each substring. If several fields are loaded using Subfield() the Cartesian products of all combinations are created."
                        "Syntax:"
                        "SubField(text, delimiter[, field_no ])"
                        "Return data type: string"
                        "Arguments:"
                        "Argument Description"
                        "text The original string. This can be a hard-coded text, a variable, a dollar-sign expansion, or another expression."
                        "delimiter A character within the input text that divides the string into component parts."
                        "field_no The optional third argument is an integer that specifies which of the substrings of the parent string text is to be returned. A negative value causes the substring to be extracted from the right-hand side of the string. That is, the string search is from right to left, instead of left to right, if field_no is a positive value."
                    ';
                    
                    tabWords:
                    LOAD Distinct * Where Len(Word);
                    LOAD ID,
                        SubField(MapSubString('mapNonLetterToSpace',TextLine),' ') as Word
                    Resident tabTextLines;
                    

                     

                     

                    hope this helps

                     

                    regards

                     

                    Marco

                    • Re: split text from field
                      Marco Wedel

                      one example of calculating word frequencies and counting words per text line in the front end might be:

                       

                      QlikCommunity_Thread_281822_Pic2.JPG

                      QlikCommunity_Thread_281822_Pic3.JPG

                      QlikCommunity_Thread_281822_Pic4.JPG

                      QlikCommunity_Thread_281822_Pic5.JPG

                      QlikCommunity_Thread_281822_Pic6.JPG

                       

                      mapNonLetterToSpace:
                      Mapping
                      LOAD Chr(RecNo()), ' '
                      AutoGenerate 65535
                      Where Upper(Chr(RecNo()))=Lower(Chr(RecNo()));
                      
                      tabTextLines:
                      LOAD RecNo() as LineID, *
                      Inline '
                          TextLine
                          "SubField - script and chart function"
                          "Subfield() is used to extract substring components from a parent string field, where the original record fields consist of two or more parts separated by a delimiter."
                          "The Subfield() function can be used, for example, to extract first name and surname from a list of records consisting of full names, the component parts of a path name, or for extracting data from comma-separated tables."
                          "If you use the Subfield() function in a LOAD statement with the optional field_no parameter left out, one full record will be generated for each substring. If several fields are loaded using Subfield() the Cartesian products of all combinations are created."
                          "Syntax:"
                          "SubField(text, delimiter[, field_no ])"
                          "Return data type: string"
                          "Arguments:"
                          "Argument Description"
                          "text The original string. This can be a hard-coded text, a variable, a dollar-sign expansion, or another expression."
                          "delimiter A character within the input text that divides the string into component parts."
                          "field_no The optional third argument is an integer that specifies which of the substrings of the parent string text is to be returned. A negative value causes the substring to be extracted from the right-hand side of the string. That is, the string search is from right to left, instead of left to right, if field_no is a positive value."
                      ';
                      
                      tabWords:
                      LOAD
                          LineID,
                          Word,
                          AutoNumber(IterNo,'WordNo'&LineID) as WordNo,
                          AutoNumber(Hash128(LineID,IterNo),'WordID') as WordID,
                          Upper(Word) as WORD
                      Where Len(Word);
                      LOAD LineID,
                          IterNo() as IterNo,
                          SubField(MapLine,' ',IterNo()) as Word
                      While IterNo()<=SubStringCount(MapLine,' ')+1;  
                      LOAD LineID,
                          MapSubString('mapNonLetterToSpace',TextLine) as MapLine
                      Resident tabTextLines;
                      
                      

                       

                       

                      hope this helps

                       

                      regards

                       

                      Marco