7 Replies Latest reply: Dec 10, 2017 4:03 PM by Marco Wedel RSS

    Pair of words from sentence

    amir ohev shalom

      Hi,

       

      I have records, when every record has a sentence.

      I would like to perform tests on the number of occurrences of each word in the sentence.

      The test on each word individually I was able to perform using the forum help! Thank you

       

      The next step I would like to do is display a pair of words, for example:

      sentence 1 : "I want to learn English"

      sentence 2 : "I want to learn Spanish"

      sentence 3 : "I want to learn French"


      The result I would like to receive is:

      I want - 3

      want to - 3

      to learn - 3

      learn English - 1

      learn Spanish- 1

      learn French - 1


      Does anyone have an idea?

       

      thanks

        • Re: Pair of words from sentence
          Sunny Talwar

          I have seen some great thread by marcowedel on this same topic... he might be able to offer his expertise here....

          unstructured text analysis

          • Re: Pair of words from sentence
            Clever Anjos

            Try with this

            Table:

            LOAD SubField(F1,' ') as Word, RecNo() as Line, RowNo() as Position INLINE [

                F1

                I want to learn English

                I want to learn Spanish

                I want to learn French

            ];

             

             

            Left join(Table)

            LOAD

            Word as Word1,

            Position as Position1,

            Line

            Resident Table;

             

             

            Final:

            Load

            Line,

            Word & ' ' & Word1 as Pair,

            RowNo() as Sequence

            Resident Table

            Where Position +1 = Position1;

             

             

            Drop Table Table;

             

             

             

            load

            Pair,

            Count(Pair) as Qty

            Resident Final

            Group by Pair;

             

             

            Drop Table Final;

            • Re: Pair of words from sentence
              Youssef Belloum

              Hi,

               

              try this:

               

              test:

              LOAD *  

              Inline [

              sentence

              sentence 1 : "I want to learn English"

              sentence 2 : "I want to learn Spanish"

              sentence 3 : "I want to learn French"

              ];

               

               

              for each var in  'learn English','learn Spanish','learn French','I want','want to','to learn'

              test2:

              LOAD sentence,

              if(wildmatch(sentence,'*$(var)*'),'$(var)') as lib,

              if(wildmatch(sentence,'*$(var)*'),1) as num

              resident test;

              next var

               

              attached app

              • Re: Pair of words from sentence
                Marco Wedel

                Hi,

                 

                one solution might be

                 

                QlikCommunity_Thread_283581_Pic1.JPG

                 

                mapNonLetterToSpace:
                Mapping
                LOAD Chr(RecNo()), ' '
                AutoGenerate 65535
                Where Upper(Chr(RecNo()))=Lower(Chr(RecNo()));
                
                mapReduceMultispace:
                Mapping
                LOAD Repeat(' ',100-RecNo()), ' '
                AutoGenerate 98;
                
                tabTextLines:
                LOAD RowNo() as LineID,
                    TextLine,
                    Trim(MapSubString('mapReduceMultispace',MapSubString('mapNonLetterToSpace',TextLine))) as TextLineWordSep;
                LOAD * INLINE [
                    TextLine
                    I want to learn English
                    I want to learn Spanish
                    I want to learn French
                ];
                
                tabWordTuples:
                LOAD *,
                    Upper(WordTuple) as WORDTuple,
                    AutoNumber(WordTuple,'WordTupleID') as WordTupleID,
                    AutoNumber(Upper(WordTuple),'WordTupleID') as WORDTupleID,
                    AutoNumber(Hash128(WordTuple,LineID,WordTupleStart),'WordTuplePosID') as WordTuplePosID;
                LOAD LineID,
                    WordTupleStart,
                    IterNo() as WordTupleLength,
                    Left(SubStrRight,Index(SubStrRight&' ',' ',IterNo())-1) as WordTuple
                While IterNo() <= SubStringCount(SubStrRight,' ')+1;   
                LOAD LineID,
                    IterNo() as WordTupleStart,
                    Mid(TextLineWordSep,Index(' '&TextLineWordSep,' ',IterNo())) as SubStrRight
                Resident tabTextLines
                While IterNo() <= SubStringCount(TextLineWordSep,' ')+1;
                
                tabWords:
                LOAD LineID,
                    WordTupleStart as WordNo,
                    AutoNumber(Hash128(LineID,WordTupleStart),'WordID') as WordID,
                    WordTuple as Word,
                    WORDTuple as WORD
                Resident tabWordTuples
                Where WordTupleLength=1
                Order By LineID,WordTupleStart;
                
                tabWordLink:
                LOAD WordTuplePosID,
                    AutoNumber(Hash128(LineID,WordTupleStart+IterNo()-1),'WordID') as WordID
                Resident tabWordTuples
                While IterNo() <= WordTupleLength;
                
                DROP Field LineID From tabWordTuples;
                

                 

                (adapting a more general approach I previously created)



                hope this helps


                regards


                Marco

                • Re: Pair of words from sentence
                  Antonio Mancini

                  May be this

                   

                   

                  LOAD Text1,Count(Text1) as Counter group By Text1;
                  LOAD *,SubField(Text,' ',IterNo())&' '&SubField(Text,' ',IterNo()+1) as Text1
                  Inline [
                  Text
                  "I want to learn English"
                  "I want to learn Spanish"
                  "I want to learn French"
                  ]
                  While Iterno() <= SubStringCount(Text
                  ,' ');

                  Regards,

                  Antonio

                  • Re: Pair of words from sentence
                    Marco Wedel

                    please close your thread if your question is anwered:

                    Qlik Community Tip: Marking Replies as Correct or Helpful

                     

                    thanks

                     

                    regards

                     

                    Marco