6 Replies Latest reply: Jul 3, 2013 8:09 AM by Erik Furlanis RSS

    LEFT JOIN and exponential growing of records

      Hello there

       

      I am trying to achieve the following: link to a weekly-updated selection of items some static datamaster information (think about price, weight etcetc), fetched from TWO different tables.

       

      The script goes as following

       

      BO:

       

      LOAD

      *,

      today() as DayStamp

      RESIDENT Lines_temp WHERE QtyUnAll>0 and DateInterpret<today();

      DROP TABLE Lines_temp;

       

      LEFT JOIN(BO)

      LOAD

      *

      RESIDENT PartMaster;

       

      LEFT JOIN(BO)

      LOAD

      *

      RESIDENT PDMCTA;

       

      The script goes well, but the outcomes leaves me baffled a bit. When I reaload the QVW, you can see what happens

      BOreport.JPG

      the weekly selection is 3k lines, however by LEFT JOINING the two datamaster tables, lines increase exponentially.

      From what I have been reading and re-reading, with LEFT JOIN the outcome of lines should be equal to the amount of lines in the first table.

       

      Can you help me out here?

      thank you very much!

        • Re: LEFT JOIN and exponential growing of records
          Jonathan Dienst

          Hi

           

          If you have duplicates in the key fields used for the right side of join, then there will be one copy for each of those duplicate values in the result.Here you have duplicates in both Partmaster and PDMCTA (of the fields used as the key for the join.

           

          A SQL left join has the same behaviour.

           

          Regards

          Jonathan

            • Re: LEFT JOIN and exponential growing of records
              Jonathan Dienst

              By way of example - the following script:

               

              LOAD * Inline
              [
                        Key, Value1
                        A, 1
                        B, 3
                        C, 2
              ];
              
              
              Left Join
              LOAD * Inline
              [
                        Key, Value2
                        A, 5
                        A, 7
                        A, 9
                        B, 3
                        D, 5
              ];
              

               

              Will have the following result table:

               

                        Key             Value1          Value2

                        A                    1                    5

                        A                    1                    7

                        A                    1                    9

                        B                    3                    3

                        C                    2                  null

               

              Regards

              Jonathan

              • Re: LEFT JOIN and exponential growing of records

                is there an easy way to check for duplicates?

                 

                I had the certainty that both PartMaster and PDMCTA had a huge amoun of UNIQUE keys in the key_field....but apprarently is not..!

                 

                ==

                 

                never mind, I managed to find a way.

                This a test of PartMaster, apparently there are just TWO duplicated keys, out of +500k keys available.

                I still don't understand then why so many records are generated with the LEFT JOIN...

                 

                duplicates.JPG

              • Re: LEFT JOIN and exponential growing of records

                Please check, it seems that your left table's key field is not unique.

                 

                 

                Regards,

                Amit

                • Re: LEFT JOIN and exponential growing of records

                  Ok, simply enough the "solution" to this problem is as follows:

                   

                  BOreport.JPG

                   

                  QV is fetching all the lines from the joinee tables, regardless of how many of those match the keys in the joiner table.

                  On the base of Jonathan Dienst example, I created the following

                  LOAD * Inline

                  [

                            Key, Value1

                            A, 1

                            B, 3

                            C, 2

                  ];

                   

                   

                   

                   

                  Left Join

                  LOAD * Inline

                  [

                            Key, Value2

                            A, 5

                            A, 7

                            A, 9

                            B, 3

                            D, 5

                  ];

                   

                   

                  Left Join

                  LOAD * Inline

                  [

                            Key, Value3

                            A, a

                            A, a1

                            A, a2

                            A, a3

                            B, b

                            C, c

                            D, d

                            E, e

                            F, f

                            G, g

                            H, h

                            I, i

                            J, l

                            K, l1

                            L, l2

                            M, l4

                            N, l5

                            O, l6

                            P, p1

                            P, p2

                            P, p3

                  ];

                   

                   

                   

                  as you see, the inline tables are made originally of 3, 5 and 21 lines

                  the script execution progress shows the following:

                   

                  left join QC fetch.JPG

                   

                  so the script is "fetching" at first everything regardless of duplicates, keys and so forth.

                  Then only the lines matching the Keys are kept; and duplicated in case the joinee tables have duplicates!

                   

                  BOreportt4.JPG

                   

                  My thought was that QV was fetching and keeping those 600k and more lines.

                  Meanwhile with "fetched" it is meant barely "read".

                   

                  thank you all for your support