6 Replies Latest reply: Apr 22, 2015 4:10 AM by Chris Olver RSS

    Distinct Inner Join - Strange Results.

    Chris Olver

      Hi,

       

      Found a bizarre issue and can't seem to get my head around what would cause this problem.

       

      I am trying to do a Distinct Inner join on a very large data set to remove data.

       

      Usage:
      LOAD 
           UserID,
           Unit,
           UsageType
      FROM
      [..\3.QVD\DataModels\Usage\2015-*-*.qvd]
      (qvd);
      
      Inner Join
      
      LOAD
        Distinct
        UserID
      FROM
      [..\3.QVD\DataModels\Customer.qvd]
      (qvd);
      
      
      
      

       

      This returns a total of 1600 results which is incorrect. If I do:

       

      Customer:
      LOAD
        Distinct
        UserID
      FROM
      [..\3.QVD\DataModels\Customer.qvd]
      (qvd);
      
      Usage:
      LOAD 
           UserID,
           Unit,
           UsageType
      FROM
      [..\3.QVD\DataModels\Usage\2015-*-*.qvd]
      (qvd);
      
      Inner Join
      
      Load
           UserID
      Resident Customer;
      
      Drop Table Customer;
      
      

       

      This returns 35,502 which is correct. I don't understand why the two results should differ. I have written both methods and the same list of values for UserID are returned. If create an INLINE * FROM [ UserID etc ] this also provides the correct results.

       

      Further testing:

       

      Usage:
      LOAD 
           UserID,
           Unit,
           UsageType
      FROM
      [..\3.QVD\DataModels\Usage\2015-*-*.qvd]
      (qvd);
      
      Inner Join
      
      LOAD
          UserID,
           Count(UserID) As Counter
      FROM
      [..\3.QVD\DataModels\Customer.qvd]
      (qvd)
      Group By UserID;
      
      Drop Field Counter;
      

       

      This also provides the correct as results above.

       

      Any ideas on what could be causing this issue?

       

      -Chris