Feb 7, 2018 8:58 AM by Katarzyna Wójcik

    histogram counts duplicates

    Katarzyna Wójcik

      Suppose I have two tables joined by id, where one table contains this id as unique identified and second table has multiple records with the same id.



      LOAD * INLINE [

          d1_ID, param1

          A1, 1

          A2, 2

          A3, 2

          A4, 3

          A5, 2

          A6, 4

          A6, 3

          A7, 5




      LOAD * INLINE [

          d2_ID, d1_ID, other_param

          1, A1, x

          2, A1, y

          3, A1, z

          4, A2, t

          5, A2, x

          6, A3, y

          7, A4, z

          7, A4, t

          8, A6, x


      I would like to make a histogram of param1 values, counting each once per d1_ID. But having these two tables joined, it counts '1' three times, as A1 occurs 3 times in data2 table... Any ideas how to make it?