0 Replies Latest reply: Jul 26, 2014 1:32 PM by Dennis Mornad RSS

    Clustering similar rows

    Dennis Mornad

      Hi All,


      I am trying to accomplish the following with my dataset in QV. For simplicity, let's say I have a Product dimension with 5 members. For each member I have the Sales data from 5 Countries. In this 5 by 5 matrix I want to cluster and identify similar Products. The definition of similar is those products that have the "closest" sales patterns across all countries. Let's say my table looks like the following: (measures are the  Ranks of Sales)


      Product                                      Country

                                USA     France     England     Italy     Germay

      A                         5         2              3              1           4

      B                         5         2              4              1           3

      C                         1         3              5              2           4

      D                         1         2              5              3           4

      E                         2         5              1              3           4



      Here Products A and B have the closest sales patterns in the data set. So do products C and D. Product E does not have any similarities with any other products. So I was thinking of assigning a score to each row based on their ranks in each country and then cluster (sort) them by that score and pick the similar ones. This score can not be just a simple summation of all ranks across a product. It must be sensitive to the rank value per country. Any ideas or approaches?