1 Reply Latest reply: Apr 10, 2013 3:26 AM by Izzy Bromberg RSS

    Strange problem creating new grouping table based off of resident table

      Hi,

      I wrote the following query in my load script:

      load

      if(sum(sdk_syncs) <= 50000,'5 - 0-50K',

      if(sum(sdk_syncs) > 50000 and sum( sdk_syncs) <=200000,'4 - 50K - 200K',

      if(sum(sdk_syncs) > 200000 and sum(sdk_syncs) <=500000,'3 - 200K - 500K',

      if(sum(sdk_syncs) > 500000 and sum(sdk_syncs) <=1000000,'2 - 500K - 1M',

      if(sum(sdk_syncs) > 1000000,'1 - > 1M','blank'

      ))))) as Grp,application_key,sum(sdk_syncs) as sum_syncs

      resident sdk_ar_data

      group by application_key;

       

      that dumps data from a resident table into buckets based upon a sum of X (sdk_syncs). When I work with all the data I have - the answers match my source database 100%. When I start using a date dimension, the sum number (sum_syncs) is not correct (at least in certain instances).

       

      Some additional info. So I understand what the problem is - but I am not sure how to fix it.

      So the problem is that this query is correct - but for the entire date range - let's say 2011-2013. However, if I look at 2011 I still see the sum_syncs for the entire date range. When I add a date key, I introduce a diffeent problem - namely that the data is not grouped correctly at all and a single application can fall into multiple grps.

       

      Any ideas?

       

      Thanks!

        • Re: Strange problem creating new grouping table based off of resident table

          So I realized how to solve this.

          As I said the original query in the load script was correct. The problem was that I needed the query to be dynamic and update based upon the dimensions choosen.

          Here is what I did to solve the problem.

           

          For the dimesnsion of my chart I used the following formula:

           

          =aggr(

          if(sum(sdk_syncs) <= 50000,'0-50K',

          if(sum(sdk_syncs) > 50000 and sum( sdk_syncs) <=200000,'50K - 200K',

          if(sum(sdk_syncs) > 200000 and sum(sdk_syncs) <=500000,'200K - 500K',

          if(sum(sdk_syncs) > 500000 and sum(sdk_syncs) <=1000000,'500K - 1M',

          if(sum(sdk_syncs) > 1000000,'> 1M','blank'

          ))))),

          application_key

          )

           

          Then as an expression I used a count distinct of the application_key.