1 Reply Latest reply: May 19, 2011 1:05 PM by John Witherspoon RSS

    Best practice for filtering data (should set analysis be employed in this case?)

      Data set

      I have two tables--one for project attributes and and one for transactions. The projects table is a historical table and includes attributes such as status, estimated cost, etc. snapshotted every day. The transaction table includes all transactions for all projects.



      I'm creating several charts that include trended information (e.g. number of projects over time, sum of estimated costs over time, etc.) as well as some charts that need to show transactions only for projects whose status is active as of today. For this discussion, let's assume I'm creating a straight table with transaction number as the dimension (note that there is a one-to-one relationship between transaction number and project number).



      I'd like to filter these records in what the sages in this forum would consider the most elegant manner. I was wondering if set analysis may be the best way to approach this. (My client suggested that set analysis shold be used in this instance, and I have to admit that I'm not very familiar with set analsys and was hoping to get some feedback here that may set me on the right course--that being of how to build my chart and/or respond to my client :-) )


      Current strategy

      I've tested two methods for filtering the records in the straight table I'm building


      Stragegy 1

      I've added a calculated dimension to my straight table that looks like this:

      = IF ( [SnapshotDate] = Today() and [Status] = 'Active', [TransactionNumber] )


      Strategy 2

      I've added an additional SQL statement to the script editor that gets ProjectNumber and Status just for today. I use TodayStatus as the alias for the Status field, which allows me to use this calculated dimension in the straight table:

      = IF ( [TodayStatus] = 'Active', [TransactionNumber] )


      At this point I'm not certain there's any more elegant way to filter these transaction records to show only those for projects that are active as of today. And at first glance it doesn't appear that set analysis is relevant here. Any thoughts?

        • Re: Best practice for filtering data (should set analysis be employed in this case?)
          John Witherspoon

          Set analysis would work fine, but isn't AS good as handling it in the data model, in script.  I'd do this in the script:


          if(SnapshotDate=today() and Status='Active',TransactionNumber) as TransactionNumberActiveToday


          Then just use that new field as your dimension.  Calculated dimensions tend to be slow, so aren't as good as having a real dimension like this.  Set analysis would be almost as fast as the new field, but not quite.  You'd have to apply it to EVERY expression in every relevant chart, though, which is a better reason to not handle it that way.  Now, you WILL need to learn set analysis.  Even if it's not the best solution for this case, it's the best solution for a lot of cases.  Here's how it might look in your case, and you can see that it can get syntactically dense and you can end up repeating yourself:


          Dimension = TransactionNumber
          Expresison 1 = sum({<SnapshotDate*={'$(=date(today()))'}
                              ,Status*={'Active'}>} Cost)
          Expresison 2 = count({<SnapshotDate*={'$(=date(today()))'}
                               ,Status*={'Active'}>} Something)


          For this, I think the script solution is simpler, more elegant, and should render in the chart slightly more quickly.