0 Replies Latest reply: Dec 1, 2011 8:47 AM by vosdewaela RSS

    Multi-level data aggregation issue

      Hello,

       

      I’m having a problem with a set of multi-level data. Let’s call the parent level “Company-level” and the child-level “Division-level’.

       

      My dataset consists of a two tables. To identify companies, there’s a variable “Company” with the company name (String value). Each record also has a variable “Division” (String value). The tables share a key field, which is a concatenation of the company and division name.

       

      Both tables have records that are constructed along these logical lines:

      • Division-level entry: if there is data on multiple Divisions within one company, each record lists the relevant Division. The collection of Divisions that share the same Company field are assumed to together represent the Company as a whole.
      • Company-level entry: if a single record data is representative of the whole company (Company-level), its Division field list the Company name. In this case the Company and Division fields match.

       

      The trouble is that (by design) the same company can have Division-level entry in table A, while having Company-level entry in table B. Logically, these entries do not have a matching key (as the Division level field is different in each record). This relationship cannot be reversed; Company-level entry in table A always implies Company-level entry in table B.

      Other companies have either division level entry in both tables or company level entry in both tables.

       

      I need to compute Performance Indicators using fields from both tables. If a Company has Division-level data in both tables, the data needs to be reported at that Division level (i.e. the Company is listed multiple times in the result list, once for each division). If a Company has Company-level data in both tables, it needs to be included on that Company-level. Both of the above are easily managed. However, if a company has Division-level entry in table A, while having Company-level entry in table B, the necessary transformation (usually a sum aggregation) is less clear-cut.

       

      How do I achieve the necessary information on the records, without unnecessarily aggregating the desired Division-level records?

       

      As an example, here’s a table of the data structure:

       

      Table A
      Table B
      CompanyDivision
      CompanyDivision
      AlphaAlpha
      AlphaAlpha
      BetaKenny
      BetaKenny
      BetaLarry
      BetaLarry
      GammaYankee
      GammaGamma
      GammaZulu


      The problem is aggregating Gamma without aggregating Beta in Table A. The performance indicator I want to compute is constructed using other fields in both tables.

       

      Thanks in advance for your help.