5 Replies Latest reply: Nov 17, 2010 4:21 AM by adstrickland RSS

    Cross Tabulation of census style data

    adstrickland

      I have "questions", "respondents" and "answers". Below is an example with 5 questions, two respondents and 8 sets of answers (yes = 1/no = 0). Is it feasible to try and get an output table that looks something like:

       


      female? | <30? | 1
      female? | >=30&<=50? | 1


       

      (ie the total of females by age groups)? Multiply questions and respondents by ~20 000, there might be things like (urban?/rural?, state, town etc).

       

      The script:

       




      question:
      LOAD * INLINE [
      question no, question
      q1, male?
      q2, female?
      q3, <30?
      q4, >=30&<=50?
      q5, >50?
      ];

      answer:
      LOAD * INLINE [
      question no, respondent, answer
      q1, r1, 0
      q2, r1, 1
      q3, r1, 1
      q4, r1, 0
      q5, r2, 0
      q1, r2, 0
      q2, r2, 1
      q3, r2, 0
      q4, r2, 1
      q5, r2, 0
      ];




      Thanks for any help, I am a noob so please forgive me if I am missing the completely obvious. I can structure the input in any way I like, so I could duplicate the question tables if that were to help with cross tables (for example).

       

      Regards

      Alex

       

        • Cross Tabulation of census style data
          Miguel Angel Baeyens de Arce

          Hello Alex,

          Check this file with sample code about crosstables. IT may be helpful if your data format is similar to the inline tables I used in it. As far as I see it, any question should be a separate field with their correspondings 1 or 0, so you could accumulate them.

          Hope it helps!

          EDIT: You're likely to find more interesting the GENERIC load with the sample data you provided. There are some useful posts and blog entries about them in the forum as well.

            • Cross Tabulation of census style data
              adstrickland

              Hello Miguel

              I am exploring this problem using the personal edition which means I cannot look at your sample. However I suspect that from your response that I may not be able to manipulate the questions as I would like (you say "question should be a separate field").

              I did experiment with loading the question table twice so that I can use "question x" as an x co-ordinate and "question y" as a y co-ordinate. This looked promising - and looked a bit like the table I posted, but I need to have some cross table query as an expression in col 3. To rephrase my question, is it possible to do a "join" operation on in-memory tables (like SQL)?:

              so "Sum (answer)" would be replaced with a SQL like query of my own which can reference the values in column 1 and column 2. I have been reading the manual quite hard, but still can't see the wood for the trees.

              Thanks

              Alex

               

                • Cross Tabulation of census style data
                  Miguel Angel Baeyens de Arce

                  Hello Alex,

                  Let's see if Jason can take a look at the files and make them available for personal edition.

                  Similar to how you join queries in SQL you can do with tables loaded in QlikView. According to your script above

                   

                  question:LOAD * INLINE [ question no, question q1, male? q2, female? q3, <30? q4, >=30&<=50? q5, >50?]; answer:JOIN LOAD * INLINE [ question no, respondent, answer q1, r1, 0 q2, r1, 1 q3, r1, 1 q4, r1, 0 q5, r2, 0 q1, r2, 0 q2, r2, 1 q3, r2, 0 q4, r2, 1 q5, r2, 0]; T:GENERIC LOAD respondent, question, answerRESIDENT question;


                  Now create a new chart, pivot table, add "<30?" as dimension and

                   

                  Sum({< question = {'female?'} >} answer)


                  As expression.

                  Is this closer to what you are looking?

                    • Cross Tabulation of census style data
                      adstrickland

                      Thank you again for such a thorough attempt to help me, it works well for that smallish set of questions.

                      I think that I am going to hit a rock wall with the number of questions: I must cater for about 20,000.

                      So, unless I am missing something, the "programmability" of the solution is not there, as I don't think it is practical to do the join the way you have, and to have so many field names? The final user should have the ability to grab a bunch of questions from x, and a bunch from y and simply crosstab.

                      I made an assumption that QlikView would provide a query language for joining tables already in memory and it seems I am wrong. I wasn't sure even that would scale, but I was going to test that.

                      Regards

                      Alex

                       

                       

                      • Cross Tabulation of census style data
                        adstrickland

                        Hi Miguel

                        I have done something that works, but is a bit odd, maybe it exploits a bug?

                        Note that in the straight chart and the pivot table, the Sum expression is invalid! I really have not got my head around the set notation at all. However, it does what I want!?!

                        Maybe you can explain, or refine it a bit?

                        Essentially I duplicated all input:

                         



                        questionx:
                        LOAD * INLINE [
                        questionx
                        male?
                        female?
                        <30?
                        >=30&<=50?
                        >50?
                        ];

                        questiony:
                        LOAD * INLINE [
                        questiony
                        male?
                        female?
                        <30?
                        >=30&<=50?
                        >50?
                        ];

                        answerx:
                        LOAD * INLINE [
                        questionx, respondent, answer
                        male? , r1, 1
                        female? , r1, 0
                        <30? , r1, 1
                        >=30&<=50?, r1, 0
                        >50? , r1, 0
                        male? , r2, 0
                        female? , r2, 1
                        <30? , r2, 0
                        >=30&<=50?, r2, 1
                        >50? , r2, 0
                        male? , r3, 0
                        female? , r3, 1
                        <30? , r3, 0
                        >=30&<=50?, r3, 0
                        >50? , r3, 1
                        male? , r4, 0
                        female? , r4, 1
                        <30? , r4, 0
                        >=30&<=50?, r4, 0
                        >50? , r4, 1
                        male? , r5, 1
                        female? , r5, 0
                        <30? , r5, 0
                        >=30&<=50?, r5, 0
                        >50? , r5, 1
                        ];

                        answery:
                        LOAD * INLINE [
                        questiony, respondent, answer
                        male? , r1, 1
                        female? , r1, 0
                        <30? , r1, 1
                        >=30&<=50?, r1, 0
                        >50? , r1, 0
                        male? , r2, 0
                        female? , r2, 1
                        <30? , r2, 0
                        >=30&<=50?, r2, 1
                        >50? , r2, 0
                        male? , r3, 0
                        female? , r3, 1
                        <30? , r3, 0
                        >=30&<=50?, r3, 0
                        >50? , r3, 1
                        male? , r4, 0
                        female? , r4, 1
                        <30? , r4, 0
                        >=30&<=50?, r4, 0
                        >50? , r4, 1
                        male? , r5, 1
                        female? , r5, 0
                        <30? , r5, 0
                        >=30&<=50?, r5, 0
                        >50? , r5, 1
                        ];


                        Thanks and regards

                        Alex