4 Replies Latest reply: Jul 13, 2017 6:31 AM by luis martin-roldan RSS

    Direct Discovery Dimensions are a mess

    luis martin-roldan

      Hi there,

       

      I have recently stepped into the Direct Discovery feature and have encountered a handful of issues I am having difficulties with.

       

      • If I define a measure using an aggregation operation, such as "Sum()", I shall not use any aggregation operation on this dimension in my charts. I believed I could do this so, for example, counting how many different aggregations can be done based on the current selections (via "Count(Measure1))", where "Measure1" could be the aggregated sum of one field times another field, for example).

                E.g.:

                Script:

      DIRECT QUERY

      Dimension

      Campo2 as T2.Campo2,

      //     Campo2,

          Campo3 as T2.Campo3,

          Campo4 as T2.Campo4,

          Campo5 as T2.Campo5,

          Campo6 as T2.Campo6,

          Campo7 as T2.Campo7

          ,native('Campo2 & Chr(95) & Campo3 & Chr(95) & Campo4') as Campo_key

      // Campo2, Campo3, Campo4

      Measure

      // Sum(Campo1) as T2.Campo1

          Sum(Campo1) as T2.Campo1

      // Sum(Campo1) as Campo1

      FROM

          tabla_union

      ;


             Table's Measure Expression (in a table with every Dimension):

                     "T2.Campo1" gives no errors while "Sum(T2.Campo1)" outputs "Database Connection Error". This can be reproduced with any                aggregation function, not only Sum.


      • Using a Hybrid Model (tables loaded in-memory and Directly Discovered Tables), when applying selection on Dimensions from Direct Query, all other fields and values become excluded. This is particularly upsetting.

                E.g.:

                Script:

      [TablaUnion]:

      Load

      *,

          Campo2 &'_'&Campo3&'_'&Date(Campo4)&' '&Time(Campo4,'hh:mm:ss') as Campo_key;

      SQL

      Select Sheet1.Campo1, Sheet1.Campo2, Sheet1.Campo3, Sheet1.Campo4, Sheet1.Campo5,Null as Campo6, Null as Campo7

      From Sheet1

      UNION ALL

      Select Sheet2.Campo1, Sheet2.Campo2, Sheet2.Campo3, Sheet2.Campo4, Null as Campo5, Sheet2.Campo6, Null as Campo7

      From Sheet2

      UNION ALL

      Select Sheet3.Campo1, Sheet3.Campo2, Sheet3.Campo3, Sheet3.Campo4, Null as Campo5, Null as Campo6, Sheet3.Campo7

      From Sheet3;

      //They are essentially the same table. Same fields. Same values. Linked through 'Campo_key', which expression I assure outputs the same values for each load statement.

      [TablaUnion2]:

      DIRECT QUERY

      Dimension

      Campo2 as T2.Campo2,

          Campo3 as T2.Campo3,

          Campo4 as T2.Campo4,

          Campo5 as T2.Campo5,

          Campo6 as T2.Campo6,

          Campo7 as T2.Campo7

          ,native('Campo2 & Chr(95) & Campo3 & Chr(95) & Campo4') as Campo_key

      Measure

          Sum(Campo1) as T2.Campo1

      FROM

          tabla_union

      ;

       

      dirct discovery post1.png

      Fig. 1: Selections on fields outside the Directly Discovered Table make all values of Directly Discovered Table excluded.

       

      dirct discovery post2.png

      Fig. 2: Selections of Dimensions from Direct Query make all other values excluded.

       

      • When using Direct Discovery, I was certain that the Dimensions described would be brought from the Data Source and loaded in the in-memory Model. Not only the meta-data but the actual values as well. The trace of the data loading script shows me the number of different values being loaded for each dimension and, back in the visualization layer, I can create filter panes on these Dimensions. However, the Data Model Viewer states that there is no data stored whatsoever! I don't know what to trust anymore (this bullet may be tightly related to the previous one)

      dirct discovery post3.png

          

       

      Every bit of help is greatly appreciated

      Regards,

      Luis.

        • Re: Direct Discovery Dimensions are a mess
          luis martin-roldan

          Well, I have been trying on and on to get a handle of this whole Direct Discovery and my personal conclusion is that it just doesn't work.

           

          The idea behind it is actually impressive, but this just doesn't work. I will try to do an Advanced Data Loading Script course in the future but my hopes on this are not high...

           

          Still, every bit of help will be greatly appreciated

          • Re: Direct Discovery Dimensions are a mess
            luis martin-roldan

            In response to my last bullet, say

             

            • When using Direct Discovery, I was certain that the Dimensions described would be brought from the Data Source and loaded in the in-memory Model. Not only the meta-data but the actual values as well. The trace of the data loading script shows me the number of different values being loaded for each dimension and, back in the visualization layer, I can create filter panes on these Dimensions. However, the Data Model Viewer states that there is no data stored whatsoever! I don't know what to trust anymore (this bullet may be tightly related to the previous one)

            dirct discovery post3.png

             

            I have understood it: Direct Querying a table brings the distinct values of its Dimensions to its Symbols Table, not Pointers Table (since there are no Pointers Table to begin with) so of course there are no rows in the internal Table.

             

            This still doesn't explain (at least not sufficiently) the rest of my doubts.

             

            Cheers.

              • Re: Direct Discovery Dimensions are a mess
                luis martin-roldan
                • Using a Hybrid Model (tables loaded in-memory and Directly Discovered Tables), when applying selection on Dimensions from Direct Query, all other fields and values become excluded. This is particularly upsetting.
                  (...)

                I got it as well!

                 

                Since there is no data loaded in-memory for these dimensions, there is no Pointers Vector related to it, so records in current selection cannot be kept internally.

                 

                Now, this could (or rather, should) be fixed forcing the Qlik Engine to delegate this selection control on Direct Discovery Dimensions on the remote data source. In other words:

                 

                •     Knowing that the qlik engine transforms a selection on DimensionN into a query like

                          Select fieldList where DimensionN in (selectionN)

                          every selection would override the previous one, since there is only one dimension in the where clause. This can be confirmed using ODBC tracing capabilites, where every selection is transformed into a query like the one above.

                 

                •     The engine should instead query taking into account all current selections and including those who are Directly Discovered, so if Dimensions 1 to 3 are Directly Discovered and have been selected, and there is then a 4th selection on DimensionN made, the last query should be written like

                Select fieldList where Dimension1 in (selection1) and Dimension2 in (selection2) and Dimension3 in (selection3) and DimensionN in (selectionN)


                     Thus delegating these selections on the source instead of trying to manage them internally.


                I guess that the reason for not having worked this out was their intentional use of Direct Discovery Dimensions as just links between in-memory tables and Direct Discovery Tables, but this should not be a necessary condition for a Direct Discovery Dimension: their main task is to work as filters and grouping factors in the evaluation of measures.

                  • Re: Direct Discovery Dimensions are a mess
                    luis martin-roldan

                    Finally, for the first bullet

                     

                    • f I define a measure using an aggregation operation, such as "Sum()", I shall not use any aggregation operation on this dimension in my charts. I believed I could do this so, for example, counting how many different aggregations can be done based on the current selections (via "Count(Measure1))", where "Measure1" could be the aggregated sum of one field times another field, for example).

                              E.g.:

                              Script:

                    DIRECT QUERY

                    Dimension

                    Campo2 as T2.Campo2,

                    //     Campo2,

                        Campo3 as T2.Campo3,

                        Campo4 as T2.Campo4,

                        Campo5 as T2.Campo5,

                        Campo6 as T2.Campo6,

                        Campo7 as T2.Campo7

                        ,native('Campo2 & Chr(95) & Campo3 & Chr(95) & Campo4') as Campo_key

                    // Campo2, Campo3, Campo4

                    Measure

                    // Sum(Campo1) as T2.Campo1

                        Sum(Campo1) as T2.Campo1

                    // Sum(Campo1) as Campo1

                    FROM

                        tabla_union

                    ;


                           Table's Measure Expression (in a table with every Dimension):

                                   "T2.Campo1" gives no errors while "Sum(T2.Campo1)" outputs "Database Connection Error". This can be reproduced with any                aggregation function, not only Sum.

                     

                    This is obvious, I was just mad at things and didn't see I was just asking for sum(sum(medida)) which is considered incorrect syntax by the Access database the data relies on.