6 Replies Latest reply: Feb 19, 2015 4:03 AM by Oliver Eiholzer RSS

    QVD or DATA MODEL : which is the best option to make transformations

    Mark Munich

      Hello community,


      Which is the best stage to make tranformation to the data, is it in QVD Layer or in the Data Model layer.

      Could Some one share their real time experiences. In my case Data Sets in my data Model are more than 300 GB and more.


      Please list advantages and disadvantages or both.


      Need some genuine answers from the experts in the community.




        • Re: QVD or DATA MODEL : which is the best option to make transformations
          Henric Cronström

          I would do most or all transformations before I save to QVD. That way I can load the QVD:s optimized.



          • Re: QVD or DATA MODEL : which is the best option to make transformations
            Ralf Becher

            Hi Mark,


            I think this is hard to answer because there are some more details about your use case needed to discuss this.


            If we're facing this size of data or more we tend to decide to build a data warehouse (star schema) on a database level before loading something into QlikView. So all (or most of) transformations are done by ETL tools before a QlikView load even start. This also has to do with decoupling the source systems and other aspects like single point of truth, the prevention of point-to-point connections, maintainability etc...


            - Ralf

            • Re: QVD or DATA MODEL : which is the best option to make transformations
              Mark Munich

              Can you list out some advantages and disadvantes of optimizing data in QVD Generator layer or DataModel layer.

                • Re: QVD or DATA MODEL : which is the best option to make transformations
                  Steen Schlüter Persson

                  Hi Mark,


                  I think the decision depend a little bit on your scenario. We are loading our data from a Datawarehouse (which are also used for building all our SSAS cubes) and we are trying to follow the following structure -


                  1: We have created views in our DW which we are using when loading to QVD. In these views we do as much "enrichment" of the data as possible. That means that we do things like finding the date for a customers last transaction, appointment etc. The advantage of this is that we will have these kind of attributes available for other applications if needed (which we won't have if we did it in QV) and that we don't have to load all transactions to e.g. get the last transcation for a customer.


                  2: The QVD load is more or less just a straight load from the DW into QVD files where we created one file per. table/view we load. This is done with only some basic limitations in data which means that the QVD files will contain data for various purposes. We don't do much transformation of the data here, because we'd like the qvd files to contains as many "source" data as possible so the same qvd file can be used for various purposes.


                  3: We then load data from the qvd files into a Datamodel, which we then make per. entity and/or businessUnit (this depends a little bit on the scenario so no strict rule). In this load we most of the transformation that is QlikView specific (the things that can't be done in SQL views) and which is relevant to the purpose of the Datamodel file.


                  We then just do a binary load of the datamodel into the application we build.


                  I'm sure there are many other opinions on how to do this, but the above is the structure we try to follow. Actually we don't neccesarily have to load to QVD files, because we read on our DW so it wouldn't be a problem just to read from this directly into the datamodel. We have decided to use the qvd files though, because then we have the process in place in case we at a later stage needs to read from other sources - and especially directly from one of our ERP systems.

                  Since we don't have an issue with load time (yet)  it's not a problem for us to do as described above, but with a 300 GB load like in your case it might be worth focusing on optimized loads etc. like mentioned by Henrik.