3 Replies Latest reply: Apr 9, 2015 8:44 AM by Olaf Gschweng RSS

    Expansion of QVD file size during iterated process

    Evan Kurowski

      I have noticed something happening with my QVDs in several different situations and I'd like to discover if there is some setting I have not engaged or activated that might help with this.

       

      I'm developing on a local machine using the QlikView developer (not on a server) and when I'm iterating through a data set and writing the output to QVD's I've noticed the file size it takes to store the QVD growing in a linear fashion even though the amount of rows being stored in that QVD aren't varying by much.  So that in the first few passes, the QVD's are very small and they keep growing until just a few thousand rows can take megabytes to write to disk.  I have attached two of the output files from one of these scenarios as an example.  The first file AAAE was created early in the iteration process and the latter NINE was created somewhere in the middle.  The last file ZZ actually took 7mb to write to disk with 1386 rows.

       

      In addition, if I take one of these QVDs, load it into a second application that stores it a second time to QVD, the resultant QVD from the second pass through is much smaller, as if all the extra memory has been squashed out.  On a second 'compression' pass of my QVD, NINE stores down to 105k.

       

       

      It seems to me that between QVD writes, even though the tables are being dropped and the variables cleared, that some space of memory is still being reserved and stored with the QVDs between iterations.  I've attached the sample data files:

       

      SECURITIES_HISTORY_AAAE.QVD - size 52k - rows 1286

      SECURITIES_HISTORY_NINE.QVD - size 4938k - rows 1386

        • Re: Expansion of QVD file size during iterated process
          Sara Leslie

          Hello Qlik Community Members- this discussion has been answered previously but accidentally deleted we are recreating the thread. Here are the responses:


          [From Giuseppe Novello of QlikTech]

          Dear Evan,

          I found a possible solution for this issue and worked around, with the latest Version 10 SR4 or V11 IR. There's a easter egg that avoids the Lineage, by setting this easter egg "AllowDataLineage" to "-1" . To get the such "easter egg" you just need to open Qlikview desktop > help> about Qlikview... > in the right  bottom corner you'll see the qlikview logo> right click on it> a window will come out and look for "AllowDataLineage" > Value = "-1". see if this helps the QVD to stop increase the size.

           

          Giuseppe was right on with this advice.  Once I installed a version recent enough after 10 SR4 to have this variable setting present, change it's value and reload the application, the QVD memory expansion went immediately away.  Be aware this Easter Egg variable does not show up in versions prior to V10 SR4.  Thank you Giuseppe!  Thank you Pablo!  Thank you Qlik Tech!

          -----------------------------------------

           

          Helpful Answer by Pablo Labbe

          Pablo Labbe Jan 15, 2012 12:43 PM (in response to EvanKurowski)

          Evan,

           

            Your case is very curious and I think it´s related with a new feature called Lineage. Datasource information is stored inside the qvd. Perhaps, because of the multiple iterations of your script, all of this is stored in the QVD.

           

            If you open the QVD with a text editor like Notepad, you can see how long is the lineage information in the larger file.

           

            I don´t know how to solve this. I sugest you to open a ticket with Qliktech Support.

           

          Best,

           

          Pablo

          -----------------------------------------------------

          EvanKurowski Jan 15, 2012 5:15 PM (in response to Pablo Labbe)

          This was very helpful to look at the QVD in a text editor Pablo.. In the past I'd usually expected to see the encrypted gibberish, but in this case I saw a large chain of XML tagging.

           

          Opening up the larger of the two QVDs I attached in a text editor, I can see it's kept the XML tags from every prior table created in this iteration process.. each pass through the loop keeping XML tags of every prior pass even though each pass drops the prior table and creates a new unique table name.  For some reason using the same segment of code doesn't clear out the XML tags, even if you clear out the data model and drop the table to begin anew.  Thanks for pointing that out Pablo, perhaps I'll see if QlikTech can offer some insight.  ~Evan

          ----------------------------------------------------------

           

          Rob Wunderlich Jan 17, 2012 11:58 PM (in response to EvanKurowski)

          I'll say I'm happy that you found a solution but I'm unhapy with the answer. Lineage, at most, should represent a couple hundered bytes of data. I note that when you reload the data, it shrink to a reasonable size. I see our problem, but I'm not convinced that lineage is the culprit.

           

          -Rob

          ---------------------------------------------

          Pablo Labbe Jan 18, 2012 11:35 AM (in response to Rob Wunderlich)

          Rob,

           

            Your doubt forced me to create a document with some test cases trying to reproduce this odd behavior.  Well, "unfortunatelly", the QVD's generated by the test document didn´t grow as Evans QVD.

           

            I've opened the QVD's generate in a Text Editor to check the Lineage info and what I found ? The  Lineage info didn´t grow as expected, even when the load script have a loop reading and writing the same QVD several times.

           

            I think the problem occurs on very special cases.

           

            I've done the tests running QV10 SR3 and QV10 SR4.

           

            Evan, wich version are you running ?  Can you share your load script ?

           

            This is the sample code I've created trying to reproduce the problem:

          1. 1. FOR X = 1 TO 3  
            1. 2.  
            2. 3.    SAMPLE_TABLE4: 
              1. 4.    LOAD * from SAMPLE_TABLE4.QVD (QVD); //will fail at first time, because file didn´t exists. 
              2. 5.  
                1. 6.    CONCATENATE 
                2. 7.        LOAD Date,  
                  1. 8.      Open,  
                  2. 9.      High,  
                    1. 10.      Low,  
                    2. 11.      Close,  
                      1. 12.      Volume,  
                      2. 13.      [Adj Close] 
                        1. 14. FROM 
                        2. 15. [http://ichart.finance.yahoo.com/table.csv?s=ABB&a=09&b=12&c=2011&d=00&e=06&f=2012&g=d&ignore=.csv&redirected=1
                          1. 16. (txt, codepage is 1252, embedded labels, delimiter is ',', msq); 
                          2. 17.  
                            1. 18.  
                            2. 19. NEXT X; 
                              1. 20.  
                              2. 21. STORE SAMPLE_TABLE4 into SAMPLE_TABLE4.qvd; 
                                1. 22.  
                                2. 23. DROP TABLE SAMPLE_TABLE4; 

          --------------------------------------------

          Rob Wunderlich Jan 18, 2012 2:03 PM (in response to Pablo Labbe)

          I see I should have opened the file with a text editor before posting. There is indeed 113K lines of linage info as Evans said, which would account for the size difference. I think what's unique about Evan's case is that he has so many different sources -- the URL changes.

           

          -Rob

          ---------------------------------------------

          EvanKurowski Jan 26, 2012 7:49 PM (in response to Pablo Labbe)

          You have the right idea with the iterator... I think the memory expansion issue occurs when table creation starts getting dynamic.

           

          FOR vIter = 1 TO 3

          [SAMPLE_TABLE$(vIter)]:
          LOAD

          (fields)

          FROM

          (source)

          STORE [SAMPLE_TABLE$(vIter)] into [SAMPLE_TABLE$(vIter).qvd];

           

          DROP TABLE [SAMPLE_TABLE$(vIter)];



          NEXT;

          ------------------------------------

          Pablo Labbe Jan 18, 2012 10:51 AM (in response to EvanKurowski)

          Glad to help you Evan.

          -----------------------------------

          Steve Dark Apr 2, 2013 11:42 AM (in response to EvanKurowski)

          Just to add to this discussion... I too saw this problem in a specific scenario - with a looped load through various database connections.  My efforts to recreate the problem in a sandbox all failed - so it certainly took a certain set of criteria.

           

          There is a similar thread to this, with a solution to solving the problem on Server rather than Desktop, which may be of use to people: http://community.qlik.com/thread/49240

           

          It also seems that upgrading to the latest release should fix the issue - rather than throwing the lineage baby out with the bath water.

           

          - Steve

           

          http://www.quickintelligence.co.uk/


          ---------------------------------

          Bruce Jan 18, 2012 7:11 PM (in response to EvanKurowski)

          I've seen this since we switched from 9 to 10. I was doing a series of qvd generation (within a loop) and each qvd got larger even though it had the same number of rows or less. Even qvd files with zero rows got bigger & bigger.

           

          I'll be happy when they fix this for real (or we get sr4 & we can cheat with the easter egg that Giuseppe suggested).

          • Re: Expansion of QVD file size during iterated process

            QlikView QVD files contain lineage information of the stored data.

             

            The lineage information can be found in the QVD file XML header and might look like below, with

            connection details and the SQL statements used during reload.

             

             

               <Lineage>

                 <LineageInfo>

                   <Discriminator>INLINE</Discriminator>

                   <Statement></Statement>

                 </LineageInfo>

                 <LineageInfo>

                   <Discriminator>ODBC;DSN=Demo - 3D8</Discriminator>

                   <Statement>SQL SELECT oid,createDate,entryTimeStamp ,modifiedTimeStamp,code,name,myBranch,myStatus, address1, address2, address3,  alpha, emailAddress, fax, mobile, phoneNumber, postCode FROM Customer WHERE (modifiedTimeStamp &gt;= {d '01/01/2010'}) AND (modifiedTimeStamp &lt; {d '02/01/2010'})</Statement>

                 </LineageInfo>

                 <LineageInfo>

                   <Discriminator>ODBC;DSN=Demo - 3D8</Discriminator>

                   <Statement>Tmp:

               Load *;SQL SELECT oid,createDate,entryTimeStamp ,modifiedTimeStamp,code,name,myBranch,myStatus, address1, address2, address3,  alpha, emailAddress, fax, mobile, phoneNumber, postCode FROM Customer WHERE (modifiedTimeStamp &gt;= {d '01/01/2010'}) AND (modifiedTimeStamp &lt; {d '02/01/2010'})</Statement>

                 </LineageInfo>

                 <LineageInfo>

                   <Discriminator>STORE - Data\RPT\Customer\2010\20100101.Qvd(Qvd)</Discriminator>

                   <Statement></Statement>

                 </LineageInfo>

                 <LineageInfo>

                   <Discriminator>c:\venkat works\clients\independence australia\data\rpt\tables.xlsx</Discriminator>

                   <Statement></Statement>

                 </LineageInfo>

               </Lineage>

             

            The lineage information can be suppressed since QlikView 10 SR4 and 11 IR by using a Easter Egg setting

            as described below.

             

            Enable Easter Egg feature in QlikView Desktop client

            1. Open Qlikview Desktop client
            2. Go to Help > About Qlikview...
            3. Right click the QlikView logo in the bottom right corner
            4. Highlight the AllowDataLineage value
            5. Set the value to -1 to disable the Lineage information
            6. Click Set to store the value change
            7. Close the settings and about dialog boxes
            8. Reload the QVD generator application           

             

            Enable Easter Egg on QlikView Server

            1. Open the QlikView Server’s batch settings.ini file in a text editor. Default file path;
            2. C:\Windows\System32\config\systemprofile\AppData\Roaming\QlikTech\QlikViewBatch\Settings.ini
            3. Add AllowDataLineage=-1 in the [Settings 7] section of the settings file.
            4. Save the settings file and close text editor
            5. Restart QlikView Server, to trigger loading of the altered settings
            6. Reload the QVW file from QlikView management console to see that the setting has been changed
            7. To enable the lineage information follow the steps above, but set the Value to 0 instead of -1 in step 5.
            • Re: Expansion of QVD file size during iterated process
              Olaf Gschweng

              In a project we convert tons of MDB files one by one to QVD files with Qlik Sense. QVD files generated in a single run get bigger and bigger. We found no option to turn off lineage information in Sense. But we found a solution anyway. After all tables have been dropped the lineage info seems to be cleared - which makes sense. So just drop all tables before starting with the next file.

               

              I don't know if that works in QlikView, too. It could be a better solution than to change a global option.