<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Data loading crashes with 20GB qvd in Connectivity &amp; Data Prep</title>
    <link>https://community.qlik.com/t5/Connectivity-Data-Prep/Data-loading-crashes-with-20GB-qvd/m-p/2514690#M14799</link>
    <description>&lt;PRE class="tw-data-text tw-text-large tw-ta" dir="ltr" style="text-align: left;" data-placeholder="Traduction" data-ved="2ahUKEwi2vqjCzd6MAxUBfKQEHe2FNVEQ3ewLegQICBAU" aria-label="Texte traduit&amp;nbsp;: Hello,

I need to optimize a loading script that handles QVDs ranging from 10 to 20 GB.

The problem is that data retrieval takes a very long time and then fails due to a lack of server resources for the largest QVD (the server has 64 GB).

The script will retrieve data for one day, then concatenate it with the existing data in the QVD, and finally replace the QVD with the new data.
There is a &amp;quot;where not exists&amp;quot; clause to avoid duplicates."&gt;&lt;SPAN class="Y2IQFc"&gt;Hello,
I need to optimize a loading script (that I did not create) that handles QVDs ranging from 10 to 20 GB.
The problem is that data retrieval takes a very long time and then fails due to a lack of server resources for the largest QVD (the server has 64 GB).
&lt;BR /&gt;The script will retrieve data for the current day, &lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN class="Y2IQFc"&gt;then concatenate it with the existing data in the QVD, &lt;BR /&gt;and finally replace the QVD with the new data.
There is a "where not exists" clause to avoid duplicates.&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN class="Y2IQFc"&gt;Changes already implemented :
CreateSearchIndexOnReload is set to 0
We only perform "where not exists" on data with a date within the loading period and we retrieve everything before this period&lt;BR /&gt;&lt;BR /&gt;This solved the problem with daily data recovery, but due to repeated crashes, we have several days to recover and this is too much for the server for the largest qvds.&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN class="Y2IQFc"&gt;&lt;BR /&gt;I'm currently trying to purge the data to keep only one year but the server doesn't have enough resources to do something like "load * from qvd where date &amp;gt; 04/17/2024 if the qvd exceeds 15gb&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN class="Y2IQFc"&gt;&lt;BR /&gt;Also, we currently have 1.5 years of data and as i said, one of the qvd's is 20gb. That seems huge to me.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN class="Y2IQFc"&gt;Do you have any ideas on how the script can be optimized and if the size of the qvd seems normal to you?&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN class="Y2IQFc"&gt;Thanks in advance&lt;BR /&gt;&lt;BR /&gt;The script looks like this :&lt;BR /&gt;&lt;BR /&gt;LET LastExecTime = if(LastExecTime, LastExecTime, MakeDate(2025,4,1));&lt;BR /&gt;Let ThisExecTime = now();&lt;BR /&gt;&lt;BR /&gt;LET v_begin = date(LastExecTime, 'YYYYMMDD hh:mm:ss.fff');&lt;BR /&gt;LET v_end = date(ThisExecTime, 'YYYYMMDD hh:mm:ss.fff') ;&lt;BR /&gt;&lt;BR /&gt;TEST:&lt;BR /&gt;LOAD &lt;BR /&gt;DateTime, &lt;BR /&gt;TagName,&lt;BR /&gt;Num(Value,'##,##') AS Value, &lt;BR /&gt;Date(DateTime) &amp;amp; '/' &amp;amp; Time(DateTime) &amp;amp; '/' &amp;amp; TagName &amp;amp; 'XXX' AS Key;&lt;BR /&gt;&lt;BR /&gt;SQL SELECT [DateTime], &lt;BR /&gt;[TagName], &lt;BR /&gt;[Value], &lt;BR /&gt;FROM XXX&lt;BR /&gt;AND [DateTime] &amp;gt;= '$(v_begin)' &lt;BR /&gt;AND [DateTime] &amp;lt;= '$(v_end)';&lt;BR /&gt;&lt;BR /&gt;TMP:&lt;BR /&gt;NoConcatenate&lt;BR /&gt;LOAD *&lt;BR /&gt;FROM [lib://XXX/TEST.qvd] (qvd) &lt;BR /&gt;Where DateTime &amp;gt;= '$(v_begin)';&lt;BR /&gt;&lt;BR /&gt;Concatenate(TEST) &lt;BR /&gt;LOAD *&lt;BR /&gt;Resident TMP&lt;BR /&gt;WHERE not exists(Key);&lt;BR /&gt;&lt;BR /&gt;DROP TABLE TMP;&lt;BR /&gt;&lt;BR /&gt;Concatenate(TEST) &lt;BR /&gt;LOAD *&lt;BR /&gt;FROM [lib://XXX/TEST.qvd] (qvd) &lt;BR /&gt;Where DateTime &amp;lt; '$(v_begin)' ;&lt;BR /&gt;&lt;BR /&gt;store * FROM TEST into [lib://XXX/TEST.qvd];&lt;BR /&gt;&lt;BR /&gt;DROP TABLE TEST;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/PRE&gt;</description>
    <pubDate>Thu, 17 Apr 2025 08:42:33 GMT</pubDate>
    <dc:creator>ChrisAess</dc:creator>
    <dc:date>2025-04-17T08:42:33Z</dc:date>
    <item>
      <title>Data loading crashes with 20GB qvd</title>
      <link>https://community.qlik.com/t5/Connectivity-Data-Prep/Data-loading-crashes-with-20GB-qvd/m-p/2514690#M14799</link>
      <description>&lt;PRE class="tw-data-text tw-text-large tw-ta" dir="ltr" style="text-align: left;" data-placeholder="Traduction" data-ved="2ahUKEwi2vqjCzd6MAxUBfKQEHe2FNVEQ3ewLegQICBAU" aria-label="Texte traduit&amp;nbsp;: Hello,

I need to optimize a loading script that handles QVDs ranging from 10 to 20 GB.

The problem is that data retrieval takes a very long time and then fails due to a lack of server resources for the largest QVD (the server has 64 GB).

The script will retrieve data for one day, then concatenate it with the existing data in the QVD, and finally replace the QVD with the new data.
There is a &amp;quot;where not exists&amp;quot; clause to avoid duplicates."&gt;&lt;SPAN class="Y2IQFc"&gt;Hello,
I need to optimize a loading script (that I did not create) that handles QVDs ranging from 10 to 20 GB.
The problem is that data retrieval takes a very long time and then fails due to a lack of server resources for the largest QVD (the server has 64 GB).
&lt;BR /&gt;The script will retrieve data for the current day, &lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN class="Y2IQFc"&gt;then concatenate it with the existing data in the QVD, &lt;BR /&gt;and finally replace the QVD with the new data.
There is a "where not exists" clause to avoid duplicates.&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN class="Y2IQFc"&gt;Changes already implemented :
CreateSearchIndexOnReload is set to 0
We only perform "where not exists" on data with a date within the loading period and we retrieve everything before this period&lt;BR /&gt;&lt;BR /&gt;This solved the problem with daily data recovery, but due to repeated crashes, we have several days to recover and this is too much for the server for the largest qvds.&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN class="Y2IQFc"&gt;&lt;BR /&gt;I'm currently trying to purge the data to keep only one year but the server doesn't have enough resources to do something like "load * from qvd where date &amp;gt; 04/17/2024 if the qvd exceeds 15gb&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN class="Y2IQFc"&gt;&lt;BR /&gt;Also, we currently have 1.5 years of data and as i said, one of the qvd's is 20gb. That seems huge to me.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN class="Y2IQFc"&gt;Do you have any ideas on how the script can be optimized and if the size of the qvd seems normal to you?&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN class="Y2IQFc"&gt;Thanks in advance&lt;BR /&gt;&lt;BR /&gt;The script looks like this :&lt;BR /&gt;&lt;BR /&gt;LET LastExecTime = if(LastExecTime, LastExecTime, MakeDate(2025,4,1));&lt;BR /&gt;Let ThisExecTime = now();&lt;BR /&gt;&lt;BR /&gt;LET v_begin = date(LastExecTime, 'YYYYMMDD hh:mm:ss.fff');&lt;BR /&gt;LET v_end = date(ThisExecTime, 'YYYYMMDD hh:mm:ss.fff') ;&lt;BR /&gt;&lt;BR /&gt;TEST:&lt;BR /&gt;LOAD &lt;BR /&gt;DateTime, &lt;BR /&gt;TagName,&lt;BR /&gt;Num(Value,'##,##') AS Value, &lt;BR /&gt;Date(DateTime) &amp;amp; '/' &amp;amp; Time(DateTime) &amp;amp; '/' &amp;amp; TagName &amp;amp; 'XXX' AS Key;&lt;BR /&gt;&lt;BR /&gt;SQL SELECT [DateTime], &lt;BR /&gt;[TagName], &lt;BR /&gt;[Value], &lt;BR /&gt;FROM XXX&lt;BR /&gt;AND [DateTime] &amp;gt;= '$(v_begin)' &lt;BR /&gt;AND [DateTime] &amp;lt;= '$(v_end)';&lt;BR /&gt;&lt;BR /&gt;TMP:&lt;BR /&gt;NoConcatenate&lt;BR /&gt;LOAD *&lt;BR /&gt;FROM [lib://XXX/TEST.qvd] (qvd) &lt;BR /&gt;Where DateTime &amp;gt;= '$(v_begin)';&lt;BR /&gt;&lt;BR /&gt;Concatenate(TEST) &lt;BR /&gt;LOAD *&lt;BR /&gt;Resident TMP&lt;BR /&gt;WHERE not exists(Key);&lt;BR /&gt;&lt;BR /&gt;DROP TABLE TMP;&lt;BR /&gt;&lt;BR /&gt;Concatenate(TEST) &lt;BR /&gt;LOAD *&lt;BR /&gt;FROM [lib://XXX/TEST.qvd] (qvd) &lt;BR /&gt;Where DateTime &amp;lt; '$(v_begin)' ;&lt;BR /&gt;&lt;BR /&gt;store * FROM TEST into [lib://XXX/TEST.qvd];&lt;BR /&gt;&lt;BR /&gt;DROP TABLE TEST;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 17 Apr 2025 08:42:33 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Connectivity-Data-Prep/Data-loading-crashes-with-20GB-qvd/m-p/2514690#M14799</guid>
      <dc:creator>ChrisAess</dc:creator>
      <dc:date>2025-04-17T08:42:33Z</dc:date>
    </item>
    <item>
      <title>Re: Data loading crashes with 20GB qvd</title>
      <link>https://community.qlik.com/t5/Connectivity-Data-Prep/Data-loading-crashes-with-20GB-qvd/m-p/2514699#M14801</link>
      <description>&lt;P&gt;&lt;a href="https://community.qlik.com/t5/user/viewprofilepage/user-id/335543"&gt;@ChrisAess&lt;/a&gt;&amp;nbsp;I think it is append and update incremental load.&lt;/P&gt;&lt;P&gt;// 1. Load new data from SQL source&lt;BR /&gt;NEW:&lt;BR /&gt;LOAD&lt;BR /&gt;DateTime,&lt;BR /&gt;TagName,&lt;BR /&gt;Num(Value, '##,##') AS Value,&lt;BR /&gt;Date(DateTime) &amp;amp; '/' &amp;amp; Time(DateTime) &amp;amp; '/' &amp;amp; TagName &amp;amp; 'XXX' AS Key;&lt;/P&gt;&lt;P&gt;SQL SELECT&lt;BR /&gt;[DateTime],&lt;BR /&gt;[TagName],&lt;BR /&gt;[Value]&lt;BR /&gt;FROM XXX&lt;BR /&gt;WHERE [DateTime] &amp;gt;= '$(v_begin)' AND [DateTime] &amp;lt;= '$(v_end)';&lt;/P&gt;&lt;P&gt;// 2. Load existing data from QVD&lt;BR /&gt;OLD:&lt;BR /&gt;LOAD&lt;BR /&gt;DateTime,&lt;BR /&gt;TagName,&lt;/P&gt;&lt;P&gt;Value&lt;BR /&gt;FROM [lib://XXX/TEST.qvd] (qvd);&lt;/P&gt;&lt;P&gt;OLD_N:&lt;/P&gt;&lt;P&gt;Load&amp;nbsp;&lt;/P&gt;&lt;P&gt;DateTime,&lt;/P&gt;&lt;P&gt;TagName,&lt;BR /&gt;Num(Value, '##,##') AS Value,&lt;BR /&gt;Date(DateTime) &amp;amp; '/' &amp;amp; Time(DateTime) &amp;amp; '/' &amp;amp; TagName &amp;amp; 'XXX' AS Key&lt;/P&gt;&lt;P&gt;Resident OLD;&lt;/P&gt;&lt;P&gt;Drop table OLD;&lt;/P&gt;&lt;P&gt;// 3. Keep only old records that are NOT in the new data (removing duplicates or rows to be updated)&lt;BR /&gt;CLEANED_OLD:&lt;BR /&gt;NoConcatenate&lt;BR /&gt;LOAD *&lt;BR /&gt;Resident OLD_N&lt;BR /&gt;WHERE NOT Exists(Key);&lt;/P&gt;&lt;P&gt;DROP TABLE OLD_N;&lt;/P&gt;&lt;P&gt;// 4. Merge old (cleaned) + new&lt;BR /&gt;Concatenate(CLEANED_OLD)&lt;BR /&gt;LOAD *&lt;BR /&gt;Resident NEW;&lt;/P&gt;&lt;P&gt;DROP TABLE NEW;&lt;/P&gt;&lt;P&gt;// 5. Store final merged table back to QVD&lt;BR /&gt;STORE CLEANED_OLD INTO [lib://XXX/TEST.qvd] (qvd);&lt;/P&gt;&lt;P&gt;DROP TABLE CLEANED_OLD;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 17 Apr 2025 10:58:56 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Connectivity-Data-Prep/Data-loading-crashes-with-20GB-qvd/m-p/2514699#M14801</guid>
      <dc:creator>Bhushan_Mahajan</dc:creator>
      <dc:date>2025-04-17T10:58:56Z</dc:date>
    </item>
    <item>
      <title>Re: Data loading crashes with 20GB qvd</title>
      <link>https://community.qlik.com/t5/Connectivity-Data-Prep/Data-loading-crashes-with-20GB-qvd/m-p/2514721#M14804</link>
      <description>&lt;P&gt;This approach must be slow - because no qvd-load is optimized. To make a qvd-load optimized no transformations must be applied - allowed is only a where exists() with a single parameter. This means not doing:&lt;/P&gt;&lt;P&gt;where date &amp;gt;= MyDate;&lt;/P&gt;&lt;P&gt;else it should look like:&lt;/P&gt;&lt;P&gt;where exists(MyDate);&lt;/P&gt;&lt;P&gt;These date-values could be created in beforehand with something like:&lt;/P&gt;&lt;P&gt;t0: load date(FirstDate + recno() - 1) as MyDate&lt;BR /&gt;autogenerate&amp;nbsp;LastExecutionDate - FirstDate;&lt;/P&gt;&lt;P&gt;then comes the load of the historic data and then the current ones, like:&lt;/P&gt;&lt;P&gt;&amp;nbsp;t1: load * from QVD where exists(MyDate);&lt;/P&gt;&lt;P&gt;concatenate(t1)&lt;/P&gt;&lt;P&gt;load *;&lt;BR /&gt;sql select * from DB where MyDate &amp;gt;&amp;nbsp;&amp;nbsp;LastExecutionDate;&lt;/P&gt;&lt;P&gt;store t1 into ...; drop tables t0;&lt;BR /&gt;let&amp;nbsp;LastExecutionDate = today();&lt;/P&gt;&lt;P&gt;No further resident loads and/or creating combined keys.&lt;/P&gt;&lt;P&gt;If your filter-field is really a timestamp and not a date you will with this suggestion remove some records and loading them n times - but this could be neglected.&lt;/P&gt;&lt;P&gt;Be further aware that you should load a time-part only if it's mandatory in any views and if it's in the most scenarios better to split the timestamp into dates and times (maybe even separating seconds from milli-secods). By reducing the distinct number of field-values the RAM consumption will be (significantly) minimized.&lt;/P&gt;</description>
      <pubDate>Thu, 17 Apr 2025 13:44:29 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Connectivity-Data-Prep/Data-loading-crashes-with-20GB-qvd/m-p/2514721#M14804</guid>
      <dc:creator>marcus_sommer</dc:creator>
      <dc:date>2025-04-17T13:44:29Z</dc:date>
    </item>
    <item>
      <title>Re: Data loading crashes with 20GB qvd</title>
      <link>https://community.qlik.com/t5/Connectivity-Data-Prep/Data-loading-crashes-with-20GB-qvd/m-p/2516255#M14855</link>
      <description>&lt;PRE id="tw-target-text" class="tw-data-text tw-text-large tw-ta" dir="ltr" style="text-align: left;" data-placeholder="Traduction" data-ved="2ahUKEwjb5fbpq_-MAxV5TaQEHZd3OqMQ3ewLegQICBAU" aria-label="Texte traduit&amp;nbsp;: Thank you very much,
The loading is running normally after several optimizations."&gt;&lt;SPAN class="Y2IQFc"&gt;Thank you very much,
The loading is running normally after several optimizations.&lt;/SPAN&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 30 Apr 2025 08:31:43 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Connectivity-Data-Prep/Data-loading-crashes-with-20GB-qvd/m-p/2516255#M14855</guid>
      <dc:creator>ChrisAess</dc:creator>
      <dc:date>2025-04-30T08:31:43Z</dc:date>
    </item>
  </channel>
</rss>

