Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
sifatnabil
Specialist
Specialist

QVD concatenation drops data sometimes

Hi all,

I have a batch file which loads a qvw, which queries a DB and concatenates the result to a QVD every hour, and then replaces the original QVD. This works fine most of the time, but occasionally (every few weeks), the task drops data and concatenates the latest load with the QVD as it looked 1 month ago, which means data from 1 month ago to today is missing from the full QVD.

So for example, today (11 Feb 8 am) the batch file ran and concatenated the result with the QVD but from 11 Jan, instead of 11 Feb 7am - so now the date field sequence on the QVD is 11 Jan, 11 Feb, and the dates in between are missing.

Note that this task has been running since 2014 and so it's grown to over 400MB. Could this be why?

1 Solution

Accepted Solutions
gsbeaton
Luminary Alumni
Luminary Alumni

I think you are nearly there.  Your pseudo code infers a couple of logic issues, but it may be ok.  Just to be clear though, here are the steps that I think you should be taking:

  1. Always load your LoginPoll QVD, no need for an IF() around that, you will need it every time, except for the very first time you ever ran the script (which sounds like it was a few years ago).  If the QVD is not there, you have bigger problems.
  2. Put a timestamp in the filenames of all of your snapshots, so they are something like snapshot-2016021114.csv (YYYYMMDDHH)
  3. Use the qvdcreatetime() function to get the time the QVD was last updated, then store that in a variable.
  4. Use the timestamp from your variable created in step 2 here as a parameter to load the correct snapshots.  Personally I would do this by creating an array of times which you can then loop through in the next step.  When I say array, I mean a parametarised string, eg 2016021110; 2016021111; 2016021112; 2016021113;
  5. Load all of your files in from step 4.  So long as they have the same column names as your QVD from step 1, they will automatically concatenate.
  6. Store your table and drop from memory.

Another approach would be to use Qlik's wildcard feature to load all the snapshots in the folder, then delete or move them once successfully loaded.   Both can be done from within QlikView.  There's would be a lot less to go wrong with that approach.

Hope that helps

George

View solution in original post

17 Replies
swuehl
MVP
MVP

Are you holding a copy of each hourly created QVD and decide with some logic which historic QVD to load?

Or are you using any other time based logic in your script (in short, please post your script)?

sifatnabil
Specialist
Specialist
Author

Hi, the entire task generates only 1 QVD which keeps growing in size due to hourly concatenations to itself.

Here's the simplified algo, which runs hourly.

If exists historicalQVD, then

load historical QVD

else [query DB];

concatenate

[query DB];

store full result as historicalQVD;

swuehl
MVP
MVP

I assume this is an incremental approach, so you are looking for the max datetime in the historicalQVD, then filter the query DB to only fetch datetime > max QVD datetime, right?

Is the issue always similar, like exactely one month is missing?

Are you tracing that max found QVD date time to logfile?

Or do you know rows written to QVD vs. rows read in the subsequent run?

Any more specific observations?

gsbeaton
Luminary Alumni
Luminary Alumni

Hi Sifat,

I don't have a definite answer, but a few of pointers that may help:

  • There is no maximum size for a QVD, 400mb is relatively small, I'm working with QVDs of >15gb with no issues.
  • The QVD is locked while reading and writing
  • The STORE process in itself will not drop or truncate fields.

My hunch would be that there is something in your script logic that is causing you to lose records.  It may take a bit of forensic analysis to find out what it is though.  If you want to post more of your script here, then I'd be happy to take a look and help further.

Good luck

George

sifatnabil
Specialist
Specialist
Author

I'm actually not looking at the max date time, or enforcing any rule while concatenating. And the issue isn't similar, sometimes it will drop 20 days or 10 days of data. This time for example, the size of the historical QVD dropped from 400MB to 370MB, so it dropped 30MB of data. 

Basically, the DB query takes a "snapshot" of the DB table every hour, and I use reloadtime() to generate the timestamp/date to tag the snapshot. It then contenates that snapshot with the historical snapshots. So:

On 11 Jan 8am, the fields will be 8:00:00 11 Jan, field1, field2, field3

On 11 Jan 9am, the fields will be 9:00:00 11 Jan, field1, field2, field3

On 11 Jan 10am, the fields will be 10:00:00 11 Jan, field1, field2, field3


Now when I look at the historical QVD, I can see a larger set of data showing all data relating to 8am, 9am, 10am, etc.


So I don't use max date or anything like that. Just a blunt concatenate.

Anonymous
Not applicable

Can you share the load script ?

sifatnabil
Specialist
Specialist
Author

swuehl

Here's the actual script:

let ReloadTime = now();

If qvdcreatetime('QVDs\LoginPoll.qvd') Then

USLogin:

LOAD

*

FROM [QVDs\LoginPoll.qvd] (qvd);

else

USLogin:

Load

*,

'$(ReloadTime)' as ReloadTime

from DB;

end if

Concatenate(USLogin)

USLogin:

Load

*,

'$(ReloadTime)' as ReloadTime

from DB;

store [USLogin] as [QVDs\LoginPoll.qvd] (qvd);

swuehl
MVP
MVP

Sorry, I don't have any specific idea then.

If you think that size may matter, does your machine hit the limits when you reload? Which version  of QV, OS and amount of RAM are you using?

To trace down the issue, could you add some TRACES to e.g. log the last timestamp of the historic QVD when read in (using Peek() ), to see if the issue arise from reading the QVD or concatenating the new data?

sifatnabil
Specialist
Specialist
Author

I see..actually I noticed that on Task Scheduler on the server (Windows 2012 R2), during the load, the CPU goes up to 100% momentarily while loading/writing the QVD, which is around 17 million rows now.