Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have read everything I can find on how to set up a partial load script and I still haven't found an example that fits my scenario enough to help me. Wondering if anyone has an example that could help me. Here is what I am doing;
I have a database table with over 50 million rows, each month we add another million. I have already loaded the initial 50 million into a QVD and now want to run a daily refresh that will add the new rows into my existing QVD. I was hoping to use the Partial reload feature in Qlik. I have a event timestamp I can use to determine what is new versus existing.
I don't understand how to write the partial statement so it will work in an automation. I have read that you write your initial load statement and then add a section that starts with a Merge or Add statement but I don't understand how you do that and also reference the timestamp variable. For instance, I currently have a script that is this;
Load
fields;
select
fields
from database table (that is 50 million rows);
store into QVD
... now what do I do in order to add rows into QVD where the timestamp from the database table is greater than the max timestamp from my qvd.
Also, I know how to do this in an incremental reload but the syntax I am using for that doesn't work with the partial reload function in qlik because I am loading the qvd first to get the max timestamp and a partial reload ignore the load statement so it errors.
Appreciate the help!
That was very helpful. I also prefer to run the initial load first and then comment out that part. That is how I had my incremental refresh set up originally. Again, being new to Qlik, I thought that was not how it should be done. Appreciate the help with this one!
@Kelliesy what version of QlikSense are you on?
I'm guessing you are following this documentation correct?
https://help.qlik.com/en-US/sense/May2024/Subsystems/Hub/Content/Sense_Hub/Scripting/ScriptPrefixes/...
We're using Qlik Sense Cloud. Yes, I have read that article along with a bunch of others but it doesn't help. I need some sort of example script that is similar to what I am doing.
I never used partial loads in a productive scenario else just a bit playing with it ages ago. Mainly because this feature isn't in the same way generally applicable as incremental approaches and therefore it's more increasing the complexity as it would be sensible to develop and maintain an environment.
In regard to your case I think an incremental load should be working well, for example in a way like this:
t: load * from X where T > '$(vTmax)';
x: load timestamp(max(fieldvalue('T', recno()))) as x autogenerate fieldvaluecount('T');
let vTmax = peek('x', 0, 'x'); drop tables x;
concatenate(t)
load * from Historic.qvd (qvd);
store t into Historic.qvd (qvd);
The big load of the historic data is performed optimized and the max. timestamp is always read from the current data against the system-table.
This is really good to know. I thought not using the partial reload was me being a bad Qlik user. My incremental refresh is much more complex then what you have so I like yours better. Can you walk my through your example? How can you load data using a variable you don't define until later in the script? Is that the true way I should lay out my refresh? How I currently do it is;
1. Load my existing QVD
2. Find the Max time stamp from that QVD
3. Concatenate my existing QVD with any data from my table that has a timestamp greater than my QVD by starting with a Concatenate (QVD) select * from table where timestamp > 'vmaxtimestamp'
4. Load into my existing QVD.
My suggestion is a shortcut-version for a running task and doesn't include the initial load if it runs the first time. To cover this the above needs to be imbedded within an at least one if-loop checking the length of the variable or the exists of the historic.qvd or similar things - and if not do this and otherwise that.
Of course this is possible but it increased IMO only the complexity/readability of the script just for an unique execution. Therefore I try to avoid such logic and run the first execution manually by setting the variables manually and/or commenting parts of the script - in the above scenario it might be:
t: load * from X; // where T > '$(vTmax)';
x: load timestamp(max(fieldvalue('T', recno()))) as x autogenerate fieldvaluecount('T');
let vTmax = peek('x', 0, 'x'); drop tables x;
// concatenate(t)
// load * from Historic.qvd (qvd);
store t into Historic.qvd (qvd);
This kind of comments is easily applied and removed again and could be also repeated by demand and it's saving development time because to design and check the first run and/or to include further exceptions (data status/quality, error-handling, ...) will need much more efforts as the few lines above.
That was very helpful. I also prefer to run the initial load first and then comment out that part. That is how I had my incremental refresh set up originally. Again, being new to Qlik, I thought that was not how it should be done. Appreciate the help with this one!