Qlik Community

Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

Preceding Load

A QlikView feature that is poorly known and brilliant in its simplicity is the Preceding Load.

 

If you don’t know what it is, then I strongly suggest that you read this blog post and find out. Because it will help you in your QlikView scripting.

 

So what is it?

 

It is a way for you to define successive transformations and filters so that you can load a table in one pass but still have several transformation steps. Basically it is a Load statement that loads from the Load/SELECT statement below.

 

Example: you have a database where your dates are stored as strings and you want to use the QlikView date functions to interpret the strings. But the QlikView date functions are not available in the SELECT statement. The solution is to put a Load statement in front of the SELECT statement: (Note the absence of “From” or “Resident”.)

 

Load Date#(OrderDate,’YYYYMMDD’) as OrderDate;
SQL SELECT OrderDate FROM … ;

 

What happens then is that the SELECT statement is evaluated first, and the result is piped into the Load statement that does the date interpretation. The fact that the SELECT statement is evaluated before the Load, is at first glance confusing, but it is not so strange. If you read a Preceding Load as

 

     Load From ( Select From ( DB_TABLE ) )

 

then it becomes clearer. Compare it with nested functions: How would you evaluate “Round( Exp( x ) )”. You would of course evaluate the Exp() function first and then the Round() function. That is, you evaluate it from right to left.

 

Input - Output.png

 

The reason is that the Exp() function is closest to the source data, and therefore should be evaluated first. It’s the same with the Preceding Load: The SELECT is closest to the source data and should therefore be evaluated first. In both cases, you can look at it as a transformation that has an input and an output and to do it correctly, you need to start with the part of the transformation closest to the input.

 

Any number of Loads can be “nested” this way. QlikView will start from the bottom and pipe record by record to the closest preceding Load, then to the next, etc. And it is almost always faster than running a second pass through the same table.

 

With preceding Load, you don’t need to have the same calculation in several places. For instance, instead of writing

 

Load  ... ,
   Age( FromDate + IterNo() – 1, BirthDate ) as Age,
   Date( FromDate + IterNo() – 1 ) as ReferenceDate
   Resident Policies
      While IterNo() <= ToDate - FromDate + 1 ;

 

where the same calculation is made for both Age and ReferenceDate, I would in real life define my ReferenceDate only once and then use it in the Age function in a Preceding Load:

 

Load  ..., ReferenceDate,
   Age( ReferenceDate, BirthDate ) as Age;
Load  *,
   Date( FromDate + IterNo() – 1 ) as ReferenceDate
   Resident Policies
      While IterNo() <= ToDate - FromDate + 1 ;

 

The Preceding Load has no disadvantages. Use it. You’ll love it.

 

HIC

96 Comments
Not applicable

Thanks for the good post, really worth to read .

0 Likes
3,151 Views
Partner
Partner

Hi Henric,

I use Preceding Load very often, but there's only one disadvantage that i know off.

If you're loading several files with a wildcard, the preceding load will not only be processed for the first file, but it will also break the 'auto' concatenation.

Check the following demonstration: http://dl.dropbox.com/u/3653801/preceding%20load.zip

Best Regards,

Carlos

0 Likes
3,151 Views

You're absolutely right.

I didn't think of this complication since I see it as a flaw of the wildcard load - which I stopped using years ago. Instead I use a For-Next loop:

For each vFile in FileList('C:\path\*.txt')

          Load ... From [$(vFile)] ;

Next vFile

Which is identical to

Load ... From ;

And within the For-Next loop you can put as many Preceding Loads as you wish...

HIC

3,151 Views
Partner
Partner

Hi HIC,

Another great QV tips.

0 Likes
3,151 Views
MVP & Luminary
MVP & Luminary

As well as the wildcard gotcha, doing a preceding load on a crosstable can also not behave as you would expect.  This is one of the cases where I would load into a temporary table and then do further work on a RESIDENT load.

Thanks for another useful post Henric!

3,151 Views
Not applicable

Thanks for another wonderful post Henric!

Preceding is definitely quite  handy to make script easy to read and maintain but one must remember to balance it coz overdoing its nesting can sometimes cause reloads slow as a preceding load does has some overhead and writing the expression multi times can result in a faster load depending on the expresisons used in load script.

Just my 2 cents!

Regards,
Ashutosh

0 Likes
3,151 Views
Not applicable

Hi Henric,

Could you please also throw some light on the comparison between Resident Load & Preceding Load ?

Is it always better to use Preceding Load instead of resident load to add few more calculated cols using the already loaded fields ?

Is it faster than Resident Load ? etc...

Thanks for your time

Regards

0 Likes
3,151 Views
hariharasudan_p
Contributor

Thanks HIC. I have used this for doing a small ETL work.

I dont see any performance drawback. But it all depends on what kind of expression is written over there.

HIC, i wish to hear from you +ve and -ve of Resident load vs Preceding load. Can you let us know from your experience ?

0 Likes
3,151 Views

If you compare a single preceding Load with a single resident Load where the two Loads have identical content, then I would say that the preceding Load always is faster. In the preceding Load, the records are evaluated "on the fly" and this makes it fast.

I made a small test where I had a Load with a few string operations. When I put them in a preceding Load, the script execution time increased by 15-20%. When I put them in a resident Load, script execution time increased by 90-100%. So, I would say: If you can put your calculation in a preceding Load, you should. But sometimes you have to use a resident Load, e.g. after a Crosstable or when you want to make some transformations after a Join.

However, a preceding Load has a small overhead, so if you have multiple preceding Loads, it will be slower. But if a preceding Load helps the manageability and clarity of the script, then I am usually willing to take a small penalty.

HIC

3,151 Views
hariharasudan_p
Contributor

Thanks a lot HIC. So according to you, transformations and string operations and for cross tables, i can go for resident. Calculations and small operations i can go for Preceding.

0 Likes
3,151 Views

I would put any transformation or string operation or creation of a new field in a preceding Load, rather than in a resident Load. I.e. if possible - always choose preceding Load.

But there are some transformations you cannot do in a preceding Load, e.g anything that you want to do after a Crosstable Load or after a Join. For these I would use a resident Load.

HIC

3,151 Views
Not applicable

Hello Henric,

Thanks for the reply.

I hope Like Resident Load,Preceding Load also takes data from RAM instead of the datasource directly(qvd etc...).

If i'm correct why does it take lesser time than the Resident Load ?

Regards

0 Likes
3,151 Views

Let's say that you have the following construction:

          Load *, SomeFunction(X,Y) as Z; // Load statement "B"

          Load X, Y From ... ;            // Load statement "A"

Then each record that is loaded by "A" is piped into "B". This means that "B" processes the first record completely before "A" starts to process the second record. In other words: the result of "A" is never stored as a table - neither on disk, nor in RAM. So you can't say that "B" loads a table from RAM or from disk.

The alternative to the above constuction, is to use a resident Load:

          A: Load X, Y From ... ;            // Load statement "A"

          C: Load *, SomeFunction(X,Y) as Z Resident A; // Load statement "C"

          Drop Table A;

If the "A" statement alone takes 10 s, then A+B typically would take perhaps 11-12 s and "C" would perhaps take 9-10 s. So the above construction would all in all take 11-12 s and the lower construction would take 19-20 s.

Does this make sense?

HIC

3,151 Views
Not applicable

I understand now.

Thanks a lot Henric for the explanation

Regards

0 Likes
3,151 Views
Not applicable

Hi,

It also does not work with crosstable. Anyone got a solution for that?

Kind Regards,

Dion

0 Likes
3,151 Views

The preceding Load cannot be used after (above) a Join or a Crosstable. The work-around is to do it in two steps:

// === Step 1 ===

TempTable:

Crosstable Load ... From ...;

// === Step 2 ===

FinalTable:

NoConcatenate

Load ... Resident TempTable;

Drop Table TempTable;

HIC

3,151 Views
Not applicable

Thx, I find it hard to fully understand, perhaps because I don't even know the alternative/traditional way - non-preceding load. Can anyone suggest an alternative explanation? 

0 Likes
3,151 Views
Not applicable

thx I have it

0 Likes
3,151 Views
amiumi66
Contributor

can you explain why i get error in this code and how preceding load will help. i get field not found error.

empData:

LOAD ID,

     Name,

     Amount,

LOADID & | & Name as uniqkey

FROM QVD.qvd (qvd);



3,151 Views
Not applicable

Assuming that the fields ID, Name, Amount exist in your QVD.qvd, you need to include either the field names or an asterisk (to call all fields) in the first (bottom) load, then the preceding (next) load will pull the already loaded fields and use those.

empData:

LOAD

     uniqkey

     ID,

     Name,

     Amount,

LOAD

     *,

     ID & | & Name as uniqkey

FROM QVD.qvd (qvd);

0 Likes
3,151 Views
amiumi66
Contributor

i get so confused with this and cant seem to understand it. explain me what is happening when code gets compiled by qlikview.

empData:

Load

     uniqkey,      // what is happening here, qlikview is pulling this field from qvd for the first time????

     ID,

     Name,

     Amount,

Load  // what is this load doing.

*,  // all field are being pulled again why?

ID & | & Name as uniqkey     // this is calculating but at this point how does it have field available to calculate.

from QVD.qvd(qvd);

0 Likes
3,151 Views
Not applicable

QlikView is pulling whatever fields exist in your QVD file (table).

The ID & '|'& Name as uniqkey essentially says concatenate ID, the pipe symbol and Name into one field and call it uniqkey.  (i.e. 23|Tom)

The preceding loads work from the bottom up.  Within each preceding load you can then transform the fields extracted from the one prior.

LOAD  *     //preceding load 3

;

LOAD  *     //preceding load 2

;

LOAD  *     //preceding load 1

;

LOAD field1, field2, etc from File.qvd;

3,151 Views
amiumi66
Contributor

thanks but your code your provide gives me syntax error.

empData:

LOAD

     uniqkey

     ID,

     Name,

     Amount,

LOAD

     *,

     ID & | & Name as uniqkey

FROM QVD.qvd (qvd);

0 Likes
3,151 Views
Not applicable

what I provided you is a guess as I don't know what exact fields you are pulling from the QVD and what your objective is.  If all you are trying to do is create a composite key, you don't need a preceding load.  Here's a rework of the syntax from above.  But, what are you actually trying to do?  Learn how to use a preceding load or create a composite key?

empData:

LOAD

     uniqkey,

     ID,

     Name,

     Amount

;

LOAD

     *,

     ID &' | '& Name as uniqkey

FROM QVD.qvd (qvd);

3,151 Views
amiumi66
Contributor

Thanks Thomas

one more question

empData:

LOAD                                                      // this gets evaluated second and loads all the below field.

     uniqkey,

     ID,

     Name,

     Amount

;

LOAD                                                           // so this gets evaluated or compiled first.

     *,                                                                // loads all the fields

     ID &' | '& Name as uniqkey                   // creates uniqkey.

FROM QVD.qvd (qvd);

1)is my understanding above correct.

2) when qlikview completes compiling or reloading script. does it do compilation twice first to get all the field in memory, do calculation on uniqkey. and then do it again to display it for dashboard.

0 Likes
3,151 Views
MVP & Luminary
MVP & Luminary

Hi,

When QlikView is running a load script it is not compiling in the way a C

program would compile. It is true that the lower level of the load script

is processed first, followed by the preceding part. In the case of the

code that you have inserted into this thread only four fields would be

present in the data set in the end. This is because whilst you are loading

all fields from the QVD in the first instance you are ignoring those fields

in the preceding load (you could put a star there to pick them up).

The reason you would do a preceding load is to use a field that you have

created in a lower level of the load, for example in the preceding load you

could write:

if(len(uniqkey) > 10, 'Yes', 'No') as ,

Where if you did it in the lower part of the load it would need to be:

if(len(ID &' | '& Name) > 10, 'Yes', 'No') as ,

As the field with the name uniqkey would not exist in the first load, but

it would in the preceding. Replicating the concatenation of the key

shouldn't take more processing (due to caching) but it would mean that if

you changed how your key was derived (eg. using FullName rather than Name)

you may miss making the same change to the expression that decides if the

key is long or not.

The other thing to watch with preceding loads and the * symbol is if you

reference a field in the list that also exists in the * QlikView will fail

without giving you much of a sensible error message.

Hope that all makes sense.

Steve

3,151 Views
amiumi66
Contributor

Thanks Steve, very well explained. I am doing composite key in my code because i am trying to find uniq method to display data on 1:1 level. and then merge new data with old master data with update and append.


can the uniqkey i created used in concatenation with where exist clause. Below gives me field not found error and the source of it is coming from where exist clause.


Merge_Data:

LOAD

     uniqkey,

     ID,

     Name,

     amount,

LOAD                                                          

     *,                                                               

     ID &' | '& Name as uniqkey                  

FROM NEW_QVD.qvd (qvd);

Concatenate (Merge_Data)

LOAD

     uniqkey,

     ID,

     Name,

     amount,

LOAD                                                          

     *,                                                               

     ID &' | '& Name as uniqkey                  

FROM MASTER_QVD.qvd (qvd);

WHERE not exists(uniqkey);

0 Likes
3,151 Views
MVP & Luminary
MVP & Luminary

At the point of the WHERE EXISTS that field name doesn't exist. You need to

replicate the expression in the exists. Something like:

WHERE NOT EXISTS (uniqkey, ID & '-' & Name)

If the exists statement only has one parameter it uses the one parameter to

search for and search in. You can also search for a different field or an

expression in an existing field.

A much better way of doing this is creating uniqkey when you create the

QVD. This removes the need for the preceding load, simplifies the WHERE

EXISTS and most importantly makes your load optimised.

If you don't know about optimised QVD load I suggest you Google about it.

Hope that helps,

Steve

3,151 Views
amiumi66
Contributor

Thanks Steve

So i am taking aproach of creating new qvd with composite key being calculated and then storing it into new qvd. however I am finding an unusual problem. i wonder if this is some kind of bug.

updatedata:

LOAD

     uniqkey,

     ID,

     Name,

     amount,

     ID &' | '& Name as uniqkey                 

FROM NEW_QVD.qvd (qvd);

store updatedata into new_updatedata.qvd;

MasterData:

LOAD

     uniqkey,

     ID,

     Name,

     amount,

     ID &' | '& Name as uniqkey                 

FROM MASTER_QVD.qvd (qvd);

store MasterData into new_masterdata.qvd;

When I do reload the first table store works, the second gives me table not found error. If I switch the tables it will give me error on the second one again.

anyone with any clues.

0 Likes
3,151 Views