If you've used QlikView or Qlik Sense for more than a day you've probably found a use for the INLINE statement so you could created data on the fly instead of wasting time with a text file.
Load * Inline [
Field1, Field2, Field3
Value1, Value2, Value3
But I'm writing this because the INLINE statement can be used for much more powerful uses. You can use it to maintain an OPTIMIZED LOAD and you can use it to define a SORT ORDER.
Who just yelled "Get out of here?"
I will not. I'm serious. If you haven't started playing with the General Hospital data set yet you should have, and when you do you will see there are lots and lots of hospitals.
One of the other things that I love seeing when I load data is the word OPTIMIZED. It's kind of like Qlik's version of a gold star or happy face on your homework. It tells you that you did something awesome. What it means is that it was able to just ingest the data from your QVD file in the fastest possible manner. Not that big of a deal with the 124 hospitals, but when you start thinking about your hundreds of millions of rows seeing OPTIMIZED is really important.
So here is a real world situation with real world data, we don't want all of the hospitals to be loaded we only want a few of them to be loaded for some reason. First instinct would be to change the LOAD command and add a WHERE CLAUSE:
Where Match([Hospital Name], 'All Saints Hospital', 'General Hospital', 'Bucks County Hospital') > 0;
It works for sure but the problem is those WHERE clauses that we add cause Qlik to NOT be able to read the data in an OPTIMIZED fashion.
But the great news is that is where the INLINE statement can help if you know how to take advantage of it. In the image below you can see that I've simply created an INLINE list of the hospital names that we want to load and instead of using a WHERE clause with conditions I am using the EXISTS syntax to say "only load records if the Hospital Name already exists in memory." Notice that Qlik then gives me the gold star again (OPTIMIZED) in the load. I get just the data I want, but it loads super fast.
But that's not the only benefit of using INLINE. Before I share the other benefit you need to think about how Qlik manages data. You know that whole "Columnar Data Store" thing. Each column is actually it's own in-memory table. When I use the INLINE command it has built the table for the [Hospital Name] field. Here is the sneaky, awesome, cool part ... it not only exists it is SORTED in the way I load the data for it. I intentionally added the values via the INLINE in the SORT ORDER that I wish to see the data.
Come on you gotta love that. Stop just using INLINE for just old fashioned loading of quick data and start letting it:
Have you tried to benchmark the OPTIMIZED and NON OPTIMIZED records in for this Data set ?
I always wonder how much we can earn in term of reloading time % ..
What a GREAT question. I absolutely have.
The base data for the General Hospital data set includes a small, more manageable and downloadable, subset of a much larger set of Quality Measure Data. By large I mean 62.5 million records. If I load the entire data set from a QVD it takes just 33 seconds.
If I change the load script to just load data for System 1 using a WHERE clause it takes 37 seconds instead.
I hear ya, I hear ya. 37 seconds isn't that big of a deal compared to 33 seconds.
But understand that part of the time in the load script to display the time also includes the time to save the application file. Saving 11 million records in the application is much faster than saving 62.5 million records. So now let's compare the time for that WHERE clause to a WHERE EXISTS clause ...
Only 16 seconds. Half the time when doing an OPTIMIZED LOAD using the WHERE EXISTS made possible by the INLINE statement.
I should also point out a neat system variable you may not have seen before CreateSearchIndexOnReload which lets me tell Qlik not to bother building an index for fields so that I don't have to wait so that I can just measure performance of the OPTIMIZED or non optimized loads.
Selectively loading data from a record set that contains 62.5 MILLION records in 16 seconds.That's HOT!!!!!