Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
lazurens
Partner - Contributor III
Partner - Contributor III

Resident Load Vs Store QVD

I have made an experiment to find which method is faster to reload a table. 

First I have loaded the QVD file onto Qlik, and added a Flag to split the table in two partitions

"Partition 1" : first half of data 

"Partition 2" : second half of data

and I store this QVD again. 

Then I have loaded the main QVD (without the partitioning Flag)

Load * From QVD

 

after that I loaded the same file in two steps

Load * From QVD Where Flag='Partition1';

Concatenate

Load * From QVD Where Flag='Partition2';

and I was surprised how fast it was the second method !! 

Resident Load took 26 Sec

Partitioning Load took 13 Sec

 

I want to share this with you, to see your opnion about this and if this could be a potential technique to optimize the loading time of a large application.

 

Best Regards,

Mohamed

1 Solution

Accepted Solutions
Or
MVP
MVP

Autogenerated about 5 million lines into Transactions.  You can see below (relevant bits in red) that loading from resident or QVD took 1-2 seconds, while loading the split version from QVD took about 16 seconds. It's always faster to use Resident or optimized QVD load vs. using a non-optimized QVD load, near as I can tell, and this result suggests the same.

2019-08-15 11:29:30 4,989,186 lines fetched
2019-08-15 11:29:30 0052 Store Transactions into [qvd]
2019-08-15 11:29:31 0054 Test1:
2019-08-15 11:29:31 0055 NOCONCATENATE Load * Resident
2019-08-15 11:29:31 0056 Transactions
2019-08-15 11:29:33 10 fields found: Flag, TransLineID, TransID, Num, Dim1, Dim2, Dim3, Expression1, Expression2, Expression3,
2019-08-15 11:29:33 4,989,186 lines fetched
2019-08-15 11:29:34 0058 Drop table Transactions
2019-08-15 11:29:34 0060 Drop Table Test1
2019-08-15 11:29:34 0064 Test2:
2019-08-15 11:29:34 0065 LOAD
2019-08-15 11:29:34 0066 *
2019-08-15 11:29:34 0067 FROM [qvd]
2019-08-15 11:29:34 0068 (qvd) Where Flag = 0
2019-08-15 11:29:43 10 fields found: Flag, TransLineID, TransID, Num, Dim1, Dim2, Dim3, Expression1, Expression2, Expression3,
2019-08-15 11:29:43 1,246,239 lines fetched
2019-08-15 11:29:44 0069 CONCATENATE
2019-08-15 11:29:44 0070 LOAD
2019-08-15 11:29:44 0071 *
2019-08-15 11:29:44 0072 FROM [qvd]
2019-08-15 11:29:44 0073 (qvd) Where Flag = 1
2019-08-15 11:29:50 10 fields found: Flag, TransLineID, TransID, Num, Dim1, Dim2, Dim3, Expression1, Expression2, Expression3,
2019-08-15 11:29:50 3,740,651 lines fetched
2019-08-15 11:29:51 0075 Drop Table Test2
2019-08-15 11:29:51 0077 Test3:
2019-08-15 11:29:51 0078 Load * FROM [qvd]
2019-08-15 11:29:51 0079 (qvd)
2019-08-15 11:29:52 10 fields found: Flag, TransLineID, TransID, Num, Dim1, Dim2, Dim3, Expression1, Expression2, Expression3,
2019-08-15 11:29:52 4,989,186 lines fetched
2019-08-15 11:29:52 0081 Drop Table Test3
2019-08-15 11:29:52 Execution finished.

View solution in original post

2 Replies
marcus_sommer

"Normally" there should be no big differences in the run-time between a resident-load and an optimized qvd-load (your qvd-load isn't optimized because it contains a processing - only where exists() with a single parameter and changes to the meta-data like a renaming of a field is allowed).

But of course it will depend on various parameter if one or the other method is more suitable - the most important will probably be the network/storage performance and the available vs. needed RAM for your loadings/transformations.

This means you might just have measured the biggest bottleneck in your environment. To be sure I suggest to repeat it a few times by monitoring the CPU/RAM/Storage workload within the taskmanager (maybe there are further parallel processes, too) and to ensure that your measurement includes all parts of your task (creating of the slices and storing them and so on).

- Marcus

Or
MVP
MVP

Autogenerated about 5 million lines into Transactions.  You can see below (relevant bits in red) that loading from resident or QVD took 1-2 seconds, while loading the split version from QVD took about 16 seconds. It's always faster to use Resident or optimized QVD load vs. using a non-optimized QVD load, near as I can tell, and this result suggests the same.

2019-08-15 11:29:30 4,989,186 lines fetched
2019-08-15 11:29:30 0052 Store Transactions into [qvd]
2019-08-15 11:29:31 0054 Test1:
2019-08-15 11:29:31 0055 NOCONCATENATE Load * Resident
2019-08-15 11:29:31 0056 Transactions
2019-08-15 11:29:33 10 fields found: Flag, TransLineID, TransID, Num, Dim1, Dim2, Dim3, Expression1, Expression2, Expression3,
2019-08-15 11:29:33 4,989,186 lines fetched
2019-08-15 11:29:34 0058 Drop table Transactions
2019-08-15 11:29:34 0060 Drop Table Test1
2019-08-15 11:29:34 0064 Test2:
2019-08-15 11:29:34 0065 LOAD
2019-08-15 11:29:34 0066 *
2019-08-15 11:29:34 0067 FROM [qvd]
2019-08-15 11:29:34 0068 (qvd) Where Flag = 0
2019-08-15 11:29:43 10 fields found: Flag, TransLineID, TransID, Num, Dim1, Dim2, Dim3, Expression1, Expression2, Expression3,
2019-08-15 11:29:43 1,246,239 lines fetched
2019-08-15 11:29:44 0069 CONCATENATE
2019-08-15 11:29:44 0070 LOAD
2019-08-15 11:29:44 0071 *
2019-08-15 11:29:44 0072 FROM [qvd]
2019-08-15 11:29:44 0073 (qvd) Where Flag = 1
2019-08-15 11:29:50 10 fields found: Flag, TransLineID, TransID, Num, Dim1, Dim2, Dim3, Expression1, Expression2, Expression3,
2019-08-15 11:29:50 3,740,651 lines fetched
2019-08-15 11:29:51 0075 Drop Table Test2
2019-08-15 11:29:51 0077 Test3:
2019-08-15 11:29:51 0078 Load * FROM [qvd]
2019-08-15 11:29:51 0079 (qvd)
2019-08-15 11:29:52 10 fields found: Flag, TransLineID, TransID, Num, Dim1, Dim2, Dim3, Expression1, Expression2, Expression3,
2019-08-15 11:29:52 4,989,186 lines fetched
2019-08-15 11:29:52 0081 Drop Table Test3
2019-08-15 11:29:52 Execution finished.