Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Loading Dimensions (Update and Insert) or Full Load in QVD File

Hi Members,

My dimensions are Update (Existing Records) and Insert (New Records) and Same I want to apply in my QlikView Script to Load all dimensions Update/Insert.

What I thought, Instead of doing Update and Insert, each load I can make it Full load for all the dimension. We have 2 years of data in our DWH and source for QlikView is DWH.

The bigger dimension is having 15 million records and rest all dimensions are having less than 200000 records.

I would like to know what would be more efficient ?

  1. Full Load (Each Day)? (15 Million or more records need to load every day)
  2. or Update/Insert? (On average each day, maximum 2 million records need update and insert)

Kindly suggest me the best practice and pros and corns of FULL LOAD and Update/Insert.

Note: My Facts are Incremental Load on daily basis.

Also provide me Update and Insert syntax for below Query

Full Load QlikView Script from My Database:

DIM_EMP_DATA:

LOAD

  EMP_KEY             as EmployeeKey,

  EMP_ID              as EmployeeId,

  EMP_TYPE            as EmployeeType,

  EMP_NAME            as EmployeeName

SQL

SELECT  

  EMP_KEY,

  EMP_ID,

  EMP_TYPE,

  EMP_NAME

FROM STAR.DIM_EMPLOYEE;

STORE DIM_EMP_DATA INTO D:\DIMENSION_DATA\DIM_EMP_DATA.QVD;

DROP TABLE DIM_EMP_DATA;

3 Replies
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

I think you will have to test to be sure. But my best guess is that a full reload would be faster. there is significant overhead in merging 2M changed rows. If your network connection is slow or your columns are wide,  the incremental approach may win out. I would test both to get a profile and some specifics for your scenario.

Rob

Not applicable
Author

Hi Rob,

Currently, Production System is not build so can't test it with production data and in Test Env having less data so unable to differentiate that which is more efficient FULL Load or UPDATE/INSERT?

Will be waiting for your analysis.

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Again, my best guess is that full load would be most efficient in this case and easier to code and maintain,

Decisions about load strategies should be evaluated in terms of "good enough". If you can do a full load and meet your time and resource objectives, go with the full load. It's simple and clean, Only if you have a problem to solve would I go down the road of to implementing an incremental load. If you do, you must test the performance and possibly do some tuning to make a 2M/!5M row update perform better than a full reload.

So again, go with the full reload if it works and meets your time and resource constraints.

-Rob