Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Members,
My dimensions are Update (Existing Records) and Insert (New Records) and Same I want to apply in my QlikView Script to Load all dimensions Update/Insert.
What I thought, Instead of doing Update and Insert, each load I can make it Full load for all the dimension. We have 2 years of data in our DWH and source for QlikView is DWH.
The bigger dimension is having 15 million records and rest all dimensions are having less than 200000 records.
I would like to know what would be more efficient ?
Kindly suggest me the best practice and pros and corns of FULL LOAD and Update/Insert.
Note: My Facts are Incremental Load on daily basis.
Also provide me Update and Insert syntax for below Query
Full Load QlikView Script from My Database:
DIM_EMP_DATA:
LOAD
EMP_KEY as EmployeeKey,
EMP_ID as EmployeeId,
EMP_TYPE as EmployeeType,
EMP_NAME as EmployeeName
SQL
SELECT
EMP_KEY,
EMP_ID,
EMP_TYPE,
EMP_NAME
FROM STAR.DIM_EMPLOYEE;
STORE DIM_EMP_DATA INTO D:\DIMENSION_DATA\DIM_EMP_DATA.QVD;
DROP TABLE DIM_EMP_DATA;
I think you will have to test to be sure. But my best guess is that a full reload would be faster. there is significant overhead in merging 2M changed rows. If your network connection is slow or your columns are wide, the incremental approach may win out. I would test both to get a profile and some specifics for your scenario.
Rob
Hi Rob,
Currently, Production System is not build so can't test it with production data and in Test Env having less data so unable to differentiate that which is more efficient FULL Load or UPDATE/INSERT?
Will be waiting for your analysis.
Again, my best guess is that full load would be most efficient in this case and easier to code and maintain,
Decisions about load strategies should be evaluated in terms of "good enough". If you can do a full load and meet your time and resource objectives, go with the full load. It's simple and clean, Only if you have a problem to solve would I go down the road of to implementing an incremental load. If you do, you must test the performance and possibly do some tuning to make a 2M/!5M row update perform better than a full reload.
So again, go with the full reload if it works and meets your time and resource constraints.
-Rob