Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
ioannagr
Creator III
Creator III

do you do incremental load on dim_dates table?

Hello all.

Is this usual at all? Or are you guys just dropping and recreating the table every time?

Also if you do, do you use the "insert only(no update , no delete" method for incremental load?

My data looks like this. To me using datekey for incremental load looks safe.

Thank you in advance!

 

photo.png

Labels (4)
10 Replies
vikasmahajan

Hi ,

Use below script :

Max: LOAD Max(DATE) as Max_Date FROM Incremental.xlsx (ooxml, embedded labels, table is Sheet1); let VMaxDate= Peek('Max_Date'); Drop table Max; Incremental: LOAD NAME, DATE, DURATION,RecNo() as ID FROM Incremental.xlsx (ooxml, embedded labels, table is Sheet1) where DATE>=Date($(VMaxDate)); Concatenate LOAD NAME, DATE, DURATION ,ID FROM P_Data.qvd (qvd) where not Exists(ID); Store Incremental into P_Data.qvd; Drop table Incremental; Data: LOAD NAME, DATE, DURATION ,ID FROM P_Data.qvd (qvd);
Hope this resolve your issue.
If the issue is solved please mark the answer with Accept as Solution & like it.
If you want to go quickly, go alone. If you want to go far, go together.
Or
MVP
MVP

As a rule of thumb, I only apply incremental loads to tables where this will make a significant impact on performance or bandwidth. Most dimension tables are small enough that there's no real reason to bother with incremental loads, so I just load them fully. If you're looking at a table that has three columns and a few thousands of rows (which would be typical for a date dimension table, since they generally have one row per date), I doubt that an incremental load will save you any noteworthy resources.

As to whether it's safe - that's something you'll need to investigate on your own, as we have no way of knowing if your particular set of data can change retroactively (modified, added, or deleted) and whether or not it contains any sort of last_update field.

ioannagr
Creator III
Creator III
Author

@Or ,  Hello. I consider the "insert, update & delete method" the safest -in my case at least- because I have asked and got the answer that updates and deletes *might* happen but they don't know for sure because the tables get truncated and remade every time. How could my consideration of this method go wrong, in that case that they don't know? 

 

@vikasmahajan thanks for the help 🙂

Or
MVP
MVP

If the tables are truncated each time, there's no way for you to do an incremental update, since you have no way of knowing what changed if every row has changed by virtue of being recreated. I would just do a regular full load.

ioannagr
Creator III
Creator III
Author

Yes, but considering that i already have a qvd, I'd take the new ones as they are, and with the inner join, with "insert update and delete" method, the older ones would get updated or deleted accordingly. That's what I'm thinking. Isn't that so?

I know what you say thought, but the fact tables tend to get really big with time and I know from my little experience that I'll have to face that problem some day :')

Or
MVP
MVP

You would have no way of knowing which rows have changed if the table is being recreated from scratch. This does not meet the prerequisite for incremental loading - A field ModificationTime (or similar) is required for Qlik Sense to recognize which records are new. When the entire table is truncated, all records are new.

ioannagr
Creator III
Creator III
Author

I've come up with 3 questions 🙂 -I LOVE how educational each day on here is!-

1)From what I'm studying now, a modification time field *or similar* is required. What could be a similar field?

2)Also in a dim table, where i have dateinserted, couldn't I consider max(dateinserted) as last date inserted and do an incremental load?

3)What should I suggest to the database creators? To add update fields? I really don't know though if something like that is possible for them to do easily.

Or
MVP
MVP

1) Assuming records can change, this would be a timestamp of when the record was last updated. If records can't change, you can get away with something like an incremental numerator.

2) I don't know. If a record can change retroactively, the the fact it was *inserted* a year ago doesn't mean it hasn't *changed* in the past year, does it? Again, this depends on how your table / data behaves.

3) I would suggest not trying to incremental load on a table like this, assuming records can change or be deleted. If records are only ever added, and you can trust the dateinserted field to show you which ones are new, then you can use an incremental load quite easily - just load the history from QVD and the new records from database. Likewise, if records can be modified but are typically only modified in a given time frame, e.g. only in the month after originally being inserted, you can load all records older than one month from the QVD and all records newer than one month from the source table.

If there's no obvious way to know which records have changed, you'll need to load all the records just to be able to check if any of them have changed. At this point, you're loading all the records from the source anyway, so you may as well just dump the result into the QVD and not try to "Incremental" load.

 

ioannagr
Creator III
Creator III
Author

Hi again , i've seen from the colleagues previous work they did something like that

MaxID:
Load Max(ID) as MaxID
from [$(vPathQVD)Results.qvd] (qvd);

Let MaxID = Peek('MaxID', 0 , MaxID);

Results:

LOAD id,
Result;



SQL SELECT c.id,
c.lastupdate,
cm.result,
cm.id


FROM "public"."record_exams" c
INNER JOIN "public"."record_exams_maths” cm
ON cm.id=c.id
WHERE
c.id >'$(MaxID)';


Concatenate
Load * from [$(vPathQVD)Results.qvd] (qvd);

STORE Results INTO '$(vPathQVD)/Results.qvd';

 

 

@Or  so they chose ID over lastupdate field for incremental load... I checked that for some years the lastupdate field was blank and it looks like the id is coming in with ascending order. That's why I am confused with the implementation of incremental load. I studied and you said that only ID won't recognize new records. But this seems to work for them too... Hmm!

I'm thinking now, wasn't it ok if they just created a qvd and then worked with lastupdate?