Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
joey_lutes
Partner - Creator
Partner - Creator

Incremental Delta Load with Slowly Changing Dimensions

I have multiple dim tables being loaded that will change over time and I'm looking for a bit of a kickstart please.

Ex.  Load 1:

EmployeeID     TruckModel    Color     Location

1                         Ford            Blue        300

2                         Chevy         Red        1990

3                         Ford            Green      400

I have 2 things at work here.  #1, new Employees will be added, some employees will be removed.  Also, multiple attributes for these employees will/could change.

Ex. Load 2:

EmployeeID     TruckModel    Color     Location

1                         Ford            Blue        400

3                         Buick          Green      800

4                         Chevy         Black      900

There are no inherent date fields associated with any of the data so I must create a timestamp of the loads.

The end result should be something like:

EmployeeID     TruckModel    Color     Location     AsOfLoad

1                         Ford            Blue        300              1

1                         Ford            Blue        400              2

2                         Chevy         Red        1990             1

3                         Ford            Green      400              1

3                         Buick          Green      800              2

4                         Chevy         Black      900              2

This will likely manifest in a (variety of) line chart(s) and bar graphs showing trending and change analysis.

I've been reading a lot of the documents that are out there about incremental, delta and scd loads and also AsOfTables - just not quite sure where to start there.  I know I'll be storing the data into a qvd and loading the delta from source.  Also, I believe I want to use Incremental Load #2 (Insert and Update)

Do I need to create a generic key/unique id in the load script?

Thanks in advance!!

4 Replies
kaushiknsolanki
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi,

What I understand is you need the history, so there is no need to update the record.

So for that you will need just insert option.

Now when the data is deleted you can use the Inner join with QVD to delete the records from QVD.

You dont need to have a Generic Key or Unique key.

Regards,

Kaushik Solanki

Please remember to hit the 'Like' button and for helpful answers and resolutions, click on the 'Accept As Solution' button. Cheers!
maxgro
MVP
MVP

It seems to me you just want to add non duplicated records

load

  *,

  AutoNumber(EmployeeID & '|' & TruckModel & '|' & Color & '|' & Location) as Id,

  now(1) as ts

inline [

EmployeeID     TruckModel    Color     Location

1                         Ford            Blue        300

2                         Chevy         Red        1990

3                         Ford            Green      400

] (delimiter is spaces);

sleep(1000);

load

  *,

  AutoNumber(EmployeeID & '|' & TruckModel & '|' & Color & '|' & Location) as Id,

  now(1) as ts

inline [

EmployeeID     TruckModel    Color     Location

1                         Ford            Blue        400

3                         Buick          Green      800

4                         Chevy         Black      900

1                         Ford            Blue        300

2                         Chevy         Red        1990

] (delimiter is spaces)

where not Exists(Id, AutoNumber(EmployeeID & '|' & TruckModel & '|' & Color & '|' & Location))

;


1.png

joey_lutes
Partner - Creator
Partner - Creator
Author

Thanks everyone!

Massimo - that looks great - I have a couple follow-on questions if you don't mind.

a)  this is sample data - the real data doesn't involve employees or cars, actually  It's dozens of columns of data across multiple tables and 100ks of rows - but the construct concept is relevant.

I noticed your 'where not exists' clause included each field I presented.  Would I need to include every field that could possibly change in that clause, or does it just need to be unique?

b)  can you tell me why you prefer AutoNumber over RowNo()?  Just trying to understand that one.

c)  what happens in this scenario if the row persists but has not changed?  will it show up again in the second load identical to the first load?  If so, is this desirable?  I would imagine that this will be the bulk of the data - sorry for not including it in my example.

Thanks so much,

Joey Lutes

joey_lutes
Partner - Creator
Partner - Creator
Author

I actually solved my problem by using a hash function and comparing previousHash to currentHash and creating an active flag thanks to Barry Harmsen's method from the Qlik Master's Summit.

Thanks for the help.

http://www.qlikfix.com/author/barry/