4 Replies Latest reply: Nov 11, 2016 5:00 PM by Joey Lutes RSS

    Incremental Delta Load with Slowly Changing Dimensions

    Joey Lutes

      I have multiple dim tables being loaded that will change over time and I'm looking for a bit of a kickstart please.

      Ex.  Load 1:

      EmployeeID     TruckModel    Color     Location

      1                         Ford            Blue        300

      2                         Chevy         Red        1990

      3                         Ford            Green      400

      I have 2 things at work here.  #1, new Employees will be added, some employees will be removed.  Also, multiple attributes for these employees will/could change.

      Ex. Load 2:

      EmployeeID     TruckModel    Color     Location

      1                         Ford            Blue        400

      3                         Buick          Green      800

      4                         Chevy         Black      900

       

      There are no inherent date fields associated with any of the data so I must create a timestamp of the loads.

       

      The end result should be something like:

      EmployeeID     TruckModel    Color     Location     AsOfLoad

      1                         Ford            Blue        300              1

      1                         Ford            Blue        400              2

      2                         Chevy         Red        1990             1

      3                         Ford            Green      400              1

      3                         Buick          Green      800              2

      4                         Chevy         Black      900              2

       

      This will likely manifest in a (variety of) line chart(s) and bar graphs showing trending and change analysis.

       

      I've been reading a lot of the documents that are out there about incremental, delta and scd loads and also AsOfTables - just not quite sure where to start there.  I know I'll be storing the data into a qvd and loading the delta from source.  Also, I believe I want to use Incremental Load #2 (Insert and Update)

       

      Do I need to create a generic key/unique id in the load script?

      Thanks in advance!!

        • Re: Incremental Delta Load with Slowly Changing Dimensions
          Kaushik Solanki

          Hi,

           

          What I understand is you need the history, so there is no need to update the record.

          So for that you will need just insert option.

           

          Now when the data is deleted you can use the Inner join with QVD to delete the records from QVD.

           

          You dont need to have a Generic Key or Unique key.

           

          Regards,

          Kaushik Solanki

          • Re: Incremental Delta Load with Slowly Changing Dimensions
            Massimo Grossi

            It seems to me you just want to add non duplicated records

             

             

            load

              *,

              AutoNumber(EmployeeID & '|' & TruckModel & '|' & Color & '|' & Location) as Id,

              now(1) as ts

            inline [

            EmployeeID     TruckModel    Color     Location

            1                         Ford            Blue        300

            2                         Chevy         Red        1990

            3                         Ford            Green      400

            ] (delimiter is spaces);

             

            sleep(1000);

             

            load

              *,

              AutoNumber(EmployeeID & '|' & TruckModel & '|' & Color & '|' & Location) as Id,

              now(1) as ts

            inline [

            EmployeeID     TruckModel    Color     Location

            1                         Ford            Blue        400

            3                         Buick          Green      800

            4                         Chevy         Black      900

            1                         Ford            Blue        300

            2                         Chevy         Red        1990

            ] (delimiter is spaces)

            where not Exists(Id, AutoNumber(EmployeeID & '|' & TruckModel & '|' & Color & '|' & Location))

            ;


            1.png

              • Re: Incremental Delta Load with Slowly Changing Dimensions
                Joey Lutes

                Thanks everyone!

                Massimo - that looks great - I have a couple follow-on questions if you don't mind.

                 

                a)  this is sample data - the real data doesn't involve employees or cars, actually  It's dozens of columns of data across multiple tables and 100ks of rows - but the construct concept is relevant.

                I noticed your 'where not exists' clause included each field I presented.  Would I need to include every field that could possibly change in that clause, or does it just need to be unique?

                 

                b)  can you tell me why you prefer AutoNumber over RowNo()?  Just trying to understand that one.

                 

                c)  what happens in this scenario if the row persists but has not changed?  will it show up again in the second load identical to the first load?  If so, is this desirable?  I would imagine that this will be the bulk of the data - sorry for not including it in my example.

                 

                Thanks so much,

                 

                Joey Lutes

              • Re: Incremental Delta Load with Slowly Changing Dimensions
                Joey Lutes

                I actually solved my problem by using a hash function and comparing previousHash to currentHash and creating an active flag thanks to Barry Harmsen's method from the Qlik Master's Summit.

                Thanks for the help.

                http://www.qlikfix.com/author/barry/