19 Replies Latest reply: Sep 29, 2015 10:19 AM by kumar avisakula RSS

    Incremental load

    kumar avisakula

      Hello,

       

      I am new to qlikview. In my project we are doing incremental load and getting data as qvx files.

      On daily basis qvx files are coming and they are coming only incremental data only, but in my script it is converting into qvd from that incremental script only. It is not appending to existing QVD with incremental data.

       

      Inserts:

      LOAD *

      FROM

      $(Vqvxpath)packets_inserts.qvx(qvx);

       

      date_packets:

      LOAD Max(packet_record_updated_datetime) as packets_Maxdate

      Resident Inserts;

       

      LET VDate_packets = Peek('packets_Maxdate',0,'date_packets');

       

       

      DROP Table date_packets;

       

      STORE Inserts into $(Vqvdpath)packets.QVD(qvd);

      DROP Table Inserts;

       

      Updates:

      LOAD *

      FROM

      $(Vqvxpath)packets_updates.qvx(qvx)

      Where (packet_record_updated_datetime >'$(VDate_packets)') ;

       

      Concatenate

      LOAD *

      FROM

      $(Vqvdpath)packets.qvd(qvd)

      Where not Exists(team_key);

       

      STORE Updates into $(Vqvdpath)packets.QVD(qvd);

       

      Pls help on this issue.

       

      Regards,

      Kumar

        • Re: Incremental load
          Massimo Grossi

          add this statement

          trace $(VDate_packets) ;

          and see if you get a num o r  a  date

          and also if the format is the same of packet_record_updated_datetime

          • Re: Incremental load
            Jonathan Dienst

            Change this line:

             

            LET VDate_packets = Peek('packets_Maxdate',0,'date_packets');

            to

            LET VDate_packets = Num(Peek('packets_Maxdate',0,'date_packets'));

             

            And this line

            Where (packet_record_updated_datetime >'$(VDate_packets)') ;

            to

            Where (packet_record_updated_datetime > $(VDate_packets)) ;


            This removes possible confusion due to date formatting.

              • Re: Incremental load
                kumar avisakula

                Hi,

                 

                My requirement is here, I have written incremental script as above. In full load it is loading and converting  QVD.

                In incremental load I will get only limited records as QVX file, if i run script, QVD will be overriding with existing QVD instead of appending new records to existing QVD. So I want to check whether I have written the code is correct or not. I mean it will override or it will do append.

                 

                Regards,

                Kumar

                  • Re: Incremental load
                    Jd Sreeni

                    Reddy -

                     

                    Please be noted that there won't be any appending to QVD... QV always creates new QVD with the timestamp... that means every time it will create a new QVD.

                     

                    HTH

                    SReeni

                      • Re: Incremental load
                        kumar avisakula

                        Hi Sreeni,

                         

                        Thanks for giving reply to my query.

                        My requirement is Everyday I will get inserts and updates QVX files. If run I run this QVD generation script today it will create  one QVD file. Tomorrow I will get some more inserts and updates QVX files which will contain new records. If I run now, that existing records are overwriting with newly records. But it should not happen and I want to see yesterday's data and today's data also.

                        This QVX files are getting overwriting everyday, but my my QVD should not overwrite and should add new records to the existing records.

                         

                        Is it possible to new add records to existing records based on my daily QVX files.

                         

                        Please help on this one.

                         

                        Regards,

                        Kumar

                          • Re: Incremental load
                            Jd Sreeni

                            Hi Kumar,

                             

                            To preserve the history, implement Slowly Changing Dimensions (SCD-2) where you can keep the historic information with an additional column like timestamp or flag...

                             

                            Let me know if you need any help.

                             

                            HTH

                            Sreeni

                            • Re: Incremental load
                              Peter Cammaert

                              Yes, it's perfectly possible to do what you want (even with separate inserts and updates). Have a look at Jonathan's suggestion below, because his script change implements what you're after. Taking into account what Jd Sreeni said, the basic mechanism is as follows:

                               

                              Every day you'll have three sets of rows:

                              • A history file called packets.qvd. That's the complete data set created during yesterday's reload
                              • An "inserts" file called packets_inserts.qvx. Those are the new records, added to the data source since our last reload
                              • An "Updates" file called packets_updates.qvx. Those are the updates to existing records that have been made since our last reload.

                               

                              The incremental load consists of the following steps:

                              • Load all inserts into memory. Let's call this table NewData.
                              • Add all updates to NewData that are more recent than yesterday's history file
                              • Add all records to NewData from yesterday's history file that do not already exist in NewData
                              • Store NewData to packets.qvd. We just created an up-to-date reference file that tomorrow will become the history file.

                               

                              To make this work, you'll need to decide on a Primary Key that identifies each record in your QVD in a unique way. Using this primary key, you can decide whether a record already exists in NewData and should not be loaded again from anywhere else.

                               

                              Best,

                               

                              Peter

                                • Re: Incremental load
                                  kumar avisakula

                                  Hi Peter,

                                   

                                  Still I am in confusion. Please bear with me

                                   

                                  In full load I got packets_inserts.QVX as 400 mb and packets_updates.QVX as 300 mb. Based on this i created packets.qvd.

                                  In second day we have done incremental load and got packets_inserts.QVX as 100 mb packets_updates.qvx as 30 mb and these files overriding the existing QVX files. So I have only fresh data and based on this I want existing full load and incremental data QVD.

                                  According to my script it is creating packet.qvd from incremental data only and it is not having previous data(full load) in packets.QVD. But I have to have in QVD full load data and as well as incremental data.

                                   

                                  Regards,

                                  Kumar

                                    • Re: Incremental load
                                      Peter Cammaert

                                      No problem, I'll just repeat what I said before about the Incremental Load steps:

                                       

                                      • Load all inserts into memory. Let's call this table NewData.
                                      • Add all updates to NewData that are more recent than yesterday's history file
                                      • Add all records to NewData from yesterday's history file that do not already exist in NewData
                                      • Store NewData to packets.qvd. We just created an up-to-date reference file that tomorrow will become the history file.

                                       

                                      You see, there are no STORE statements inbetween these steps. Only at the end and just a single one. The idea is that you assemble your final table in-memory from the different local data sources: Inserts QVX, Updates QVX and History QVD. The three thrown together are stored as the new reference QVD. At that time, this QVD contains records from all three sources and without doubles.

                                       

                                      Adjust your script accordingly. Start by eliminating the different STORE statements, as you'll be overwriting your QVD again and again.

                                       

                                      Best,

                                       

                                      Peter

                                       

                                      [Edit] In bold is the step that adds Full load records that haven't changed. The history file is just yesterday's version of packets.qvd. In the last step, we'll be overwriting that file with today's (new) version.

                                        • Re: Incremental load
                                          kumar avisakula

                                          Hi Peter,

                                          Thanks for your patience.

                                           

                                          Inserts:

                                          LOAD *

                                          FROM

                                          $(Vqvxpath)packets_inserts.qvx(qvx);

                                          date_packets:

                                          LOAD Max(packet_record_updated_datetime) as packets_Maxdate

                                          Resident Inserts;

                                          LET VDate_packets = Peek('packets_Maxdate',0,'date_packets');

                                          DROP Table date_packets;

                                          Concatenate(Inserts)

                                          LOAD *

                                          FROM

                                          $(Vqvxpath)packets_updates.qvx(qvx)

                                          Where (packet_record_updated_datetime >'$(VDate_packets)') ;

                                          Concatenate(Inserts)

                                          LOAD *

                                          FROM

                                          $(Vqvdpath)packets.qvd(qvd)

                                          Where not Exists(team_key);

                                           

                                          In this script it will create Packets.QVD. In incremental load it has to check that already created packets.QVD data and then it has to add incremental data to to existing QVD .  But in our script, we are not able check already created packets.QVD data. like below

                                          maxdate:

                                          load max(date) as maxdate

                                          from old.qvd;


                                          Pls help on this.

                                           

                                          Regards,

                                          Kumar

                                            • Re: Incremental load
                                              Peter Cammaert

                                              There are two ways to go from here:

                                              1. Do you want to keep ony one version of every row with a unique team_key? The effect will be that you will keep the updated row from the Updates QVX, and throw away the old row from packets.qvd.
                                              2. Or do you want to keep all previous versions of rows with a unique team_key value? The effect will be that all rows from the history file will be kept, and all inserts and updates will simply be added to this existing QVD. In this way, you can go back to previous versions by date-of-change.

                                               

                                              I think you want the first solution. But better be sure about this, so please make your selection.

                                               

                                              Peter

                                                • Re: Incremental load
                                                  Peter Cammaert

                                                  If you select the first approach, try with this script:

                                                   

                                                  // Everyday, we have fresh Inserts and fresh Updates in QVX files

                                                   

                                                  NewPackets:

                                                  LOAD *

                                                  FROM [$(Vqvxpath)packets_updates.qvx] (qvx); // Updates may be on Inserts

                                                   

                                                  CONCATENATE (NewPackets)

                                                  LOAD *

                                                  FROM [$(Vqvxpath)packets_inserts.qvx] (qvx)

                                                  WHERE Not Exists(team_key); // Add Inserts that do not exist yet


                                                  CONCATENATE (NewPackets)

                                                  LOAD *

                                                  FROM [$(Vqvdpath)packets.qvd] (qvd)

                                                  WHERE Not Exists(team_key); // Add history that hasn't been updated

                                                   

                                                  STORE NewPackets INTO [$(Vqvdpath)packets.qvd] (qvd); // Overwrite history

                                                   

                                                  Best,

                                                   

                                                  Peter

                                                  • Re: Incremental load
                                                    kumar avisakula

                                                    Hi Peter,

                                                     

                                                    I want to maintain all versions of data including old and new.

                                                     

                                                    Regards,

                                                    Kumar

                                      • Re: Incremental load
                                        Jonathan Dienst

                                        Sreeni is correct. The QVD is overwritten on each STORE. Make these changes:

                                         

                                        Change this:

                                         

                                        STORE Inserts into $(Vqvdpath)packets.QVD(qvd);

                                        DROP Table Inserts;

                                         

                                        Updates:

                                        LOAD *

                                        FROM

                                        $(Vqvxpath)packets_updates.qvx(qvx)

                                        Where (packet_record_updated_datetime >'$(VDate_packets)') ;

                                         

                                        Concatenate

                                        LOAD *

                                        FROM

                                        $(Vqvdpath)packets.qvd(qvd)

                                        Where not Exists(team_key);

                                         

                                        To this:

                                         

                                        Concatenate(Inserts)

                                        LOAD *

                                        FROM

                                        $(Vqvxpath)packets_updates.qvx(qvx)

                                        Where (packet_record_updated_datetime >'$(VDate_packets)') ;

                                         

                                        Concatenate(Inserts)

                                        LOAD *

                                        FROM

                                        $(Vqvdpath)packets.qvd(qvd)

                                        Where not Exists(team_key);