5 Replies Latest reply: Jan 12, 2011 3:10 AM by abcd1234 RSS

    Help with Incremental Reload

    Brian Garland

      We've got a table of about 23.5 million rows which represent 18 months (rolling) of data. There are about 80k new or updated rows every day. Using Rob Wunderlich's example, I have created the following script. But it runs really slow and in the end it fails to update the QVD. I'd appreciate it if someone could take a quick look and let me know if anything looks wrong.

       

      LET SCRIPT_START=now(1);

      LET PKexp=autonumberhash128(Branch_Plant,SKU,Lot_Code,Location,Transaction_Date,Transaction_Time);

      IF filesize('QVD/Detail_Inventory_Transactions_Last2_FY.qvd') > 0 THEN

      SET QVD_EXISTS=1;

      LET QVD_ROWS_BEFORE=QvdNoOfRecords('QVD/Detail_Inventory_Transactions_Last2_FY.qvd');

      ELSE

      SET QVD_EXISTS=1; // Always make True becasue this part isn't working for some reason.

      LET QVD_ROWS_BEFORE=QvdNoOfRecords('QVD/Detail_Inventory_Transactions_Last2_FY.qvd');

      END IF

      IF $(QVD_EXISTS) THEN

      GetLastDateTemp:

      Directory;

      LOAD Record_Update_Date FROM QVD/Detail_Inventory_Transactions_Last2_FY.qvd (qvd);

      LOAD max(Record_Update_Date) as MaxLastUpdateDate RESIDENT GetLastDateTemp;

      LET LastUpdateDate = MaxLastUpdateDate;

      LET MaxLastDate = text(date(fieldValue('MaxLastUpdateDate', 1),'MM/DD/YY hh:mm tt'));

      DROP table GetLastDateTemp;

      ELSE

      LET MaxLastDate = '4/1/2009';

      END IF

      DIT:

      SQL SELECT *

      FROM Detail_Inventory_Transactions

      WHERE "Record_Update_Date" >= '$(MaxLastDate)'

      ;

      IF $(QVD_EXISTS) THEN

      RIGHT JOIN (DIT) LOAD

      *,

      $(PKexp) as PK

      RESIDENT DIT;

      CONCATENATE (DIT) LOAD * FROM QVD/Detail_Inventory_Transactions_Last2_FY.qvd (qvd)

      WHERE NOT exists(PK, $(PKexp));

      DROP FIELD PK;

      END IF

      STORE DIT INTO QVD/Detail_Inventory_Transactions_Last2_FY.qvd;

      DROP TABLE DIT;

      LET LOAD_DURATION=now(1) - SCRIPT_START;

      LET QVD_ROWS_AFTER=QvdNoOfRecords('QVD/Detail_Inventory_Transactions_Last2_FY.qvd');

        • Help with Incremental Reload
          Brian Garland

          I forgot to add:

          The primary problem I'm having is that the new QVD being written only contians the new/updated records. It's not merging in the existing records from the QVD. So my file starts at 1.5GB in size but ends up as only 9MB.

            • Help with Incremental Reload
              Rob Wunderlich

              Briain,

              I don't see anything obvious that would explain why you're not merging. Can you tell from the log if the IF block that contains the RIGHT JOIN and the CONCATENATE is being executed?

              It would be helpful if you posted the document log. Can you post it here?

              -Rob

                • Help with Incremental Reload
                  Brian Garland

                  Thanks for the reply Rob!

                  I think I've got it working properly now (log contains no errors), but I'm including the revised script if you have a few minutes to look it over. The record delete portion is still commented out - does it look correct to you? It creates a 1.5GB QVD file that contains about 22 million records. It takes about 3 hours to run (25 minutes just to get the Max Date from the QVD).

                  - Brian

                  // The following script is based on the Incremental Reload script by Rob Wunderlich

                   

                  CONNECT TO [Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=Fg_DataMart;Data Source=occ01db055r;Use Procedure for Prepare=1;Auto Translate=True;Packet Size=4096;Workstation ID=BOI03LT073;Use Encryption for Data=False;Tag with column collation when possible=False];

                  Directory QVD/;

                   

                  LET SCRIPT_START=now(1);

                  SET QVDFILE=Detail_Inventory_Transactions_Last2_FY.qvd;

                  SET PKexp=autonumberhash128(Branch_Plant,SKU,Lot_Code,Location,Transaction_Date,Transaction_Time);

                  IF FileSize('$(QVDFILE)') > 0 THEN

                  SET QVD_EXISTS=1;

                  Let QVD_ROWS_BEFORE=QvdNoOfRecords ('$(QVDFILE)');

                  ELSE

                  SET QVD_EXISTS=0;

                  END IF

                   

                  IF $(QVD_EXISTS) THEN

                  LET GETMAXDATE_START=now(1);

                  maxdate:

                  LOAD max(Record_Update_Date) as maxdate

                  FROM $(QVDFILE) (qvd);

                  LET GETMAXDATE_END=now(1)-GETMAXDATE_START;

                  LET vMaxDate = date(fieldValue('maxdate', 1),'MM/DD/YYYY hh:mm:ss');

                  DROP table maxdate;

                  END IF

                   

                  IF $(QVD_EXISTS) THEN

                  SET vSQLQuery = SELECT * FROM Detail_Inventory_Transactions WHERE Record_Update_Date >= '$(vMaxDate)';

                  ELSE

                  SET vSQLQuery = SELECT * FROM Detail_Inventory_Transactions WHERE Gl_Date >= '09/01/2007 00:00:00';

                  END IF

                   

                  LET GETNEWDATA_START=now(1);

                  TempTable:

                  SQL $(vSQLQuery);

                  LET GETNEWDATA_END=now(1)-GETNEWDATA_START;

                   

                  IF $(QVD_EXISTS) THEN

                  LET GETALLDATA_START=now(1);

                  RIGHT JOIN (TempTable)

                  LOAD

                  *,

                  $(PKexp) as PK

                  RESIDENT TempTable;

                  CONCATENATE (TempTable)

                  LOAD

                  *

                  FROM $(QVDFILE) (qvd)

                  WHERE NOT exists(PK, $(PKexp));

                  //AND Record_Update_Date >= AddMonths(today(),-24);

                   

                  DROP FIELD PK;

                  LET GETALLDATA_END=now(1)-GETALLDATA_START;

                  END IF

                   

                  LET STOREDATA_START=now(1);

                  STORE TempTable INTO $(QVDFILE);

                  LET STOREDATA_END=now(1)-STOREDATA_START;

                  DROP TABLE TempTable;

                  LET LOAD_DURATION=now(1) - SCRIPT_START;

                  LET QVD_ROWS_AFTER=QvdNoOfRecords ('$(QVDFILE)');

                    • Help with Incremental Reload
                      Rob Wunderlich

                      Brian,

                      Glad you got it working as far as getting correct results. On performance, I'll offer a couple suggestions.

                      - You are probably spending a lot of time generating the PK. Also the PKexp in the WHERE NOT exists() is preventing an optimized load of the QVD. It may be better if you generate the PK only on the new rows and save the PK in the QVD. If you save PK, you can't use autonumber(). You'll have to use something like hash160() which increase the size of your QVD but may be worth the tradeoff. With a saved PK, the WHERE expression is just "WHERE NOT exists(PK)". Exists() with a single expression allows for optimized load.

                      - If you are only running the update once or a few times a day, you might consider another approach to the maxdate problem. I think sweeping the data to get maxdate is the most accurate technique, but it comes at a performance cost. Perhaps you could use QvdCreateTIme() with a day resolution. That is, select based on date only, don't worry about time. You may duplicate some selects day to day, but the PK will keep them out of the QVD.

                      - If possible, don't use CONCATENATE in the QVD merge. It's not necessary unless you are adding fields. I've seen cases where CONCATENATE seemed to prevent an optimized load.

                      - The AddMonths() to roll off older data looks correct, but leave it out for the moment because it will prevent an optimized load. If saving the PKs as suggested gives you an optimized load, it may be better to do the rolloff in a subsequent load against the RESIDENT file.

                      Good luck. Let us know how it turns out. (I'm taking off for a week but will check when I return).

                      -Rob

                      • Help with Incremental Reload

                        I have implemented your code it is working fine my side.

                        thanks to you.

                        can you put some light on -

                        what is the purpose of this -

                        Directory QVD/;

                        is it required?