9 Replies Latest reply: Nov 26, 2013 4:42 AM by Friedrich Hofmann RSS

    LOAD on top of SQL_SELECT - where to put timestamp?

    Friedrich Hofmann

      Hi,

       

      I have a script which loads quite a lot of data from our personell database and, moreover, from a table that is huge - the biggest in the database - and thus it takes a while to load the data from the database.

      I usually display the data in that app only up to (yesterday), however, so the data doesn't change during the day, and some apps in which that data is being used are reloaded every 3 hrs.

      So I want to use a timestamp to make sure that part of the script is only run once a day in the morning and not every time the app is reloaded.

       

      => My question is: Where do I have to put that timestamp? In the LOAD or in the underlying SQL_SELECT - that would make more sense I guess - or both?

       

      Thanks a lot!

      Best regards,

       

      DataNibbler

       

      P.S.: Also, I don't know what would be the command to use in the SQL_SELECT command.

        • Re: LOAD on top of SQL_SELECT - where to put timestamp?
          Jonathan Dienst

          Hi

           

          Not sure if I fully understand, but what about storing a variable with the last reload time, or using the ReloadTime() system variable, like this:


          If Floor(ReloadTime()) < Today(1) Then


               .... perform reload ....


               Store Data into DailyQVD.qvd (qvd);


          Else

              

               .... load for from QVD(s)


          End If

           

          HTH

          Jonathan

           

          Edit: you could also consider partial reloads or a dedicated QVD generator, and then all your models refer to the QVD file(s) generated by the QVD generator.

            • Re: LOAD on top of SQL_SELECT - where to put timestamp?
              Friedrich Hofmann


              Hi Jonathan,

               

              that is different from the way I was planning to do it, but also possible - that way, if I understand correctly,

               

              => it would encompass both the LOAD and the SQL_SELECT and perform both only if a certain condition is met (in my instance, if the current_timestamp (to be generated with NOW(0) or reloadtime().

              (there would be no ELSE in my instance, there are no qvd files for personell data)

               

              Thanks a lot!

              Best regards,

               

              DataNibbler

                • Re: LOAD on top of SQL_SELECT - where to put timestamp?
                  Friedrich Hofmann

                  Hi Jonathan,

                   

                  one more question: I will have to conditionalize several subsequent LOADs (RESIDENTs) according to that timestamp - I can do that all in one IF_then_clause, can't I?

                  => IF the condition is "reload only if NOW() is before 8am" and the app is reloaded again at 1pm and at 7pm - will that data still be available after those reload times?

                   

                  Thanks a lot!

                  Best regards,

                   

                  DataNibbler

                    • Re: LOAD on top of SQL_SELECT - where to put timestamp?
                      Jonathan Dienst

                      That's why the first load of the day in my earlier post saves a QVD. Later loads in the same day reuse the QVD, rather than hitting the SQL database again. The QVD load can be an optimised load which makes it much faster than loading from the database.

                       

                      On my post, the line saying "load for QVD" is a typo, it should say "load from QVD". I have corrected that.

                       

                      You can do it without saving a QVD if you can implement a partial reload, but I am not clear what you are reloading with those subsequent loads. If the data is the same, why do you need reloads? I am assuming that there is another data source in your model which is changing during the day.

                      .

                      And yes, you can enclose multiple loads within the If ... then ... else .... end if structure.

                       

                      Jonathan

                        • Re: LOAD on top of SQL_SELECT - where to put timestamp?
                          Friedrich Hofmann


                          Hi Jonathan,

                           

                          tthere are multiple data sources, yes - all from the personell_database, but from multiple tables.

                          My problem is only with that one which is huge and takes a long time.

                           

                          I interpret your post such that without a qvd, the data won't be available at the time of the next reloads, right?

                          I'll try anyway.

                          That's a bit of a dilemma: I wanted to avoid loading that data all three hrs when it doesn't change during the day. But storing personell_data in a qvd file that by definitionem cannot be secured seems a bit risky. I tried to have a directory generated which only few people have access to, but that would always include all of our IT - globally, which is way too much.

                          I'll try replacing the Pers_ID with an AutoNumber()-value so it is relatively safe to store... I'll have to ask the HR_supervisor, though - if he says no, I cannot do it.

                          And don't worry about typos - I have enough experience to know to read the help_file ;-)

                           

                          Best regards,

                           

                          DataNibbler

                           

                          P.S.: O_O - something seems to be wrong: I just found this line in my log_file

                          >>> 26.11.2013 07:00:03: 0158  IF (NOW(0) > '01:00:00' AND NOW(0) < '08:00:00') then <<<

                          (the next line is the next data_source -> the data inside that clause was not loaded. Can you help me there?

                            • Re: LOAD on top of SQL_SELECT - where to put timestamp?

                              Hi,

                               

                              As Jonathan suggested, you may also use partial reloads if the QVD does not suit you and your company. Even if the QVD solution seems to be the best in terms of efficiency.

                               

                              Use REPLACE ONLY and ADD ONLY prefix before the LOAD statements. But take car:

                              1) the user (or any action) must perform this partial reload

                              2) you have to test correctly the data that are loaded again to not append twice or three times to the database: you must use a exists() on a field or a key on several fields. If you do not test what you ADD ONLY, you will see your data increase load after load.

                               

                              Fabrice

                                • Re: LOAD on top of SQL_SELECT - where to put timestamp?
                                  Friedrich Hofmann

                                  Hi Aunez,

                                   

                                  thanks a lot!

                                  I just talked to the HR_supervisor and I think that I will go with the QVD thing, after all. I have a directory, access to which is rel. restricted. I will see what pieces in the data I can maybe hide to make an association difficult.

                                  Right now, however, it doesn't seem to work at all - see that line from my log_file that I posted.

                                  Can you help me there? It looks absolutely correct - yet that data is not loaded as I can see in the log_file - and, of course, the table is not available...

                                • Re: LOAD on top of SQL_SELECT - where to put timestamp?
                                  Jonathan Dienst

                                  Remember that at the start of a normal reload, QV erases all the data. So you have to load everything, but loading from a QVD is very fast (providing it is an optimised load), even if  the file is huge.

                                   

                                  You could consider using a partial reload for the intraday loads - this means adding REPLACE or ADD prefixes to the load statements (and that the loaded data will replace or add to that table). In a partial reload QV does not erase any data and normal LOAD statements are ignored, only the ADD LOADs and REPLACE LOADs take place.

                                   

                                  As far as the time conditions are concerned, I would not use Now(0) for this purpose - it refers to the time of the previously finished reload. Rather use Now(1) - you can ignore the warning in the manual. The value returned by Now() includes the date, so you have to Frac() it to discard the date portion. Also you may have a time interpretation issue - rather do this

                                   

                                       If Frac(Now(1)) > Time#('01:00:00') And Frac(Now(1)) < Time#('08:00:00') ....

                                   

                                  or even this:

                                   

                                       If Frac(Now(1)) > 1/24 And Frac(Now(1)) < 8/24 ....

                                   

                                  Jonathan