7 Replies Latest reply: Jul 9, 2013 1:46 PM by Rikab Kothari RSS

    PICK or leave based on the timestamp.

    Rikab Kothari

      Hi All,

      Can someone please help me out for the following 2 requirement.

       

      1. I have number of xls which is in ExcelName_YYYYMM.xls format. I want to read all the xls in a particular folder and put a flag as picked based on the value in the variable. Imagine I have below xls in a folder and value in a variable is 4 then it will put latest 4 records as PICK and rest as OTHERS.

       

      ExcelName_201212.xls

      ExcelName_201211.xls

      ExcelName_201210.xls

      ExcelName_201209.xls

      ExcelName_201208.xls

      ExcelName_201207.xls

      ExcelName_201206.xls

      ExcelName_201205.xls

      ExcelName_201204.xls

      ExcelName_201203.xls

      ExcelName_201202.xls

      ExcelName_201201.xls

       

      Required Result:

       

      ExcelName_201212.xls    PICK

      ExcelName_201211.xls    PICK

      ExcelName_201210.xls    PICK

      ExcelName_201209.xls    PICK

      ExcelName_201208.xls    OTHERS

      ExcelName_201207.xls    OTHERS

      ExcelName_201206.xls    OTHERS

      ExcelName_201205.xls    OTHERS

      ExcelName_201204.xls    OTHERS

      ExcelName_201203.xls    OTHERS

      ExcelName_201202.xls    OTHERS

      ExcelName_201201.xls    OTHERS

       

      2. Imagine I have log files in a folder.As you know there can be N number of log files for the same date(as the creation timestamp is different). So based on the value in the variable it should put flag as PICK and rest as OTHERS.For example if the value in the variable is 3 it should put flag as PICK for all the logs for the latest 3 dates and rest as archived. Imagine you have 5 logs for the frst date,10 for second date and 5 for third date and 200 for rest of the dates. then put flag as PICK for first 20 and rest 200 as OTHERS.

       

      Hope my requirement is clear. If you have any questions please ask me.Hope someone can help me out on this.

       

      Regards,

      qvforum

       

      Logfile_20121201.xls    PICK(Put all the 5 files for this date as PICK)

      Logfile_20121202.xls    PICK(Put all the 10 files for this date as PICK)

      Logfile_20121203.xls    PICK(Put all the 5 files for this date as PICK)

      Logfile_20121204.xls    OTHERS(Put rest of the records as OTHERS From here)

      Logfile_20121205.xls    OTHERS

      Logfile_20121206.xls    OTHERS

      Logfile_20121207.xls    OTHERS

      Logfile_20121208.xls    OTHERS

      Logfile_20121209.xls    OTHERS

      Logfile_20121210.xls    OTHERS

      Logfile_20121211.xls    OTHERS

      Logfile_20121212.xls    OTHERS

       

      Thanks,

      qvforum

        • Re: PICK or leave based on the timestamp.
          Stefan Wühl

          Not really sure if I understood what you want to do.

           

          Maybe like attached?

           

          I used some sample Excelfiles with YearMonth and date / timestamp values in the filename:

           

          ExcelName_20120901_100800.xls

          ExcelName_20120901_121008.xls

          ExcelName_201210.xls

          ExcelName_20121002_101008.xls

          ExcelName_20121002_111232.xls

          ExcelName_201211.xls

          ExcelName_20121105_101008.xls

          ExcelName_20121105_121008.xls

          ExcelName_201212.xls

           

          You can parse these filenames and create a table of the names and add new fields to flag the latest files:

           

          For each vFile in FileList('ExcelName*.xls')

           

          Files:

          LOAD *,

               Date(Date#(keepchar(BaseName,'0123456789'),'YYYYMM')) as Date,

               Timestamp(Timestamp#(keepchar(BaseName,'0123456789'),'YYYYMMDDhhmmss')) as Timestamp;

          LOAD '$(vFile)' as Filename,

               subfield('$(vFile)','\',-1) as BaseName

          AutoGenerate 1;

           

          NEXT vFile

           

          Let vTop =2;

           

          Ex:

          LOAD max(Date, $(vTop)) as maxDate, max(daystart(Timestamp),$(vTop)) as maxDateTimestamp Resident Files;

           

          Left Join (Files) LOAD Date, if(Date >= peek('maxDate',0,'Ex'),'Keep','Other') as DateFlag Resident Files;

           

          Left Join (Files) LOAD Timestamp, if(daystart(Timestamp) >= peek('maxDateTimestamp',0,'Ex'),'Keep','Other') as TimestampFlag Resident Files;

           

           

          Hope this helps,

          Stefan

            • Re: PICK or leave based on the timestamp.
              Rikab Kothari

              I think I was not clear in explaining my requirement.

               

              1. Excel thing only have YYYY_MM along with the filename. It will not contain minutes and even days also. So based on my variable(if value is 2) it should latest 2 as pick and rest as others. Basically I want to read the file name of xls, and put flag based on our above calculation.

               

              2. This is for log file. Here we are not considerring the DateStamp which is there along with the logfile. Here will consider only the ModificationTime of file. So based of variable value we will match the date and put flag as pick and rest as others.

               

              Hope it is clear not

                • Re: PICK or leave based on the timestamp.
                  Stefan Wühl

                  1. should already work in above sample, I just called the field Date, but actually checked for your year month in the filename. You can also use other QV string or date / time parsing functions to get the number you are interested in from the BaseName.

                   

                  2. Ok, got it. You can use filetime() function to get the modification date when loading in a file.

                   

                  I changed the script to add 2. and hopefully make the things more clear:

                   

                   

                  For each vFile in FileList('ExcelName*.xls')

                   

                  Files1: // reading excel files with yearmonth in filename

                  LOAD *,

                       Date(Date#(keepchar(BaseName,'0123456789'),'YYYYMM')) as YearMonthFilename;

                  LOAD '$(vFile)' as Filename,

                       subfield('$(vFile)','\',-1) as BaseName

                  autogenerate 1;

                   

                  NEXT vFile

                   

                  Files2: // reading your logs with mod time

                  LOAD Filetime() as TimestampMod FROM *.log (txt)

                  where recno()=1;

                   

                  Let vTop =2;

                   

                  Ex1:

                  LOAD max(YearMonthFilename, $(vTop)) as maxYearMonthFilename Resident Files1;

                   

                  Ex2:

                  LOAD max(daystart(TimestampMod),$(vTop)) as maxDateTimestampMod Resident Files2;

                   

                  Left Join (Files1) LOAD YearMonthFilename, if(YearMonthFilename >= peek('maxYearMonthFilename',0,'Ex1'),'Keep','Other') as YearMonthFilenameFlag Resident Files1;

                   

                  Left Join (Files2) LOAD TimestampMod, if(daystart(TimestampMod) >= peek('maxDateTimestampMod',0,'Ex2'),'Keep','Other') as TimestampModFlag Resident Files2;

                   

                   

                  edit: You probably want to add the filename also to the Files2 table. Look into the QV file system function, you can use filename() for example:

                   

                  Files2: // reading your logs with mod time

                  LOAD FileName() as Filename2,

                      Filetime() as TimestampMod FROM *.log (txt)

                  where recno()=1;

                   

                  Instead of the file filter *.log, you can use the extensions and file name patterns that you need, of course, e.g. Logfile_*.txt (txt)

                   

                   

                    • Re: PICK or leave based on the timestamp.
                      Rikab Kothari

                      Thanks a ton! By looking at the output looks like you have done what I was lookin for. But still I want to check it by loading data from my side. I have couple of doubts below could you please help for the same?

                       

                      1. May I know what is the change in below For statement if all excels are @ C:\Users\Public\Downloads\Excel\

                       

                      For each vFile in FileList('ExcelName*.xls')

                       

                      2. What will be below load statement if my log files are @ C:\Users\Public\Downloads\Log\

                       

                      Files2: // reading your logs with mod time

                      LOAD Filetime() as TimestampMod FROM *.log (txt)

                      where recno()=1;

                       

                      Should I change something like this?

                       

                      1. For each vFile in FileList('C:\Users\Public\Downloads\Excel\*.xls')

                       

                      2. May I know what need to be done for logfile???

                        • Re: PICK or leave based on the timestamp.
                          Stefan Wühl

                          1.

                           

                          For each vFile in FileList('C:\Users\Public\Downloads\Excel\ExcelName*.xls')

                           

                          should do what you want (just change the file name pattern if needed, you may also just use what you suggested, if you are interested in all excel files at this location).

                           

                          Please look into the QV string functions and date / time interpretation functions if you need to parse the year month from your file name and having more complex file names than posted above.

                           

                          2.

                           

                          LOAD Filetime() as TimestampMod FROM C:\Users\Public\Downloads\Log\*.log (txt)

                          where recno()=1;

                           

                          should work, assuming that there log files are text files and have a file extension .log

                          Otherwise change .log to whatever extension you are using (or if you use several extensions, but all files are in this location, just use the star symbol (wildcard).

                           

                          Hope this helps,

                          Stefan

                            • Re: PICK or leave based on the timestamp.
                              Rikab Kothari

                              May I know the use of recno=1 in WHERE condition of the second example i,e while loading from log files?

                                • Re: PICK or leave based on the timestamp.
                                  Rikab Kothari

                                  I have another set of requirement which belongs to the task(Ex:1 above) which I am working on currently.

                                   

                                  I have field called "Location" in excel files which read in Ex:1(above).Once Files1 table is ready in above example with flag. I want to replace the root path in "Location" field of xls with given path.(i,e)Changing changing the root path from DEV environment to PROD environment.Once I replace the root path in above location I want either right back to the same excel and QVD.May I know how this can be achieved?

                                   

                                  I think we can use "REPLACE" function to replace the path explained above.(i,e)

                                  replace('Location','C:\Users\Public\Downloads\','D:\Users\Public\Downloads\') as New_Location

                                   

                                  Before using Replace:

                                   

                                  Location:

                                  C:\Users\Public\Downloads\Excel\

                                  C:\Users\Public\Downloads\xyz\

                                  C:\Users\Public\Downloads\abc\

                                   

                                  After using Replace:

                                  Location:

                                  D:\Users\Public\Downloads\Excel\

                                  D:\Users\Public\Downloads\xyz\

                                  D:\Users\Public\Downloads\abc\

                                   

                                  So may I know how this can be achieved at Excel level and QVD level. Is there is anyway we can replace the location as explained above and save the same in Excel and QVD.I know it should be possible QVD level, but not sure how that can be achieved. Can you please how this can be achived at QVD level and also at Excel level.

                                   

                                  Regards,

                                  qvforum