Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
michiel_
Contributor II
Contributor II

Multiple file load with file name range

Hi,

I'm new to the forum, thanks for all the input I've found here already! I've seen several ways to load multiple files from a folder.


I haven't found a good solution to load a number of files based on a range in their filenam. I'm reaching out to see if multiple methods could be combined to a simple solution.

I have a few requimrents:

  • Load all files
  • Load a range of files
  • Keep the script as short or simple as possible. Short scripts are easier to teach to others.
  • Keep the performance ok
  • Should work with preceeding load

I currently know of the following 4 methods:

  • Load all files with wildcard in FROM statement. Works fine for all files in a folder.
  • Load all files with wildcard using Filelist function. Works fine for all files in a folder. Can also work with preceeding load.
  • Load all files with wildcard and a WHERE statement to only load the desired file name range. This method works, but checks each file for the WHERE statement. This makes the script very slow.
  • Load a range of files using duplicate script. It starts a table from a certain variable and for each file until the next variable it concatenates the next file. Works with name range, but the full load script (all column calculations etc..) is duplicated, which looks very messy.

I think the solution should be in the filelist method. The range would be properly captured with either another function or a variable. The result could then be put in the filelist function.

Does anyone have an idea how to script this?


Let's assume I have the following files in 1 folder I want to load from.:

180814.xlsx

180813.xlsx

180810.xlsx

180809.xlsx

180808.xlsx

180807.xlsx

180806.xlsx

180803.xlsx

180802.xlsx

This is the current filelist script which works fine for all files, but not for a range of files.

FOR EACH File in filelist('lib://.......\*.xlsx');

SET vFile = '$(File)';

Table:

Load *,

    ApplyMap('Map1',[ColumnX],null()) as "MappedValue";

LOAD *,

    FileBaseName() as "FileName",

    date(date#(FileBaseName(),'YYMMDD'),'DD-MM-YYYY') as "FileDate",

FROM [$(vFile)] (ooxml, embedded labels, table is Sheet1)

;

NEXT File;

2 Solutions

Accepted Solutions
michiel_
Contributor II
Contributor II
Author

Hi All,

Thanks a million for all the input. I've kept on searching around and have combined the solution i wanted now.

First I load all the filenames into a table. This is super quick. Next to the full filename I load whatever I want to select on. In this case there is a list list with incremental file ID's like:

Test_1234

Test_1235

Test_1236

Test_1237

In a resident load I can then select the required files.

Then I For...Next through the list of selected files.

Here's the script.

Thanks to https://gist.github.com/mountaindude/fbfefce78ac5e3ba4d5a for the first bit of script.

Thanks to atkinsow‌ for the last push in the thread Re: load files with specific names from folder through for loop

For each Filename in 'lib://.....\'

    sub DoDir (Root)

          for each Ext in 'qvd'

              for each File in filelist (Root&'\Test*.' &Ext)

                    QVD_Files:

                        Load '$(File)' as Filename,

                        Subfield('$(File)','/',-1) as QVDName,

                        Left(SubField('$(File)','_',-1),Len(SubField('$(File)','_',-1))-4) as QVDInteger

//                          FileSize( '$(File)' ) as QVDSize,

//                          FileTime( '$(File)' ) as QVDFileTime

                    autogenerate 1;

              next File

          next Ext

// Uncomment if you need to include QVDs in subdirectories

//          for each Dir in dirlist (Root&'\*' )

//              call DoDir (Dir)

//          next Dir

    end sub

call DoDir ('$(Filename)')

next;

SelectedFiles:

Load

Filename

Resident QVD_Files

Where QVDInteger >= 1235 and QVDInteger <= 1237;

Drop table QVD_Files;

LET NumRows=FieldValueCount('Filename');

FOR i=1 to $(NumRows)

LET vFile = peek('Filename',$(i)-1,'SelectedFiles');

let vQVD= ('$(vFile)');

Table:

Load *;

Concatenate Load * From ['$(vQVD)'] (qvd);

next;

Drop table SelectedFiles;

View solution in original post

michiel_
Contributor II
Contributor II
Author

Hi All,

Next to the solution described above, I've also found another solution with the help of a colleague. 

FOR period = 10001 TO 10101
                IF NOT ISNULL(QVDCREATETIME('lib://PATH\FILE$(period).qvd')) THEN

Table:
LOAD
*,
FROM [lib://PATH\FILE$(period).qvd]
(qvd);
END IF
NEXT;
;

Using this method, any range will be possible in files while not having to look in each file. The QVDCREATETIME function is used to skip missing files in the range. 

I really like this solution as the script is small and fast.

Hope this can be of help for anyone else:)

View solution in original post

6 Replies
bwisealiahmad
Partner - Specialist
Partner - Specialist

Hi,

How about adding a step before the "For Each" look where you load the File names into a Temp Table where you load Filenames based on the range conditions and then use those file names for your For Each clause.

That way you will only load data for the filenames that are picked up on certain conditions.

I found this example on the forum which kind of illustrates the same, except that here they have noted which ones they do not want to include and check against that kind of list.

ExceptionList: 

Load Concat(Chr(39) & Name & Chr(39),', ') as FileList; 

LOAD * INLINE [ 

    Name 

    File1 

    File2 

    File3 

    File4 

    File5 

]; 

 

LET vFileList = Peek('FileList'); 

 

DROP Table ExceptionList; 

 

Data: 

LOAD *, 

Filebasename() as FileName FROM 

 

(qvd) Where Not Match(FileBaseName(), $(vFileList)); 

Best,

Al

marcus_sommer

For me it's not really clear what do you want to do but I assume that you want to load only certain files from a folder. In regard to your example it seems quite easy with something like:

FOR EACH File in filelist('lib://.......\*.xlsx');

     let vDate = (year(today()-14)-2000)*10000+(month(today()-14)*100)+day(today()-14);

     if subfield(subfield('$(File)', '\', -1), '.', 1) >= $(vDate) then

           Table: Load * FROM [$(File)] (ooxml, embedded labels, table is Sheet1);

     end if

NEXT

to load for example the files from the last 14 days - with subfield() or any other string-functions could extract any information from the filename (and this without loading these files) and only if this information matched with your condition the file will be loaded. This meant it depends on the decoded information within the filename and your condition how easy it is to implement. By only period-related information it's often quite simple.

- Marcus

michiel_
Contributor II
Contributor II
Author

Hi Marcus,

This looks like a good idea! Tried it, When I only use the PeriodMin, it still loads all files. When I use both parameters like the the following script, no files are loaded (while there are files which match the criteria. I've tried to Num() to all statements, but no luck. Any ideas?

FOR EACH File in filelist('lib://...\*.xlsx');

SET vFile = '$(File)';

Let vDate.PeriodMin = 180710;

Let vDate.PeriodMax = 180716;

If subfield(subfield('$(File)', '\', -1), '.', 1) >= '$(vDate.PeriodMin)'

and subfield(subfield('$(File)', '\', -1), '.', 1) <= '$(vDate.PeriodMax)'

    then

Table:

Load *

FROM [$(vFile)] (ooxml, embedded labels, table is Blad2)

;

End if

NEXT File;

marcus_sommer

Maybe the extract of the period from the filename didn't work like expected. To check this place the following below your period-variables:

let vCheck = subfield(subfield('$(File)', '\', -1), '.', 1);

TRACE '$(vCheck)';

and then you will see the result within the load progress-window and within the log-file.

But if I look again I assume the issue are the single-quotes around your variables by their call because they make the content to a string and not a number. Therefore try it in this way:

...

If subfield(subfield('$(File)', '\', -1), '.', 1) >= $(vDate.PeriodMin)

and subfield(subfield('$(File)', '\', -1), '.', 1) <= $(vDate.PeriodMax)

    then

...

- Marcus

michiel_
Contributor II
Contributor II
Author

Hi All,

Thanks a million for all the input. I've kept on searching around and have combined the solution i wanted now.

First I load all the filenames into a table. This is super quick. Next to the full filename I load whatever I want to select on. In this case there is a list list with incremental file ID's like:

Test_1234

Test_1235

Test_1236

Test_1237

In a resident load I can then select the required files.

Then I For...Next through the list of selected files.

Here's the script.

Thanks to https://gist.github.com/mountaindude/fbfefce78ac5e3ba4d5a for the first bit of script.

Thanks to atkinsow‌ for the last push in the thread Re: load files with specific names from folder through for loop

For each Filename in 'lib://.....\'

    sub DoDir (Root)

          for each Ext in 'qvd'

              for each File in filelist (Root&'\Test*.' &Ext)

                    QVD_Files:

                        Load '$(File)' as Filename,

                        Subfield('$(File)','/',-1) as QVDName,

                        Left(SubField('$(File)','_',-1),Len(SubField('$(File)','_',-1))-4) as QVDInteger

//                          FileSize( '$(File)' ) as QVDSize,

//                          FileTime( '$(File)' ) as QVDFileTime

                    autogenerate 1;

              next File

          next Ext

// Uncomment if you need to include QVDs in subdirectories

//          for each Dir in dirlist (Root&'\*' )

//              call DoDir (Dir)

//          next Dir

    end sub

call DoDir ('$(Filename)')

next;

SelectedFiles:

Load

Filename

Resident QVD_Files

Where QVDInteger >= 1235 and QVDInteger <= 1237;

Drop table QVD_Files;

LET NumRows=FieldValueCount('Filename');

FOR i=1 to $(NumRows)

LET vFile = peek('Filename',$(i)-1,'SelectedFiles');

let vQVD= ('$(vFile)');

Table:

Load *;

Concatenate Load * From ['$(vQVD)'] (qvd);

next;

Drop table SelectedFiles;

michiel_
Contributor II
Contributor II
Author

Hi All,

Next to the solution described above, I've also found another solution with the help of a colleague. 

FOR period = 10001 TO 10101
                IF NOT ISNULL(QVDCREATETIME('lib://PATH\FILE$(period).qvd')) THEN

Table:
LOAD
*,
FROM [lib://PATH\FILE$(period).qvd]
(qvd);
END IF
NEXT;
;

Using this method, any range will be possible in files while not having to look in each file. The QVDCREATETIME function is used to skip missing files in the range. 

I really like this solution as the script is small and fast.

Hope this can be of help for anyone else:)