Discussion board where members can learn more about Qlik Sense Data Connectivity.
Hi,
I'm new to the forum, thanks for all the input I've found here already! I've seen several ways to load multiple files from a folder.
I haven't found a good solution to load a number of files based on a range in their filenam. I'm reaching out to see if multiple methods could be combined to a simple solution.
I have a few requimrents:
I currently know of the following 4 methods:
I think the solution should be in the filelist method. The range would be properly captured with either another function or a variable. The result could then be put in the filelist function.
Does anyone have an idea how to script this?
Let's assume I have the following files in 1 folder I want to load from.:
180814.xlsx
180813.xlsx
180810.xlsx
180809.xlsx
180808.xlsx
180807.xlsx
180806.xlsx
180803.xlsx
180802.xlsx
This is the current filelist script which works fine for all files, but not for a range of files.
FOR EACH File in filelist('lib://.......\*.xlsx');
SET vFile = '$(File)';
Table:
Load *,
ApplyMap('Map1',[ColumnX],null()) as "MappedValue";
LOAD *,
FileBaseName() as "FileName",
date(date#(FileBaseName(),'YYMMDD'),'DD-MM-YYYY') as "FileDate",
FROM [$(vFile)] (ooxml, embedded labels, table is Sheet1)
;
NEXT File;
Hi All,
Thanks a million for all the input. I've kept on searching around and have combined the solution i wanted now.
First I load all the filenames into a table. This is super quick. Next to the full filename I load whatever I want to select on. In this case there is a list list with incremental file ID's like:
Test_1234
Test_1235
Test_1236
Test_1237
In a resident load I can then select the required files.
Then I For...Next through the list of selected files.
Here's the script.
Thanks to https://gist.github.com/mountaindude/fbfefce78ac5e3ba4d5a for the first bit of script.
Thanks to atkinsow for the last push in the thread Re: load files with specific names from folder through for loop
For each Filename in 'lib://.....\'
sub DoDir (Root)
for each Ext in 'qvd'
for each File in filelist (Root&'\Test*.' &Ext)
QVD_Files:
Load '$(File)' as Filename,
Subfield('$(File)','/',-1) as QVDName,
Left(SubField('$(File)','_',-1),Len(SubField('$(File)','_',-1))-4) as QVDInteger
// FileSize( '$(File)' ) as QVDSize,
// FileTime( '$(File)' ) as QVDFileTime
autogenerate 1;
next File
next Ext
// Uncomment if you need to include QVDs in subdirectories
// for each Dir in dirlist (Root&'\*' )
// call DoDir (Dir)
// next Dir
end sub
call DoDir ('$(Filename)')
next;
SelectedFiles:
Load
Filename
Resident QVD_Files
Where QVDInteger >= 1235 and QVDInteger <= 1237;
Drop table QVD_Files;
LET NumRows=FieldValueCount('Filename');
FOR i=1 to $(NumRows)
LET vFile = peek('Filename',$(i)-1,'SelectedFiles');
let vQVD= ('$(vFile)');
Table:
Load *;
Concatenate Load * From ['$(vQVD)'] (qvd);
next;
Drop table SelectedFiles;
Hi All,
Next to the solution described above, I've also found another solution with the help of a colleague.
FOR period = 10001 TO 10101 IF NOT ISNULL(QVDCREATETIME('lib://PATH\FILE$(period).qvd')) THEN Table: LOAD *, FROM [lib://PATH\FILE$(period).qvd] (qvd); END IF NEXT; ;
Using this method, any range will be possible in files while not having to look in each file. The QVDCREATETIME function is used to skip missing files in the range.
I really like this solution as the script is small and fast.
Hope this can be of help for anyone else:)
Hi,
How about adding a step before the "For Each" look where you load the File names into a Temp Table where you load Filenames based on the range conditions and then use those file names for your For Each clause.
That way you will only load data for the filenames that are picked up on certain conditions.
I found this example on the forum which kind of illustrates the same, except that here they have noted which ones they do not want to include and check against that kind of list.
ExceptionList:
Load Concat(Chr(39) & Name & Chr(39),', ') as FileList;
LOAD * INLINE [
Name
File1
File2
File3
File4
File5
];
LET vFileList = Peek('FileList');
DROP Table ExceptionList;
Data:
LOAD *,
Filebasename() as FileName FROM
(qvd) Where Not Match(FileBaseName(), $(vFileList));
Best,
Al
For me it's not really clear what do you want to do but I assume that you want to load only certain files from a folder. In regard to your example it seems quite easy with something like:
FOR EACH File in filelist('lib://.......\*.xlsx');
let vDate = (year(today()-14)-2000)*10000+(month(today()-14)*100)+day(today()-14);
if subfield(subfield('$(File)', '\', -1), '.', 1) >= $(vDate) then
Table: Load * FROM [$(File)] (ooxml, embedded labels, table is Sheet1);
end if
NEXT
to load for example the files from the last 14 days - with subfield() or any other string-functions could extract any information from the filename (and this without loading these files) and only if this information matched with your condition the file will be loaded. This meant it depends on the decoded information within the filename and your condition how easy it is to implement. By only period-related information it's often quite simple.
- Marcus
Hi Marcus,
This looks like a good idea! Tried it, When I only use the PeriodMin, it still loads all files. When I use both parameters like the the following script, no files are loaded (while there are files which match the criteria. I've tried to Num() to all statements, but no luck. Any ideas?
FOR EACH File in filelist('lib://...\*.xlsx');
SET vFile = '$(File)';
Let vDate.PeriodMin = 180710;
Let vDate.PeriodMax = 180716;
If subfield(subfield('$(File)', '\', -1), '.', 1) >= '$(vDate.PeriodMin)'
and subfield(subfield('$(File)', '\', -1), '.', 1) <= '$(vDate.PeriodMax)'
then
Table:
Load *
FROM [$(vFile)] (ooxml, embedded labels, table is Blad2)
;
End if
NEXT File;
Maybe the extract of the period from the filename didn't work like expected. To check this place the following below your period-variables:
let vCheck = subfield(subfield('$(File)', '\', -1), '.', 1);
TRACE '$(vCheck)';
and then you will see the result within the load progress-window and within the log-file.
But if I look again I assume the issue are the single-quotes around your variables by their call because they make the content to a string and not a number. Therefore try it in this way:
...
If subfield(subfield('$(File)', '\', -1), '.', 1) >= $(vDate.PeriodMin)
and subfield(subfield('$(File)', '\', -1), '.', 1) <= $(vDate.PeriodMax)
then
...
- Marcus
Hi All,
Thanks a million for all the input. I've kept on searching around and have combined the solution i wanted now.
First I load all the filenames into a table. This is super quick. Next to the full filename I load whatever I want to select on. In this case there is a list list with incremental file ID's like:
Test_1234
Test_1235
Test_1236
Test_1237
In a resident load I can then select the required files.
Then I For...Next through the list of selected files.
Here's the script.
Thanks to https://gist.github.com/mountaindude/fbfefce78ac5e3ba4d5a for the first bit of script.
Thanks to atkinsow for the last push in the thread Re: load files with specific names from folder through for loop
For each Filename in 'lib://.....\'
sub DoDir (Root)
for each Ext in 'qvd'
for each File in filelist (Root&'\Test*.' &Ext)
QVD_Files:
Load '$(File)' as Filename,
Subfield('$(File)','/',-1) as QVDName,
Left(SubField('$(File)','_',-1),Len(SubField('$(File)','_',-1))-4) as QVDInteger
// FileSize( '$(File)' ) as QVDSize,
// FileTime( '$(File)' ) as QVDFileTime
autogenerate 1;
next File
next Ext
// Uncomment if you need to include QVDs in subdirectories
// for each Dir in dirlist (Root&'\*' )
// call DoDir (Dir)
// next Dir
end sub
call DoDir ('$(Filename)')
next;
SelectedFiles:
Load
Filename
Resident QVD_Files
Where QVDInteger >= 1235 and QVDInteger <= 1237;
Drop table QVD_Files;
LET NumRows=FieldValueCount('Filename');
FOR i=1 to $(NumRows)
LET vFile = peek('Filename',$(i)-1,'SelectedFiles');
let vQVD= ('$(vFile)');
Table:
Load *;
Concatenate Load * From ['$(vQVD)'] (qvd);
next;
Drop table SelectedFiles;
Hi All,
Next to the solution described above, I've also found another solution with the help of a colleague.
FOR period = 10001 TO 10101 IF NOT ISNULL(QVDCREATETIME('lib://PATH\FILE$(period).qvd')) THEN Table: LOAD *, FROM [lib://PATH\FILE$(period).qvd] (qvd); END IF NEXT; ;
Using this method, any range will be possible in files while not having to look in each file. The QVDCREATETIME function is used to skip missing files in the range.
I really like this solution as the script is small and fast.
Hope this can be of help for anyone else:)