Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I am trying to load multiple excel files from dropbox folder on Qlik Sense Business environment.
I tried the following script:
LIB CONNECT TO 'Betting Stats:Dropbox - denis.m.u@gmail.com';
for each file in filelist('lib://Betting Stats:Dropbox - denis.m.u@gmail.com/Football Stats/27122018\*.xlsx')
Test:
load *
from [$(file)](ooxml, embedded labels, table is Match);
next file
the load is working but it does not load any data.
Can you please let me know what I am missing or doing wrong?
Thanks
Hi,
Each of the different file folder connectors in Sense Business behaves in a slightly different way, and none of them quite like a folder in Enterprise or Desktop.
To enumerate around files in a DropBox you first need to connect to the DropBox MetaData connector. From here you can get a list of files into a temporary table.
Once you have that you can enumerate around and peek out the filename. You can then load each file in turn.
LIB CONNECT TO 'Dropbox_Metadata';
tmpFiles:
LOAD
path_lower
WHERE index(path_lower, '.xlsx') > 0
;
SELECT
path_lower
FROM List
WITH PROPERTIES (
dropboxFolder='',
recursive='false',
includeDeleted='false'
);
for iFile = 0 to NoOfRows('tmpFiles') - 1
let vFile = peek('path_lower', iFile, 'tmpFiles');
LOAD
*
FROM [lib://Dropbox$(vFile)]
(ooxml, embedded labels);
next
DROP TABLE tmpFiles;
This is not as simple as doing a load from a *, but I tend to avoid doing that anyway, as if there is a single file that is slightly different things will fail and it's tricky to debug why.
By looping round (this can be done with a for each vFile in ('lib://library/*.qvd') in Enterprise and Desktop) it gives you the ability to insert code within the enumeration.
The DropBox MetaData connector has a whole bunch of other information you can get from it also, so well worth exploring.
If you have lots of Excel spreadsheets it can slow things down. Though it won't work directly with QSB you might want to look at this article for some ideas around creating QVDs from Excel, to speed up the load:
https://www.quickintelligence.co.uk/convert-drop-folder-files-qvd/
Hope that helps,
Steve
Hello Steve,
Thank you for this reply. I have never looked at the metadata connector, so I guess I know what I will be exploring this weekend. I amended the script, looking only at Sheet2, and it worked like a charm.
I am rather going to change my ETL slightly to upload a QVD, rather than the spreadsheets.
Having said that, this is a very good option.
Thank you.
John
HI Steve,
Thank you for proposed solution.
I was trying it but have one issue that the load fails on data connection, and it might be something simple that I do not get due to lack of knowledge, but what I suppose to put under the From connection part (the part that marked as bold below)?
I know it is something simple but could you tell me what is wrong I am doing?
Thanks
LIB CONNECT TO 'Dropbox_Metadata - denis.m.u@gmail.com';
Tempfiles:
LOAD
path_lower as [List.path_lower]
where index(path_lower, '.xlsx')>0;
SELECT
path_lower
FROM List
WITH PROPERTIES (
dropboxFolder='/Football Stats/27122018',
recursive='false',
includeDeleted='false'
);
for iFile = 0 to NoOfRows('Tempfiles') -1
let vFile = peek('path_lower', iFile,'Tempfiles');
Load *
From [lib://Dropbox_Metadata - denis.m.u@gmail.com/Football Stats/$(vFolder)/$(vFile)]
(ooxml, embedded labels);
next
Drop Table Tempfiles;
Hi Denis,
The second load is from the Dropbox connector, rather than the Dropbox Metadata one.
Probably something like:
From [lib://Dropbox - denis.m.u@gmail.com/Football Stats/$(vFolder)/$(vFile)]
Whether the folder names are correct or not depends on how your DropBox is configured, try loading one file from the Connection wizard and then amend accordingly to have the $(vFile) variable in place of the filename.
I notice you have a variable for the folder in the second load, but not the file list. I would recommend using either 27122018 or $(vFolder) in both places.
Good luck!
Steve
Thanks for confirming that it works for you.
Thanks for getting back Steve.
When I load one file it works but whn I use variable I get an error, please see below. Can you let me know what is wrong please?
Sorry forgot to add my code.
Here is is.
Thank you !
LIB CONNECT TO 'Betting Stats:Dropbox - denis.m.u@gmail.com';
LIB CONNECT TO 'Dropbox_Metadata - denis.m.u@gmail.com';
Tempfiles:
LOAD
path_lower as [List.path_lower]
where index(path_lower,'.xlsx')>0;
SELECT
path_lower
FROM List
WITH PROPERTIES (
dropboxFolder='/Football Stats/27122018',
recursive='false',
includeDeleted='false'
);
for iFile = 0 to NoOfRows('Tempfiles') -1
let vFile = peek('path_lower', iFile,'Tempfiles');
Load *
From [lib://Betting Stats:Dropbox - denis.m.u@gmail.com/Football Stats/27122018/$(vFile)]
(ooxml, embedded labels,table is Match);
next
Drop Table Tempfiles;
Remove the code as [List.path_lower], the peek statement is looking for a field just called path_lower.
That might do it.
Put the line trace $(vFile); after the line with the peek statement and it will prove whether you have found the filename or not.
Hope that helps.
Thanks Steve, removing [List.path_lower] helped, now the load works perfectly, I should be paying more attention to my code :).
But now the issue is when it comes to drop a table it freezes and cannot drop the table.
Is there again I am missing something?
Also can you explain why we need to drop the table?
Thank you very much and sorry for luck of knowledge in this.
Hi Denis,
I very much doubt it is the DROP TABLE which is causing the problem, rather that you have reached the end of the script.
It is quite safe to comment out that line and not drop the table. It will just be a data island in the data model with a list of files. The table isn't required though, and it is best practice to remove it.
If I had to guess what has gone wrong, it would be that one or more tables in the DropBox folder are not in an identical format. This will mean that when they load rather than concatenating they will load into separate tables in the data model, potentially with many common fields causing a large synthetic key. These will show in yellow text in the load progress dialog.
You can;
- Check the DropBox folder to ensure all files are in an identical format
- Check the filename in the loop to ensure the next file to load is one that matches the format
- List all the fields in the LOAD statement so that any rogue fields are ignored and files with missing fields fail to load (so you can identify and fix/remove them)
- Use the CONCATENATE keyword in front of the LOAD. If you go for this you will need to either do differently on the first iteration of the loop, or create a stub table before the loop so that you are always concatenating.
Or you can do all of these things, just to be sure!
Personally I would list all the fields rather than using a * as a matter of course. This could be enough to either fix or highlight your issue.
If not, please post back any error messages.