Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello all,
I have been unable to find if anyone has resolved the issue of creating a wildcard load from within a dropbox folder using the Qlik Dropbox Connector in Business Cloud. Is this possible?
Is a wildcard load only possible from FTP/SFTP?
Thanks
+1 to this thread - I am also seeking out how to load all files in a folder through a webconnector (OneDrive in my case) such as dropbox; as well as loading the most recent file in the folder.
I feel like it will require loading a tmp table of all the file directory showing all the files - and then some kind of loop that uses the file IDs in the tmp table to load all the files; but, while I can imagine how this would work my coding skills to create it have proven lacking.
Hi,
Unfortunately the Dropbox Web Storage Provider does not support wild cards as per the limitations section in the docs:
There are a couple of techniques you might be able to use though to get the result you need.
1 - You could use the List table in the Dropbox Metadata Connector:
To load the files in the folder into a table and then use some qlik scripting to loop through these and load the ones that match your requirements. e.g. the latest one by date or all files with a certain prefix.
2. You can use the qlik filelist command to enumerate through all of the files in a Dropbox directory:
https://help.qlik.com/en-US/qlikview/April2019/Subsystems/Client/Content/QV_QlikView/Scripting/Scrip...
And then inside each iteration decide whether to load the file based on its name.
If you are still interested in this and can't figure out the script I can try and spend some time to leave an example here.
Hi all,
Here are some notes on this topic, hope they are helpful.
Chris, are you sure about the FileList with DropBox? I believe I tried that and didn't get it to work, but it may have been a different connector.
With each of the different file system providers (Google Drive, DropBox and Amazon) I found that looping around the MetaData connector worked, but the syntax for each was a bit different.
I posted code on this thread about achieving the loop on DropBox:
The syntax for doing the same in Google Drive is:
LIB CONNECT TO 'Google_Drive_&_Spreadsheets';
tmpFiles:
LOAD
title
WHERE labels_trashed = 'false'
;
SELECT
title,
labels_trashed
FROM ListFiles
WITH PROPERTIES (
query='title contains ''.csv'''
);
for iFile = 0 to NoOfRows('tmpFiles') -1
let vFile = peek('title', iFile, 'tmpFiles');
LOAD
*
FROM [lib://Google_Drive/root/$(vFile)] (txt);
Next
DROP TABLE tmpFiles;
Note that it is the Google Drive and Spreadsheets connector that needs to be used for the metadata in this case.
For Amazon S3 the syntax is:
LIB CONNECT TO 'Amazon_S3_Metadata';
tmpFiles:
LOAD
Key
;
SELECT
Key
FROM List
WITH PROPERTIES (
dir='SubFolder',
maxResults=''
);
for i = 0 to NoOfRows('tmpFiles') -1
let vFile = peek('Key', i, 'tmpFiles');
LOAD
*,
filename() as FileName
FROM [lib://Amazon_S3/$(vFile)] (qvd);
next
In each case it is very similar, the differences come in the way that you have to search and filter the files on the provider, particularly with regard to what folder files are in. In each case, use the wizard to get the syntax and then modify it to match the code above for your selected filter.
Watch out on Google Drive and DropBox, as both keep archives of deleted files and unless you filter them out they will get loaded. With drive there is the trashed field, and DropBox it is includeDeleted.
How to get this working is not obvious. Post to this thread if you are trying and get stuck.
Good luck!
Steve
Hi - yes the filelist, dirlist, filesize and filetime should all work on Qlik Sense Cloud Business (note these do not currently work on the Windows version of the Dropbox connector).
I just tested it out with the following. Note the absence of any trailing / at the end of the folder name. If you include this I think it errors.
for each vFile in filelist('lib://Dropbox - myaccount@gmail.com/Folder Name')
let vFileSize = FileSize('$(vFile)');
let vFileTime = FileTime('$(vFile)');
trace '$(vFile) is $(vFileSize) bytes and last modified at $(vFileTime)';
next
Thank you Chris & Steve!
As I was struggling with this - I switched my architecture to Sharepoint, and sync'd the folder path to my server so that I could use my existing scripts to wildcard load and not re-write my load scripts.
Appreciate the resources though!
Thanks Chris.
That is a slightly different syntax to a standard Windows file share, where you would need to have /*.* on the end of the filename. This could be why I missed the fact the functionality was there (or it has been fixed since, you never can tell with QSB!).
Do you know how the FileList function handles files which have been deleted from DropBox? Do they get returned or not?
If they do get returned, how can you identify them as being deleted files?
I will add it to my list of things to look into, but if you know already then that would be great.
Thanks!
Hi Steve,
Yes because different providers like Dropbox, Google Drive etc. provide different ways of querying the list of files it will be a different syntax and in the initial release we are just supporting returning the folder contents.
I checked and filelist should be excluding deleted files.
Thanks Chris.
I totally get the rationale why the FileList function works differently. It's not clear how users can be expected to know that FileList('lib://DropBox/Folder') works, but FileList('lib://DropBox/Folder/*.*') doesn't? The latter is the syntax that users will try first, as that is correct for a local folder.
Could you Qlik boffins not implement the function so that it gets the entire list back from DropBox, regardless of file mask, and then parse that full list using the wildcard provided?
Not a problem for me, as I now know the syntax, but would perhaps prevent others missing it as I did initially.
Thanks!
Steve
Hi Steve,
We could do that but it would complicate the code re-implementing wild card filtering on the connector side. Plus if the dropbox folder contains tens of thousands of files or potentially a lot more this would likely perform badly.
Personally, I think as Dropbox, Google Drive, One Drive etc. provide their own API specific ways of querying and searching for files we should consider looking at giving users a way to leverage these somehow.