Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Distinct load with updating data

Hi,

My dashboard is currently loading a folder full of monthly Excel reports from a network share. Each row has a numeric identifier that I want to have as unique in the final loaded data. The problem is that there can be multiple occurences of the same ID on later reports where other fields on the same row have been updated. A normal distinct load doesn't work since it only keeps the first corresponding row leaving my data partly out of date.

The reports' naming convention is Report-YYMM which could maybe be used to identified the load order. The only other thing I can think of is renaming them so that they are loaded from latest to oldest.

Any ideas?

1 Solution

Accepted Solutions
tanelry
Partner - Creator II
Partner - Creator II

I think the most optimal way is to load excel files looping from newest to oldest,

using "not exists(ID)" to avoid older duplicates.

Here is some script you can try:

// get list of excel files in a directory

set vDirectory = 'E:\SomeDirectory';

for each vFoundFile in filelist(vDirectory & '\*.xls')

Files:

Load

'$(vFoundFile)' as FileName,

FileTime( '$(vFoundFile)' ) as Timestamp

AutoGenerate 1;

next

// Sort files by timestamp

SortedFiles:

Load

FileName as FileNameSorted

Resident Files

Order By Timestamp desc;

// Load data from excel files listed in table SortedFiles

for i = 0 to NoOfRows('SortedFiles')-1;

let iExcelFile = Peek('FileNameSorted', i, 'SortedFiles');

Data:

Load

IDField, Field1, Field2

FROM

[$(iExcelFile)] (biff, no labels, table is [SheetName$])

Where not Exists(IDField);

next

View solution in original post

4 Replies
lironbaram
Partner - Master III
Partner - Master III

hei

you can use filetime() which give you when the file last modified to sort the files

Not applicable
Author

Thanks for the suggestion! I'd still have to loop through all of the files in the folder to find the modified date and then load them in the sorted order. I'm not sure how to do that in the load script.

tanelry
Partner - Creator II
Partner - Creator II

I think the most optimal way is to load excel files looping from newest to oldest,

using "not exists(ID)" to avoid older duplicates.

Here is some script you can try:

// get list of excel files in a directory

set vDirectory = 'E:\SomeDirectory';

for each vFoundFile in filelist(vDirectory & '\*.xls')

Files:

Load

'$(vFoundFile)' as FileName,

FileTime( '$(vFoundFile)' ) as Timestamp

AutoGenerate 1;

next

// Sort files by timestamp

SortedFiles:

Load

FileName as FileNameSorted

Resident Files

Order By Timestamp desc;

// Load data from excel files listed in table SortedFiles

for i = 0 to NoOfRows('SortedFiles')-1;

let iExcelFile = Peek('FileNameSorted', i, 'SortedFiles');

Data:

Load

IDField, Field1, Field2

FROM

[$(iExcelFile)] (biff, no labels, table is [SheetName$])

Where not Exists(IDField);

next

Not applicable
Author

Thanks! That did the the trick after some tweaks. Decided to use the suffix in the filename instead of filetime() since someone might accidentally edit old reports.