Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Changing filters messes with load time of Excel files for a table of just the file information

I am working with somebody else's code and there is a table that pull all of the files with .xls extension from a directory. Very simple. The table only contains the following fields:

FileName()           as [File Name],

FileTime()            as [Last Modified],

FileSize()/1024    as [Size (kb)],

Now()                  as [Reload Date]

There is a filter that removes a few rows from the files on load. I thought, "Well, what's the point of that?", and I got rid of the filters. The load time went from ~7 seconds to rolling over 2 minutes, at which point I aborted the reload.

I figured, "It could be that the sheet that's being loaded is large enough to affect the load time", so I applied a filter that removes everything but the first row. Same result as if there were no filters at all.

So the question is: what is going on? Why can't I grab the information about the files, since I don't care about the content at that point, without applying the filters that are there? (which just leave a block of 8x2, a basic info box)

5 Replies
swuehl
MVP
MVP

Hi skarpusenka,

would it be feasible that you post that piece of code including the filters?

Regards,

Stefan

Not applicable
Author

swuehl wrote:

Hi skarpusenka,

would it be feasible that you post that piece of code including the filters?

Regards,

Stefan

DataFilesList:

LOAD FileName()    as [File Name],

     FileTime()            as [Last Modified],

     FileSize()/1024    as [Size (kb)],

     Now()                  as [Reload Date]

FROM

[$(vExternalDataPath)\*.xls]

(biff, embedded labels, table is [Input Sheet$], filters(

    Remove(Row, RowCnd(Interval, Pos(Top, 8), Pos(Bottom, 1), Select(1, 0))),

    Transpose(),

    Remove(Row, RowCnd(CellValue, 1, StrCnd(null)))

));

So, the file has over a few dozens of rows, the top 7 are kept (*which are not even used in the table!)

swuehl
MVP
MVP

Wow, that's kind of strange.

well, I think if the filters result in a transformed input table with more than 1 row, you get copies of FileName etc in your table. You might see this in table view, inspecting the resulting table or add a field RecNo() to the above Load.

So, first I thought the filters should limit the amount of rows read. But that does not make any sense, and there is also a Transpose and  remove which mangels the headers and might remove a former column, but what for?

If you want only one row per file to be read, I would use:

DataFilesList:

LOAD FileName()    as [File Name],

      FileTime()            as [Last Modified],

      FileSize()/1024    as [Size (kb)],

      Now()                  as [Reload Date]

FROM

[$(vExternalDataPath)\*.xls]

(biff, embedded labels, table is [Input Sheet$]) where RecNo() = 1;

this should still give you the wanted file information table.

I have no idea why removing the filters (which I really think are not very logically useful, In fact I first assumed a random filter setup using the filter assistant without knowledge ) result in bad performance.

You might try my version of data limiting without a filter at all.

In my experiments, the performance was quite good (on few files and with different table structure, of course).

Regards,

Stefan

sarahallen1
Creator II
Creator II

Perhaps you could try

FIRST 1 LOAD

instead of just

LOAD

?

whiteline
Master II
Master II

Everything is simple.

1) It's easier to use just wildchars in '*.xls' than to write a foreach statement.

2) FileName() function doesn't (or didn't) work outside the load statemet.

If you want to exclude file opening from your script you should completely change the logic.

Use foreach (look at help) and some string functions to obtain the filename instead of FileName().

And to construct your table row by row use AUTOGENERATE(1).