I can't answer the "why" question - virtualy don't use Excel 2007. but can tell that using "where" is fine. Optimized (or not) can be only load from QVD files. Otimization is not applicable to Excel.
(Although it is possible it will be slower then load from Excel 2003.)
Another idea is ti use "First N" - but it make sense only if you know the number of rows beforehand. Maybe it makes sense to use combination of both. That is, if you expect that the number of rows is always <10,000:
are you sure that the described Excel-Files are really empty? Quite often there is a - not visible - formatting over all cells or just one column etc. In this case the WHERE-clause makes perfectly sense.
In day-to-day-life have not seen a difference betw Excel 2007 and 2003, but most of our files in daily use are still in 2003-format.
I'm not advising against Excel 2007. It's just not the one i have installed on my machine.
In actual applications, my datasources are virtually always the databases, only occasionally some additional data from files.
And, I think that Peter is right - the "empty" cells may be not actually empty.
Michael and Peter, I have checked again carefully the whole situation and discovered that it was not an ETL problem, rather the behaviour of Table Box object with join properties of QlikView.
When I displayed only a list box the nukmber of rows was right.
When I displayed a Table box with only 4 fields, the nukmber of rows was right.
When I added 2 key fields to other 2 tables, the number of rows increased.
I am sorry for misleading you with my questions, I was not familiar with the features of the Table box.
Thanks for your help