I am loading a simple table of 2,799 rows from an Excel file. For some reason, QV loads much more rows (in Excel2003 format it yielded 5,422 rows, in Excel2007 format it yielded 1,048,000 raws)
The only way I found to overcome it, is to filter out empty rows by a null value in one of the fields through the [Where] button (i.e. exclude from load) .
But as far as I understood from other discussions, using Where clause in Load script, cancels the optimization - so it is not recommended.
Can anybody explain why QV loads empty raws or a better way to fiter them out ?
I can't answer the "why" question - virtualy don't use Excel 2007. but can tell that using "where" is fine. Optimized (or not) can be only load from QVD files. Otimization is not applicable to Excel.
(Although it is possible it will be slower then load from Excel 2003.)
Another idea is ti use "First N" - but it make sense only if you know the number of rows beforehand. Maybe it makes sense to use combination of both. That is, if you expect that the number of rows is always <10,000:
Michael, thanks for the quick reply and the tips.
I am still surprised again and again for the flexibilty of QV's ETL tools.
Is your advice not to use Excel2007 total, or just for my current loading ?
Excel2007 has some improvements (e.g. capacity), which Iwouldn't like to give up.
are you sure that the described Excel-Files are really empty? Quite often there is a - not visible - formatting over all cells or just one column etc. In this case the WHERE-clause makes perfectly sense.
In day-to-day-life have not seen a difference betw Excel 2007 and 2003, but most of our files in daily use are still in 2003-format.
I'm not advising against Excel 2007. It's just not the one i have installed on my machine.
In actual applications, my datasources are virtually always the databases, only occasionally some additional data from files.
And, I think that Peter is right - the "empty" cells may be not actually empty.
Michael and Peter, I have checked again carefully the whole situation and discovered that it was not an ETL problem, rather the behaviour of Table Box object with join properties of QlikView.
When I displayed only a list box the nukmber of rows was right.
When I displayed a Table box with only 4 fields, the nukmber of rows was right.
When I added 2 key fields to other 2 tables, the number of rows increased.
I am sorry for misleading you with my questions, I was not familiar with the features of the Table box.
Thanks for your help