Excel datasource with "outline" structure / staggered rows for fields
Is there a simple way of writing the script so that it deals with Excel datasources in "outline" structure (with staggered rows for different levels of grouping)? I've attached an example Excel file to show you what I mean - it's the standard download format from FactSet, and I have lots of files like this that I'd love to put into QlikView.
I need to preserve the different grouping levels that come from FactSet, so don't want to just exclude the "total" rows (e.g. rows 10, 11, 12, 13 and 14 in the attached file) unless I can somehow read in the group name (which only appears at the "total" row at the top of each group and sub-group of rows) for each row. For example, from the row of data in row 58 of the attached file, I want QV to read in the following fields and field values:
- Date = 30-Sep-10
- Asset class = Equity
- Region = Western Europe
- Country = United Kingdom
- Sector = Consumer Staples
- Industry = Tobacco
- Security name = British American Tobacco PLC
as well as the data points in cells M58:V58 (with column headings at the top which I would manually rename as they are split over different rows).
I would then probably try to put some fields into drill-down groupings within QlikView (e.g. Region as a parent of Country, and Sector as a parent of Industry) - this should be OK, it's just reading in the field values correctly for each row that I can't figure out.
Hope this makes sense - I think it's quite self-explanatory once you look at the Excel file.
Would be brilliant if somebody could get back to me within a day or so - thank you.
Re: Excel datasource with "outline" structure / staggered rows for fields
Thanks very much for your reply. It's certainly a start, and much better than I was managing (so thanks!), but it still doesn't quite read in the data right.
I've attached your document with the columns renamed (for clarity) and I've added a pivot table (as this best reflects how the data was in the source file).
Take Fresnillo PLC, for example. Once I've pivoted the date field to go horizontally along the top, there should only be one row for Fresnillo (as this stock name is the lowest heirarchy) - with multiple columns for "ending weight", "market cap", "# of securities" and "date". However, as the attached document shows, Fresillo appears in 6 rows:
- North America / Mexico / Materials / Metals & Mining (this is the correct classification)
- Western Europe / Mexico / Materials / Metals & Mining
- Western Europe / Spain / Industrials / Airlines
- Western Europe / Spain / Industrials / Metals & Mining
- Western Europe / Spain / Materials / Metals & Mining
It looks like it's reading in some rows below/above the correct row for Fresnillo (and indeed for other stocks).
As you say, it only needs to read from the rows where Stock_Name<>' '. It then needs to pick up the other field values from the first non-empty cells moving left and up.
Thanks for your help so far - would be great if you could suggest a solution to the above problem.