Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Enriquemmesteo
Creator
Creator

Help Loading Table (Formatting issues)

Hello everyone, I have to load a bunch of excel files and among those there is a table in a sheet that needs some preparation to load properly:

The table is structured like this (notice this is a simplified example):

Enriquemmesteo_0-1603282129115.png

Explanation:

  • In the first column we have "Position" of the members of the team (Leader, Trainee,etc...) which is not always the same. One excel file can have 2 positions here and another one could have 12. The names you can see from cell B3 to E6 (notice that the number of columns isn't constant either from file to file as there can be different MILESTONES ,from cell B1 to the last milestone. Each milestone has a date assigned to it in the second row (from cell C1 to the last milestone).
  • The seventh row is the STATUS for each milestone, which can be GOOD, BAD, ETC...

I'm not interested in any of the names or positions. Just need to load the milestones with their dates and scores.

I can't change the excel files, so I need load these tables like they had the following format in the script editor:

Enriquemmesteo_1-1603282997653.png

 

This doesn't look complex but the loading script already has a lot of stuff for the rest of the sheets and files and I need to do this as simple as possible.

Thanks very much

5 Replies
Enriquemmesteo
Creator
Creator
Author

Hello again,

Does anyone have a solution for this?

Thanks again

marcus_sommer

I think this will be helpful for you: The-Crosstable-Load 

- Marcus

Enriquemmesteo
Creator
Creator
Author

Thanks Marcus, 

I'm working with crosstable but I can't get it right and as I said, the load of this app is quite long and complex due to the loops and the variable file names, sheets, etc... so I need to do this simple to implement to the rest of the script.

The problem here is getting the crosstable to load Position up to the row with the value "Status", which should be another column. And of course, the dates which I can get simply with the crosstable. 

Notice the qualifier field should be the MILESTONE...which goes from the second column of the first row up to the last column they have filled (we don't know how many beforehand, one file may have 2 milestones and another may have 20)

Now for each MILESTONE we need the date (second row) and STATUS as dimensions. This is often the last or penultimate row but it always has Status in the first column.

I'm not getting Peek() to work correctly to recover status either.

This has the added difficulty that we should avoid using synthetic keys due to optimizing resources.

Thanks in advance!

Enriquemmesteo
Creator
Creator
Author

I'm making some advances but still need some help identifying where Status is and load that row as Status dimension.

Here is the script that works for a certain example where Status appears on the row 19 of the first column:

[INPUT]:
LOAD*

FROM [lib://Documents/SR_LOG.xlsx]
(ooxml,explicit labels,filters(

Transpose()), table is [SR LOG])Where(@2<>'Position');
[RESULT]:
Load
@1 AS MILESTONE,
@2 AS MILESTONE_DATE,
@19 AS MILESTONE_STATUS
RESIDENT INPUT;
DROP TABLE INPUT;

I need to change the part of the script where I've put "@19"manually because it's gonna be different for each file

 

marcus_sommer

If there are complex datatructures in Excel which don't have a datastructure how classical database tables usually are designed you may need multiple pre-loads to count the columns/records, check the fieldnames, detect the cell-positions and/or the values of your KEY-values to create with these information your real load and/or to filter a previous load appropriate. Even if it might be technically possible to put everything within a single-load the distribution to n load-steps may be more suitable.

In regard to your first answer you could detect the row of Status if you add recno() / rowno() within your load and queries then within a following load-step the rowno(), maybe with:

ReturnRowNo: load RowNo resident FirstLoad where Field = 'Status';

With similar logics you could detect each cell-position for any values which are important for your requirements. The complexity depends completely on your Excel files.

Therefore it could be the most practicable approach to get better structured data - maybe parallel data-outputs - and/or additionally information about the data to minimize the efforts in Qlik.

- Marcus