Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Load nested CSV

Hi,

I've got the problem, that I have CSV file with the structure:

Data
Remark
0001 2012, 12345678
Time: 05.01.2012
Pos Time Value
1         12      41,5625
2         41      41,5687522888184
3         72      41,5687522888184
0002 2012, 12345678
Time: 06.01.2012
Pos Time Value
1         29      39,0499992370605
2         59      39,1312522888184
3         89      39,1937522888184

The list of values continues a lot of times. So this is only an example.

On loading the CSV, I would like to have a table like this:

Pos0001 20120002 2012
141,562539,0499
241,568739,1312
341,568739,1937

Any other blocks in the CSV file should result in an additional column. The header rows and the blank lines should be skip!

Does anybody has an idea to accomplish that.

Many thanks

Michael

1 Solution

Accepted Solutions
rbecher
MVP
MVP

Hi Michael,

please see attached solution. The whole ETL process is done with several steps:

1. load related lines

2. mark blocks for columns

3. split data lines into fields

4. transpose with generic load

5. create target table

6. collect auto created generic tables

7. join everything together to finalize target table

- Ralf

Astrato.io Head of R&D

View solution in original post

6 Replies
rbecher
MVP
MVP

Hi Michael,

this is possible. You have to read the whole reacord as line and parse the columns of the data lines with subfield().

You have to switch between header and data section on a condition.

After you have read the whole file you can transpose the lines into positions and columns.

It makes only sense to develop an example on the real data. Maybe you can attache a file.

- Ralf

Astrato.io Head of R&D
Anonymous
Not applicable
Author

Hi Ralf,

here is an example file from the original data.

Michael

rbecher
MVP
MVP

Hi Michael,

please see attached solution. The whole ETL process is done with several steps:

1. load related lines

2. mark blocks for columns

3. split data lines into fields

4. transpose with generic load

5. create target table

6. collect auto created generic tables

7. join everything together to finalize target table

- Ralf

Astrato.io Head of R&D
Anonymous
Not applicable
Author

Hi Ralf,

wow, that's really great and exactly what I was looking for!

But there is an additional question. The code only reads data from 2012. Is there a more general way, which also works with 2013 data?

Thanks a lot for your effort!

Michael

rbecher
MVP
MVP

Yes, of course. Just replace the wildmatch in Step1 with:

Step1:

LOAD if(wildmatch(Line, '* 201?,*', -1, 0), subfield(Line, ',', 1), peek(Col)) as Col, Line

Resident RawData;

- Ralf

Astrato.io Head of R&D
Anonymous
Not applicable
Author

Thank Ralf,

now it's exactly what I needed!

Michael