Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Transforming single row datasets into table

Hi all,

I have a single row data file ordered in sets in the following format: {set1},{set2},...,{setn}

Example including two sets:

{"eventId":1,"revision":1,"hash":"4836-1","channel":4836,"gametime":"05:29","timePeriod":329,"gameId":4836,"realTime":"20150916190954","time":"1461760398.0554","period":1,"class":"ShotWide","type":"Skott","description":"45 A Falk","extra":[],"action":"message","source":"api","sourceport":"80","team":"DIF","messagetype":"all","actiontype":"new","location":{"x":148,"y":-44},"status":"new"},{"eventId":2,"revision":1,"hash":"4836-2","channel":4836,"gametime":"05:49","timePeriod":349,"gameId":4836,"realTime":"20150916191014","time":"1461760398.0793","period":1,"class":"Shot","type":"Skott","description":"49 R Rakhshani","extra":[],"action":"message","source":"api","sourceport":"80","team":"LHC","messagetype":"all","actiontype":"new","location":{"x":157,"y":105},"status":"new"}

I want to transform the data into the following table format:

gameideventidgametimeperiodclassdescriptionteamlocation xlocation y
4836105:291ShotWide45 A FalkDIF148-44
4836205:491Shot49 R RakhshaniLHC157105

Any thoughts how I achieve this for a file including around 150 data sets?

Here's a link to the example file with all the data sets http://www.shl.se/shl-arena/4836/liveevents/0/

Best regards,

Mikael Johansson

5 Replies
sunny_talwar

This can be the first step, try to take it forward from here:

Table:

LOAD RowNo() as Sort,

  PurgeChar(SubField(NewField, ':', 1), Chr(34)) as Label,

  PurgeChar(SubField(NewField, ':', 2), Chr(34)) as Value;

LOAD PurgeChar(SubField(Field, ','), '}{') as NewField,

  Field;

LOAD * Inline [

Field

{"eventId":1,"revision":1,"hash":"4836-1","channel":4836,"gametime":"05:29","timePeriod":329,"gameId":4836,"realTime":"20150916190954","time":"1461760398.0554","period":1,"class":"ShotWide","type":"Skott","description":"45 A Falk","extra":[],"action":"message","source":"api","sourceport":"80","team":"DIF","messagetype":"all","actiontype":"new","location":{"x":148,"y":-44},"status":"new"},{"eventId":2,"revision":1,"hash":"4836-2","channel":4836,"gametime":"05:49","timePeriod":349,"gameId":4836,"realTime":"20150916191014","time":"1461760398.0793","period":1,"class":"Shot","type":"Skott","description":"49 R Rakhshani","extra":[],"action":"message","source":"api","sourceport":"80","team":"LHC","messagetype":"all","actiontype":"new","location":{"x":157,"y":105},"status":"new"}

] (delimiter is '\t');

FinalTable:

Generic

LOAD AutoNumber(RowNo(), Label) as UniqueKey,

  Label,

  Value

Resident Table

Order By Label, Sort;

DROP Table Table;

Capture.PNG

marcus_sommer

Wouldn't be there categories with multiple values like in location - {"x":148,"y":-44} - and further commas within certain values - I assume the description - then it would be a simple coding like this enough:

Load subfield(C, ':', 1) as Category, subfield(C, ':', 2) as Value, rowno() as RowNo;

Load subfield(B, ',') as C;

Load subfield(A, '},{') as B;

LOAD @1 as A

FROM http://www.shl.se/shl-arena/4836/liveevents/0/ (txt, codepage is 1252, no labels, delimiter is '\t');

To catch these and maybe further exeptions it's more complicated and needs more efforts and probably a deeper investigation of the data-structure which I couldn't to carry out.

I would probably change the load-strategy from the now used preceeding load to a chain of resident loads in which I would try to identify and to replace - with various string-functions like replace/index/mid/left/right and so on - the special cases before doing the next record-split with a subfield.

If this and maybe some other ideas aren't successful I would change the record-splitting to a while-loop and joining the resulting uncomplete parts based on recno/rowno/iterno together. Before going these (worst case) step I would try to change the source to get for each data part an unique delimiter.

- Marcus

sunny_talwar

I have used Generic load here, and that is why the data isn't looking pretty if you look at the data viewer. But there are ways to combine the generic load data back into one table.

Alternatively instead of putting it as a pivoted table, you can leave the data in different rows and use Pivot table on the front end to visualize the data as the above. This would keep the data model clean, while giving you the result you want.

Capture.PNG

Not applicable
Author

Thanx for your response Sunny. I might be in over my head here, but I will try to build on your example.

Not applicable
Author

Thanx for your response Marcus. I wasn't sure how complex a solution to my problem would be. It might be too advanced for me, but I will try to move forward with your and Sunny's suggestions, thank you.