Skip to main content
Announcements
Accelerate Your Success: Fuel your data and AI journey with the right services, delivered by our experts. Learn More
cancel
Showing results for 
Search instead for 
Did you mean: 
BSP
Contributor II
Contributor II

Parsing nested data

Hi there,

I got data in csv, loaded it and it has nested column with unknown amount of itwems for parsing.

I tried to use the subfield() function, which works fine, except for the unknown amount of items to parse. So i want to denormalize, but cant figure out how to create separate row for each item in the nested field.

 

Any ideas?

 

Thanks! 

Labels (4)
1 Solution

Accepted Solutions
melissapluke
Partner - Creator
Partner - Creator

If this is a one time parse, load your table without doing subfields with this line: SubStringCount([Field to Parse])+1 as "total count" as a field. Look in your data for the highest "total count" value in the dimension. That would be your IterNo to use when parsing.

Example:

I found that my "total count" was 74 in my data. So i load my subfields as long as the IterNo is less than 75.

[Temp Data]:
LOAD Distinct
(SubField([@7], chr(94),IterNo())) as "Field1",
(SubField([@8], chr(94),IterNo())) as "Field2",
(SubField([@9], chr(94),IterNo())) as "Field3"
FROM [lib://Data/file.txt]
(txt, codepage is 28591, no labels, delimiter is spaces)
While IterNo()<75
;

If this is a daily load, you'd want to do the count and create a variable for it and then do it while IterNo() is less than your variable. Or, if you wanted to roughly estimate it, you could do while IterNo() is less than a number you think it would always be less than to create that temporary table, and then create the permanent table like this:

PermanentTable:
Load
Field1,
Field2,
Field3
Resident "Temp Data"
where not isnull(Field1);

View solution in original post

1 Reply
melissapluke
Partner - Creator
Partner - Creator

If this is a one time parse, load your table without doing subfields with this line: SubStringCount([Field to Parse])+1 as "total count" as a field. Look in your data for the highest "total count" value in the dimension. That would be your IterNo to use when parsing.

Example:

I found that my "total count" was 74 in my data. So i load my subfields as long as the IterNo is less than 75.

[Temp Data]:
LOAD Distinct
(SubField([@7], chr(94),IterNo())) as "Field1",
(SubField([@8], chr(94),IterNo())) as "Field2",
(SubField([@9], chr(94),IterNo())) as "Field3"
FROM [lib://Data/file.txt]
(txt, codepage is 28591, no labels, delimiter is spaces)
While IterNo()<75
;

If this is a daily load, you'd want to do the count and create a variable for it and then do it while IterNo() is less than your variable. Or, if you wanted to roughly estimate it, you could do while IterNo() is less than a number you think it would always be less than to create that temporary table, and then create the permanent table like this:

PermanentTable:
Load
Field1,
Field2,
Field3
Resident "Temp Data"
where not isnull(Field1);