- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Split Field (Column) based on | delimiter into multiple columns (variable number)
Hi All
The problem that I have is that I am loading .csv from AWS and want to split out a field where the the number of columns produced is variable and the delimter is |. The variablity is from not needing a split to having 20 new columns split out. The data is a database of web behavior and visit dates, pages visited, etc. The data is grouped per user within one field. For Example:
Anonymous Unique ID | Pages Visited | Sources |
---|---|---|
aD56-9045id-a4nj | www.example.com|www.example.com/page1|www.example.com/page2 | Google|Facebook|Email |
dRTY-8778id-b3hk | www.example.com|www.example.com/page4|www.example.com/page3|www.example.com | Email|Google |
l0rd-3423id-kl21 | www.example.com|www.example.com/page5 | |
f@rt-3782id-nn2w | www.example.com |
I want to split the Page Visted and dates visited so into columns so they remain as a data point associated to the ID. I have looked on the forums and I know that some of the answers are out there, but I am completely new and I have no idea how to take the scripts I have seen (for example subfield) and translate to work for my example. The more detailed the answer the better. I learn fast with a couple good examples. Thank you for the help!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try this
Table:
LOAD [Anonymous Unique ID],
[Pages Visited],
SubField([Sources], '|') as [Sources];
LOAD [Anonymous Unique ID],
SubField([Pages Visited], '|') as [Pages Visited],
Sources;
LOAD * INLINE [
Anonymous Unique ID, Pages Visited, Sources
aD56-9045id-a4nj, www.example.com|www.example.com/page1|www.example.com/page2, Google|Facebook|Email
dRTY-8778id-b3hk, www.example.com|www.example.com/page4|www.example.com/page3|www.example.com, Email|Google
l0rd-3423id-kl21, www.example.com|www.example.com/page5, Email
f@rt-3782id-nn2w, www.example.com, Email
];
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The first row from your data have 3 pages visited, but only 2 dates... how do you know which page was visited on what date?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try this
Table:
LOAD [Anonymous Unique ID],
[Pages Visited],
SubField([Sources], '|') as [Sources];
LOAD [Anonymous Unique ID],
SubField([Pages Visited], '|') as [Pages Visited],
Sources;
LOAD * INLINE [
Anonymous Unique ID, Pages Visited, Sources
aD56-9045id-a4nj, www.example.com|www.example.com/page1|www.example.com/page2, Google|Facebook|Email
dRTY-8778id-b3hk, www.example.com|www.example.com/page4|www.example.com/page3|www.example.com, Email|Google
l0rd-3423id-kl21, www.example.com|www.example.com/page5, Email
f@rt-3782id-nn2w, www.example.com, Email
];
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sunny,
can u pls explain how this works with preceeding load. I am not able to get its working.