Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hey guys,
I'm working on an assignment for my internship and I've encountered a challenge.
I'm working with Google Analytics(GA) data and I have to distinguish between the different customers, the types of users, modules, articles, etc.
The GA data consists the url's(for now)
The url's containt the delimiter '/' and can vary per customer and per user type.
The lengt of the url's also vary and in order to be able to break down the url's properly for analysis I want to be able to switch between the Subfield structure according to the frequency count of '/'(url length)
My question(s): Is this the proper way to go by such an assignment and how can I switch between how QV structures and processes the url data.
GoogleAnalyticsConnectorV3_DataFromQueryURI:
LOAD
*,
If(Match(frequency, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
If(frequency == 5,
Lower (SubField(DataFromQueryURI_dim_pagePath, '/', 2)) as klant,
Lower (SubField(DataFromQueryURI_dim_pagePath, '/', 3)) as module,
Lower (SubField(DataFromQueryURI_dim_pagePath, '/', 4)) as artikel,
Lower (SubField(DataFromQueryURI_dim_pagePath, '/', 5)) as artikel2;
)
LOAD
//dim_pagePath as PagePath,
//metric_pageviews as Pageviews
DataFromQueryURI_dim_pagePath,
SubStringCount(DataFromQueryURI_dim_pagePath, '/') as frequency
FROM
So basically if the delimiter count is 1 then subfield 2 is klant
if the delimiter count is 2 then subfield 2 is klant and 3 is module
etc.
Thanks all,
Greetings,
Robert
Your preceding load structure is OK. In the preceding load, you will need something like this:
Example, if the substring count is 1, 2, 3, then klant is in field 2; if the count is 4 or 4, klant is in field3,
otherwise klant is in field 4:
Lower(
If(Match(frequency, 1, 2, 3),
SubField(DataFromQueryURI_dim_pagePath, '/', 2),
If(Match(frequency, 4, 5),
SubField(DataFromQueryURI_dim_pagePath, '/', 3),
SubField(DataFromQueryURI_dim_pagePath, '/', 4)
)
)
) As klant,
Repeat for each of the other variable position fields.
If the subfields like klant, module etc are always in the same position if they exist then you don't need to count the delimiters. Simply use the subfield expressions as they are. If there's no value for the nth subfield then it'll return a null.
Hey Gysbert,
Thanks for the reply.. In this case the subfields are not always in the same position as the customers have different user(rights) types. The position of the module can differ or it may occur that there is no module present.
The main challenge I think I have is that for example if there are 4 or more positions, then the last position is the article-page, but there are cases that the last position is not an article but the module page. It depends on how extended the module can be and the type of user.
As you can see the can see the url can contain a module or no module and the last position can be an article, module and or a sub-module)
So I think I'm looking at multiple if statements.. Is that even possible and would it be the correct way to go?
Your preceding load structure is OK. In the preceding load, you will need something like this:
Example, if the substring count is 1, 2, 3, then klant is in field 2; if the count is 4 or 4, klant is in field3,
otherwise klant is in field 4:
Lower(
If(Match(frequency, 1, 2, 3),
SubField(DataFromQueryURI_dim_pagePath, '/', 2),
If(Match(frequency, 4, 5),
SubField(DataFromQueryURI_dim_pagePath, '/', 3),
SubField(DataFromQueryURI_dim_pagePath, '/', 4)
)
)
) As klant,
Repeat for each of the other variable position fields.
Hey Jonathan,
Many Thanks for your reply.. This will help me on my way