Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi there.
Newbie trying to transform a text file that looks like the following:
Request: Login | Date: 200812160750 | Elapsed Time: 367 | User Identifier: 4709
Request: Profile Search | Date: 200812160905 | Elapsed Time: 969 | User Identifier: Mann123
Request: View Profile | Date: 200812160905 | Elapsed Time: 54 | User Identifier: 145992
There's essentially two delimiters. The ":" colon and the "|" pipe.
My script looks like this for the *FIRST* delimiter ":".
LOAD
@1,
@2,
@3,
@4,
@5
FROM
(txt, utf8, explicit labels, delimiter is ':', no quotes);
How do I transfer the second delimiter "|" pipe ?
Thank-you in advance!
Would this work for you?
LOAD
recno() as RecordNumber,
subfield(@1, ':',1) as v1, subfield(@1, ':',2) as v1Val,
subfield(@2, ':',1) as v2, subfield(@2, ':',2) as v2Val,
subfield(@3, ':',1) as v3, subfield(@3, ':',2) as v3Val,
subfield(@4, ':',1) as v4, subfield(@4, ':',2) as v4Val,
subfield(@5, ':',1) as v5, subfield(@5, ':',2) as v5Val
FROM
(txt, utf8, explicit labels, delimiter is '|', no quotes);
Would this work for you?
LOAD
recno() as RecordNumber,
subfield(@1, ':',1) as v1, subfield(@1, ':',2) as v1Val,
subfield(@2, ':',1) as v2, subfield(@2, ':',2) as v2Val,
subfield(@3, ':',1) as v3, subfield(@3, ':',2) as v3Val,
subfield(@4, ':',1) as v4, subfield(@4, ':',2) as v4Val,
subfield(@5, ':',1) as v5, subfield(@5, ':',2) as v5Val
FROM
(txt, utf8, explicit labels, delimiter is '|', no quotes);
Thank-you - your script suggestion worked!
That did the trick. I now have data in this tool to play with.
Or if the fields are always in the same order, and including a little formatting, I'm guessing this would work and give you more meaningful field names rather than legend/value pairs:
LOAD
subfield(@1,': ',2) as "Request"
,date(date#(left (subfield(@2,': ',2),8),'YYYYMMDD'),'MM/DD/YY') as "Date"
,time(time#(right(subfield(@2,': ',2),4),'hhmm'),'hh:mm') as "Time"
,interval(subfield(@3,': ',2)/100,'hh:mm:ss.ff'), as "Elapsed time"
,text(subfield(@4,': ',2)) as "User Identifier"
FROM
Your timing is impeccable.
I was actually going to start tackling the wacky date/time format and was just about to post a question on that. 🙂
I was originally using an open source log analysis tool called Splunk (http://www.splunk.com) - I had to do some regular expressions to parse the log file - but I found the reporting feature to be quite weak.
I think the QlikView tool will allow me to monitor, report and possibly trigger email alerts.
Thank-you again! I'll let you know how your script suggestion works out...
Hi John,
I got the following error with your script suggestion:
Syntax error, missing/misplaced FROM:
LOAD
subfield(@1,': ',2) as "Request",
date(date#(left (subfield(@2,': ',2),8),'YYYYMMDD'),'MM/DD/YY') as "Date",
time(time#(right(subfield(@2,': ',2),4),'hhmm'),'hh:mm') as "Time",
interval(subfield(@3,': ',2)/100,'hh:mm:ss.ff'), as "Elapsed time",
text(subfield(@4,': ',2)) as "User Identifier"
FROM
LOAD
subfield(@1,': ',2) as "Request",
date(date#(left (subfield(@2,': ',2),8),'YYYYMMDD'),'MM/DD/YY') as "Date",
time(time#(right(subfield(@2,': ',2),4),'hhmm'),'hh:mm') as "Time",
interval(subfield(@3,': ',2)/100,'hh:mm:ss.ff'), as "Elapsed time",
text(subfield(@4,': ',2)) as "User Identifier"
FROM
-----
Well, I probably missed a comma, or have misbalanced parentheses or something. QlikView isn't very good at telling you the real problem, I've found. "Missing/Misplaced FROM", more often than not, means "syntax error in your list of fields".
Ah, looks like I put an EXTRA comma after the last parenthesis in the Elapsed Time line. Remove that comma, or just copy the below.
LOAD
subfield(@1,': ',2) as "Request"
,date(date#(left (subfield(@2,': ',2),8),'YYYYMMDD'),'MM/DD/YY') as "Date"
,time(time#(right(subfield(@2,': ',2),4),'hhmm'),'hh:mm') as "Time"
,interval(subfield(@3,': ',2)/(24*60*60*100),'hh:mm:ss.ff') as "Elapsed time"
,text(subfield(@4,': ',2)) as "User Identifier"
FROM
;
Oh, and perhaps I should explain what this is doing. Hard to learn otherwise.
The subfield() function breaks a field apart into more than one field based on a delimiter, so that's how we're handling the multiple delimiters here - one in subfield, the other in the FROM. We're telling it we want the second subfield and that ': ' is our delimiter, so that should return the value portion.
Date#() and time#() functions build a date or time based on a particular format of the input, so that first string is the format of the input.
Date() and time() functions tell the system how you WANT the date and time to be formatted, so the second string is the format we want.
Interval() tells the system how to format an interval of time. However, now that I look at it, I messed it up. I was assuming that your numbers were hundredths of a second. But an interval is the fraction of an entire day, not of a second, so I should have divided by (24*60*60*100). Fixed.
I explicitly tell it that the User Identifier is text() because we've had some issues in the past with IDs like 043E07 being interpreted as scientific notation, and coming out really weird. So when I see a mixture of text and things that could be interpreted as a number, I wrap it in text() just in case.
Hi John.
It works! I should have reviewed the script myself more. Thank-you.
My "Elasped Time" is really just milliseconds but it doesn't need to be modified at this time. I'll take a swag at updating your code this weekend to fix it up.
My goal this weekend is to create some visual charts/graphs (usual the app itself).
Thanks again!
The TextBetween function can also be very useful for this kind of parsing. I like it because it's self documenting and is not dependent on field order. Read the record without a delimiter and then you can do functions like:
trim(TextBetween(@1, 'Request:', '|')) as Request
trim(TextBetween(@1, 'Elapsed Time:', '|')) as ElapsedTIme
etc. You'll still need to do date/time interpretation as before.
-Rob
I'll need to remember that one.