Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I would like to analyze textual data from log files using Qlikview. Here would be sample data:
ID, EventID, Message
1 aa Begining of message here....
2 aa ....middle of message there....
3 aa ....end of message there
4 bb Another message
5 cc And another message
and so on.
If i try to group by EventID and concatenate messages "Group By" memory spikes up and it's completely impossible to do that way for aproximately 2.5milj of records. I don't have DB as data source so I can't pefrom grouping there.
Are there any other way to achieve that without using Group By using Qlikview script?
Thanks in advance,
Viesturs
Viesturs,
Are the message contents ordered by the event IDs? By that I mean does the complete message for each eventID exist in consecutive lines?
If so, this script below would not overload your machine as it processes one line at a time...
LOAD ID,
EventID,
Message,
if(EventID = Previous(EventID),
Peek('FullMessage') & Message,
Message) As FullMessage
From MyLogFile.log (...);
If not, maybe load the data in and do the same as above using a resident load, ordered by EventID and ID
T_Logfile:
LOAD * From MyLogFile.log (...);
LogMessages:
LOAD ID,
EventID,
Message,
if(EventID = Previous(EventID),
Peek('FullMessage') & Message,
Message) As FullMessage
Resident T_Logfile
ORDER BY EventID, ID;
This will demand more memory for the sort operation, but may be necessary if the log data is not sorted correctly for the script. But I would expect it to be somewhat less demanding than a group by.
HTH
Jonathan
Edit: Henric's post is the same as my second proposal
You could do something along the lines of
RawData:
Load * From Messages;
Data:
Load *,
If(EventID=Peek(EventID), Peek(ConcatenatedMessage) & ' ') & Message as ConcatenatedMessage
Resident RawData Order By EventID, ID;
Drop Table RawData;
HIC
Viesturs,
Are the message contents ordered by the event IDs? By that I mean does the complete message for each eventID exist in consecutive lines?
If so, this script below would not overload your machine as it processes one line at a time...
LOAD ID,
EventID,
Message,
if(EventID = Previous(EventID),
Peek('FullMessage') & Message,
Message) As FullMessage
From MyLogFile.log (...);
If not, maybe load the data in and do the same as above using a resident load, ordered by EventID and ID
T_Logfile:
LOAD * From MyLogFile.log (...);
LogMessages:
LOAD ID,
EventID,
Message,
if(EventID = Previous(EventID),
Peek('FullMessage') & Message,
Message) As FullMessage
Resident T_Logfile
ORDER BY EventID, ID;
This will demand more memory for the sort operation, but may be necessary if the log data is not sorted correctly for the script. But I would expect it to be somewhat less demanding than a group by.
HTH
Jonathan
Edit: Henric's post is the same as my second proposal
This solution helped a lot. Thanks. However it's strange that Qlikview have such bad performance in in Group By.