
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Qlikview "Group By" alternative (High memory usage)
Hello,
I would like to analyze textual data from log files using Qlikview. Here would be sample data:
ID, EventID, Message
1 aa Begining of message here....
2 aa ....middle of message there....
3 aa ....end of message there
4 bb Another message
5 cc And another message
and so on.
If i try to group by EventID and concatenate messages "Group By" memory spikes up and it's completely impossible to do that way for aproximately 2.5milj of records. I don't have DB as data source so I can't pefrom grouping there.
Are there any other way to achieve that without using Group By using Qlikview script?
Thanks in advance,
Viesturs
Accepted Solutions


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Viesturs,
Are the message contents ordered by the event IDs? By that I mean does the complete message for each eventID exist in consecutive lines?
If so, this script below would not overload your machine as it processes one line at a time...
LOAD ID,
EventID,
Message,
if(EventID = Previous(EventID),
Peek('FullMessage') & Message,
Message) As FullMessage
From MyLogFile.log (...);
If not, maybe load the data in and do the same as above using a resident load, ordered by EventID and ID
T_Logfile:
LOAD * From MyLogFile.log (...);
LogMessages:
LOAD ID,
EventID,
Message,
if(EventID = Previous(EventID),
Peek('FullMessage') & Message,
Message) As FullMessage
Resident T_Logfile
ORDER BY EventID, ID;
This will demand more memory for the sort operation, but may be necessary if the log data is not sorted correctly for the script. But I would expect it to be somewhat less demanding than a group by.
HTH
Jonathan
Edit: Henric's post is the same as my second proposal

.png)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You could do something along the lines of
RawData:
Load * From Messages;
Data:
Load *,
If(EventID=Peek(EventID), Peek(ConcatenatedMessage) & ' ') & Message as ConcatenatedMessage
Resident RawData Order By EventID, ID;
Drop Table RawData;
HIC


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Viesturs,
Are the message contents ordered by the event IDs? By that I mean does the complete message for each eventID exist in consecutive lines?
If so, this script below would not overload your machine as it processes one line at a time...
LOAD ID,
EventID,
Message,
if(EventID = Previous(EventID),
Peek('FullMessage') & Message,
Message) As FullMessage
From MyLogFile.log (...);
If not, maybe load the data in and do the same as above using a resident load, ordered by EventID and ID
T_Logfile:
LOAD * From MyLogFile.log (...);
LogMessages:
LOAD ID,
EventID,
Message,
if(EventID = Previous(EventID),
Peek('FullMessage') & Message,
Message) As FullMessage
Resident T_Logfile
ORDER BY EventID, ID;
This will demand more memory for the sort operation, but may be necessary if the log data is not sorted correctly for the script. But I would expect it to be somewhat less demanding than a group by.
HTH
Jonathan
Edit: Henric's post is the same as my second proposal

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This solution helped a lot. Thanks. However it's strange that Qlikview have such bad performance in in Group By.
