Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Qlikview "Group By" alternative (High memory usage)

Hello,

I would like to analyze textual data from log files using Qlikview. Here would be sample data:

ID,       EventID,      Message

1          aa               Begining of message here....

2          aa               ....middle of message there....

3          aa               ....end of message there

4          bb               Another message

5          cc               And another message

and so on.

If i try to group by EventID and concatenate messages "Group By" memory spikes up and it's completely impossible to do that way for aproximately 2.5milj of records. I don't have DB as data source so I can't pefrom grouping there.

Are there any other way to achieve that without using Group By using Qlikview script?

Thanks in advance,

Viesturs

1 Solution

Accepted Solutions
jonathandienst
Partner - Champion III
Partner - Champion III

Viesturs,


Are the message contents ordered by the event IDs? By that I mean does the complete message for each eventID exist in consecutive lines?


If so, this script below would not overload your machine as it processes one line at a time...


     LOAD ID,

          EventID,

          Message,

          if(EventID = Previous(EventID),

               Peek('FullMessage') & Message,

               Message) As FullMessage

     From MyLogFile.log (...);


If not, maybe load the data in and do the same as above using a resident load, ordered by EventID and ID


     T_Logfile:

     LOAD * From MyLogFile.log (...);


     LogMessages:

     LOAD ID,

          EventID,

          Message,

          if(EventID = Previous(EventID),

               Peek('FullMessage') & Message,

               Message) As FullMessage

     Resident T_Logfile

     ORDER BY EventID, ID;


This will demand more memory for the sort operation, but may be necessary if the log data is not sorted correctly for the script. But I would expect it to be somewhat less demanding than a group by.


HTH

Jonathan


Edit: Henric's post is the same as my second proposal



Logic will get you from a to b. Imagination will take you everywhere. - A Einstein

View solution in original post

3 Replies
hic
Former Employee
Former Employee

You could do something along the lines of

RawData:

Load * From Messages;

Data:

Load *,

  If(EventID=Peek(EventID), Peek(ConcatenatedMessage) & ' ') & Message as ConcatenatedMessage

  Resident RawData Order By EventID, ID;

Drop Table RawData;

HIC

jonathandienst
Partner - Champion III
Partner - Champion III

Viesturs,


Are the message contents ordered by the event IDs? By that I mean does the complete message for each eventID exist in consecutive lines?


If so, this script below would not overload your machine as it processes one line at a time...


     LOAD ID,

          EventID,

          Message,

          if(EventID = Previous(EventID),

               Peek('FullMessage') & Message,

               Message) As FullMessage

     From MyLogFile.log (...);


If not, maybe load the data in and do the same as above using a resident load, ordered by EventID and ID


     T_Logfile:

     LOAD * From MyLogFile.log (...);


     LogMessages:

     LOAD ID,

          EventID,

          Message,

          if(EventID = Previous(EventID),

               Peek('FullMessage') & Message,

               Message) As FullMessage

     Resident T_Logfile

     ORDER BY EventID, ID;


This will demand more memory for the sort operation, but may be necessary if the log data is not sorted correctly for the script. But I would expect it to be somewhat less demanding than a group by.


HTH

Jonathan


Edit: Henric's post is the same as my second proposal



Logic will get you from a to b. Imagination will take you everywhere. - A Einstein
Not applicable
Author

This solution helped a lot. Thanks. However it's strange that Qlikview have such bad performance in in Group By.