18 Replies Latest reply: Jul 1, 2016 12:04 AM by Fikri Shihab RSS

    Count distinct based on number of events

    Fikri Shihab

      Hi Qlik Community,

       

      I need to create a chart and table where I need to count number of terminals that are bouncing (out of network or coming in to the network for very short time) and also another count based on different criteria, which is based on number of events (e.g. number of events where terminal is out of network within observation period).

       

      The key information in my table for this is as follows:

       

      Time Stamp, Terminal Id, Network status

       

      Time stamp periodicity is every 5 minutes (each row shows 5 minutes worth of information)

      The content of network status is IN_NETWORK or OUT of NETWORK, with some additional information (text) indicating how long the terminal is in the network.

       

       

       

      I'd like to count:

       

      1. Number of terminals which are bouncing (changing from in-network to out of network or vice versa within certain duration, e.g. 2 or 3 consecutive time stamp samples). Please note each row in the table is 5 minutes data, so 2 consecutive samples where terminal is out of network is 10 minutes.
      2. Number of  terminals which are out of network for 5 minutes ( 1 time stamp sample), 10 minutes ( 2 time stamp samples) and 15 minutes (3 time stamp samples), etc.  This does not need to be with the criteria as above (changing status from in to out or vice versa for certain duration)
      3. Table showing for each terminal number of times where it is out of network more than 10  minutes (e.g. 2 time stamp samples).


      The dimension will be Time stamp and the expression will be the above count. 

       

      What would be the right expression for the above?

       

      Appreciate any help on this.

        • Re: Count distinct based on number of events
          Manish Kachhia

          It would be better if you have provided few lines of sample data along with required output in (may be) excel file.

          It is difficult to work without data or prepare data ourselves to work.

            • Re: Count distinct based on number of events
              Fikri Shihab

              Hi,

               

              Please find attached a sample file. Please note that in this file, out of network is indicated by a text that contains word "UNDETECTED". The terminal status is the right name as opposed to network status in the original posting.  This first 3 columns are the area of interest.

              • Re: Count distinct based on number of events
                Fikri Shihab

                My immediate need is to count for the case no 2 above; Number of  terminals which are out of network for 5 minutes ( 1 time stamp sample), 10 minutes ( 2 time stamp samples) and 15 minutes (3 time stamp samples), etc.


                I tried this expression for 2 time stamp samples but did not display anything.


                count(DISTINCT{<[Terminal Age]={'=count("*UNDETECTED*")>2'}>}([Terminal Name]))


                Any suggestion?

                  • Re: Count distinct based on number of events
                    Fikri Shihab

                    Terminal Age above is actually terminal status column in my sample file.

                      • Re: Count distinct based on number of events
                        Digvijay Singh

                        You should use it like this -

                        =count(DISTINCT{<[Terminal Status]={"=count([Terminal Status]='*UNDETECTED*')>2"}>}[Terminal Id])

                         

                        But confused when I found that no terminal ID in your sample data is having more than one timestamp value, so couldn't understand how terminal ID count can be more than 1 UNDETECTED for more than 5 mins.

                         

                        But your above expression give positive result so I am trying to understand logic.

                          • Re: Count distinct based on number of events
                            Fikri Shihab

                            Hi Digvijay,

                             

                            Thank you for the correction of the syntax in my expression.

                             

                            Some of filed names are not correct in my expression above as I used different filed names in my real data (sorry, forgot to change it for the example file), but the idea is basically to count number of terminals that has undetected for more than 5 minutes, or 10 minutes or etc. (this can be variable inputs).

                             

                            Each time stamp is 5 minutes interval, so the intention is simply to count how many distinct terminals shows undetected more than let's say 10 minutes (which is associated with 2 time stamp samples)  over period of time (e.g. every 30 minutes, or 1 hour or a day).

                             

                            I tried =count(DISTINCT{<[Terminal Status]={"=count([Terminal Status]='*UNDETECTED*')>2"}>}[Terminal Id]) in my sample file and something similar in my real data and now it is showing something but not sure if the chart is even correct as the number seems hig.

                            Any suggestion?

                        • Re: Count distinct based on number of events
                          Sunny Talwar

                          May be this:

                           

                          =Count(DISTINCT{<[Terminal Id] = {"=Count({<[Terminal Status] = {'*UNDETECTED*'}>} [Terminal Id]) > 2"}>} [Terminal Id])

                            • Re: Count distinct based on number of events
                              Digvijay Singh

                              Thanks for correcting it, I realized that I missed set expression in inner count.

                              Not sure but looks like dimension needs to be a calculated one to have buckets of 5 min, 10 min etc.

                                • Re: Count distinct based on number of events
                                  Sunny Talwar

                                  I don't think the expression is working still. I am still having a hard time wrapping my head around the requirement.

                                    • Re: Count distinct based on number of events
                                      Digvijay Singh

                                      I think I understood a bit

                                      He needs to count terminal ids which are 'undetected' for two consecutive time periods, similarly for 3 consecutive time period or 4 etc...

                                      I think the complexity lies in defining calculated dimension defining buckets of these time periods, more complexity may be as time periods will overlap to cover all combinations.

                                    • Re: Count distinct based on number of events
                                      Fikri Shihab

                                      I tried the latest expression suggested by Sunny

                                       

                                      *=Count(DISTINCT{<[Terminal Id] = {"=Count({<[Terminal Status] =

                                      {'UNDETECTED')}>} ) > 2"}>} ) *

                                       

                                      but this is not showing any result in my chart, and this is what I also did

                                      before.

                                       

                                      The expression suggested by Digvijay is showing result but I am not so sure

                                      if the result in my chart is correct. Just feel the number is quite high.

                                       

                                      For the time bucket on the dimesnion, I simply use

                                       

                                      =Timestamp((Ceil(,1/IntervalName)), 'DD-MMM-hh:mm')

                                       

                                      I added the interval name as part of input box and have some numeric values

                                      to define my interval.

                                      In the LOAD script I added this:

                                       

                                      LOAD * INLINE [

                                       

                                          IntervalSize, IntervalName

                                       

                                          288, 5  Minutes

                                          144, 10 Minutes

                                          96,  15 Minutes

                                          48,  30 Minutes

                                          24,  1  Hour

                                      ];

                                       

                                      Any suggestion on the expression or does the suggested expression from

                                      Digvijay shows the correct result?

                                       

                                       

                                      On Thu, Jun 30, 2016 at 12:36 PM, Digvijay Singh <qcwebmaster@qlikview.com>

                                      • Re: Count distinct based on number of events
                                        Fikri Shihab

                                        I have 3 separate requirements as explained in my initial posting, but

                                        let;s deal with requirement number 2 first as it may be simpler than

                                        others.  This is simply counting number of distinct terminals experiencing

                                        more than 2 or 3 or more undetected status within certain period (e.g.

                                        every hour).

                                         

                                        Requirement number 1 is a bit complex where I need to count number of

                                        distinct terminals experiencing change of status from in network or

                                        undetected or vice versa for a period of at least 2 samples (10 minutes).

                                         

                                        Requirement number 3 is pretty much similar with number 2, just in table

                                        format.

                                         

                                        I hope this clarifies.

                                          • Re: Count distinct based on number of events
                                            Digvijay Singh

                                            Do you expect repeated terminal id for different consecutive timestamp values like 29-05-2016 00:00:00, 29-05-2016 00:05:00 etc.

                                            Also shouldn't we count terminal status as 'Undetected' in the inner count  and check with >2?

                                             

                                            Can you check attached sample I have created, and changed data with the intention to repeat HW351.1281 for first two time period. So I understand from this as, this particular terminal id is undetected for two time periods and it should be counted as 1 in the 10 min interval.

                                            Your dimension expression you shared has missing Timestamp field I think. May be a typo.

                                            Capture.JPG

                                              • Re: Count distinct based on number of events
                                                Fikri Shihab

                                                Hi,

                                                 

                                                So I understand from this as, this particular terminal id is undetected for two time periods and it should be counted as 1 in the 10 min interval.


                                                Yes, this is correct.


                                                BTW,   for this question, Do you expect repeated terminal id for different consecutive timestamp values like 29-05-2016 00:00:00, 29-05-2016 00:05:00 etc.


                                                You are correct, the same terminal ID should show right value after 5 minutes. The sample file I sent seems to be incorrect and corrupted.  Please find the correct one attached.


                                                I assume you are referring to requirement number 1 for counting how many bouncing terminals that have bounces more than 1, 2, 3  for certain observation period (e.g. one hour or one day or one week).

                                                 

                                                Bouncing terminal is defined as a terminal whose status changes from  one state to another state (in network or undetected) only for short period of time (two time stamp samples or 10 minutes -certainly we can change it to shorter or longer one) and then go back to its previous state (either undetected or in network). In other words, these terminal are considered unstable terminal.  In this case undetected is simply when status change only happens in two samples (e.g. 10 minutes).

                                                 

                                                The requirement is to count how many terminals that behave like this per observation period (e.g. one day) with a criteria on the number of bounces (per hour or per day), either 2 or 3 bounces per hour or per day.

                                                 

                                                So, for instance the output would be number of terminals that bounces 3 times per observation period .

                                                  • Re: Count distinct based on number of events
                                                    Digvijay Singh

                                                    See attached after applying new file, Can you confirm manually if output is  shown correctly?

                                                    Capture.JPG

                                                      • Re: Count distinct based on number of events
                                                        Fikri Shihab

                                                        Hi,

                                                         

                                                        This seems only shows number terminals with undetected status more than x

                                                        samples, which is the same result as the expression that was suggested

                                                        earlier

                                                         

                                                        =count(DISTINCT{<[Terminal Status]={"=count([Terminal

                                                        Status]='UNDETECTED')=>2"}>}[Terminal Id]).

                                                         

                                                         

                                                        The  intention is to count

                                                         

                                                         

                                                           1. Number of bouncing terminals, where bouncing is defined as terminal

                                                           that changes status from in network to undetected for 10 minutes (this can

                                                           be changed as required) and then go back to in network again  or vice

                                                           versa, from undetected to in network (only for 10 minutes) and then go back

                                                           to in network

                                                           2. Number of bounces that happens for 10 minutes or 15 minutes etc.

                                                         

                                                         

                                                        Any idea how this can be calculated with Qlikview?

                                                         

                                                         

                                                        On Thu, Jun 30, 2016 at 8:43 PM, Digvijay Singh <qcwebmaster@qlikview.com>