10 Replies Latest reply: May 19, 2014 5:32 PM by Dariusz Mielczarek RSS

    Traffic simulation data analysis

      Hello people,


      I find myself stuck on the following problem:

      I have some data from a traffic simulation where I have the data of individual vehicles per second (also see simpiflied example). There is a field containing every second per vehicle. The second field effectively is a collection of seconds (from the first field) to aggregate the data by. The third field contains unique ID codes per individual vehicle.


      Each vehicle (v in the attached example) can have multiple timespans (I will call these 'timespans' for lack of a better explanation/word, r in the attached examples), meaning it can be in found in the simulation network during that time. I want to allocate each vehicle to the maximum timespan (r) in the dataset for analysis purposes, BUT! Only if it is part of that timespan (r) for equal or more than 2 seconds.


      I made a straight table chart with the timespans (r) as dimension, and then using the following expression to count the distinct amount of vehicles (v) that are allocated to that timespan:

      Count(DISTINCT If(r=Aggr(Max({1}r),v),v))

      This works fine, but I don't want to just allocate the vehicles (v) to the maximum timespan (r), but the maximum timespan of which that vehicle is part for 2 or more seconds (t). I've tried the following expression, which gives an error and no result:

      Count(DISTINCT If(r=Max({1}If(Aggr(Count(t),v,r)>=2,r)),v))

      I already found out that the error is a result of the lack of a TOTAL qualifier, but I cannot get any results (or at least not those I am looking for)


      Please find an example dataset / expressions in the attached example.


      Any help will be greatly appreciated!




        • Re: Traffic simulation data analysis
          Dariusz Mielczarek

          once again:


          for you calculation interesting are only those rows, where r>=2??

            • Re: Traffic simulation data analysis

              Hey there,

              What I'm trying to achieve there is to alocate the vehicle (v) to the highest timespan (r) of which it is part at least 2 seconds. I count the seconds (t) and make sure it is 2 or higher, hence the >=2. I know I can simply Count(t), because the dataset only contains unique values for every vehicle (v) in this field, and that every second is in there once.

                • Re: Traffic simulation data analysis
                  Dariusz Mielczarek

                  If i understand, for example for r=1 you expect result 2,


                  because there are vehicles 1,2,3,6, but vehicles 3 and 6 were only 1 time (1 sec).




                    • Re: Traffic simulation data analysis

                      vehicle 3 and 6 should indeed not be alocated to r=1, but not because of the time they are part of it. They are part of later timespans (r) for 2 seconds and thus should be alocated to those. The total expected results in a straight table with r as dimension would be:


                      r, count(v), concat(v)

                      1, 1, (2)

                      2, 2, (1, 3)

                      3, 1, (4)

                      4, 0, -

                      5, 1, (6)


                      In the example the only vehicle for which the result between checking for the max(r) of which it is part for at least 2 seconds would be vehicle 4, which is part of r=4 for only 1 second and is thus alocated to r=3.

                      • Re: Traffic simulation data analysis
                        Dariusz Mielczarek

                        this expression:

                        sum(if(aggr(Count(DISTINCT t),v,r)>=2,1))


                        with r as dimension


                        should give result, i think, you need

                          • Re: Traffic simulation data analysis

                            The expression you propose will alocate each vehicle (v) to each timespan (r) of which it is part for at least two seconds, not just the latest


                            I'm trying to alocate each vehicle distinctly to only the highest value of r of which it is part for at least 2 seconds. This is why my formula contains the Max() function.


                            When I look in the qlikview help at the advanced aggregation section it states the requirement for the TOTAL qualifier. But using it will result in in alocating each vehicle to the overal Max(r)=5, and thus resulting in no (or wrong) results.


                            For example, when using

                                 Count(DISTINCT If(r=Max(TOTAL If(Aggr(Count(DISTINCT t),v,r)>=2,r)),v))


                            the bold part of the function will check r (the dimension) against the Max(TOTAL r), which is 5 for each vehicle because of the TOTAL qualifier. Because only 1 vehicle is actually part of r=5, the expression will only yield the correct result on the last row of the table, showing 0 for each other row. Without the use of the TOTAL qualifier however it will yield no result at all.