7 Replies Latest reply: Apr 26, 2010 10:27 AM by johannesmilling RSS

    Compare Dates and count the missmatches



      I am trying to make an expression that counts the number of deliveries made out of time. I want to count both those which are too late and those that are too early.

      My expression looks like this atm.

      count ( if ( [ Actual Delivery Date ] <> [ Promised Delivery Date ], Delivery_KEY))

      The problem is that I get a higher count with this expression than with:

      Count ( Delivery_KEY)

      Shouldn't I get fewer counts with the first expression? What am I missing?

      And also. Is there a way of counting unique substrings of the field Delivery_KEY? The key is made up by a delivery number, and a row number (since one delivery can be composed by several delivery rows, and there are only an identifying attribute for the rows, and not for the actual delivery).
      For example the keys can look like this: 100|10

      I would therefore like to be able to count only one of the keys starting with "100", so that the total count in this case would become 2, and not 3.

      Any help will be greatly appreciated, even it it only solves a part of my problem. Any suggestion on where to look for answers will also be appreciated.



        • Compare Dates and count the missmatches
          Michael Solomovich

          1. Is it a front-end expression? I assume that Actual and Promised dates are in different logical tables.

          2. I'd do it in the script using count with GROUP BY subfield(DeliveryKey, 'I', 1)

            • Compare Dates and count the missmatches


              1. The expression is made in the expression tab in a regular chart or table, if that is what you mean by front-end. And no, the actual and promised dates are in the same data table, as are the keys. For example, one line in the data table can look like this (simplified)
              Delivery_KEY ActualDate PromisedDate
              100|10 08.05.24 08.05.23

              2. I am somewhat handicapped (I think) by the fact that I am importing the data like this:
              Binary data.qvw;

              Therefore it doesn't seem like I can use regular commands to create new or modify fields and such. If I could import the data using Load (from a file) or through a DB connection I would know how to find the answers I need. Unfortunately I cannot change the way I import my data. When I try to write something in the script as one would normally do, and reload, all the data vanishes. Exactly how the binary data is loaded from the qvw file I do not know.

              Forgive my lack of understanding regarding the issue of retrieving data , but I am new to this.



            • Compare Dates and count the missmatches


              you can try this :

              sum(fabs(sign( {<Delivery_KEY={'100|*'} >} [ Actual Delivery Date ] - [ Promised Delivery Date ] )))

              or if you want to change the Delivery_key, use a variable.



                • Compare Dates and count the missmatches

                  First of all, thank you for your replies. It has really helped me moving forward.


                  I've tried alot of different things now, and this is the closest I have come to solve the problem.

                  if( date#( [ Actual Delivery Date ] - [ Promised Delivery Date ], 'DD') < 0, 1 )

                  Using the deliveries as a dimension, I can get this to flag each delivery where the dates don't match. Don't ask me why it's "<" instead of "<>". I don't understand the logic behind that part, but it works, i've checked and double checked.

                  The problem is that I cannot count or sum all the 1's into a total count when using time as the dimension. I've tried a number of different solutions to this.

                  sum(if(date#([Actual Delivery Date ] - [ Promised Delivery Date ],'DD')<0, 1))

                  This returns the value 0 for example, even though my current selection contains posts that get the value 1 from the if statement. i've also tried the same thing using count to simply count the number of 1's, but that too was unsuccessful.

                  Here is a screenshot of a small selection of deliveries. As you can see in the right column, only one post has the '1' flag, as it should. Where there are no Actual Dates it is assumed that the promised date matches the actual date.

                  I've almost managed to solve the issue with only counting posts containing '|10' in their DeliveryID to get the actual number of deliveries (thanks to the previous posts), and not all of the delivery rows by using this expression:

                  sum(SubStringCount(Leveransprec_KEY, '|10'))

                  A strange thing is that with the selection shown above, containing only 9 matching posts, it actually counts all 14 of them. This however is an exception. Most of the time the expression counts the correct amount of deliveries making it hard for me to understand what the logical error is.

                  The second column to the left contains the expression shown in the column header. Why the result of the expression varies I cannot tell, but it sure is strange that the result isnt the same for all posts containing '|10'.


                  Is it me doing all these things wrong, or could there be a logical error to the statements I am using (SubStringCount and SubField)?



                • Compare Dates and count the missmatches

                  Hi Johannes

                  The reason why your expression should be .... Actual delivery Date - promised delivery date <0 instead of <>0 is that where actual delivery date is empty Actual delivery Date - promised delivery date isn't 0 but null (empty). Take a look of the enclosed.

                  What puzzels me is the subfield. I can't get it to be right either but get another result than you.

                  Could you elaborate on when you get substringCount to be 14 instead of 9 ?




                    • Compare Dates and count the missmatches

                      Hello, and thank you for clarifying.

                      When it comes to the strange behaviour of the subfield I am starting to believe that it is due to the weak computers of my university. The file I am working in uses around 1 - 1.3 GB of RAM, and the computers here only have 2. Could it be that the computer skips some of the lines in the table because it just can't keep up?

                      I got the subfield thing to work btw, but only when creating an entirely new field using the same subfield expression in the script. I don't have the file with me right now but I can show the code later.

                      I actually needed this for a competition in business intelligence, which has now ended. But since I hate leaving things undone I would like to get this to work, even though I can't use it anymore.

                      Regarding the substringCount being 14 instead of 9 I'm sorry to say I cannot elaborate. I have since then tried probably 10 different solutions with varied results and cannot seem to get the same result again. All I know is that I have gotten results that are both higher and lower than what they should be, and that they sometimes work for one selection, but not another.