Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Summing Volume based on Timestamp

Although this was posted in a previous discussion, and I have had some great help, I still haven't managed to come up with the right answer.  I have created a cut down worksheet to post here for input.

Basically, I have a table which has read in DATE, DATE and TIME, principal code (PRIN_CODE), a site code (SITE_CODE) and a volume (VOLUME).

And what I need, has to be a fairly standard sort of thing that many application developers would have wrestled with and won.  I need to create a pivot table chart that has three dimensions DATE, PRIN_CODE and SITE_CODE.  The data has either one or multiple snapshots taken throughout the day of the current volume per principal per site. What I need to calculate are two expressions, one calculates the maximum volume per day, per principal per site, and another shows the volume as of the last (or latest) snapshot.

This should be easy for most of you out there . . . I have the table and the calculation for the maximum volume sorted - that was the easy bit.  But for showing the volume as it stood at the end of the day . . . well I am truly stumped.  Toni has had some great suggestions, but as you can see his latest does not seem to do what we had intended:

//Sum the [Storage_Details VOLUME] related to the highest [Storage_Details Actual PROCESSED_DATE] for the current selection

=sum(

//Set expression looks at current data set with the [Storage_Details Actual PROCESSED_DATE] changed to max date value

{$<[Storage_Details Actual PROCESSED_DATE]={

                            //The [Storage_Details Actual PROCESSED_DATE] is a timestamp, hence Max value need to be formatted

                            //the max value is found by aggregating over the chart dimensions, so all dimensions should be added in their correct order

                            "=TimeStamp( Aggr(Max([Storage_Details Actual PROCESSED_DATE]), [Storage_Details PROCESSED_DATE], [Storage_Details PRIN_CODE], [Storage_Details SITE_CODE]) )"

}>} [Storage_Details VOLUME])

The timestamp portion certainly appears to be working, but when combined with the set notation, it seems to return only the first snapshot of stock holdings and not the latest as I would have expected.  And it has some even stranger results - if you constrain the results and selected principal 455 and date 3-Jan-2011, then the last snapshot is three zeroes for the three sites.  I am not sure if this is because there were three sites for maximum, but only two at the end of the day?

There are mega brownie points for whoever can help me with this . . . double points if you are a Mom - happy Mothers' Day.

Regards

Ian

1 Solution

Accepted Solutions
johnw
Champion III
Champion III

1) Yes, wrapping the aggr() in a sum() allows for a total, just like you did with your peak volume.

2) I use the actual processed date because I need a timestamp.  So the idea is that, for each of the three dimensions actually displayed on the report, I consider every timestamp separately.  That's what doing an aggr() with timestamp as the final dimension allows.  For each timestamp, if it is the most recent timestamp for the three dimensions (rank = 1), then I use the Storage_Details VOLUME from that timestamp.

3) If I understood the change, this seems to work.

sum(aggr(if([Storage_Details Actual PROCESSED_DATE]
           =max(total <[Storage_Details PROCESSED_DATE]
                      ,[Storage_Details PRIN_CODE]>
            [Storage_Details Actual PROCESSED_DATE])
           ,[Storage_Details VOLUME])
        ,[Storage_Details PROCESSED_DATE]
        ,[Storage_Details PRIN_CODE]
        ,[Storage_Details SITE_CODE]
        ,[Storage_Details Actual PROCESSED_DATE]))

4) Probably the same answer as #1, but I guess I'm not sure exactly what you're after.  I also don't want to keep creating iteration after iteration of the same basic idea, though.  Tools that may help you are rank()=1, something=max(...), and firstsortedvalue(), combined with appropriate aggr() functions.

5) I don't really have the time today to analyze exactly why an incorrect expression is yielding incorrect results.  Chances are you're doing aggregations at the wrong level or something.  The expression looks similar enough to the one I wrote for #3 that this seems likely the case, as I can see I aggregate a bit differently.

6) I don't know anything about using the menu option for formula auditing in Excel.  When I want to understand a formula step by step, I build it step by step.  Perhaps there is or should be a better way, but that's what I do.

View solution in original post

4 Replies
johnw
Champion III
Champion III

This seems to work for the example data:

aggr(if(rank([Storage_Details Actual PROCESSED_DATE])=1,[Storage_Details VOLUME])

    ,[Storage_Details PROCESSED_DATE],[Storage_Details PRIN_CODE]

    ,[Storage_Details SITE_CODE],[Storage_Details Actual PROCESSED_DATE])

Not applicable
Author

Hi John,

Brilliant . . . I have to agree with you that this should be way easier in such a great tool . . . but boy have I struggled.  Not to be a nuisance, but I am still a tad short of a full yureka moment.  Could you see if you can cast some light on my questions below:

1.  If I enable a total against the PRIN_CODE, I get a volume for the maximum, but a dash for the last daily snapshot.  Is the best solution just to wrap the aggr() function in a sum function?

2.  Now that I have a working expression, can you explain why when you aggr(), you use the [Storage_Details Actual PROCESSED_DATE] as the last dimension?

3.  Seeing the data now, what I am getting is the last snapshot for each site that has held stock during the day for that principal.  However, I suspect, and am currently checking with the management team at the warehouse, that they currently charge on the results of the last snapshot.  So for instance if we select PRIN_CODE=455 and PROCESSED_DATE='3-Jan-2011', the maximum volume is shown in 3 locations, but by the end of the day, stock is only listed in two.  The formula currently still lists the volume for the last snapshot for that location.  What I need it to do is to show zero volume for STGIN on that date as there was no stock in it when the last snapshot was taken.

4.  What would be the easiest way to sum this up per principle per week / month / quarter / year?

5.  This question is so that I understand a little bit more about how QV works, but from other formula I had tried to sum the last snapshot (which logically, at least to me seem workable), for example:

=sum(

if([Storage_Details Actual PROCESSED_DATE]=aggr(Max([Storage_Details Actual PROCESSED_DATE]),[Storage_Details PROCESSED_DATE], [Storage_Details PRIN_CODE],[Storage_Details SITE_CODE]),

  [Storage_Details VOLUME],0

)

)

This resulted in very strange results, whereby almost all of the entries for principals with multiple snapshots was zero, whereas the principles with only a single snapshot were correct.  However, scrolling theough the pivot table, suddenly on the 4-Jan-2011 for 455 for STGIN, a value is listed.  This is totally confusing . . . I cannot seem to grasp how QV is processing the data.

6.  Lastly, one of the things in Excel that increased mt learning exponentially was the menu option for formula (or expression) auditing.  I not only am able to work my way step by step through the formula, but it also gives me huge insights into how Excel actually works.  Wouldn't that functionality greatly enhance not only QV but also developers understanding about how it works?

Thanks

Ian

johnw
Champion III
Champion III

1) Yes, wrapping the aggr() in a sum() allows for a total, just like you did with your peak volume.

2) I use the actual processed date because I need a timestamp.  So the idea is that, for each of the three dimensions actually displayed on the report, I consider every timestamp separately.  That's what doing an aggr() with timestamp as the final dimension allows.  For each timestamp, if it is the most recent timestamp for the three dimensions (rank = 1), then I use the Storage_Details VOLUME from that timestamp.

3) If I understood the change, this seems to work.

sum(aggr(if([Storage_Details Actual PROCESSED_DATE]
           =max(total <[Storage_Details PROCESSED_DATE]
                      ,[Storage_Details PRIN_CODE]>
            [Storage_Details Actual PROCESSED_DATE])
           ,[Storage_Details VOLUME])
        ,[Storage_Details PROCESSED_DATE]
        ,[Storage_Details PRIN_CODE]
        ,[Storage_Details SITE_CODE]
        ,[Storage_Details Actual PROCESSED_DATE]))

4) Probably the same answer as #1, but I guess I'm not sure exactly what you're after.  I also don't want to keep creating iteration after iteration of the same basic idea, though.  Tools that may help you are rank()=1, something=max(...), and firstsortedvalue(), combined with appropriate aggr() functions.

5) I don't really have the time today to analyze exactly why an incorrect expression is yielding incorrect results.  Chances are you're doing aggregations at the wrong level or something.  The expression looks similar enough to the one I wrote for #3 that this seems likely the case, as I can see I aggregate a bit differently.

6) I don't know anything about using the menu option for formula auditing in Excel.  When I want to understand a formula step by step, I build it step by step.  Perhaps there is or should be a better way, but that's what I do.

Not applicable
Author

Hi John,

Thank you very, very much.  Does exactly what I was after and should allow me to build some measures that will give the management team here a good idea of the impact of changing the charging mechanisms for clients.

You have been an amazing help.

Thanks again.

Regards

Ian