Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi QlikCommunity, I know that QlikView table boxes filter duplicated records in such a way that if two records are identical on all points listed in the table box, the table box will only show one line for that record like so:
Records/Values in database might be like this:
User ID Date Minutes Uploaded
123 12/13/2013 50 12/14/2013
123 12/13/2013 50 12/15/2013
(Uploaded field isn't in my qvds, but I am guessing something like it is hiding somewhere or snuck through somehow.)
QlikView TableBox shows this
User ID Date Minutes
123 12/13/2013 50
QlikView Straight Chart shows this
User ID Date Minutes
123 12/13/2013 100
Now, how do I make a QlikView Straight Chart that sums in such a way as to only count duplicated records once? Mine is doing this currently for one group that had some kind of technical issue and might have been reloaded or might have something about two processes feeding the same data table or something. The issue seems to be persistent and consistent in causing my sums to show double the minutes actually logged. I'm hoping for some simple way to make my formula spin its sums of whatever QlikView is doing itself to handle the duplicate records. Thanks!
As Rachel says, it is better to fix in the script. Still on the front end, this may work:
sum(aggr(sum(distinct minutes), user, date))
Use sum(distinct Minutes)
Alternatively use Minutes as a dimension and add an expression: =1. Then hide the expression.
Sorry, forgot to mention the Minutes field in the straight chart is a Sum(Minutes) expression, and if I had more than one day of activity I'd expect it to sum those. I do not want it to be inflated by the duplicate records though. The minutes field in the Table box is just Minutes, and would show whatever values were logged for a particular day by a particular user.
Not quite. I should have given a little more robust example. If I sum distinct minutes, but logged 50 minutes on two different days, it will come out to 50 minutes. (I just tried that on the expression in my script just now.)
I would want to sum the minutes for each unique user and date combination, since there should only be one distinct value within that combination. Formula for that? (We only see one value in the field on our log, but something is quietly creating duplicated records in background somehow.)
If I were you I'd try to get to the bottom of the duplicate fields rather than trying to fix in the charts. You might end up missing something and have spurious results go undetected!
You could add a count field to the table in your load script ('1' AS Count) and then create a straight table which sums the Count field to identify the duplicates. To fix it you will need to add a unique primary key to your table or load distinct values only.
I do have minutes as a dimension on a table box with the breakouts by day, but on the chart where I want to just total up the minutes.. actually, the straight chart doesn't have the date field in it since I am summing from all dates selected.
Straight table is like this:
User Id Minutes
123 100
When I have only the one day selected and only 50 minutes logged on that day.
If I have two days selected and 50 minutes on both of those, then I'd want the straight table to show 100 Minutes total.
Yeah, you're right Rachel. It's on a dashboard and I only see it happening for one group after checking through all the others, so I'm confident the dashboard is actually okay and something is wrong in the processing of the one group. I was just looking for a patch to use in the interim because I don't know how long it will take over-extended engineering team to get to my issue and resolve it. (Right now my workaround would just be export the daily counts to excel and do the sums there since the group is small enough for that.)
As Rachel says, it is better to fix in the script. Still on the front end, this may work:
sum(aggr(sum(distinct minutes), user, date))
Thanks, you nailed it and I've worked enough with Aggr() to understand that part is doing the thing I wanted about separating the minutes by user and date, and the distinct is only counting them once within each date. (Even so I stared at the chart to be sure the minutes all halved themselves when I pushed apply, and I'll audit some more to be sure I didn't just cause ten other problems. )
added helpful here since this expression went in to the correct answer guy's expression, and might've been correct if my initial report had been complete.