Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Cannot count total distinct individuals when filtering by year and/or quarter

I'm pulling data from two datasheets and want to count unique individuals by doing a temporarily resident load. How would I enable the Count(Distinct([people])) to count unique individuals for the entire year, but then also have a quarter filter be able to count(distinct [people])? Thanks in advance!

10 Replies
swuehl
MVP
MVP

You want to do this in the script, right?

How does your data look like, could you post some sample lines of data and your expected result?

In general, I would look into GROUP BY clauses with appropriate calendar fields and then use COUNT(DISTINCT FIELD) aggregation function.

Not applicable
Author

Hi swuehl,

I am using a GROUP BY clause but then I can only filter by Year (you can see below that I removed Quarter from the Group By statement. If I filter by both Year and Quarter, some people get counted twice (once per quarter) and I do not get Total unique individuals.

Here's a general sense of the script:

Tracking:

Load ...

   [F and L Name] as [Partner Name],

   ...

   FROM [table1.xlsx]

Concatenate (Tracking) Load ...

    [F & L  Name] as [Partner Name],

    ...

    FROM [table2.xlsx]

Concatenate (Tracking) Load

  Goal,

  Year,

// Quarter,

  Count(DISTINCT [Partner Name]) as Reach

  Resident Tracking

  Where Goal='Employees volunteer and/or donate' and (Year ='2015' or Year='2016')

  Group by Goal,Year;           //removed: Quarter

I've included a bit of a customized script because this is a part of a much much longer and unruly body of script that I took over from a partially created Dashboard with many highly customized graphs. Thanks!

jagan
Luminary Alumni
Luminary Alumni

HI Serena,

For this type of scenario you should not calculate in the script, you should calculate in the front end by using the COUNT(DISTINCT FIELD).

or you can try like this, count the first transaction/record then this issue will be solved.

Regards,

Jagan.

swuehl
MVP
MVP

Yes, might be better doing this in the front end.

If you want to calculate this in the script, I think you would need the yearly and quarterly reaches separately:

Tracking:

Load ...

   [F and L Name] as [Partner Name],

   ...

   FROM [table1.xlsx]

Concatenate (Tracking) Load ...

    [F & L  Name] as [Partner Name],

    ...

    FROM [table2.xlsx]

Concatenate (Tracking) Load

  Goal,

  Year,

// Quarter,

  Count(DISTINCT [Partner Name]) as YearlyReach

  Resident Tracking

  Where Goal='Employees volunteer and/or donate' and (Year ='2015' or Year='2016')

  Group by Goal,Year;

Concatenate (Tracking) Load

  Goal,

  Year,

Quarter,

  Count(DISTINCT [Partner Name]) as QuarterlyReach

  Resident Tracking

  Where Goal='Employees volunteer and/or donate' and (Year ='2015' or Year='2016')

  Group by Goal,Year, Quarter;         

Not applicable
Author

Hi Jagan,

How do I count first transaction/record (especially with two different data sources)? Say the two activities (data sources) are recording individuals running and walking. Each individual can do one or both of these activities multiple times per quarter and within a year.

I thought about COUNT(DISTINCT [FIELD]) in the front end except that I have to count it on the back end because the number of unique individuals also rolls up into a separate high-level metric that's counting individuals from these two data sheets along with a number of others (e.g. total number of individuals exercising, training and taking classes). The difficulty is that some of the data pulls in unique individuals while others, like the two sheets above, pull in their activity (so one individual can appear multiple times).

jagan
Luminary Alumni
Luminary Alumni

Attach some sample data then it would be easier to understand and provide the solution.

Regards,

jagan.

Anonymous
Not applicable
Author

The reasons why you think you have to do it in the script are in fact exactly the reasons why it is much better to do it on the front end.  No matter how many data sheets or how you slice the calendar, the simple count(distinct Name)  will give you the correct result.  Just give it a try and see for yourself.

qlikviewwizard
Master II
Master II

Hi serenamau

Please attach some sample data. It will help the folks for better understanding and give the answer quickly.

Thank you.

Not applicable
Author

I understand that COUNT(DISTINCT [FIELD]) would be preferable. However, these two datasheets (call then A and B) roll up to two different metrics. First is at the program level (unique individuals from just the two datasheets) and the second high level goal is counted via a front end equation using a SUM({<Goal ={"Supported people"}>} Progress). Each of those "unique individuals" contribute to the "Supported people" goal, along with individuals listed in a half dozen other data sources. In order to capture an accurate value for "Supported people", I would either need to list out the 8 other program names in the SUM portion of the equation then + COUNT(DISTINCT [field]) for the datasheets A and B, or COUNT(DISTINCT [field]) in the script for datasheets A and B. The former seems very clunky, especially because there are multiple charts that are tied to this "Supported people" goal.

I took over this very highly customized set of Dashboards, with a body of script that is not well designed. What I mean by that is that the previous individual concatenated most of the data sheets into a single concatenated table. I do not have the bandwidth to completely redo the entire body of script and all the separate Dashboards. Each data source contributes to one or more Program level goal and an Aggregated company goal. 

@jagan I don't see the attached data. can you resend?

Thanks again for your help!