Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Toronto Sept 9th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Aggregate values table

Hi guys. I'm trying to create a table of aggregate values and I was wondering if you guys had an elegant solution.

I have one table called Tasks. Each task row has a Category field, and a Completed field with values Done or Not Done.

I want to create a derived Categories table that pulls out all the values of the Category field in the Tasks table. This table will obviously link to the Tasks table by the Category name. I want to then create two derived fields.

One field will be called HasCompletedTask, and will evaluate to true if there exists at least one Task row where Completed is Done. Otherwise, false.

The other derived field will be called AllTasksCompleted, and will evaluate to true if ALL task rows of that category are Done. Otherwise, false.

Any ideas? Please let me know if you have any questions. Thank you for your help!

Eric

5 Replies
Not applicable
Author

It is probably easier and better to handle this in your Load using flags.

If you want to do it using Expressions, then you'll probably need the Concat() function. This should get you a comma separated list of all Completed values for that respective Category:

Concat(TOTAL <Categories> Completed)


The All Completed field will be easier than the Has Completed, because Done is a substring in Not Done. For the All Completed, try:

Index(Concat(TOTAL <Categories> Completed), 'Not')=0


I'm not sure of a great way to to the Has Completed. You need to search for Done, but not within a Not Done string.

EDIT: I guess you could try:

Index(Concat(TOTAL <Categories> Completed), ',Done')>0 or
Left(Concat(TOTAL <Categories> Completed), 4) = 'Done'

But that's nowhere near elegant.

Not applicable
Author

I think I would rather do this in the load script using flags. So on every Task row there's a flag Done or Not Done (I could change this to say Open or Closed or the like). I want to then evaluate, for every Category value, does there exist a Task that's Open. For the other field, I'd want to evaluate if there exists a Task that's Closed. What do you think?

Not applicable
Author

What is the source of your data? Are you pulling from QVDs or using SQL to pull directly from a database?

I always seem to think of the SQL method first, because I've been using that longer. It would be something like this (Oracle syntax):

SELECT Tasks.*, Has.HasComp, All.AllComp FROM Tasks,
(SELECT Category, Max(Completed) As HasComp FROM Tasks
WHERE Completed = 'Done' GROUP BY Category) Has,
(SELECT Category, Max(Completed) As AllComp FROM Tasks
WHERE Completed = 'Not Done' GROUP BY Category) All
WHERE Tasks.Category = Has.Category (+)
AND Tasks.Category = All.Category (+)


Something like that. You would end up modifying the select to return 0 for HasComp if it is null and 1 if not. And the same thing for AllComp.

I'm sure you could also handle this using QlikView Load syntax.

Not applicable
Author

I was able to set up a load using an Inline to create some test data. It's kind of long and probably inefficient. I'll also try to upload the sample file, but that functionality hasn't been working on the forum.

Load:

DataLoad:
LOAD * INLINE [
Task,Category,Completed
A,Cat1,Done
B,Cat2,Not Done
C,Cat1,Not Done
D,Cat2,Not Done
E,Cat3,Done
F,Cat3,Done
];
AllComp:
LOAD Category, Sum(0) As AllC
RESIDENT DataLoad
WHERE Completed = 'Not Done'
GROUP BY Category;
HasComp:
LOAD Category, Max(1) As HasC
RESIDENT DataLoad
WHERE Completed = 'Done'
GROUP BY Category;
DataInt:
LOAD Task As Task1, Category As Cat1,Completed As Comp1 RESIDENT DataLoad;
LEFT JOIN LOAD Category As Cat1, AllC RESIDENT AllComp;
LEFT JOIN LOAD Category As Cat1, HasC RESIDENT HasComp;
DROP TABLE DataLoad;
DROP TABLE AllComp;
DROP TABLE HasComp;
Data:
LOAD Task1 As Task, Cat1 As Category, Comp1 As Completed,
if(AllC <> 0, 1, AllC) As AllComplete, if(HasC <> 1, 0, HasC) As HasComplete
RESIDENT DataInt;
DROP TABLE DataInt;


Not applicable
Author

You could do a load of the initial table, then do a distinct load of just the status field in a temp table basically leaving you either one or two values in a single field.

So, in the example data very kindly provided by NMiller, it would be:

DataLoad:

LOAD * INLINE [

Task,Category,Completed

A,Cat1,Done

B,Cat2,Not Done

C,Cat1,Not Done

D,Cat2,Not Done

E,Cat3,Done

F,Cat3,Done

];

CompletedIndexTemp:

LOAD DISTINCT Completed

RESIDENT DataLoad;

From here, I would use a peek in a variable to determine if 'Not Done' exists, and return text based on the result. My preference is always to use a variable for these kinds of things, but this could easily be done in another table as well.

LET CompletedFlag = If(peek('Completed', 0, 'CompletedIndexTemp') = 'Not Done' OR peek('Completed', 1, 'CompletedIndexTemp') = 'Not Done', 'Not Done', 'Done');