Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Trying to use WHERE clause with GROUP BY to exclude records from a SUM

Hi Folks

Please can anyone help?

I've got a table full of order data to summarise:

  • For each order number I need to generate several SUM values - sum of Field1, sum of Field2, etc, across all individual item records within the order.
  • I want to exclude those rows where a specific field is blank from the SUMs (but I want to keep the keep the data in the table)

I'm using a GROUP BY on the Order No field, and

WHERE len(<fieldname>) <> 0

for the exclusion. I've also tried

WHERE <fieldname> = ''

that doesn't seem to work either - the records get summed regardless of the blank field. Any ideas why please?

NB: I actually started off with a NULL rather than a blank - as the result of a left join with a sparsely populated table - and I've used the following code to convert it. If there is a solution for NULL that works better I'd be equally pleased!

NullMap:

MAPPING LOAD

null(),''

AUTOGENERATE 1;

1 Solution

Accepted Solutions
johnw
Champion III
Champion III

So you're generating a summary table? In the script? Well, first, I wouldn't do that if there is ANY way around it. Also, I don't know why your where wouldn't exclude the null values. Still, maybe this alternative would work? It should work just fine for nulls too, and not require a map to convert nulls.

SummaryTable:
LOAD
OrderNo
,sum(if(len(<fieldname>),Field1)) as Field1Sum
,sum(if(len(<fieldname>),Field2)) as Field2Sum
...
RESIDENT OrderItems
GROUP BY OrderNo
;

If you put the len(<fieldname>) in the WHERE clause, then you'd drop the OrderNo from the table completely if all items within the order have blank for that field. It sounds like you still want it in this table, so I think putting the condition in each field may be more appropriate anyway.

View solution in original post

7 Replies
pover
Luminary Alumni
Luminary Alumni

I don't have a full picture of what is going on, but if the values were null you could try:

where not isnull(fieldname)

Regards.

johnw
Champion III
Champion III

So you're generating a summary table? In the script? Well, first, I wouldn't do that if there is ANY way around it. Also, I don't know why your where wouldn't exclude the null values. Still, maybe this alternative would work? It should work just fine for nulls too, and not require a map to convert nulls.

SummaryTable:
LOAD
OrderNo
,sum(if(len(<fieldname>),Field1)) as Field1Sum
,sum(if(len(<fieldname>),Field2)) as Field2Sum
...
RESIDENT OrderItems
GROUP BY OrderNo
;

If you put the len(<fieldname>) in the WHERE clause, then you'd drop the OrderNo from the table completely if all items within the order have blank for that field. It sounds like you still want it in this table, so I think putting the condition in each field may be more appropriate anyway.

Not applicable
Author

Thanks Karl, but I tried that earlier. I think John's answer is going to sort it for me though.

Not applicable
Author

Hi John

Thanks for the help - I'm struggling to understand why Qlikview is doing what it's doing.

To answer your 'But why?' question I'll have to give you a bigger picture. I have to create several boolean metrics (true/false) for an order so they can be reported by our designers. The metrics are defined as true where the sum of some order item fields is greater than the sum of others, and there is a specific last item (by date) in the order. I'm doing this in 3 steps:

  1. getting the sum and 'last item' info for each group of order items
  2. applying the 'greater than' and 'last item' rules to generate each metric in its own table
  3. joining the metrics together by order number and saving the order metrics to QVD

Is there a more natural way to do this? Unfortunately it's a common requirement in our business, so I'll need to do the same sort of thing for several business objects we're reporting on. Any input very gratefully received!!!

As for the fix you suggested, I had to use a slightly modified version:

sum(if(len(trim(<field>))>0,Field1)) as FieldSum

but it does seem to work OK. Something strange seems to have happened since I ran the mapping load code - even though I've removed the code from the script and rerun it, the field is always working as a blank string rather than a null (hence the reason for your suggested code not working 'as is'?). Is there something I have to do to 'undo' the effect of the mapping load rather than just removing it and re-running? I hope that makes sense...

Regards

Graham

johnw
Champion III
Champion III

I think a blank string IS null, so I think your mapping code may have been doing nothing, which is why removing it is also doing nothing.

I think I see now why you'd want to do the aggregation in the script. Your business simply considers these boolean metrics to be order information. There's no breakdown of this data at a lower level, though obviously the supporting data IS available if you look at all the order items. While you probably could calculate the flags on the fly in the screen objects, I'm not thinking of any advantage, and it would slow down the charts and complicate life for the screen designers.

I'm not sure I'm following your 3 steps though, or at least not in enough detail to be able to see if it's a good way, or if I might have alternatives to suggest. Can you maybe post some example data and what your resulting metrics are supposed to look like? The script, or at least the most relevant parts, might help as well.

I'm surprised you needed to trim, and you shouldn't need >0, because >0 is true, while 0 is false. But whatever works.

Not applicable
Author

John

The reason I'm thinking the nulls have been 'converted' to blank strings is that when I was running the join code before the mapping I was seeing a hyphen in the result field where it had no data to join, however now the hyphen is missing (other left joined fields that weren't mapped still have hyphens). Also, I had already tried using len() on the result field before the mapping and it was giving me a hyphen rather than a zero as the result for that too. Like you say - it works now though...

The trim is necessary as len() against our empty string is giving the result 12 (the width of the field in the original table). Is that unusual?

Thanks for the info about >0 = TRUE, I'd never have guessed that - I've only just got used to Qlikview loading a logical TRUE as -1!

It's very kind of you to offer to look through some sample data, but I've been trying to abstract out some of the business jargon/quirks from the info I've given so far and I think it would be unreasonable to expect you to spend the time to understand our data - especially as my script seems to be working fine now.

One last question I would like to ask though: In order to count the number of grouped records with a TRUE logical field, is there a better way than SUMming the negative field values (e.g. "sum(-LogicalField) as cnt_Field")?

Thanks again for all your help.

Kind regards

Graham

johnw
Champion III
Champion III

Well, maybe a blank string isn't null then. I should have just checked. Yeah isnull('') is false, so I guess it isn't null. On the other hand, when I didn't even put a comma or field value in an inline load, it still didn't call it isnull(), so maybe it's just the isnull() bug. Hard to say.

Yeah, 0 is false, and any non-zero is true.

I'd probably -sum(LogicalField) so that I was only applying the sign once. But yeah, I don't think there's a more efficient way. There are perhaps more clear ways, like count(if(LogicalField,OrderItem)) or whatever you're counting. Most of the time, clarity probably wins over pure execution speed, particularly in the script.

QlikView normally autotrims everything as you read it in. You have to actually set a script parameter to prevent that from happening. But maybe it only trims as a last step before establishing the field, thus requiring the trim() if you're doing a len(). That wouldn't surprise me. I tried testing it with an inline load, but apparently if you put ' ' in an inline load, it figures you really do want it to be spaces, so it keeps them all and doesn't trim even as a last step.