Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
CreepyCatLady
Creator
Creator

Count/Group by in load script not matching equivalent function in sheet

I have an aggr function in my sheet that counts the number of machines a user has. The function is 

Count(total <USER> distinct HOST_NAME).

This gives me an accurate count of machines to user; I can verify this by looking at the table sorted by USER and see that the results match the number of times the user shows up in a simple table.

CreepyCatLady_1-1658171215249.png

The problem is that this is a measure in the table, and I cannot filter nor sort on a measure. I also need to be able to create various other charts based on this field. So I need to create this field in my load script. I did this, and the results of the load script are inconsistent with the results of the function. Here is my load script:

TEMP:

LOAD max(ASOFDATE) as TempDate;

 

SQL SELECT *

FROM DB.DB_NAME

;

 

vMaxDate= Date(Peek('TempDate',0,'TEMP'),'DD-MM-YYYY'); 

//The above is to create a variable to show only current date results;

//The database includes historical data.

 

DB_NAME:

 LOAD ASOFDATE,

     HOST_NAME,

     USER

;

 SQL SELECT *

 FROM DB.DB_NAME;

 

 HOST_COUNT:

Load USER,

count(DISTINCT (HOST_NAME)) as HOST_COUNT

resident [DB_NAME]

Group by USER;

This gives the following results (function results included for comparison):

CreepyCatLady_2-1658171704607.png

Can anyone help me determine what I am doing wrong here? The table is filtering on only vMaxDate, so that shouldn't be the problem, but I included that info here anyway in case it is somehow related. 

This is a blocker for an important project of mine so any help would be very appreciated. Thanks.

Labels (1)
3 Replies
Or
MVP
MVP

I don't see anything wrong with your script, so I'd suggest checking the underlying data, loads/SQLs, and data structure.  Note that while you included a section setting a max date variable, that variable isn't shown as actually being used anywhere. Perhaps this is related to your issue?

Temp:
Load * INLINE [
USER, HOST_NAME
111, Host1
111, Host2
111, Host3
145, Host4
145, Host5
BRB, Host6
BRW, Host7
BTJ, Host8
BTS, Host9
];

Load [USER], Count(distinct [HOST_NAME]) as HOST_COUNT
Resident Temp
Group by [USER];

Or_0-1658222524753.png

 

CreepyCatLady
Creator
Creator
Author

I'm pretty sure that is it. I think it's counting every machine the user has had instead of their current machines. 

I tried 

if(Date(ASOFDATE,'DD-MM-YYYY')='$(vMaxDate)',(count (DISTINCT (HOST_NAME)))) as HOST_COUNT

but when I loaded that it gave me an "invalid expression" error. Can you (or anyone) help me with what is wrong with my expression there?

Or
MVP
MVP

As I said, there doesn't appear to be anything wrong with your load expression or script as you posted it. If the data is being further filtered somewhere else, you might nee to apply a similar filter in the Where condition for the load where you count the host names, though.