Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

histogram counts duplicates

Suppose I have two tables joined by id, where one table contains this id as unique identified and second table has multiple records with the same id.

data1:

LOAD * INLINE [

    d1_ID, param1

    A1, 1

    A2, 2

    A3, 2

    A4, 3

    A5, 2

    A6, 4

    A6, 3

    A7, 5

];

data2:

LOAD * INLINE [

    d2_ID, d1_ID, other_param

    1, A1, x

    2, A1, y

    3, A1, z

    4, A2, t

    5, A2, x

    6, A3, y

    7, A4, z

    7, A4, t

    8, A6, x

];



I would like to make a histogram of param1 values, counting each once per d1_ID. But having these two tables joined, it counts '1' three times, as A1 occurs 3 times in data2 table... Any ideas how to make it?


11 Replies
Anonymous
Not applicable
Author

I've realized that there is another part of the script that probably causes the issue... on this dummy data it would be:

Temp:

Load

    'data1' as [data1.Table] ,

    '1'  as [data1.Count] ,

    *

Resident data1;

Outer Join (Temp)

Load

    'data2' as [data2.Table] ,

    '1'  as [data2.Count] ,

    *

Resident data2;

join:

NoConcatenate

Load

  if ( [data1.Table] = 'data1' and isnull([data2.Table])

      , dual('Yes',1)    , dual('No',0) )  as    [data1.Only] ,

  if ( [data2.Table] = 'data2' and isnull([data1.Table])

      , dual('Yes',1)    , dual('No',0)     )  as    [data2.Only] ,

  if ( [data2.Table] = 'data2' and [data1.Table] = 'data1'

      , dual('Yes',1)    , dual('No',0)     )  as    [data1.and.data2] ,

  *

resident Temp ;

Drop Table Temp ;

Drop Table data1;

Drop Table data2;



It allows me to filter all data that occurs in both tables. Now while having this, the histogram gives false result

Anonymous
Not applicable
Author

Finally my colleague found the solution - posting here for anybody who ever meets this issue.

I've added the following script:

join2:

NoConcatenate

Load

    d1_ID

    ,min(param1) as param1_min

Resident join

group by d1_ID;

Then I just put param1_min as the histogram field. That did the magic!