Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Counting distinct values

I have a dataset containing a 'Reference No' field and within each reference no, there is an Item_no.

In qlikview I want to group together the item_no's for each reference no and then create an Item banding. I've tried the code below but get an error.

Load *,

If(Item_count=1,'a.1',

if(Item_count>1 and Item_count <=5,'b.2-5',

if(Item_count>5 and Item_count <=10, 'c.6-10',

if(Item_count>10 and Item_count <=50, 'd.11-50',

if(Item_count>50 and Item_count <=100, 'e.51-100',

if(Item_count>100 and Item_count <=250, 'f.101-250',

if(Item_count>250 and Item_count <=500, 'g.251-500',

if(Item_count>500,'h.501-high','unknown')))))))) as Item_band;

count(distinct(ITEM_NO)) as Item_count;

Load,

Reference_No,

ITEM_NO

I want to do this in the load script rather than in the from end as i want to use the Item_band as a dimension in a chart.

There is probably a very quick way of doing this but not seeing it at the minute.

Thanks

Adrian

1 Solution

Accepted Solutions
sujeetsingh
Master III
Master III

First derieve count as

DataPre:

LOAD

Reference_No,

count(Distinct ITEM_NO) as Item_count

from.....


Data

load

Reference_No,

If(Item_count=1,'a.1',

if(Item_count>1 and Item_count <=5,'b.2-5',

if(Item_count>5 and Item_count <=10, 'c.6-10',

if(Item_count>10 and Item_count <=50, 'd.11-50',

if(Item_count>50 and Item_count <=100, 'e.51-100',

if(Item_count>100 and Item_count <=250, 'f.101-250',

if(Item_count>250 and Item_count <=500, 'g.251-500',

if(Item_count>500,'h.501-high','unknown')))))))) as Item_band;

count(distinct(ITEM_NO)) as Item_count;




View solution in original post

4 Replies
sujeetsingh
Master III
Master III

First derieve count as

DataPre:

LOAD

Reference_No,

count(Distinct ITEM_NO) as Item_count

from.....


Data

load

Reference_No,

If(Item_count=1,'a.1',

if(Item_count>1 and Item_count <=5,'b.2-5',

if(Item_count>5 and Item_count <=10, 'c.6-10',

if(Item_count>10 and Item_count <=50, 'd.11-50',

if(Item_count>50 and Item_count <=100, 'e.51-100',

if(Item_count>100 and Item_count <=250, 'f.101-250',

if(Item_count>250 and Item_count <=500, 'g.251-500',

if(Item_count>500,'h.501-high','unknown')))))))) as Item_band;

count(distinct(ITEM_NO)) as Item_count;




maxgro
MVP
MVP

// make test data

t:

Load

ITEM_NO,

Reference_No

;

Load

Rand() as Rand1,

IterNo() as Reference_No,

RecNo() as ITEM_NO

Autogenerate 100

While Rand()<=0.5 or IterNo()=1;

// band in script

t2:

Load *,

If(Item_count=1,'a.1',

if(Item_count>1 and Item_count <=5,'b.2-5',

if(Item_count>5 and Item_count <=10, 'c.6-10',

if(Item_count>10 and Item_count <=50, 'd.11-50',

if(Item_count>50 and Item_count <=100, 'e.51-100',

if(Item_count>100 and Item_count <=250, 'f.101-250',

if(Item_count>250 and Item_count <=500, 'g.251-500',

if(Item_count>500,'h.501-high','unknown')))))))) as Item_band;

load

Reference_No,

count(distinct(ITEM_NO)) as Item_count

Resident t

group by Reference_No;

Not applicable
Author

Hi Massimo,

That has worked a treat!. Many thanks!

Adrian

MarcoWedel

Hi Adrian,

although your question is already answered, maybe you find this solution helpful too:

Instead of some static nested if statements I created the Item_Bands out of a loaded field (here inline).

I guess you added the letter prefix to your Item_band names due to sorting purposes, so instead I used the dual function to apply numerical values to the Item_Band enabling them to be sorted numericaly.

QlikCommunity_Thread_127959_Pic1.png

QlikCommunity_Thread_127959_Pic2.png

QlikCommunity_Thread_127959_Pic3.png

QlikCommunity_Thread_127959_Pic4.png

tabData:

Load

  Ceil(log(IterNo()))+1 as Reference_No,

  Ceil(Rand()*1000) as ITEM_NO

AutoGenerate 1

While IterNo() <= 2000;

tabSums:

LOAD

  Reference_No,

  Count(DISTINCT ITEM_NO) as Item_count

Resident tabData

Group By Reference_No;

tabClass:

LOAD *,

     AutoNumberHash128(RangeMin, RangeMax) as %ItemBandID;

LOAD RangeMin+1 as RangeMin,

     Alt(Peek(RangeMin)-1, Dual('high', 100000000000000)) as RangeMax

Inline [

RangeMin 

500

250

100

50

10

5

1

-1

];

tabIntMat:

IntervalMatch (Item_count)

LOAD RangeMin, RangeMax

Resident tabClass;

Left Join (tabSums)

LOAD Item_count,

     Dual(RangeMin&'-'&RangeMax, RangeMax) as Item_band,

     AutoNumberHash128(RangeMin, RangeMax) as %ItemBandID

Resident tabIntMat;

DROP Table tabIntMat;

hope this helps

regards

Marco