Solved: Re: Help determining the subset of data that had n... - Qlik Community

Report Inappropriate Content · ‎2013-04-04

Dear all,

We need to find, the “Set” of data that had never had a certain value.

Lets say we have the following input table listing the V_Values that F_Filed has assumed:

F_Field	V_Value
m1	A
m1	B
m2	A
m2	C
m3	B
m4	B
m5	B

Case1) If we want to list all F_Fields that had never got “A” value; the set would be only m2.

Case2) If we want to list the F_Fields that never had a “C”; the set would be m1,m3,m4,m5

The only way we fount to implement this in Qv, was using a “dynamically built” set analysis. For instance, for the case “never been “A”, it would be:

concat({<F_Field-=$(='{'&concat(if(V_Value='A', F_Field),',')&'}') >} Distinct F_Field, ' ')

It can even be used into a calculated dimension:

aggr(concat({<F_Field-=$(='{'&concat(if(V_Value='a', F_Field),',')&'}') >} Distinct F_Field, ' '),F_Field)

As we would expect, this approach worked fine for a relatively low number of different F_Fields (lets say less than 6k different F_Filels), but our production DB has about 1.6 M distinct F_Fields and about 30M rows.

What can we do to achieve our buckets of data ?

Best regards,

Paulo

Anonymous · ‎2013-04-04

Paulo,

This expression look simpler, and works for you set of data (see attached):

=concat({1<F_Field=E(F_Field)>} distinct F_Field, ' ')

I don't expect it to have better performance, give it a try anyway.

Regards,

Michael

View solution in original post

Anonymous · ‎2013-04-04

Paulo,

This expression look simpler, and works for you set of data (see attached):

=concat({1<F_Field=E(F_Field)>} distinct F_Field, ' ')

I don't expect it to have better performance, give it a try anyway.

Regards,

Michael

flipside · ‎2013-04-04

Hi Paulo,

Depending on the number of distinct V_Values you have, the attached may work where you turn the values into binary strings and parse the strings. Zeroes show where a value doesn't exist.

flipside

Report Inappropriate Content · ‎2013-04-04

I think is better to work with qvd .

Store your table into a qvd and create another qvd :

TAB_NEVER:

LOAD

F_Field,

if((sum(if(index(V_Value, 'A' )<>0,1,0))>0,1,0) as FLAG_NEVERA,

if((sum(if(index(V_Value, 'B' )<>0,1,0))>0,1,0) as FLAG_NEVERB,

if((sum(if(index(V_Value, 'C' )<>0,1,0))>0,1,0) as FLAG_NEVERC,

if((sum(if(index(V_Value, 'D' )<>0,1,0))>0,1,0) as FLAG_NEVERD

resident YOURTABLE

group by F_Field;

What is CapsLock change with your example.

hope it helps.

Report Inappropriate Content · ‎2013-04-04

Fantastic Michael. Worked like a charm!

It performs much better. I’ve used E() so few times that had almost forgotten about it.

Just for the record, the statement used to select the Set of F_Fields that had never assumed a specific V_Value was:

Concat({<F_Field=E({1< F_Field={'V_Value'} >} F_Field) >} Distinct F_Field,' ')

Best regards,

Paulo

Help determining the subset of data that had never a certain value