Skip to main content
Woohoo! Qlik Community has won “Best in Class Community” in the 2024 Khoros Kudos awards!
Announcements
Nov. 20th, Qlik Insider - Lakehouses: Driving the Future of Data & AI - PICK A SESSION
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

How to merge/combine multiple slices of a pie chart into one slice/category?

Hey everyone,

I'm attempting to combine various field values into one value on a particular field. For example, if we had a field named fruit and it contained values such as Banana, Strawberry, banana, straw berry, strawberries, blueberries, and Pineapple.

The easiest approach is obviously to go into the data source and quickly modify these values, however that is not currently possible. My pie chart is currently giving me a different slice for each of these field values, however we obviously would like to have all of the strawberries combined into one slice.

I've had a degree of success using set analysis, however a colleague recently suggested using an IF statement in my script. Any help would be greatly appreciated!

1 Solution

Accepted Solutions
MarcoWedel

or joining the group name using a static levenshtein distance limit in the script:

QlikCommunity_Thread_268376_Pic11.JPG

tabFruits:

LOAD *,

    Ceil(Rand()*100) as SomeFruitValue

INLINE [

    fruit

    Banana

    Strawberry

    banana

    straw berry

    strawberries

    blueberries

    Pineapple

];

tabFruitGroups:

LOAD Distinct fruit Resident tabFruits;

Join

LOAD * Inline [

fruit2, fruitGroup

Banana, Yellow Fruit

Pineapple, Yellow Fruit

Strawberry, Berry

Blueberry, Berry

];

Left Join (tabFruits)

LOAD fruit,

    fruitGroup

Resident tabFruitGroups

Where levenshtein(Upper(fruit),Upper(fruit2))<=3;

DROP Table tabFruitGroups;

hope this helps

regards

Marco

View solution in original post

10 Replies
effinty2112
Master
Master

Hi Chris,

Maybe a calculated dimension like:

if(Match(fruit,'Banana', 'Strawberry', 'banana', 'straw berry', 'strawberries', 'blueberries', 'Pineapple'),'Fruit','Not Fruit')

cheers

Andrew

Anonymous
Not applicable
Author

Could you explain the function of Match? Initially, I was attempting to group together my budgets based on a description about them, and the data source had them named with different conventions (e.g. and, And, &).

However, I am also looking for a way to combine values regardless of how similar their names are to generate charts that management would like to see. In the example I wrote, it would be along the lines of "We want to group together all the blueberries with strawberries, but leave the rest as they are"

Are both of these possible via the expression you posted? Thank you for your help!

effinty2112
Master
Master

Hi Chris,

The best explanation of the Match function is to be found in QlikView help, but briefly it returns a number if the value being checked matched one of the values in the list that follows. If it matches the first in the list we get 1, second 2 etc. If no match then 0. It can be used as an if condition, 0 will cause condition to be not met, >0 and the condition is satisfied.

Look into the WildMatch function in the help. This allows to search for similar strings using wildcards.

If fruit = 'strawberry'

match(fruit, 'berry') = 0

wildmatch(fruit, '*berry') = 1


Your best bet might be to build up a table mapping possible field values to  the standardised values you want to use. You can do this in an external file like an Excel file. Look up the function ApplyMap. You can use that to map to correct values in the script.


Good luck


Andrew

MarcoWedel

Hi,

maybe one solution could be to group the fruit names by similarity (e.g. using the Levenshtein distance) without hard coding specific fruit names at all:

QlikCommunity_Thread_268376_Pic1.JPG

QlikCommunity_Thread_268376_Pic2.JPG

QlikCommunity_Thread_267457_Pic6.JPG

QlikCommunity_Thread_267457_Pic4.JPG

QlikCommunity_Thread_267457_Pic3.JPG

QlikCommunity_Thread_267457_Pic5.JPG

tabFruits:

LOAD *,

    Ceil(Rand()*100) as SomeFruitValue

INLINE [

    fruit

    Banana

    Strawberry

    banana

    straw berry

    strawberries

    blueberries

    Pineapple

];

tabFruitGroups:

LOAD Distinct fruit Resident tabFruits;

Join

LOAD fruit as fruit2

Resident tabFruitGroups;

Join

LOAD fruit,

    fruit2,

    levenshtein(Upper(fruit),Upper(fruit2)) as LevenshtDistU

Resident tabFruitGroups;

refer to:

Levenshtein Algorithm

string matching with fuzzy, trigram (n-gram), levenshtein, etc.

for the implementation of the required levenshtein function.

I guess in a productive application the grouping of the fruit names should be done in the script to really get all relevant names into one group and probably name the group by its most frequent member.

The front end aggregation in this example does not deliver useful results for all selected levenshtein distances.

hope this helps

regards

Marco

Anonymous
Not applicable
Author

Thank you!

This will definitely help in terms of cleaning data. I don't suppose there is any way to use the Levenshtein distance  to group based on different criteria is there? For example, say you wanted to group Pineapple and Banana (yellow fruits) despite their vastly different names.

Is this still possible using the methodology you have outlined? It sounds like it would be more applicable to do this through the script editor as you mentioned towards the end of your post, but as a new user I'm not sure the exact syntax for grouping field values.

MarcoWedel

Hi,

maybe one solution might be to create a table that defines the different groups and check for similarities using the same algorithm:

QlikCommunity_Thread_268376_Pic7.JPG

QlikCommunity_Thread_268376_Pic8.JPG

QlikCommunity_Thread_268376_Pic9.JPG

QlikCommunity_Thread_268376_Pic10.JPG

tabFruits:

LOAD *,

    Ceil(Rand()*100) as SomeFruitValue

INLINE [

    fruit

    Banana

    Strawberry

    banana

    straw berry

    strawberries

    blueberries

    Pineapple

];

tabFruitGroups:

LOAD Distinct fruit Resident tabFruits;

Join

LOAD * Inline [

fruit2, fruitGroup

Banana, Yellow Fruit

Pineapple, Yellow Fruit

Strawberry, Berry

Blueberry, Berry

];

Join

LOAD fruit,

    fruit2,

    levenshtein(Upper(fruit),Upper(fruit2)) as LevenshtDistU

Resident tabFruitGroups;

hope this helps

regards

Marco

MarcoWedel

or joining the group name using a static levenshtein distance limit in the script:

QlikCommunity_Thread_268376_Pic11.JPG

tabFruits:

LOAD *,

    Ceil(Rand()*100) as SomeFruitValue

INLINE [

    fruit

    Banana

    Strawberry

    banana

    straw berry

    strawberries

    blueberries

    Pineapple

];

tabFruitGroups:

LOAD Distinct fruit Resident tabFruits;

Join

LOAD * Inline [

fruit2, fruitGroup

Banana, Yellow Fruit

Pineapple, Yellow Fruit

Strawberry, Berry

Blueberry, Berry

];

Left Join (tabFruits)

LOAD fruit,

    fruitGroup

Resident tabFruitGroups

Where levenshtein(Upper(fruit),Upper(fruit2))<=3;

DROP Table tabFruitGroups;

hope this helps

regards

Marco

Anonymous
Not applicable
Author

Andrew,

I appreciate the help. I looked through the documentation for the match() function, and I managed to combine the fields in question. However, I was unable to define more than 2 fields for the grouping. In the end, I ended up with 2 groups:

              1. Group containing all desired field values that should be grouped

              2. All other fields

I tried to find a way to combine the IF/MATCH/WILDMATCH functions such that I could define multiple fields, however I kept receiving an error in my expression. Do you have any suggestions on how I may modify the function in a manner such as:

    

     if(Match(fruit,'Banana', 'Pineapple'), 'Yellow Fruit', Match(fruit, 'Strawberry', 'strawberries'), 'Red Fruit', Matc(fruit, 'blueberries'),'Blue Fruit','Not Fruit')

Anything with functionality like this would be a huge help, thank you.

Anonymous
Not applicable
Author

THANK YOU!!! This got me to where I was wanting to go. I'm going to have to look farther into the script to figure out how the function works. I had attempted originally to use just the inline table and a left join, however it wouldn't work until I created the Levenshtein function. Thank you again!!