Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Toronto Sept 9th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Drummer
Contributor II
Contributor II

Combinations of data from multiple fields

Hello. I hope someone can help.

I have a requirement to create a filter based on any combination of data from 39 fields. I realise that this is a lot of combinations, and will try and get this down to, say, up to 8 in each.

I managed to get this working in a POC using just 4 fields. Here's the code:

MultiDims_Temp:
LOAD
Field1,
Field2,
Field3,
Field4,
RecNo() AS SourceNum,
IterNo() AS IterationNum,
Pick(IterNo(),
Field1,
Field2,
Field3,
Field4,
Field1&' / '& Field2,
Field1&' / '& Field3,
Field1&' / '& Field4,
Field2&' / '& Field3,
Field2&' / '& Field4,
Field3&' / '& Field4,
Field1 &' / '& Field2&' / '& Field3,
Field1&' / '& Field2&' / '& Field4,
Field1&' / '& Field3&' / '& Field4,
Field3&' / '& Field4&' / '& Field2,
Field1&' / '& Field2&' / '& Field4&' / '& Field3
) AS FieldCombo
RESIDENT ComboTable
WHILE IterNo() <= 70; 

When I try and increase this to pick 4 combinations from the 39 dims, I have tens of thousands of field combinations, and it takes hours to load.

 

Can anyone suggest a different way to do this? I was thinking perhaps having a table with the field names which the users can select from, and to try and use these selected fields as a type of dynamic filter which the user could then select the values from. I just can't quite figure out how to do it though.

 

Any assistance will be gratefully received.

Thanks.

 

 

Labels (1)
10 Replies
srchagas
Creator III
Creator III

Ok its little bit complicate to understand, can you try to make sample so we can understand better what are you  trying to do ?

vincent_ardiet_
Specialist
Specialist

Not sure neither to understand the purpose. Worst case, a concatenation of all fields should be enough using stars in the research. 

marcus_sommer

I assume that your data is structured within a crosstable-logic. If so, you should transform then into a normal data-structure and afterwards you may add some grouping/categorizing on them:

The Crosstable Load - Qlik Community - 1468083

and you won't need the field-combination anymore.

Drummer
Contributor II
Contributor II
Author

Hello @srchagas @vincent_ardiet_ @marcus_sommer  and many thanks for attempting to work out what I'm trying to do.

I'm sorry for the late response - I've been working on other projects.

Here's (I hope) a better description of the issue and what I am attempting to do.

I have several tables which have been combined to make something similar to the table below. NB - I have 39 different fields which I need to take into account. I have included 8 here (Type, Location, AttributeAge, PersonAge, Bonus, Colour, Rating and T&C), to give an idea of my data structure as it stands. For now I will refer to these as dimensions. The other fields are the ID and Year (which join to other tables) and Cost which I need to use to summarise the data.

Reference Year Type Location AttAge PersonAge Bonus Colour Rating T&C Cost
1 2022 New LONDON 5 43 0 Red A A 912
2 2022 Ongoing MANCHESTER 8 40 1 Green D D 299
3 2022 Ongoing BIRMINGHAM 15 55 5 Silver Z D 1098
4 2022 New DURHAM 3 38 1 Black C- C 3997

5

 

2022 New NOTTINGHAM 8 38 5 Red B- A 210
6 2022 Ongoing LEICESTER 15 45 17 Green Z A 248
7 2022 Ongoing BIRMINGHAM 8 51 2 Silver B D 5433
8 2022 Ongoing LONDON 4 58 2 Black Z F 4322
9 2022 Ongoing PENDLE 8 44 6 Red D F 438
10 2022 New LONDON 5 55 2 Green C C 4843
11 2022 New MANCHESTER 2 47 6 Silver C B 432
12 2022 New BIRMINGHAM 11 49 1 Black D D 575
13 2022 Ongoing DURHAM 8 60 6 Red B+ F 33
14 2022 Ongoing NOTTINGHAM 2 64 2 Green C+ A -144
15 2022 Ongoing LEICESTER 3 35 2 Silver C F 544
16 2022 Ongoing BIRMINGHAM 6 58 14 Black E D 944
17 2022 Ongoing LONDON 5 51 12 Red C A -2320
18 2022 Ongoing PENDLE 9 41 2 Green A D 2664
19 2022 Ongoing DERBY 6 45 11 Silver B F 422
20 2022 Ongoing YORK 7 45 2 Black A A 822

 

My aim is to find any/all unique combinations of the dimensions and summarize each combination based on the Cost field. I then need to discard any combinations of data where the sum(Cost) < 5k. At this point, the Reference (id) field doesn't need to be included in the Group By, but I will need a way to find every Reference which falls into the group of data, and join it so that the integrity between the other tables is kept in place.

I have this working for 4 of the dimensions (see code in original post), but I now have an additional 35 dimensions which I need to bring into this scenario - which is making the original code unworkable as it stands.

Here's an idea of what I hope to see (based on some combinations of the data above):

Year Type Location AttAge PersonAge Bonus Colour Rating T&C DimCombination TOTAL
2022 New               New,,,,,,, 11959
2022 New London             New,London,,,,,, 6955
2022 New London 5           New,London,5,,,,, 6955
2022 New London 5 43 0 Red A A New,London,5,43,0,Red,A,A 912
2022 New London 5 55 2 Green C C New,London,5,55,2,Green,C,C 6043
2022 Ongoing LONDON             Ongoing,LONDON,,,,,, 2002
2022 Ongoing LONDON 4 58 2 Black Z F Ongoing,LONDON,4,58,2,Black,Z,F 4322
2022 Ongoing LONDON 5 51 12 Red C A Ongoing,LONDON,5,51,12,Red,C,A -2320
2022 New   5           New,,5,,,,, 6955
2022 New         Black     New,,,,,Black,, 4572
2022           Black     ,,,,,Black,, 10660

 

For scope - we have approx 20k references (all of which can have null values in any of the 39 dimensions. All records will have a unique Reference (id) and Year (the same reference will belong in multiple years), and each will have a Cost (which can be positive or negative).

I have attached a sample Qlikview qvw (with the working code from my initial post) for anyone that is able to take a look.

Thanks to all - your effort is much appreciated.

marcus_sommer

It's not really clear for me what do you want to achieve. The following is probably not what you asked for but I could imagine it as helpful to define the requirements more purposeful.

This means to load the data with a crosstable-prefix like hinted above to create a table with year and reference as normal dimensions and a CATEGORY and VALUE field from the crosstable-part. Then within the UI you may use an expression like:

concat(distinct  CATEGORY, ' + ')

and then playing with various selections and/or wrapping it an aggr() to create a calculated dimension from it, maybe per:

aggr(concat(distinct  CATEGORY, ' + '), YEAR)

and to it then:

sum(Costs)

vincent_ardiet_
Specialist
Specialist

If your purpose is to be able to build a single filter object which contains all values from your 39 fields, do you really need to concatenate them?
I mean, for example, you have:

Field1 Field2 Field3 Field4
John Houston Chevrolet Green
Jake Dallas Chevrolet Blue
Will Austin Ford Yellow

 

If your have a field containing:

SearchField
John
Jake
Will
Houston
Dallas
...

 

Would it fit your need?

If this is the case, maybe the easiest is to do a loop across your 39 fields with something like:
FilterTable:
Load IterationNum, Field1 as SearchField Resident MultiDims_Temp;

 

 

Drummer
Contributor II
Contributor II
Author

@vincent_ardiet_ @marcus_sommer  - many thanks for your responses.

@vincent_ardiet_  - you ask: "If your purpose is to be able to build a single filter object which contains all values from your 39 fields, do you really need to concatenate them?"

I don't want all values from the 39 fields in one field to filter on. I want all combinations of the 39 fields where the cost >=5K to be in one field to filter on. So using the data from your response (I have added a Cost field):

Field1 Field2 Field3 Field4 Cost
John Houston Chevrolet Green 5433
Jake Dallas Chevrolet Blue 4843
Will Austin Ford Yellow 2664

 

I would expect to see:

DimCombination CostTotal
Chevrolet 10276
Chevrolet / Green 5433
Chevrolet / Green / Houston 5433
Green 5433
Houston 5433
Houston / Chevrolet 5433
Houston / Green 5433
John 5433
John / Chevrolet 5433
John / Chevrolet / Green 5433
John / Green 5433
John / Houston 5433
John / Houston / Chevrolet 5433
John / Houston / Chevrolet / Green 5433
John / Houston / Green 5433

 

As you can see Chevrolet has a higher Cost value as this is listed in 2 rows and has been summed and grouped by the DimCombination field.

What I'm trying to do is find all combinations of 39 fields where the sum of cost is >= 5000. The user needs to see all of the relevant 39 fields in one filter so that they can click on that filter to see the individual records associated with that combination of the 39 (or fewer) fields.

The qvw attached shows it working with 4 fields.

To try and simplify - imagine I wanted to find out every combination of 5 numbers that added up to 10:

1, 2, 3, 4, 5 (these are our fields)

1+2+3+4

2+2+2+4

1+3+3+3

1+4+5

5+5

etc...

Many thanks for the ongoing help.

Drummer.

marcus_sommer

I must admit that I have serious doubts about the usefulness of the requested solution. Assuming that all field-combinations are created the user would get a filter-field with millions of entries. Then applying various selections against this field + probably some other fields + filtering against (aggregated) measure-values (which are massive redundant) could provide valuable insights?

How does the business do the job nowadays? With which tools and how are the data there prepared? And what do they do exactly with it to extract an insight and what do they do with it? I suggest you let you show it in detail. I wouldn't be surprised if the real requirement is quite different to what they have originally requested (I have experienced it quite often).

Drummer
Contributor II
Contributor II
Author

Hi @marcus_sommer  - many thanks for the response.

At the moment, the customer has to make educated guesses about the combinations of the 39 fields which could make up a cost value of >=5K. They have the 39 individual fields on the dashboard and filter each value in each field in combinations that they hope may work.

Although I agree that there will be millions of combinations in the filter, they will use these to find patterns in the data, and to do additional slicing and dicing to see how they can improve products for their customers.

We talked at length about this with them, and this is exactly what they require. At some point in the future, this data will be used for a machine-learning project for predictive analytics.

Thanks, again, for your input.

Drummer.