Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Categorize a field with values

Hi Guys,

I have a field with values between o and 50,000

I need to category it, like:

Less than 100

Less than 250

Less than 500

and so on...

I know two ways to do that:

1.- concatenating several if statements.

2. Creating an auxiliary table with two fields, one with all posxible values, the second one with the category for each value (using in line or external file).

The issue is that, in te first case I have to deal with too much if statements.

In the second case, I have to list ALL possible values (up to 50,000 or even more)..

Is there any other option?

Thanks in advance,

Aldo.

1 Solution

Accepted Solutions
jonathandienst
Partner - Champion III
Partner - Champion III

Aldo

I think IntervalMatch is what you are after. Here is an example to clarify:

Step 1: set up the intervals table. This example sets up 5 named (dual) values for 5 intervals together with the interval minimum and maximum values.

NOTMOVEBANDS:
LOAD Dual(BandName, DayMin) AS NMoveBand,
DayMin,
DayMax
INLINE
[
BandName, DayMin, DayMax
'Active', 0, 0
'<2 Weeks', 1, 14
'2 - 4 weeks', 14, 28
'4 - 8 weeks', 28, 56
'>8 weeks', 56,
];


Step 2: Load the fact table with your field values

Step 3: Perform the interval match. The example matches the LastMovePeriod field to the intervals defined by DayMin and DayMax.

LEFT JOIN(.... your fact table here ....)
INTERVALMATCH (LastMovePeriod) LOAD DayMin, DayMax RESIDENT NOTMOVEBANDS;


That's it. This will result in a synthetic key. If you prefer to remove the synthetic key (not really necessary), then add something like:

LEFT JOIN(.... your fact table here ....)
LOAD DayMin,
DayMax,
NMoveBand
RESIDENT NOTMOVEBANDS;

DROP FIELDS DayMin, DayMax;
DROP TABLES NOTMOVEBANDS;


This last script adds the NMoveBand field (the matched dual value) to the fact table and then deletes the DayMin/DayMax values and the NOTMOVEBANDS intervals table.

Hope that helps.

Jonathan

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein

View solution in original post

5 Replies
Not applicable
Author

Where are you getting the data from? if this is a database server there are always options to put it there. For example MSSQL has the case statement where you could do something as the following:

SELECT Value

, CASE

WHEN Value BETWEEN 0 AND 99 THEN 'Less then 100'

WHEN Value BETWEEN 100 AND 199 THEN 'Less then 200'

ELSE '200 or more'

END AS ValueCategory

FROM Table

It is basically the same as an if-statement in your script. Is it not that the case statement is quite a bit more concise.

Not applicable
Author

I believe CASE WHEN doesn't work with data.

Not applicable
Author

Now you lost me a bit, but it definitely works in the SQL part of the load statement. See the below example:

Values:

LOAD Value

, CategoryValue

SQL SELECT Value

CASE WHEN .... THEN.... ELSE.... END AS CategoryValue

FROM Table

Ofcourse if your data is not coming from a database server this won't work. Another option could be to create a data-island with 3 columns (from, to, description) and then by using a set-analyse get the right description. Though problems like you describe i would always fix as close to the source as possible.

jonathandienst
Partner - Champion III
Partner - Champion III

Aldo

I think IntervalMatch is what you are after. Here is an example to clarify:

Step 1: set up the intervals table. This example sets up 5 named (dual) values for 5 intervals together with the interval minimum and maximum values.

NOTMOVEBANDS:
LOAD Dual(BandName, DayMin) AS NMoveBand,
DayMin,
DayMax
INLINE
[
BandName, DayMin, DayMax
'Active', 0, 0
'<2 Weeks', 1, 14
'2 - 4 weeks', 14, 28
'4 - 8 weeks', 28, 56
'>8 weeks', 56,
];


Step 2: Load the fact table with your field values

Step 3: Perform the interval match. The example matches the LastMovePeriod field to the intervals defined by DayMin and DayMax.

LEFT JOIN(.... your fact table here ....)
INTERVALMATCH (LastMovePeriod) LOAD DayMin, DayMax RESIDENT NOTMOVEBANDS;


That's it. This will result in a synthetic key. If you prefer to remove the synthetic key (not really necessary), then add something like:

LEFT JOIN(.... your fact table here ....)
LOAD DayMin,
DayMax,
NMoveBand
RESIDENT NOTMOVEBANDS;

DROP FIELDS DayMin, DayMax;
DROP TABLES NOTMOVEBANDS;


This last script adds the NMoveBand field (the matched dual value) to the fact table and then deletes the DayMin/DayMax values and the NOTMOVEBANDS intervals table.

Hope that helps.

Jonathan

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein
Not applicable
Author

Great, exacly what I was looking for.

Thanks guys.

Aldo.