Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Guys,
I have a field with values between o and 50,000
I need to category it, like:
Less than 100
Less than 250
Less than 500
and so on...
I know two ways to do that:
1.- concatenating several if statements.
2. Creating an auxiliary table with two fields, one with all posxible values, the second one with the category for each value (using in line or external file).
The issue is that, in te first case I have to deal with too much if statements.
In the second case, I have to list ALL possible values (up to 50,000 or even more)..
Is there any other option?
Thanks in advance,
Aldo.
Aldo
I think IntervalMatch is what you are after. Here is an example to clarify:
Step 1: set up the intervals table. This example sets up 5 named (dual) values for 5 intervals together with the interval minimum and maximum values.
NOTMOVEBANDS:
LOAD Dual(BandName, DayMin) AS NMoveBand,
DayMin,
DayMax
INLINE
[
BandName, DayMin, DayMax
'Active', 0, 0
'<2 Weeks', 1, 14
'2 - 4 weeks', 14, 28
'4 - 8 weeks', 28, 56
'>8 weeks', 56,
];
Step 2: Load the fact table with your field values
Step 3: Perform the interval match. The example matches the LastMovePeriod field to the intervals defined by DayMin and DayMax.
LEFT JOIN(.... your fact table here ....)
INTERVALMATCH (LastMovePeriod) LOAD DayMin, DayMax RESIDENT NOTMOVEBANDS;
That's it. This will result in a synthetic key. If you prefer to remove the synthetic key (not really necessary), then add something like:
LEFT JOIN(.... your fact table here ....)
LOAD DayMin,
DayMax,
NMoveBand
RESIDENT NOTMOVEBANDS;
DROP FIELDS DayMin, DayMax;
DROP TABLES NOTMOVEBANDS;
This last script adds the NMoveBand field (the matched dual value) to the fact table and then deletes the DayMin/DayMax values and the NOTMOVEBANDS intervals table.
Hope that helps.
Jonathan
Where are you getting the data from? if this is a database server there are always options to put it there. For example MSSQL has the case statement where you could do something as the following:
SELECT Value
, CASE
WHEN Value BETWEEN 0 AND 99 THEN 'Less then 100'
WHEN Value BETWEEN 100 AND 199 THEN 'Less then 200'
ELSE '200 or more'
END AS ValueCategory
FROM Table
It is basically the same as an if-statement in your script. Is it not that the case statement is quite a bit more concise.
I believe CASE WHEN doesn't work with data.
Now you lost me a bit, but it definitely works in the SQL part of the load statement. See the below example:
Values:
LOAD Value
, CategoryValue
SQL SELECT Value
CASE WHEN .... THEN.... ELSE.... END AS CategoryValue
FROM Table
Ofcourse if your data is not coming from a database server this won't work. Another option could be to create a data-island with 3 columns (from, to, description) and then by using a set-analyse get the right description. Though problems like you describe i would always fix as close to the source as possible.
Aldo
I think IntervalMatch is what you are after. Here is an example to clarify:
Step 1: set up the intervals table. This example sets up 5 named (dual) values for 5 intervals together with the interval minimum and maximum values.
NOTMOVEBANDS:
LOAD Dual(BandName, DayMin) AS NMoveBand,
DayMin,
DayMax
INLINE
[
BandName, DayMin, DayMax
'Active', 0, 0
'<2 Weeks', 1, 14
'2 - 4 weeks', 14, 28
'4 - 8 weeks', 28, 56
'>8 weeks', 56,
];
Step 2: Load the fact table with your field values
Step 3: Perform the interval match. The example matches the LastMovePeriod field to the intervals defined by DayMin and DayMax.
LEFT JOIN(.... your fact table here ....)
INTERVALMATCH (LastMovePeriod) LOAD DayMin, DayMax RESIDENT NOTMOVEBANDS;
That's it. This will result in a synthetic key. If you prefer to remove the synthetic key (not really necessary), then add something like:
LEFT JOIN(.... your fact table here ....)
LOAD DayMin,
DayMax,
NMoveBand
RESIDENT NOTMOVEBANDS;
DROP FIELDS DayMin, DayMax;
DROP TABLES NOTMOVEBANDS;
This last script adds the NMoveBand field (the matched dual value) to the fact table and then deletes the DayMin/DayMax values and the NOTMOVEBANDS intervals table.
Hope that helps.
Jonathan
Great, exacly what I was looking for.
Thanks guys.
Aldo.