enumeration case with powers of two and "bitwise &... - Page 3 - Qlik Community

ioannagr · ‎2024-05-19

Hello all, so there's this challenge this time.

I was given this enumeration for the field "ChangeType", which stores 0,1, then powers of 2 like : 0,1, 2, 4, ,8, 16, 32, ... etc. The thing is the db table creator uses this enumeration like this- If they enter number 25 then there are three change types f & e & b because 25=16+8+1, if they enter number 10 there are two changes e & c because 10=8+2. So for example a person with change type 10 should be counted twice, once for reason 8 and once for reason 2. All I want is to count each person as many times as there are change types for them.

MapChangeType:
Mapping load * inline [
ChangeType, ChangeType1
0,'a'
1,'b.'
2,'c.'
4,'d'
8,'e'
16,'f'
32,'g'];

I contacted the db guy and they told me they wrote an sql script with bitwise operator & like

CREATE PROCEDURE [dbo].[sp_Updates]

@Year INT,

@UpdateSource INT

AS

BEGIN

SET NOCOUNT ON;

INSERT INTO UpdateTable

SELECT @Year, @UpdateSource, 8, COUNT(*) AS e

FROM Table2

WHERE PreviousYear < @Year

AND Year = @Year

AND (@UpdateSource IS NULL OR UpdateSource = @UpdateSource)

AND ChangeType & 8 > 0

END

How could I do the equivalent for Qlik Sense?

This is a much needed step for my further analysis and to be honest, I have no clue what to do so your help will be enormous.

Thank you in advance

Oleg_Troyansky

Well, in this case you need to extract each digit from the decimal number and reassemble the complete string.

In the data load script, you can do it with the help of the WHILE load and the function IterNo(), going through the length of each binary number and generating new rows for each digit.

Then, once you have each individual ChangeType in a separate data row, you can either keep them separate or aggregate them again, constructing the desired string.

Cheers,

Oleg Troyansky
Call me for expert support, performance tuning, advanced training and more!

marcus_sommer

No, I didn't meant Temp0 else Temp1. Your Temp0 is superfluous and is wasting RAM because it creates a table with the lengths of 268 M records + distinct field-values and in Temp1 it's repeated + creating 268 M of large and distinct strings in a heavily nested if-loop ... and your next intended step is loading the mapping resident from this large table and with it duplicating the RAM consumption ... and there is further no drop tables included before you access the sql.

By the usual common data-sets it's often no big issue to use such un-optimized load-approaches but by larger data-sets and many distinct field-values it's beneficial to mandatory to slice the tasks in n layers and loading incremental and to keep attention to the number of records/field-values and their size.

Therefore my suggestion to slice the task per loops and creating qvd's for the next layer. I could imagine that a mapping-table from such a single qvd may go in the direction of about 20 GB which is rather too much for a 32 GB environment (OS + Qlik services + further parts of this load + other stuff will also need RAM) but by 64+ GB it should be working and practicably enough. Be aware that you could re-use such qvd without re-creating it again and again - and further that you couldn't drop a mapping-table.

Personally I love mappings and use them for nearly all kind of matching - also n huge and heavily nested ones but I think I would avoid it this scenario and calculating it on the fly probably within n if-loops which seems to be your preferred way. Here a small example of the idea which is based on my first example above:

load *, if(mid(Bin, L - 0, 1) = '1', 'a', '') & if(mid(Bin, L - 1, 1) = '1', 'b', '') &
if(mid(Bin, L - 2, 1) = '1', 'c', '') & if(mid(Bin, L - 3, 1) = '1', 'd', '') &
if(mid(Bin, L - 4, 1) = '1', 'e', '') & if(mid(Bin, L - 5, 1) = '1', 'f', '') as C;
load *, len(Bin) as L;
load recno() - 1 as Int, text(num(recno() - 1, '(bin)')) as Bin autogenerate pow(2, 9);

This means this kind of if-loop is directly used within the preceding of the sql-load. Of course you need to extend it to the end of the alphabet and some logic to add a delimiter (+) between the chars ... abd probably there are further measurements (outsourcing the logic in a parametrized variable, pick(match()) approaches ...) to make it more elegant.

ioannagr

Hello everyone!

With the help of you all, I am so so so close to achieve my final desired result. Only thing I'm missing is taking into account Int 0 and its change.

BinaryData:
LOAD
recno() - 1 as Int,
text(num(recno() - 1, '(bin)')) as Bin
AUTOGENERATE pow(2, 6);

// Step 2: Calculate the length of each binary string
BinaryLength:
LOAD
Int,
Bin,
len(Bin) as L
RESIDENT BinaryData;

//Step 3: Create the field C based on conditions for each digit in the binary string
BinaryWithC:
LOAD
Int,
Bin,
L,
if(mid(Bin, L - 0, 1) = '1', 'a'&'+', '') &
if(mid(Bin, L - 1, 1) = '1', 'b'&'+', '') &
if(mid(Bin, L - 2, 1) = '1', 'c'&'+', '') &
if(mid(Bin, L - 3, 1) = '1', 'd'&'+', '') &
if(mid(Bin, L - 4, 1) = '1', 'e'&'+', '') &
if(mid(Bin, L - 5, 1) = '1', 'f'&'+', '') as C
RESIDENT BinaryLength;

Remove_PLUS:
LOAD
Int,
Bin,
Left(C, Len(C)-1) as ChangeType
RESIDENT BinaryWithC;

// Drop intermediate tables to clean up
DROP TABLE BinaryLength;
DROP TABLE BinaryData;

// Step 4: Create Sub_changes table with appropriate changes in separate lines
Sub_changes:
LOAD
Int,
Bin,
SubField(ChangeType, '+') as Change
RESIDENT Remove_PLUS;

DROP TABLE BinaryWithC;
DROP TABLE Remove_PLUS;

Also instead of directly applying a+, b+,etc, I'm thinking of applying a Map with ApplyMap(). I want to remind you that my mapping table looks like this:
MapChangeType:
Mapping load * inline [
ChangeType, ChangeType1
0,'a'
1,'b.'
2,'c.'
4,'d'
8,'e'
16,'f'
32,'g'];

but since Int 0 can't be shown as binary with digit 1, it disappears completely.

Could you please help me achieve my goal?

It works very quickly 🙂

marcus_sommer

The bin-representation of 0 isn't 1 but it has a return which is 0 and by querying L=1 twice it could be fetched, maybe like:

...
if(mid(Bin, L - 0, 1) = '0', 'a'&'+', '') &
if(mid(Bin, L - 0, 1) = '1', 'b'&'+', '') &
if(mid(Bin, L - 1, 1) = '1', 'c'&'+', '')
....

Beside of this I considered a classical mapping-approach again and think it's applicable without creating extremely huge mapping-tables - just by slicing the lookup-values into n chunks. This isn't possible with the numeric value but with the string-representation and the way how it's translated into letters which is just taking the position of 1 to derive the letter. This means for example the first four chars and their translation look like:

1111
dcba

and for the next four chars it would be:

1111
hgfe

or in other words the logic didn't change only the letters are different in regard to their position. Therefore a mapping mustn't contain pow(2, 28) of records else it might be 2 * pow(2, 14) or 4 * pow(2, 7) which would make a significantly difference. Of course the applymap() call must consider the appropriate string-part which may look like:

applymap('m', L & '|' & 1 & '|' & mid(text(num(MyNumber, '(bin)')), L - 14, 14), '') &
applymap('m', L & '|' & 2 & '|' & mid(text(num(MyNumber, '(bin)')), 1, 14), '') as C

The mid() extraction is just a hint to the idea and needs probably some adjusting. The blue-part of the lookup-key is to fetch the right translation from a single mapping (by using several mapping-tables the green mapping-reference would be need to set appropriate) and the red-part is aimed to include the + char delimiter-logic in regard to L within the mapping to avoid the needs respectively the efforts to clean/prepare the final output - but it will of course increase the complexity.

If this approach is better as the on-the-fly calculation? Probably not - but possible and surely not unappealing ...

enumeration case with powers of two and "bitwise &" case

Data Load Editor

Developers

Qlik Sense

Script