32 Replies Latest reply: Nov 16, 2011 6:02 PM by John Witherspoon

# Inside field calculation?

Hi all,

i have a table where one of the field contain claculation formula based on fields of the table

```id, field1, field2, calc
1 , 1      , 10    , field1/field2
2 , 2      , 20    , field1*field2
3 , 3      , 30    , field1+field2
```

Is it possible to resident load this table in script and add new field which is based on calculation of "calc" field?

I've played with evaluate() function but the result is always null.

Thanks!

Stefan

• ###### Re: Inside field calculation?

see the attached file

where i have took the expression in resident table

tab1:
id, field1, field2
1 , 1      , 10
2 , 2      , 20
3 , 3      , 30
4 , 4      , 30
];

if(rowno()=1,field1/field2,
if(rowno()=2,field1*field2,
if(rowno()=3,field1+field2,if(rowno()=4,field1-field2)))) as calc
resident tab1;

drop table tab1;

• ###### Inside field calculation?

Hi Sunil,

thanks for your answer but i'm looking for more general solution (if is possible) because the table in the first post is a sample table and the real data may contain many and different calculations and many fields included.

Thanks!

Stefan

• ###### Inside field calculation?

Hi Stefan.

How do you determine which fields are to be used in the calculation and what the calculations is?

In your sample in the first post, what determines that field 1 is divided by field 2 as opposed to multiplied by field 2 or added to field 2 on the first record?

John.

• ###### Re: Inside field calculation?

Hi John,

i don't determine nothinng. The "calc" field is a database field and the formula is stored in it.

Regards!

Stefan

• ###### Re: Inside field calculation?

Stefan,

that's an interesting question and I was attempted to say, yes you can do this using evaluate() function.

I remembered I played around with something similar few weeks ago (http://community.qlik.com/message/135222) and I just wanted to adapt that to your requirements. I couldn't make it to work again. Then I noticed I was now using QV10SR3 and I remembered using QV9 from time to time - and surprisingly, my linked demo script just works fine in QV9SR6.

Then I continued working on QV9, trying to integrate your requirements. To my surprise, I couldn't get some assumingly easy expressions like 'field1' to work either, it always return NULL, as you mentioned.

I could access the data for already read in rows by using the peek() method, but this is not what you required. Hm, why this? Is the expression evaluated in a different context where the actual current input row fields are not accessible, like in a global context?

I tried to investigate on this, but couldn't find any information in the help or the forum. Almost all examples are using quite easy expressions where they don't use current input row field values, most of the time examples like '6*8' or something like this.

I then noticed that the problem might be similar to http://community.qlik.com/thread/15247 and the statement Hector made. Not sure about this, I believe it has to do with QV's internal structure (latter issue probably with how and when expressions are (pre-)parsed and evaluated, which I believe will probably not be done on record base).

Then I played around with other functions, trying to evaluate() them on script load, something like:

Table2:

if(IsNull(evaluate(calc))=-1,'no',evaluate(calc)) as Result;

id, field1, field2, calc

1 , 1      , 10    , field1/field2

2 , 2      , 20    , field1*field2

3 , 3      , 30    , field1+field2

4 , 4      , 40    , date(today())

5 , 5    , 50   , "peek('field1',-1)*peek('field2',-1)"

7 , 7   ,60       , rowno()

8 , 8    , 70   , rowno()*field1

];

and I again noticed differences between QV 9 and 10, not only with the peek() function evaluation, but also with the rowno().

Finally, I was just about giving up, I remembered the "easy ones" using something like

evaluate('8*6')

so what about replacing the field values in the string to evaluate before evaluating?

And in fact, I got this to work for both QV9 and QV10:

Table2:

if(IsNull(evaluate(calc2))=-1,'no',evaluate(calc2)) as Result;

Replace(Replace(calc,'field1',text(field1)),'field2',text(field2)) as calc2;

id, field1, field2, calc

1 , 1      , 10    , field1/field2

2 , 2      , 20    , field1*field2

3 , 3      , 30    , field1+field2

];

Not sure if this meets your requirement, but I think I am at my wits' end.

Best regards,

another Stefan

• ###### Re: Inside field calculation?

Hi Stefan,

What about using a variable with fields as parameters? Is that acceptable given your data model and possible values?

```SET fxEval = Evaluate(\$1 & KeepChar(\$2, '+-/*') & \$3);

LOAD *, \$(fxEval(field1, calc, field2)) AS calc2 INLINE [
id, field1, field2, calc
1 , 1      , 10    , field1/field2
2 , 2      , 20    , field1*field2
3 , 3      , 30    , field1+field2
];

```

Hope that helps.

Miguel Angel Baeyens

BI Consultant

Comex Grupo Ibérica

• ###### Re: Inside field calculation?

Thanks Stefan i've ednded with the same situation

Thanks Miguel using your solution actually show some results! But is there any way to adapt the variable to use randon number of fileds?

Thanks!

Stefan

• ###### Inside field calculation?

Stefan,

the replace method will also work with more than two fields and if not every field is used in the calc string (the replace just won't have any effect for that field). So just extend the expression with more replace() functions up to maximum number of fields you need (well, I hope you are not using too many, and honestly I don't know if restrictions apply for too many functions embedded into other functions, just give it a try.

I think replacing the field values is probably more flexible with regard to calc function structure, for example you will have not problems with such 'sqr(field1)' or 'field2 / field1' or 'field1 / (field2 + field1)'.

Hope his helps,

Stefan

• ###### Inside field calculation?

Hi all,

why not store this in an Excel file and load the calc field?

;-)

- Ralf

• ###### Re: Inside field calculation?

I've started with replacing method Stefan. Just to make it work but as far as i know my bosses they will offer the same qvw file to other of our clients and there the filed names are not totally the same. Thats why i'm looking for some solution that will save me some time in future

Thanks!

Stefan

• ###### Re: Inside field calculation?

Indeed, Excel is calculating the formulas, but you migh know..

• ###### Re: Inside field calculation?

Btw. if you want to abstract (map) your qvw fields from the customer fields w/o changing the qvw file you could provide an external script (qvs via include) with the mapping which will be customized on the client side.

• ###### Inside field calculation?

Ralf,

I haven't known that something plain like 'field1+field2' will be a correct excel expression indicating the sum of columns titled field1 and field2 row based. But I surely am no Excel expert.

Could you elaborate on that?

Stefan, where does your data come from? Could your data source somehow handle the calculation itself? I believe some probably could do.

Regards,

Stefan

• ###### Re: Inside field calculation?

At this stage I clearly have to ask what is the use case? As I've understand so far it is the abstraction from a client´s datamodel. The calculated fields is only one approach as a workaround, right?

• ###### Inside field calculation?

tab1:

`Load * inline [`

`id, field1, field2, calc1 , 1      , 10    , field1/field22 , 2      , 20    , field1*field23 , 3      , 30    , field1+field2``];`

tab2:

if(Rowno()=1,field1/field2,if(Rowno()=2,field1*field2,if(Rowno()=1,field1+field2))) as Calc

residdent Tab1;

drop table Tab1;

or

tab1:

`Load * inline [`

`id, field1, field2, calc1 , 1      , 10    , field1/field22 , 2      , 20    , field1*field23 , 3      , 30    , field1+field2``];`

and in expression

if(Rowno()=1,field1/field2,if(Rowno()=2,field1*field2,if(Rowno()=1,field1+field2)))

• ###### Re: Inside field calculation?

Ralf,

the idea is that all the fields are loaded from the db. including the calculation field. All folrmulas are stored in this field. And for particular rows this formula must override the regular calculation. And the formula and fields inside it may vary and they are user defined and may be changed by the end user of the application any time.

Some developer gave the user option to enter this formulas and now the customer want to have this in qlikview and some charts to be made.

Stefan

• ###### Re: Inside field calculation?

If there is no simple way in QV I would prefer to do this in the script with dynamic SQL. But to show how this could work I would need more details: the data model and the script.

- Ralf

• ###### Inside field calculation?

did u see my post

• ###### Inside field calculation?

Sunil Chauhan wrote:

did u see my post

You explicitly tie a row number to a calculation.  But as discussed, the calculations, the rows they will appear on, and even the possible fields are not something we know ahead of time.

• ###### Inside field calculation?

Stefan,

This looks like working.  Functionality is correct, but it could be cleaner, I think.  Anyway, you can start from here and improve:

data:
LOAD *, rowno() as Row INLINE [
f1, f2, calc
1, 5, "f1 + f2"
4, 10, "f1 * f2"
15, 5, "f1 / f2"];

Let R=peek('Row');

for r=1 to \$(R)

LET E=peek('calc', \$(r)-1,'data');

tab:
f1,
f2,
calc,
evaluate(\$(E)) as Result
RESIDENT data
WHERE Row=\$(r);

Next r

DROP TABLE data;

• ###### Inside field calculation?

Very well done Michael!

Could you explain why passing the string to evaluate via a variable makes a difference here? Or is this something we have to accept as invented by QlikTech engineers?

Regards

the other Stefan

• ###### Re: Inside field calculation?

Here's another working solution for arbitrary calculations.  I suspect that the performance would be higher than the looping solution, though I've done no testing on large data sets to prove it.  Also, it's dangerous to allow something like this with no validation of the calculations, as any data entry error by the user can crash out the script.  Validating calculations is a huge and I'd say separate topic.  Hopefully that's being done on the data entry side, which is where I think it belongs.  Calculations must of course follow QlikView's syntax and be possible to evaluate at script execution time, so in that sense, they aren't entirely arbitrary.

Data:
,recno() as ID // you can just use the key if your record has a key
INLINE [
F1, F2, Calc
1,  5, F1 + F2
4, 10, F1 * F2
15, 5, F1 / F2
];

Calcs:
concat(chr(39) & Calc & chr(39),',',CalcSequence) as CalcNames
,concat(         Calc          ,',',CalcSequence) as Calcs
;
recno() as CalcSequence
,text(fieldvalue('Calc',recno())) as Calc
AUTOGENERATE fieldvaluecount('Calc')
;
LET vCalcs = 'pick(match(Calc,' & peek('CalcNames') & '),' & peek('Calcs') & ')';

DROP TABLE Calcs;

LEFT JOIN (Data)
ID
,\$(vCalcs) as Result
RESIDENT Data
;

EDIT: Swuehl, you could probably do something similar to pull off your solution with an arbitrary list of fields.  I suspect you could generate your list of replaces by detecting the field names using \$Table and \$Field.  I don't know if that would be simpler or more complex than the generated pick(match()).  One advantage is that an arbitrary list of fields may well have fewer members than an arbitrary list of calculations, and thus give you a shorter expression.  That might improve performance, and help avoid any expression length bugs if any were to surface.

• ###### Inside field calculation?

Hi John,

that's another nice solution without any evaluate() call, I need to start remembering the pick/match method using variables.

Since you are building a variable based on calc field values from all records, do you know of any limitiations regarding max variable length (or expression length of the expanded pick function)?

Stefan hasn't mentioned the common calc field expression length or expected number of records, so maybe that's not a problem anyway.

Regards,

swuehl

• ###### Inside field calculation?

I'm not aware of any intended or published maximum variable or expression length.  That said, I've crashed out earlier versions of QlikView with huge expressions while specifically looking for bugs related to expression size.  I haven't tried to test for variable size bugs, but they could exist as well.

So it's one of those things that SHOULD work, and probably won't be a problem in practice.  BUT it might reasonably be expected to fail in some versions due to QlikView bugs if the variable/expression gets too big, whatever too big might be.

• ###### Re: Inside field calculation?

John Witherspoon wrote:

Swuehl, you could probably do something similar to pull off your solution with an arbitrary list of fields.  I suspect you could generate your list of replaces by detecting the field names using \$Table and \$Field.  I don't know if that would be simpler or more complex than the generated pick(match()).  One advantage is that an arbitrary list of fields may well have fewer members than an arbitrary list of calculations, and thus give you a shorter expression.  That might improve performance, and help avoid any expression length bugs if any were to surface.

I decided to follow up on this.  With a little modification, it's simpler than my pick(match()) solution and works just fine.  I again expect performance to be high.  You could skip ID and Calc in the replace for slightly better performance at the cost of complexity.  And as mentioned, I think this variable expression is less likely to hit any bugs related to size.  See attached.  Script below.

Data:
,recno() as ID // you can just use the key if your record has a key
INLINE [
F1, F2, Calc
1,  5, F1 + F2
4, 10, F1 * F2
15, 5, F1 / F2
];

LET vCalcs = repeat('replace(',nooffields('Data')) & 'Calc';

FOR I = 1 TO nooffields('Data')
LET vCalcs = vCalcs & ',' & chr(39) & fieldname(I,'Data') & chr(39) & ',' & fieldname(I,'Data') & ')';
NEXT

LEFT JOIN (Data)
ID
,evaluate(\$(vCalcs)) as Result
RESIDENT Data
;

EDIT:  If you know your field list ahead of time, you can of course just put the replace in the initial load to save additional time.  That should be the normal case, I'd think, as I try to never load fields I don't know the name of.  Generating the expression like the above may still make sense from the standpoint of simplifying maintenance, though, as you then only have to list the fields in one place instead of two.  Of course, if people maintaining it later can't even understand what the loop and second load are doing, that might be an argument for just listing all the fields explicitly, but I don't personally tend to cater to people who can't read code, since the people maintaining this should be professional programmers.  I might not expect anyone to come up with this from scratch, but I'd expect them to be able to understand it, even if it took referencing QlikView documentation.

• ###### Inside field calculation?

Stefan,

Apparently this is how it works.

BTW, in my version evaluate() is not needed.  Simply \$(E) as Result is enough.

Same true about evaluate in the later versions, I assume.

• ###### Inside field calculation?

Alas, evaluate() is still necessary in the replace() example I posted above (testTableDrivenCalculations8).

• ###### Inside field calculation?

@Michael:

Ah, even more simple, I like your solution!

@Ralf:

No, I don't think it's a bug with FieldValue/FieldIndex, but not sure and most people will probably tend to tell it a bug (after spending hours debugging their apps like me ).

The manual uses some subtle differences in wording between FieldValue ("Returns the field value found in position n of the field fieldname") and e.g. Peek ("

Returns the contents of the fieldname in the record specified by row in the input table tablename

"), but I think noone can understand that.

A clear word could really help sometimes (and an example).

• ###### Re: Inside field calculation?

Michael's solution does a resident load for every single row of the table.  A million rows means a million loads X a million rows.  That would be impossibly slow, and it would create a HUGE log file.  It took 48 seconds to load just 5000 rows on my machine:

Data:
recno() as ID
,recno() as Row // separate because an ID wouldn't normally match the row number
,ceil(rand()*10) as F1
,ceil(rand()*10) as F2
,'F1 ' & pick(ceil(rand()*4),'+','-','*','/') & ' F2' as Calc
AUTOGENERATE 5000
;

LET R=peek('Row');

FOR r=1 TO \$(R)
LET E=peek('Calc',\$(r)-1,'Data');
Tab:
LOAD ID, F1, F2, Calc, \$(E) as Result
RESIDENT Data
WHERE Row=\$(r);
NEXT r

DROP TABLE Data;

I can't test Ralf's solution because I'm not going to build a database.  Again, for a million rows, there are a million loads X a million rows.  HOWEVER, those loads are just a quick primary key read on a database.  It SHOULD perform reasonably, but in my opinion puts an unnecessary load on the database management system, plus again would create a huge log file.

I'm betting I can crash my pick(match()) solution by adding a + recno() to the load and going to enough rows.  If it doesn't crash, it'll at least get slow based on the gigantic pick(match()) I'll be generating.  OK, it was still working at 100,000 rows, each with a unique calculation, but it took 142 seconds to load.  There MAY be a point where it crashes, but it becomes impractically slow before it gets to that point.  It might be a workable solution for some data sets (data sets with a fairly limited number of calculations), but it's not what I'd choose as a generic solution.

Data:
recno() as ID
,ceil(rand()*10) as F1
,ceil(rand()*10) as F2
,'F1 ' & pick(ceil(rand()*4),'+','-','*','/') & ' F2 + ' & recno() as Calc
AUTOGENERATE 100000
;

Calcs:
concat(chr(39) & Calc & chr(39),',',CalcSequence) as CalcNames
,concat(          Calc          ,',',CalcSequence) as Calcs
;
recno() as CalcSequence
,text(fieldvalue('Calc',recno())) as Calc
AUTOGENERATE fieldvaluecount('Calc')
;
LET vCalcs = 'pick(match(Calc,' & peek('CalcNames') & '),' & peek('Calcs') & ')';

DROP TABLE Calcs;

LEFT JOIN (Data)
ID
,\$(vCalcs) as Result
RESIDENT Data
;

And finally we have Stefan's solution, modified by me to generate the required replace() statement automatically.  Even with each calculation being unique, it loads a million rows in 13 seconds, two million rows in 25 seconds, and three million rows in 37 seconds.  It's fast and looks to be o(n), as I'd expect.  Of what we've seen so far, I think it's the most practical solution for real world data sets.

Data:
recno() as ID
,ceil(rand()*10) as F1
,ceil(rand()*10) as F2
,'F1 ' & pick(ceil(rand()*4),'+','-','*','/') & ' F2 + ' & recno() as Calc
AUTOGENERATE 3000000
;

LET vCalcs = repeat('replace(',nooffields('Data')) & 'Calc';

FOR I = 1 TO nooffields('Data')
LET vCalcs = vCalcs & ',' & chr(39) & fieldname(I,'Data') & chr(39) & ',' & fieldname(I,'Data') & ')';
NEXT

LEFT JOIN (Data)
ID
,evaluate(\$(vCalcs)) as Result
RESIDENT Data
;

• ###### Re: Inside field calculation?

Hi Stefan,

maybe this is a more suitable solution:

Calculations:

select ID, CALC from xyz_test;

LET rowText = NoOfRows('Calculations'); // get the total number of rows in Calculations table

for i=1 to \$(rowText) // loop through every Calculations row

let vCalc = FieldValue('CALC',\$(i));

let vId = FieldValue('ID',\$(i));

if len(trim('\$(vCalc)')) > 0 then

Results:

select ID, FIELD1, FIELD2, \$(vCalc) as RESULT from xyz_test where id = \$(vId);

end if

next

- Ralf

Some more explanation: This pushes the calculation into the database where it was edited. User could have use database operators or functions which could maybe unknown in QlikView. This also gives the possibility to use sub-selects or UDFs at the database level..

• ###### Inside field calculation?

This is a really nice evolving thread!

@Ralf: Some time ago I spent some hard hours debugging some strange behaviour. I traced it down to my usage of FieldValue, which - in a case like above - will not work as expected if your field values are non unique (i.e. multiple rows with same CALC expression). FieldValue is indexing the compressed column with unique values. You will notice the difference if you add one more row with repeated CALC expression.

So it might be better to use peek('CALC',\$(i),'Calculations') with i running from 0 to NoOfRows-1.

A lot of solutions to one small problem!

Regards,

Stefan

• ###### Re: Inside field calculation?

Good point Stefan (@swuhl), I didn't knew this! Is this a bug with FieldValue?

Calculations:

select ID, CALC from xyz_test;

LET rowText = NoOfRows('Calculations'); // get the total number of rows in Calculations table

for i=1 to \$(rowText) // loop through every Calculations row

let vId = FieldValue('ID',\$(i));

let vCalc = peek('CALC',\$(i) -1,'Calculations');

if len(trim('\$(vCalc)')) > 0 then

Results:

select ID, FIELD1, FIELD2, \$(vCalc) as RESULT from xyz_test where id = \$(vId);

end if

next

- Ralf

Message was edited by: Ralf Becher

..better to use peek here