Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Adding a derived field to an existing table. Left join or...?

Hi all. I've got a table with about 50 fields. In my script, after the table is loaded, and some other dimension tables are loaded, I want to create some derived fields based on the data that I've loaded. I've been getting weird results. I tried doing something like this:

LEFT JOIN (MyTable) LOAD

ID,

if(...) AS DerivedField1,

if(...) AS DerivedField2

RESIDENT MyTable;

Is this a good approach? As I said, I've been getting strange results, like where the data doesn't come out right according to the logic. Or sometimes it creates a ton of duplicate rows. So how do you guys create derived fields after the fact?

Cheers,

Eric

3 Replies
Not applicable
Author

The syntax looks good. The question about logic and depends on what you really need.

Not applicable
Author

You can do a load of load...don´t use a join.....if i understood you...something like this...

LOAD
YEAR(xxx) AS YEAR,
MONTH(xxx) AS MONTH,
IF(xxx='xyz',True,false) AS yyyy,
*;
LOAD
xxxx,
xxxx
SQL SELECT
xxxx,
xxxx ;

Not applicable
Author

You have three options:

1) Left Join the table to itself like you did in your example. Which does look right, I would guess if your getting duplicate rows the "ID" is not truely a unique identifer or was duplicated earlier before this left join. When trouble shooting I always recommend output using trace the rowcount of the resulting table after each Join.

2) Temp load and then Resident load. Example:

Table1_Tmp:
Load Field1, Field2;
SQL Select Field1, Field2 From Table1;

Table1:
Noconcatenate
Load *,
Field1 + Field2 AS Field3
Resident Table1_Tmp;
Drop Table Table1_Tmp;

Both 1 & 2 are most useful when you must Join two or more tables in memory before all the fields needed for your calculation are present in one table. However, if all the fields present in the original table loaded I recommend using proceeding loads maybe even multiple. Example:

Table1:
Load *,
SubTotal + Tax AS Total
Load *,
SubTotal * 0.06 AS Tax
Load Labor,
Material,
Labor + Material AS SubTotal;
SQL Select Labor, Material From Table1;

Its kind of funny that most code executes top down but load statements like these kind of go bottom up. Whenever possible I would use mulitple loads chained on one select before I would consider options 1 or 2 above. As far as performance goes I would not know which is faster option 1 or 2.