Solved: Re: Joining vs resident loading - Qlik Community

Bchur · ‎2026-04-08

So basically here is the question.

I have a fact table called "Transaction", which is quite large (~100m records), in it theres a field called "ID.#KEY", which has less than a hundred unique values and is mixed in type (so numeric & string). When i try to recreate the ID.#KEY with a join (doesn't matter if i use FIELDVALUE in combination with FIELDVALUECOUNT, or LOAD DISTINCT):

temp_ID:
LOAD DISTINCT
ID.#KEY
RESIDENT Transaction;

LEFT JOIN (Transaction)
LOAD *,
ID.#KEY AS ID_2.#KEY
RESIDENT temp_ID;
DROP TABLE temp_ID;

The result is what I'd expect it to be, there are no duplication issues whatsoever, but the filesize increases 3 times. If I just do one more load and do the same, just in a resident load - no issues. Any ideas?

marcus_sommer · ‎2026-04-09

The main-cause is probably the behaviour of fieldvalue() to return only a single information of the field-value and not both parts - which lead then to side-effects in regard which field-value information is stored where and in which way.

It's a standard in Qlik that values are stored (if possible) as dual() values with a string- and a numeric-interpretation. By accessing a field per fieldvalue() it returns only the second (usually numeric) part without the string-interpretation which usually results in ? respectively showing a place-holder instead the real value.

By using such values as join-keys it may impact how the meta-data of the field are stored. For example, a consistent date-field is stored with 8 byte per (numeric) value and a few extra bytes of the formatting per field. If the date-field has instead n different formatting the format-information isn't stored anymore on the column-level else each the format-pattern is stored for each value against the values.

Nevertheless I wouldn't expect a round 3 times bigger file-size by just 100 unique values but in some way seems your approach trigger an unoptimized processing.

You may a bit playing with wrapping the fieldvalue() with a num() - rather not suitable by mixed types - and text() and/or changing the field-creation order - means fieldvalue() as ID2 and after that the duplicating as ID for the join ... whereby I would tend to change the logic for duplicating a field respectively switching to a mapping approach.

View solution in original post

marksouzacosta · ‎2026-04-09

Hi @Bchur ,

It is quite hard to me to understand what you are trying to do, and the comparison you are doing, without code examples (scenario A vs scenario B, etc).

From what you said, I can imagine you are comparing these two scenarios:

Scenario A

temp_ID:
LOAD DISTINCT
ID.#KEY
RESIDENT Transaction;

LEFT JOIN (Transaction)
LOAD *,
ID.#KEY AS ID_2.#KEY
RESIDENT temp_ID;
DROP TABLE temp_ID;

Scenario B

LEFT JOIN (Transaction)
LOAD DISTINCT
ID.#KEY,
ID.#KEY AS ID_2.#KEY
RESIDENT Transaction;

Is that correct?

A few other questions?

What do you mean by "filesize increases 3 times"? Are you saving Transaction as a file?
What are the number of records Transaction table in both cases?
Why are you doing a Join just to rename a field?

One suggestion:
I highly recommend adjusting the field types, especially for key fields. Do not link fields with mixed types.

Regards,

Mark Costa

Read more at Data Voyagers - datavoyagers.net
Follow me on my LinkedIn | Know IPC Global at ipc-global.com

Bchur · ‎2026-04-09

Hey @marksouzacosta , I'll clarify - this is the thing causing issues:

The subroutine results reflect what I see in QMC, as you can see the QVF grew approx. 3 times. Rowcount for Transaction tables stayed the same. What doesn't cause issues:

So the issue must be in that specific field, its mixed type and the join. But, what I don't understand - how can QVF size grow while rows stay the same :).

marcus_sommer · ‎2026-04-09

The main-cause is probably the behaviour of fieldvalue() to return only a single information of the field-value and not both parts - which lead then to side-effects in regard which field-value information is stored where and in which way.

It's a standard in Qlik that values are stored (if possible) as dual() values with a string- and a numeric-interpretation. By accessing a field per fieldvalue() it returns only the second (usually numeric) part without the string-interpretation which usually results in ? respectively showing a place-holder instead the real value.

By using such values as join-keys it may impact how the meta-data of the field are stored. For example, a consistent date-field is stored with 8 byte per (numeric) value and a few extra bytes of the formatting per field. If the date-field has instead n different formatting the format-information isn't stored anymore on the column-level else each the format-pattern is stored for each value against the values.

Nevertheless I wouldn't expect a round 3 times bigger file-size by just 100 unique values but in some way seems your approach trigger an unoptimized processing.

You may a bit playing with wrapping the fieldvalue() with a num() - rather not suitable by mixed types - and text() and/or changing the field-creation order - means fieldvalue() as ID2 and after that the duplicating as ID for the join ... whereby I would tend to change the logic for duplicating a field respectively switching to a mapping approach.

Joining vs resident loading

Advanced Authoring

Client Managed

Data Model

Performance