Solved: Cut reload time - Qlik Community

mato32188 · ‎2017-09-11

Hi guys,

I just want to collect some of your ideas how to solve one simple transformation.

Source Data:

ID	Date	Attribute1	Attribute2	Attribute3
1	01-01-2017	active	NULL	NULL
1	01-01-2017	NULL	1	NULL
1	01-01-2017	NULL	NULL	8
1	02-01-2017	active	NULL	NULL
1	02-01-2017	NULL	7	NULL
1	02-01-2017	NULL	NULL	100
2	01-01-2017	active	NULL	NULL
2	01-01-2017	NULL	2	NULL
2	01-01-2017	NULL	NULL	9
2	02-01-2017	inactive	NULL	NULL
2	02-01-2017	NULL	20	NULL
2	02-01-2017	NULL	NULL	50

Output should be:

ID	Date	Attribute1	Attribute2	Attribute3
1	01-01-2017	active	1	8
1	02-01-2017	active	7	100
2	01-01-2017	active	2	9
2	02-01-2017	inactive	20	50

Transformation is done for millions of rows, so actually I am trying to find the option for shortest reload time.

Thank you very much.

BR

Martin

ECG line chart is the most important visualization in your life.

andrei_delta · ‎2017-09-11

Glad i could help.

My pleasure,

Andrei

View solution in original post

Anonymous · ‎2017-09-11

Hard to say without knowing what is happening - Could you share the load script and the reload log.

techvarun · ‎2017-09-11

Load Null as 0 and Active as 1 and Inactive as 2. The search will become faster

sam_grounds · ‎2017-09-11

Hi Martin,

If the data doesn't change after the fact, I would suggest an 'incremental' reload. This will speed it up massively.

An example for your case, load the data where the date is greater than a certain date and concatenate old data already loaded into a QVD.

Eg.

LET vDate = Date(Today()-1,'DD-MM-YYYY');

Data:

LOAD * FROM dataSource

WHERE Date >= '$(vDate)'

CONCATENATE (Data) LOAD * FROM C:\DataLocation\Data.qvd (qvd)

WHERE Date < '$(vDate)'; //may or may not need this WHERE statement depending on how your load works, if there is no date crossover from the stored data and the new data then you won't.

STORE Data Into C:\DataLocation\Data.qvd (qvd);

...This way you won't be transforming old data each time you reload, it will simply add to the old set.

Hope this helps.

Sam

mato32188 · ‎2017-09-11

Hi Bill,

doing just mapping load for each attribute where not isnull(Attribute)and then applying it into next table on distinct combination of ID and Date.

MAP_attribute1:

mapping load

Autonumber(ID&'│'&Date),

attribute1

Resident XXX

where not isnull(Attribute1);

Load DISTINCT

Autonumber(ID&'│'&Date),

ApplyMap(...) as AttrbuteValue

Resident XXX;

...

Thanks.

Martin

ECG line chart is the most important visualization in your life.

andrei_delta · ‎2017-09-11

Hello,

Don;t use string for values, it's bad for the memory...

I made it like this:

LOAD ID, Date,

only(if(Attribute1='active',-1,if( Attribute1='inactive',0))) as Attr1,

only(Attribute2) as Attribute2,

only(Attribute3) as Attribute3

FROM

[.\Book1.xlsx]

(ooxml, embedded labels, table is Sheet1)

Group by ID, Date;

Group by also consumes resources but it depends on how much data you have.

Hope it helps,

mato32188 · ‎2017-09-11

Hi Andrei,

thanks for advice. Sure, I am using INT instead of CHAR, that was just an example of data set. Group by looks like the best option here.

Thank you.

Martin

ECG line chart is the most important visualization in your life.

andrei_delta · ‎2017-09-11

Glad i could help.

My pleasure,

Andrei