I just want to collect some of your ideas how to solve one simple transformation.
Output should be:
Transformation is done for millions of rows, so actually I am trying to find the option for shortest reload time.
Thank you very much.
Solved! Go to Solution.
If the data doesn't change after the fact, I would suggest an 'incremental' reload. This will speed it up massively.
An example for your case, load the data where the date is greater than a certain date and concatenate old data already loaded into a QVD.
LET vDate = Date(Today()-1,'DD-MM-YYYY');
LOAD * FROM dataSource
WHERE Date >= '$(vDate)'
CONCATENATE (Data) LOAD * FROM C:\DataLocation\Data.qvd (qvd)
WHERE Date < '$(vDate)'; //may or may not need this WHERE statement depending on how your load works, if there is no date crossover from the stored data and the new data then you won't.
STORE Data Into C:\DataLocation\Data.qvd (qvd);
...This way you won't be transforming old data each time you reload, it will simply add to the old set.
Hope this helps.
doing just mapping load for each attribute where not isnull(Attribute)and then applying it into next table on distinct combination of ID and Date.
where not isnull(Attribute1);
ApplyMap(...) as AttrbuteValue
Don;t use string for values, it's bad for the memory...
I made it like this:
LOAD ID, Date,
only(if(Attribute1='active',-1,if( Attribute1='inactive',0))) as Attr1,
only(Attribute2) as Attribute2,
only(Attribute3) as Attribute3
(ooxml, embedded labels, table is Sheet1)
Group by ID, Date;
Group by also consumes resources but it depends on how much data you have.
Hope it helps,