Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I have a table with 30 values reported in a single line, and each value is composed of 5 fields, ex .:
Key1, Key2, Field101, Field201, Field301, Field401, Field501, Field102, Field202, Field302, Field402, Field502, ...Field530
I would have to distribute in rows each group of 5 fields, it is a qvd of several million records, for now I have it done with a loop but it has to read 30 times and it takes a lot. Do you know if there is a way to avoid reading it 30 times?
FOR i = 1 to 30
LET vNum = Num ($(i), '00');
Concatenate (TableName)
LOAD Key1,
Key2,
Field1$(vNum) as Field1,
Field2$(vNum) as Field2,
Field3$(vNum) as Field3,
Field4$(vNum) as Field4,
Field5$(vNum) as Field5
From
[$(vPath)FileName.qvd]
(qvd)
Where
Field1$(vNum) <> 0
;
NEXT
I'm not looking for delta/incremental loads solutions.
Thanks,
Regards.
Thanks Marco, this one has become my first option. I'll make some tests and return with the results.
I will also use Rob's code, this table will be exported to qvd, so I think I can avoid the discussion this time.
I made some tests, I have created a 1,2Gb qvd with10M rows and all 150 fields (30 groups of 5 fields) with 2 keys (generator attached).
I couldn't make optimized load with a where Exists clause to work but doing an optimized load in bucle (without 'where') and a resident of this one was the fastest.
Optimized bucle and resident where: bit less than 3 minutes.
Bucle with Where: 6 minutes
Subfield: 45 minutes. It proccess all the 300M rows.
Generic: about 3 hours (cancelled after 2 hours). It creates a row for each field (10M*30*5 fields), making at least 2 turns of the 1500M rows.
I didn't worried to make clean tests, the computer was doing other non very consming tasks.
Attached the different tests made.
Edit: Just to comment that I tried to simulate real data giving less probability for '1' in Field1 as goups are higher (group 30 has less probability than group 29), so in my data only 28,8M had Field1<>0.
Many thanks Ruben that you made a comparing of the various approaches. I think it will be very useful in similar cases.
- Marcus
one example of combining the loop with a generic field name extraction:
tabData:
LOAD * FROM
(qvd); tabFields:
CrossTable (Field,Value)
LOAD 1, * Resident tabData Where RecNo()=1;
tabLoads:
LOAD Concat(Field&' as '&Left(Field,2),', ') as Fields,
Mid(Field,3) as Group
Resident tabFields
Where RecNo()>2
Group By Mid(Field,3);
Join
LOAD Concat(Field,', ') as Keys
Resident tabFields
Where RecNo()<=2;
tabResult:
LOAD 1 as tempField AutoGenerate 0;
FOR Each vGroup in FieldValueList('Group')
LET vLoad = Lookup('Keys','Group',vGroup,'tabLoads')&','&Lookup('Fields','Group',vGroup,'tabLoads');
Concatenate (tabResult)
LOAD $(vLoad),
'$(vGroup)' as Group
Resident tabData;
NEXT
DROP Tables tabData, tabFields, tabLoads;
DROP Field tempField;
Marco
... and another one looping per row
tabData:
LOAD K1,
K2,
Pick(IterNo(),F101,F102,F103,F104,F105,F106,F107,F108,F109,F110,F111,F112,F113,F114,F115,F116,F117,F118,F119,F120,F121,F122,F123,F124,F125,F126,F127,F128,F129,F130) as Field1,
Pick(IterNo(),F201,F202,F203,F204,F205,F206,F207,F208,F209,F210,F211,F212,F213,F214,F215,F216,F217,F218,F219,F220,F221,F222,F223,F224,F225,F226,F227,F228,F229,F230) as Field2,
Pick(IterNo(),F301,F302,F303,F304,F305,F306,F307,F308,F309,F310,F311,F312,F313,F314,F315,F316,F317,F318,F319,F320,F321,F322,F323,F324,F325,F326,F327,F328,F329,F330) as Field3,
Pick(IterNo(),F401,F402,F403,F404,F405,F406,F407,F408,F409,F410,F411,F412,F413,F414,F415,F416,F417,F418,F419,F420,F421,F422,F423,F424,F425,F426,F427,F428,F429,F430) as Field4,
Pick(IterNo(),F501,F502,F503,F504,F505,F506,F507,F508,F509,F510,F511,F512,F513,F514,F515,F516,F517,F518,F519,F520,F521,F522,F523,F524,F525,F526,F527,F528,F529,F530) as Field5
FROM
(qvd) While IterNo()<=30;
still not very fast though ...
Marco
Thanks Marco, as you say, the optimized bucle version is still the fastest and, in my opinion, the crosstable-generic is still the cutest.