Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi folks,
I have a high level case at least for me,
I have a load script a with a big string which i need to divide several times depends on count :
Table:
load
[big string],
subfield([big_string], '&', 1) as field 1,
subfield([big_string], '&', 2) as field 2,
subfield([big_string], '&', 3) as field 3,
subfield([big_string], '&', 4) as field 4,
subfield([big_string], '&', 5) as field 5,
subfield([big_string], '&', 6) as field 6,
..............
resident temp_table;
So i know have many i have '&' and know how many times i need to to use subfield,
i want to do it fully automatically, so if i have number 10 i know that should be 10 fields with subfield if 2 there will be 2 fields with subfield, one idea that comes to my mind is using loop but how later attached that column to main column i don't know.
I can do in a very simple way to create 50 fields with 50 subfield, but i would like to have it more flexible
Thank you in advance
@Micki Try below. This will create dynamic formula for maximum count of delimiter in your data which is always better than running loop over actual data set which is not good practice
let vActual_Field_Name = 'big_string'; // specify big string field name
let vDelimiter = '&'; // specify big string delimeter
let vField_Name_After_Split = 'Field'; // specify big string field name
Data:
load *,
SubStringCount(big_string, '&')+1 as Count_delimiter
inline [
big_string,Dimension1,Dimension2
Field1&Field2&Field3&Field4,a,c
Field1&Field2&Field3&Field4&Field5,b,d
];
Max_Count:
LOAD max(Count_delimiter) as max_count_delimiter;
LOAD FieldValue('Count_delimiter',RecNo()) as Count_delimiter
AutoGenerate FieldValueCount('Count_delimiter');
let vMax_Count_Delimiter = Peek('max_count_delimiter',0,'Max_Count');
// create dynamic subfield formula
for i=1 to $(vMax_Count_Delimiter)
Formula:
LOAD 'subfield('& '$(vActual_Field_Name)'& ','& chr(39)& '$(vDelimiter)'& chr(39)& ',' & $(i) & ')' & ' as ' & '[' & '$(vField_Name_After_Split)' & ' ' & $(i) & ']' as Formula
AutoGenerate 1;
NEXT
Formula_Final:
NoConcatenate
LOAD Concat(Formula,','&chr(10)) as Formula
Resident Formula;
DROP Table Formula;
let vFormula = Peek('Formula',0,'Formula_Final');
DROP Table Formula_Final;
Final_table:
LOAD *,
$(vFormula)
Resident Data;
DROP Table Data;
@Micki One automatic solution with loop
you can remove the last loop but I keep it to combine the tables.
Data:
load *,rowno() as Idtmp, SubStringCount(big_string, '&') as counttmp inline [
big_string,Dimension1,Dimension2
Field1&Field2&Field3&Field4,a,c
Field1&Field2&Field3&Field4&Field5,b,d
];
LET NumRows=NoOfRows('Data');
FOR i=0 to $(NumRows)-1
LET Vbig_string=Peek('big_string',$(i),'Data');
LET VIdtmp=Peek('Idtmp',$(i),'Data');
LET Vcounttmp=Peek('counttmp',$(i),'Data')+1;
FOR j=1 to $(Vcounttmp)
load big_string,subfield(big_string,'&',$(j)) as Field$(j) resident Data where Idtmp=$(VIdtmp);
NEXT;
NEXT;
CombinedGenericTable:
Load distinct big_string,Dimension1,Dimension2 resident Data;
drop table Data;
FOR i = NoOfTables()-1 to 0 STEP -1
LET vTable=TableName($(i));
IF WildMatch('$(vTable)', 'Data*') THEN
LEFT JOIN ([CombinedGenericTable]) LOAD distinct * RESIDENT [$(vTable)];
DROP TABLE [$(vTable)];
ENDIF
NEXT i
Input:
output:
@Micki Try below. This will create dynamic formula for maximum count of delimiter in your data which is always better than running loop over actual data set which is not good practice
let vActual_Field_Name = 'big_string'; // specify big string field name
let vDelimiter = '&'; // specify big string delimeter
let vField_Name_After_Split = 'Field'; // specify big string field name
Data:
load *,
SubStringCount(big_string, '&')+1 as Count_delimiter
inline [
big_string,Dimension1,Dimension2
Field1&Field2&Field3&Field4,a,c
Field1&Field2&Field3&Field4&Field5,b,d
];
Max_Count:
LOAD max(Count_delimiter) as max_count_delimiter;
LOAD FieldValue('Count_delimiter',RecNo()) as Count_delimiter
AutoGenerate FieldValueCount('Count_delimiter');
let vMax_Count_Delimiter = Peek('max_count_delimiter',0,'Max_Count');
// create dynamic subfield formula
for i=1 to $(vMax_Count_Delimiter)
Formula:
LOAD 'subfield('& '$(vActual_Field_Name)'& ','& chr(39)& '$(vDelimiter)'& chr(39)& ',' & $(i) & ')' & ' as ' & '[' & '$(vField_Name_After_Split)' & ' ' & $(i) & ']' as Formula
AutoGenerate 1;
NEXT
Formula_Final:
NoConcatenate
LOAD Concat(Formula,','&chr(10)) as Formula
Resident Formula;
DROP Table Formula;
let vFormula = Peek('Formula',0,'Formula_Final');
DROP Table Formula_Final;
Final_table:
LOAD *,
$(vFormula)
Resident Data;
DROP Table Data;
An alternative to the already mentioned loop-approaches might be to load these big-string with a from_field approach (alternatively you may store this field within a txt and load then from there) like the following:
t1: load *, rowno() as R inline [
s
a&b&c
1&2&3
];
t2:
load *, rowno() as R;
load @1, @2, @3
from_field (t1, s) (txt, utf8, no labels, delimiter is '&', msq);
join(t1) load * resident t2;
drop tables t2;
- Marcus
That was great @marcus_sommer
question though - it didnt behave the way i expected it with the preload:
t2:
load *, rowno() as R;
load @1, @2, @3
from_field (t1, s) (txt, utf8, no labels, delimiter is '&', msq);
it created a table, <1> (i guess the table name came from the field @1) then it proceeded to do the preload. can you pls comment on that? that means these two statements created two tables. is there a way to get around that?
It's not quite clear for me what you mean with creating two tables. The first table t1 is just to get data - within a real scenario there will be probably already a resident-table for it - and the second table is loading from this field quite similar as if it would be an external txt-file.
Nevertheless the feature of from_field is a bit tricky (at least I don't use it often and therefore I struggle sometimes with it and therefore I keep it usually as simple as possible) - therefore I didn't try to add [s] as a join-key else used the rowno() as key created in the preceeding-part (by larger datasets the join with a number would be certainly faster as with the big-string ).
Please elaborate your issue more detailed if this didn't helped enough.
- Marcus
Dear @Kushal_Chawda and @Taoufiq_Zarra , @marcus_sommer ,
for me the best solutions the one provided by @Kushal_Chawda , faster and safer.
thank you again so much ALL of you, for your time and help, it's really helps me a lot and i able to improve my scripting skills.
subfield without third parameter