Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all!
I'm trying to import a text file that is semicolon delimited. The only problem is that some lines spill over to the next line which creates a problem that the end of the line gets registered on a new column. For example like this:
Id;Dateline;UserId;FullName;Email;
AAA;BBB;CCC;DDD;EEE;
VVV;WWW;XXX;YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY
YYYYY;ZZZ;
In the "Id" column that gives the values:
AAA
VVV
YYYYY
But I only want it to get:
AAA
VVV
Is there any way of doing this or do we need to restructure our data?
I usually deal with this by making two passes over the file. First pass, combine the multi-lines into a single line. Than a second pass to read the individual fields.
-Rob
Thanks for your answer Rob!
How do you make the first pass where you combine the multi lines into a single line?
/Hampus
I would probably do like this:
- Read the file with each record in a single field by specifying a delimiter not present in your data
- Pass through the records and count semicolons in each field. If it's not 4, concatenate to previous record
- Split up the fields with SubField()
If the file is not too big I might approach it by combining all the data into one big string and then dividing into new rows by every fifth ";". The individual fields could then be extracted as Ralf suggested with SubField() or LOAD FROM_FIELD. Here's a script coded as one preceding load. Final table is just one field, "OneRow". This would handle wrapping at any point on any field.
FixedData:
LOAD
mid(BigField, start, (end-start)) as OneRow
;
LOAD
BigField,
Index(BigField, ';',(IterNo()-1)*5)+1 as start,
Index(BigField, ';',IterNo()*5) as end
WHILE (SubStringCount(BigField, ';') / 5) >= IterNo()
;
LOAD
concat(Raw,'',RecNo()) as BigField
;
LOAD * INLINE [
Raw
AAA;BBB;CCC;
DDD;EEE;
VVV;WWW;XXX;YYYYYYYYYYYYYY
YYYYY;ZZZ;
AAA;BBBBBBBB
BBB;CCC;DDD;EEE;
AAA;BBB;CCC;DDD;ZZZ;
] (delimiter is '|')
;
-Rob