Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
ollyhughes1982
Contributor
Contributor

Extract multiple lines from long cell string

Hi,

I have a straight table with two columns, shown in the image.

Multiple Offences Example.png

I have the below script, which is currently extracting ‘Sales_Code’ (left-hand column in the straight table, above) from cells which have a long string of information (‘Entry’ is the field with the long string – ‘Entry’ column I the straight table, above):

trim(TextBetween(Entry, 'SALES ID: ', 'DATE:')) as Sales_Code,

I have included an example of one of the cell strings, below:

 

ADDRESS OF PERSON:

Post.Code:

Tel. No.: 

ODE: REDACTED

SDE: REDACTED

Vehicle Details:

VRM: REDACTED

COLOUR: REDACTED

MAKE: REDACTED

MODEL/CC: REDACTED

TYPE: Car

DATE OF 1ST REG:

SALES ID: DE86027

DATE: REDACTED

TIME: REDACTED

LOCATION:  REDACTED

 

This is working fine for strings where I only have one ‘Sales _Code’, as per above (RC86027), but I am now getting some cells where the string has multiple (usually between two and four) ‘Sales _Code’ entries. I am not sure how to extract these. My script is bringing back blanks (‘-’) for these multiple ones – so I then miss two entries. I need to come up with a script that can cater for these multiple ones (e.g. If there are two, as per below, then have two rows in the table – both with the first column properly populated) If this is possible, please? I have included an example of a multiple ‘Sales _Code’ cell string below:

 

ADDRESS OF PERSON: REDACTED

Post.Code: REDACTED

Tel. No.: 

ODE: REDACTED

SDE: REDACTED

Vehicle Details:

VRM: REDACTED

COLOUR: REDACTED

MAKE: REDACTED

MODEL/CC: REDACTED

TYPE: REDACTED

DATE OF 1ST REG: REDACTED

SALES ID: SA88509

SALE_TYPE: REDACTED

SALES ID: AS88975

DATE: REDACTED

TIME: REDACTED

LOCATION:  REDACTED

 

You can see that in this example there are two ‘Sales_Code’ entries (SA88509 & AS88975). This then gives me a ‘-‘ in the left-hand column of my straight table, as it cannot cater for the two codes.

Any help appreciated!

Thanks in advance!

Labels (1)
2 Replies
marcus_sommer

I suggest another approach by resolving the entire cell-content by using subfield() and afterwards you may pick the wanted data and/or clean them further and/or whatever, maybe something in this way:

load *, rowno() as RowNo where match(FieldDescription, 'Sales ID');
load *, subfield(YourFieldLine, ':', 1) as FieldDescription, subfield(YourFieldLine, ':', 1) as Fieldvalue;
load *, recno() as RecNo, subfield(YourField, chr(10)) as YourFieldLine from Source;

ollyhughes1982
Contributor
Contributor
Author

Thanks. Will give this a try!