Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I've an unusual request but I'm almost there and just need some help getting over the final hurdle. I have a file containing a few thousand records that contains strings I need to split in two. The thing is, the strings have varying lengths and no clear distinctive character I can use to separate them, apart from an upper case character appearing after the first character. As the strings were originally concatenations of two strings with nothing in between, I've managed to split them using a series of loops searching for 2 capital letters and returning the number of the subfield of the first word in the string that contains 2 upper case letters and 1 or more lower case records.
Example:
Original | Subfield_Split | Header | Description |
---|---|---|---|
Route A - DavidReturn journey | 4 | Route A - David | Return journey |
James B - Driver - Part Time.No more than 4 hours round trip | 7 | James B - Driver - Part Time. | No more than 4 hours round trip |
I understand it's strange but I need to display as close to the Header & Description fields as I can. I've already worked out the subfield I need to split the string on, now I need to find a way to display it.Do I create another series of loops that concat the correct number of subfields together or is there a better way?
So the first field would be:
Subfield(Original, ' ', 1) & ' ' & Subfield(Original, ' ', 2) & ' ' & Subfield(Original, ' ', 3) & ' ' & Subfield(Original, ' ', 4) AS Header,
Subfield(Original, ' ', 4) & ' ' & Subfield(Original, ' ', 5) & ' ' & Subfield(Original, ' ', 6) AS Description
And the second would be:
Subfield(Original, ' ', 1) & ' ' & Subfield(Original, ' ', 2) & ' ' & Subfield(Original, ' ', 3) & ' ' & Subfield(Original, ' ', 4) & ' ' & Subfield(Original, ' ', 5) & ' ' & Subfield(Original, ' ', 6) & ' ' & Subfield(Original, ' ', 7) AS Header,
Subfield(Original, ' ', 7) & ' ' & Subfield(Original, ' ', 😎 & ' ' & Subfield(Original, ' ', 9) & ' ' & Subfield(Original, ' ', 10)........etc. AS Description
Right. Got it sorted in the end. It's not the most elegant but it works and it only needs to run once a month.
Instead of return the number of the SUBFIELD where the original JOIN happens, I returned the SUBFIELD itself (e.g. instead of 4, return DavidReturn AS WORD_JOIN) and then use that to display the first part of the Hearer string and the last part of the Description string.
Example:
Subfield(Original, WORD_JOIN, 1) AS Header
Subfield(Original, WORD_JOIN, 2) AS Description
Header Description
'Route A - ' ' journey'
Then I use SUBFIELD again to search for the second capital letter in the WORD_JOIN string:
Left(WORD_JOIN, 1) &
Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(
Right(WORD_JOIN, Len(WORD_JOIN)-1)
, 'A', 1), 'B', 1), 'C', 1), 'D', 1), 'E', 1), 'F', 1), 'G', 1), 'H', 1), 'I', 1), 'J', 1), 'K', 1), 'L', 1), 'M', 1), 'N', 1), 'O', 1), 'P', 1), 'Q', 1), 'R', 1), 'S', 1), 'T', 1), 'U', 1), 'V', 1), 'W', 1), 'X', 1), 'Y', 1), 'Z', 1) AS Header_Missing
Then I use SUBFIELD again of RIGHT with this to get the rest of the string:
SUBFIELD(WORD_JOIN, Header_Missing, 2) AS Description Missing
OR
Right(WORD_JOIN, LEN(WORD_JOIN)-LEN(Header_Missing)) AS Description
Add these two to the end/start of the Header/Description and I have the answer!
Is Subfield_Split a column coming available in your database or is that just created for demonstration purposes here?
Are these the only two cases? If not, there is probably no way to fix this (because of the large number of possibly different cases).
Moreover, the example code in your post doesn't split the concatenated words into a header and a description part. In the first example, Header = "Route A - DavidReturn" and Description = "DavidReturn journey". The same with the second example. Header = "James B - Driver - Part Time.No" and Description = "Time.No more than 4 hours round trip"
Is there no way you can get hold of part of the string (like the left or left part from some other field in the original data source)?
Peter
Tip: better start by first splitting the concatenated strings on the last hyphen. The text to be cut into pieces now becomes smaller and - maybe - easier to handle.
I created it in the script using a series of (complex) loops to work out where the concatenation occurs so it's there and correct for each row.
Hi,
These are not the only two cases, there are loads more, all different but share the same type of join. I'm aware of the results you described but I'm working out a way to counter act that behavior. I just had a eureka moment so watch this space (hopefully).
Would you be able to share what you have done so far. May be we can leverage what you have and expand on your logic
Right. Got it sorted in the end. It's not the most elegant but it works and it only needs to run once a month.
Instead of return the number of the SUBFIELD where the original JOIN happens, I returned the SUBFIELD itself (e.g. instead of 4, return DavidReturn AS WORD_JOIN) and then use that to display the first part of the Hearer string and the last part of the Description string.
Example:
Subfield(Original, WORD_JOIN, 1) AS Header
Subfield(Original, WORD_JOIN, 2) AS Description
Header Description
'Route A - ' ' journey'
Then I use SUBFIELD again to search for the second capital letter in the WORD_JOIN string:
Left(WORD_JOIN, 1) &
Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(Subfield(
Right(WORD_JOIN, Len(WORD_JOIN)-1)
, 'A', 1), 'B', 1), 'C', 1), 'D', 1), 'E', 1), 'F', 1), 'G', 1), 'H', 1), 'I', 1), 'J', 1), 'K', 1), 'L', 1), 'M', 1), 'N', 1), 'O', 1), 'P', 1), 'Q', 1), 'R', 1), 'S', 1), 'T', 1), 'U', 1), 'V', 1), 'W', 1), 'X', 1), 'Y', 1), 'Z', 1) AS Header_Missing
Then I use SUBFIELD again of RIGHT with this to get the rest of the string:
SUBFIELD(WORD_JOIN, Header_Missing, 2) AS Description Missing
OR
Right(WORD_JOIN, LEN(WORD_JOIN)-LEN(Header_Missing)) AS Description
Add these two to the end/start of the Header/Description and I have the answer!
Thanks Sunny but I just figured it out. Typing out the scenario in my original question got me thinking and all sorted now!
Awesome. Thanks for sharing how you managed to resolved it. I am sure this will be helpful for others in the future
Glfrtbl... Sorry. Well, if it works.
Maybe you can use a more elegant technique for the middle part, like a call to function
...FindOneOf(WORD_JOIN, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 2)...
to obtain the index of the second capital letter. Based on that index, you can split off the front using
...Left(WORD_JOIN, FindOneOf( ... )-1)...
and the back using
...Mid(WORD_JOIN, FindOneOf( ... ))...
Best,
Peter