In other programming languages I've used, similar functions eat the consecutive delimiters and that was my expectation with this language.
I agree that the next character after the first delimiter is a space and that is what it returns.
I disagree with how QV has implemented the SubField function. I believe it should return the first non-delimiter character.
You may disagree with how QlikView has implemented subfield, but it works according to the documentation and is unlikely to change.
You can have mulltiple characters as the delimiter and subfield works correctly, but subfield does not support variable length delimiters. If the delimiter length varies, then you will need to correct this before passing the data to the subfield command.
I cannot comment on whether using replace is more efficient than trim/mid/index, but replace is simpler to implement, especially where numerous delimiters may exist.
I do have to back of somewhat on my sweeping statement about eating delimiters.
I would like and expect a split / subfield command to treat consecutive 'space' delimiters as one delimiter, but I do not want the command to treat consecutive non-space delimiters (such as commas) as one delimiter.
Can you give me an example of a data processing language that eats consecutive delimiters? Let me translate your example into something that for Subfield() is 100% identical:
This tells me that there are 6 fields in your string and that three of them are blank. It would be awfully difficult to split a line like this if SubField would exhibit the behavior you prefer.
BTW I cannot reproduce your results. Whatever I do, SubField() never returns spaces if a space itself is the delimiter. See also Rob's comment below.
Subfiled is doing its job as it should.
What surprised me that TRIM does not do the job.
It "should" (I expected that as Excel do it this way) cut all additional spaces after each word but is only removing empties for whole sentence.
If TRIM was working as in Excel there was no need to use additional functions.
In QlikView TRIM removes leading and trailing spaces from a string. It does not remove spaces within a string.
In Excel, trim removes all spaces from a string except for a single space between words.
QlikView is not excel, both have a trim function, but they produce different results.
The F1 help defines the syntax and function of an expression, different languages have subtle differences between similar commands, you cannot assume that functionality will be identical in different programming languages.
QlikView Trim is doing its job as it should, but this is not the same as trim in Excel..
Are you sure it returns an actual space and not an empty string? My tests show that it returns an empty string, which is what I would expect. I use a mapping table to squeeze out extra blanks when I need to.
MAPPING LOAD repeat(' ',recno()+1) , ' '
AutoGenerate 10 // Squeeze up to 11 blanks
LOAD subfield( MapSubstring('SpaceMap', myfield), ' ',2) as ....
as John stated it's human entered data, I tried to extend your solution to work with any number of spaces (for John's users with itchy fingers ;-)
tabStrings: LOAD *, RecNo() as ID Inline [ string abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz abcdef MyData zzzz aa bb cc aa bb cc aa bb cc aaa bb cc aaa bb cc ]; SpaceMap: MAPPING LOAD Repeat(' ',IterNo()+1) , ' ' While IterNo()<MaxSpaces; LOAD Max(IterNo) as MaxSpaces; LOAD IterNo() as IterNo Resident tabStrings While Index(string, Repeat(' ',IterNo())) or IterNo()=1; Left Join (tabStrings) LOAD Distinct string, SubField(MapSubString('SpaceMap', string),' ',2) as SubString2 Resident tabStrings;
hope this helps
As I mentioned in a post in the middle of this conversation, this is human entered data. The bulk of the strings only have single spaces, but a few have more than one space between the fields.
I have to process about 250K rows to see if there is data in the field and if so, if the data is valid.
What I have provided is a subset of the processing I'm doing on each string.