Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I have a column with texts that is used to classify specific codes. These texts are all in upper case, but I need to convert them in order to have only the first character in upper case.
However, I have some exceptions in which there are acronyms that should be all in upper case, so the CAPITALIZE() function will not work only by itself.
Is there a way to have a formula in the script that can only apply the capitalize function for words with more than 3 characters?
Examples of the expected output.
Thanks
You could do this using either Subfield(), followed by if(Len(SplitText)<=3,Upper(SplitText),SplitText), and then concatenate the results again, or you achieve the same thing by looping through each individual line based on the number of spaces it contains. If your strings are similar to those in your example - no more than three "words" per line - you could just spell out Subfield(Text,' ',1) & ' ' & Subfield(Text,' ',2) & ' ' & Subfield(Text,' ',3) with the same if condition as above.
You could do this using either Subfield(), followed by if(Len(SplitText)<=3,Upper(SplitText),SplitText), and then concatenate the results again, or you achieve the same thing by looping through each individual line based on the number of spaces it contains. If your strings are similar to those in your example - no more than three "words" per line - you could just spell out Subfield(Text,' ',1) & ' ' & Subfield(Text,' ',2) & ' ' & Subfield(Text,' ',3) with the same if condition as above.
Many thanks for the quick reply and guidance! I checked and only had texts with 4 separate "words" per line, so I came up with the following formula based on your second suggestion:
if(SubStringCount(TXT_PROFITCENTER2,' ') + 1 = 1, // One word
if(Len(SubField(TXT_PROFITCENTER2,' ',1)) <=3 ,TXT_PROFITCENTER2,Capitalize(TXT_PROFITCENTER2)),
if(SubStringCount(TXT_PROFITCENTER2,' ') + 1 = 2, // Two words
if(len(Subfield(TXT_PROFITCENTER2,' ',1)) <=3 , subfield(TXT_PROFITCENTER2,' ',1),Capitalize(subfield(TXT_PROFITCENTER2,' ',1)))&' '&
if(len(Subfield(TXT_PROFITCENTER2,' ',2)) <=3 , subfield(TXT_PROFITCENTER2,' ',2),Capitalize(subfield(TXT_PROFITCENTER2,' ',2))),
if(SubStringCount(TXT_PROFITCENTER2,' ') + 1 = 3, // Three words
if(len(Subfield(TXT_PROFITCENTER2,' ',1)) <=3 , subfield(TXT_PROFITCENTER2,' ',1),Capitalize(subfield(TXT_PROFITCENTER2,' ',1)))&' '&
if(len(Subfield(TXT_PROFITCENTER2,' ',2)) <=3 , subfield(TXT_PROFITCENTER2,' ',2),Capitalize(subfield(TXT_PROFITCENTER2,' ',2)))&' '&
if(len(Subfield(TXT_PROFITCENTER2,' ',3)) <=3 , subfield(TXT_PROFITCENTER2,' ',3),Capitalize(subfield(TXT_PROFITCENTER2,' ',3))),
if(len(Subfield(TXT_PROFITCENTER2,' ',1)) <=3 , subfield(TXT_PROFITCENTER2,' ',1),Capitalize(subfield(TXT_PROFITCENTER2,' ',1)))&' '&
if(len(Subfield(TXT_PROFITCENTER2,' ',2)) <=3 , subfield(TXT_PROFITCENTER2,' ',2),Capitalize(subfield(TXT_PROFITCENTER2,' ',2)))&' '&
if(len(Subfield(TXT_PROFITCENTER2,' ',3)) <=3 , subfield(TXT_PROFITCENTER2,' ',3),Capitalize(subfield(TXT_PROFITCENTER2,' ',3)))&' '&
if(len(Subfield(TXT_PROFITCENTER2,' ',4)) <=3 , subfield(TXT_PROFITCENTER2,' ',4),Capitalize(subfield(TXT_PROFITCENTER2,' ',4)))
)
)
)
as TXT_PROFITCENTER3,
It may not be the most "elegant" solution, but it is working as expected.
Do you have a simpler alternative that would be flexible enough to accommodate a scenario if there are more than "4" words in the future?
As mentioned in my original post, a more robust solution would be to use Subfield(Text,' ') to split the text into rows and then re-concatenate them while applying Upper() as required, or using a loop to check each row N number of times where N is the count of whitespaces in the string.
Thanks again! I'm not very experienced with Qlik, so thanks for your patience. I tried to apply your subfield and re-concatenate solution. I'm not sure if there is an underlying assumption that requires the creation of temporary tables to do the splitting and concatenate steps, I assumed yes. The following worked and it overcame the need to do the copy and paste approach.
That looks right to me. I think this approach is likely to be easier to read and also provide better performance compared to a loop that doesn't split the line with a temporary table, though the latter would also be possible.