Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
toddbuss
Creator
Creator

finding dynamic text sequence within a larger text field

I have an ugly database with a text narrative.  Within the narrative, there is a reference to a patient number.  The patient number is always in the format:

A00000

A is any letter A-Z (either capital or lowercase) or any digit

0 is any numeric character. 

Total characters is 6.

Examples:

A123456, 112112,  b11234

I think I need to use the findoneof() function, and finding 5 consecutive numeric characters, and then using that position within the string and the Mid() function to select the 6 character patient number.  The coding I'm imagining is so inelegant that I hesitate to begin.  How do I find the first numeric character without searching separately for 1, then 2, then 3, etc.

any help is appreciated.

Thanks,

Todd

5 Replies
joseduque
Partner - Contributor III
Partner - Contributor III

Hi,

If you want only the number and eliminate the text, you can use the function KeepChar:

Keepchar(Patient_Number,'0123456789')

or use PurgeChar

Purgechar(Patient_Number,'ABCDEF.......') Alphabet

Hope This Helps,

Jose

toddbuss
Creator
Creator
Author

Thanks, Jose.  The problem is there may be additional numbers referenced within the original text.   I think I'll have to use a for/next loop to go character-by-character and look for consecutive numeric values.

marcus_sommer

I think there are two alternatives to your loop over all chars from the fieldvalues. One could be a RegEx approach like described here: http://www.qlikfix.com/2010/10/18/regular-expressions-in-the-load-script/ and the other to use a mapsubstring() maybe similar to this one:

MapTemp:

load F1 & iterno() while iterno() <= 999999;

load chr(64 + recno()) as F1 autogenerate 26;

This won't be a small table but it wouldn't be too huge to be usable. Whereby I'm not sure if you really need to create this as a large generic table - your patient numbers should be already available in any fact- or dimension-table. In this case your mapping-table would be look like:

Map:

mapping load distinct [Patient Number], '<' & [Patient Number] & '>' from YourTable;

TableWithFreeText:

load *, textbetween(mapsubstring('Map', FreeText), '<', '>') as [Patient Number] from TableWithFreeText;

Depending on the text you might need a different delimiter for textbetween() which definitely exists within the text, for example a chr(1) should be quite unlikely.

- Marcus

sasiparupudi1
Master III
Master III

May be something like this..

T1:

Load * Inline

[

TTT

asasasA123456dbddrrs,

112112, 

121232ab112341asasas

];

Left Join(T1)

Load

TTT,

IF(IsNUm(Mid(TTT,IterNo()+1,1)) AND IsNUm(Mid(TTT,IterNo()+2,1)) And IsNUm(Mid(TTT,IterNo()+3,1))And IsNUm(Mid(TTT,IterNo()+4,1)) And IsNUm(Mid(TTT,IterNo()+5,1)) and Istext(Mid(TTT,IterNo(),1)),IterNo(),0) as IndexOfPatNo

Resident T1

While(IterNo() <=Len(TTT));

Final:

Noconcatenate Load

TTT,

Mid(TTT,IndexOfPatNo,7) as PatientNumber,

IndexOfPatNo

Resident T1

;

toddbuss
Creator
Creator
Author

thanks to everyone for the suggestions.  After soliciting the help of the OIT department here, they decided to create the new field in a step prior to my loading script.