Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
EDehzn
Contributor III
Contributor III

String extraction

Hi,

I am trying to extract the code which starts with 'PA9' from a string in column OTHERS as below and make a new column/field for the code.

 

OTHERS

~PA9235 testing for qlik

PA92222 testing

test123~PA92325~test

 

due to the unfixed code length and location, i am unable to use subfield() for this.

 

Is there any other way for this?

 

Labels (2)
1 Solution

Accepted Solutions
cwolf
Creator III
Creator III

The easiest way is a JScript function using a regular expression:

Macro:

function GetPA(s){
	var m = s.match(/PA9\d*/);
	return m==null ? "" : m[0];
}

 

Script:

load
Text,
GetPA(Text) as PA
Inline [
Text                       
~PA9235 testing for qlik   
PA92222 testing            
test123~PA92325~test       
some string without code   
qlik~PA92233 No.12445      
test PA92222 /test         
PA92124 /HEG/0000 test     
PA932145 /RFG/4567 qwrty
PA99921 BILL NO 1234 3456 4
PA89921 BILL NO 1234 3456 4
JSriptPA97264018Macro
PAPPA95547P911236
];

 

Result:

cwolf_0-1643751653182.png

- Christian

View solution in original post

9 Replies
MarcoWedel

assuming the code consists of only 'PA9' and the following numeric characters, i.e. it ends with any non-numeric character, maybe you could use

 

If(Index(OTHERS,'PA9'),'PA9'&Left(SubField(OTHERS,'PA9',2),Index(SubField(OTHERS,'PA9',2),PurgeChar(SubField(OTHERS,'PA9',2),'0123456789'))-1)) as Code

 

hope this helps

Marco

EDehzn
Contributor III
Contributor III
Author

Hi Marco,

This works well but there were still codes which were not extracted correctly in some case such as:

qlik~PA92233 No.12445

test PA92222 /test

MarcoWedel

so not all the codes of interest actually start with 'PA9' like you specified ... ?


EDehzn
Contributor III
Contributor III
Author

Hi Marco,

 

Sorry for the confusion. I have edited the codes. These codes do start from PA9, however some of the extraction failed when the codes ends like these:

 

qlik~PA92233 No.12445

test PA92222 /test

 

MarcoWedel

The expression seems to work with these values as well. 

Can you post an example application that demonstrates "OTHERS" values not working with this expression?

Thanks

 

MarcoWedel_0-1643306783225.png

 

 

table1:
LOAD *,
     If(Index(OTHERS,'PA9'),'PA9'&Left(SubField(OTHERS,'PA9',2),Index(SubField(OTHERS,'PA9',2),PurgeChar(SubField(OTHERS,'PA9',2),'0123456789'))-1)) as Code 
Inline [
    OTHERS
    ~PA9235 testing for qlik
    PA92222 testing
    test123~PA92325~test
    some string without code
    qlik~PA92233 No.12445
    test PA92222 /test
];

 

 

 

 

 

 

 

EDehzn
Contributor III
Contributor III
Author

Hi Marco, here are some examples where the code have some issues:

PA92124 /HEG/0000 test

PA932145 /RFG/4567 qwrty

PA99921 BILL NO 1234 3456 4

 

Apologies for the dummy data but i can assure u that the string sequence and type is the same

 

 

MarcoWedel

Hi,

I didn't notice I used a wrong function here at one point.

Can you try

If(Index(OTHERS,'PA9'),'PA9'&Left(SubField(OTHERS,'PA9',2),FindOneOf(SubField(OTHERS,'PA9',2),PurgeChar(SubField(OTHERS,'PA9',2),'0123456789'))-1)) as Code

instead?

 

cwolf
Creator III
Creator III

The easiest way is a JScript function using a regular expression:

Macro:

function GetPA(s){
	var m = s.match(/PA9\d*/);
	return m==null ? "" : m[0];
}

 

Script:

load
Text,
GetPA(Text) as PA
Inline [
Text                       
~PA9235 testing for qlik   
PA92222 testing            
test123~PA92325~test       
some string without code   
qlik~PA92233 No.12445      
test PA92222 /test         
PA92124 /HEG/0000 test     
PA932145 /RFG/4567 qwrty
PA99921 BILL NO 1234 3456 4
PA89921 BILL NO 1234 3456 4
JSriptPA97264018Macro
PAPPA95547P911236
];

 

Result:

cwolf_0-1643751653182.png

- Christian

MarcoWedel

same with VBS RegEx:

 

MarcoWedel_0-1643823799404.png

 

MarcoWedel_1-1643823833225.png

 

Function RegExFind(str)
  set RE = New RegExp
  RE.Pattern = "(PA9\d*)"
  RegExFind = RE.Execute(str)(0).SubMatches(0)  
End Function

 

table1:
LOAD *,
     RegExFind(OTHERS) as Code
Inline [
    OTHERS
    ~PA9235 testing for qlik
    PA92222 testing
    test123~PA92325~test
    some string without code
    qlik~PA92233 No.12445
    test PA92222 /test
    PA92124 /HEG/0000 test
    PA932145 /RFG/4567 qwrty
    PA99921 BILL NO 1234 3456 4
];

 

hope this helps

Marco