Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

tExtractRegexFields not working

Hello!

I want to split one column. For example, 21.A-01-BTA or 21.A03-01-BTA. The split I want to do is like 

BDel: 21

Group: A

UnderGroup: (if there is any number after Group then it should go to UnderGroup). So for 1st example it will be Null and for 2nd it will be 03

Remaining string1: 01

Remaining string2: BTA

 

I tried to use tExtractRegexFields with the following expression but i get no values

"([0-9][0-9]).([A-Z])([0-9][0-9])?-([0-9])-([A-Z])" 

-- Used '?' since undergroup might or might not be present for the group.

What is the correct syntax for this?

 

Regards

Priyadarshini

 

 

 

Labels (1)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

Another alternative is to use this regex :

 

"^"+
"([0-9]{2}).([A-Z])([0-9][0-9])?-([0-9][0-9])-([A-Z]{3})" +
".*"

make sure you create at least 5 columns in your shema

 

 

View solution in original post

6 Replies
Anonymous
Not applicable
Author

@priyadarshiniv 

 

Please refer the below details for parsing of data. Please note that I have considered only happy path of data. So you will have to do testing with various conditions and make necessary amendments for null checking and string length. The solutions for these two are already available in stackoverflow. So I am not touching on that aspect and give it as a hands-on exercise to you.

0683p000009M5Ar.png

 

0683p000009M59G.png

 

0683p000009M3QB.png

 

Coming to the java functions, please refer below.

 

var1 ->             row1.input.substring(row1.input.indexOf(".")+1,row1.input.indexOf("-")).replaceAll("\\D+","") 

BDel ->             row1.input.substring(0 ,row1.input.indexOf(".")) 
Group ->            row1.input.substring(row1.input.indexOf(".")+1,row1.input.indexOf("-")).replaceAll("[^A-Za-z]+", "") 
UnderGroup ->      Var.var1.equals("")?null:Var.var1 
R_string1->        row1.input.substring(row1.input.indexOf("-") +1,row1.input.indexOf("-", row1.input.indexOf("-") +1)) 
R_string2->        row1.input.substring(row1.input.indexOf("-", row1.input.indexOf("-") +1)+1) 

Hope you are happy with the resolution. Please spare a minute to give kudos and mark the topic as resolved 🙂

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

Anonymous
Not applicable
Author

Thank you so much @nthampi !!! I will check it today! 

 

Regards

Priya

Anonymous
Not applicable
Author

Another alternative is to use this regex :

 

"^"+
"([0-9]{2}).([A-Z])([0-9][0-9])?-([0-9][0-9])-([A-Z]{3})" +
".*"

make sure you create at least 5 columns in your shema

 

 

Anonymous
Not applicable
Author

Hello @nthampi 

When I use this I get this error!

String index out of range: -4

 

 

Anonymous
Not applicable
Author

Thank you @dgm01 for your reply! I need one more help! Instead of having Remaining_1 and Remaining_2 I want to have everything after UnderGroup to go in as one part. Tried this but gives wrong result:

"^"+

"([0-9]{2})?(\\.[A-Z])?([0-9][0-9])?(-[0-9][0-9])?(-[.]*)?" +

".*"  

for 21.A03-01-BTA  it gives me 

BDel: 21

Group: A

UnderGroup:03

Rest1: 01

It doesnt take the last part -BTA. Might be "-" is not trated as a character. How can then the expression be?

 

Regards

Priya

 

Anonymous
Not applicable
Author

Hello @priyadarshiniv

Please, try this :

 

"^"+
"([0-9]{2})?(\\.[A-Z])?([0-9][0-9])?(-[0-9][0-9])?" +
"(.*)" 

Don't forget to create at least 5 columns in the schema



inputString:
expected Result:

Then I will help you write the regex