Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Save an extra $150 Dec 1–7 with code CYBERWEEK - stackable with early bird savings: Register
cancel
Showing results for 
Search instead for 
Did you mean: 
DarinAfni
Contributor III
Contributor III

Regex Replace Is Not Working

Hello,

This is a new problem. I've been using Talend 6.3 for a few years and I've been using this same approach but for some reason the results have changed.

I have a regular situation where I have to take a series of letters from a table column and insert a dash between every 2 characters.

So - DU02ASUD47S1FN2N becomes DU-02-AS-UD-47-S1-FN-2N

I do this with a very simple REGEX REPLACE function from a SQL query in tVerticaInput or tAs400Input component.

REGEXP_REPLACE(TableColumn, '(.{2}(?!$))', '\1-')

Very recently Talend Open Studio for Data Integration (which has not changed - it's the same version I've always been running) now returns this output:

0695b00000OAhbGAAT.png

I can't figure out why this odd "blob" or block is being returned instead of the actual characters save for the last 2? I've also tried this differently by just using an expression in a TMAP field using the EREPLACE (StringHandling.EREPLACE(row1.newColumn,"(.{2}(?!$))","\1-") routine there and doing the same thing but I get the same result. Blocks with the last 2 characters rendered correctly.

Does anyone know why this might be happening given that the install hasn't changed? Could this be a memory problem?

Labels (2)
8 Replies
Anonymous
Not applicable

Can you show the data before it has been altered? If the installation hasn't changed, the data must have. Maybe the encoding of the data you are receiving has changed?

DarinAfni
Contributor III
Contributor III
Author

I'll do my best to show what I'm talking about.

 

If I run a query in a SQL Editor then it does exactly what I want it to do. Such as this:

 

0695b00000OAt44AAD.png 

If I then move into Talen to run the exact same query from tVerticaInput then the results are as you see below.

 

0695b00000OAt57AAD.png 

I hope those images come through in a way that is readable. Nothing about the data in those tables should render fine from a SQL Editor and then behave like this in Talend. It makes no sense, but that's what is happening.

 

Darin

DarinAfni
Contributor III
Contributor III
Author

I should also specify that I'm the Vertica dba. I know very well that the data type from those fields in those tables have not changed.

 

The only thing that changed about this environment is that the windows machine that hosts Talend was moved to a new data center. So the VM was backed up and restored on new hardware. But the OS, Talend version, hardware specs, all identical. Just lifted and shifted.

Anonymous
Not applicable

I'm afraid this sounds like it is more to do with the move of the VM rather than the job itself. I'd contact the VM provider you are using and explain these symptoms. If nothing has changed at all with Talend, the job or the JDK being used, my first course of action would be to look at what has changed.

ddduser1643959971
Contributor
Contributor

HI

I have a csv File

name; FirstName ; numero;adresse

1 a;b;12,xx

2 c;b;13;yy

3 x;y;47;zz

4 e;r;45;tt

 

I want identify the doublon row by FirstName( here row number 1 and 2 firstName =b )

and i want to update the csv file to add new column "status" NotUnique like this

 

name; FirstName ; numero;adresse;status

1 a;b;12,xx;NotUnique

2 c;b;13;yy;NotUnique

3 x;y;47;zz;unique

4 e;r;45;tt;unique

 

 

 

can some one help me

Thank you very mutch in advance

 

 

gjeremy1617088143

Have you try to just read the file without the regex ?

gjeremy1617088143

Hi, have you tried to use tUniqRow component ?

Send me love and kudos

Anonymous
Not applicable

Hi @duser dduser​,

 

Can you please create a new question about your issue? it is not related to the issue above and can be quite confusing for people looking for answers.

 

Regards

 

Richard