Skip to main content
Announcements
Introducing a new Enhanced File Management feature in Qlik Cloud! GET THE DETAILS!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Removing question marks "?" in Talend

I have several rows which are entirely question marks. I am pasting some sample data below

 

id	text
1328qdfjhase	This is a text
1038qdfjhase	???? ??  ????
1114qdfjhase	This is also text
1455qdfjhase	Another text
1376qdfjhase	Extra text

I want to get rid of the second row as it only contains question mark and the data is of no use to me. I tried using tMap function EREPLACE function to replace the question marks to blank as

StringHandling.EREPLACE(out3.text,"?","")

and next i plan to filter the rows which are blank. However i am getting error at tMap component as 

 

Exception in component tMap_1
java.util.regex.PatternSyntaxException: Dangling meta character '?' near index 0
?
^
	at java.util.regex.Pattern.error(Pattern.java:1955)
	at java.util.regex.Pattern.sequence(Pattern.java:2123)
	at java.util.regex.Pattern.expr(Pattern.java:1996)
	at java.util.regex.Pattern.compile(Pattern.java:1696)
	at java.util.regex.Pattern.<init>(Pattern.java:1351)
	at java.util.regex.Pattern.compile(Pattern.java:1028)
	at java.lang.String.replaceAll(String.java:2223)
	at routines.StringHandling.CHANGE(StringHandling.java:96)
	at routines.StringHandling.EREPLACE(StringHandling.java:189)
	at local_project.clean_crmjl2_0_1.Clean_CRMJL2.tFileInputExcel_1Process(Clean_CRMJL2.java:4743)
	at local_project.clean_crmjl2_0_1.Clean_CRMJL2.runJobInTOS(Clean_CRMJL2.java:7478)
	at local_project.clean_crmjl2_0_1.Clean_CRMJL2.main(Clean_CRMJL2.java:7335)

Can anyone help?

 

 

Labels (3)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

Sorry, i was on vacation. I don't know why but instead this worked in tMap expression builder. I think issue was something else, not sure what though. I am now taking the input from excel files instead of CSV. could be because of encoding?

StringHandling.EREPLACE(out3.text,"?","")

 

View solution in original post

9 Replies
cterenzi
Specialist
Specialist

As the error message suggests, a question mark is a meta character in pattern strings. You get around this by escaping it. Because your String will be interpreted before being used as a pattern, you have to type "\\?"

Anonymous
Not applicable
Author

I tried that and its not removing the question marks row for me.

Anonymous
Not applicable
Author

Have you tried something like row5.newColumn.replaceAll("\\?", "") ?

cterenzi
Specialist
Specialist

So, the string replacement will only make that value blank.  It won't remove the entire row from the data flow.  For that you'll need to filter using a tFilter component or a tMap.  If you trim() the text after replacing all of the question marks, you can set up an output filter like:

!rowX.text.isEmpty()

to only pass through records that aren't empty (assuming you don't have other empty values you want to preserve).

Anonymous
Not applicable
Author


@douglaszickuhr wrote:

Have you tried something like row5.newColumn.replaceAll("\\?", "") ?


I am getting a new error as follows 

Exception in component tMap_1
java.lang.NullPointerException
	at local_project.clean_crmjl2_0_1.Clean_CRMJL2.tFileInputExcel_1Process(Clean_CRMJL2.java:4743)
	at local_project.clean_crmjl2_0_1.Clean_CRMJL2.runJobInTOS(Clean_CRMJL2.java:7477)
	at local_project.clean_crmjl2_0_1.Clean_CRMJL2.main(Clean_CRMJL2.java:7334)
Anonymous
Not applicable
Author

It seems that the value is null. Are you sure that you have value on that?

Are your components connected right? 

 

Paste here a screenshot of your job please. 0683p000009MACn.png

TRF
Champion II
Champion II

tMap is all you need:

0683p000009LvC4.png

Here is the expression I used to filter output rows:

!(StringHandling.BTRIM(row109.text.replaceAll("\\?*", ""))).equals("")

StringHandling.BTRIM is here to remove extra blanks which included in the text if any.

And the result (remark the last line which contains "?" but also other characters, so the line is in the result:

Starting job test at 22:00 23/06/2017.

[statistics] connecting to socket on port 3599
[statistics] connected
1328qdfjhase|This is a text
1114qdfjhase|This is also text
1455qdfjhase|Another text
1376qdfjhase|Extra text
999999999999|An extra ??? ?? ???? text to keep
[statistics] disconnected
Job test ended at 22:00 23/06/2017. [exit code=0]

Hope this helps.

TRF
Champion II
Champion II

@Enthusiast, does this helps or not?
Please, let us know and mark the case as solved if it is.
Anonymous
Not applicable
Author

Sorry, i was on vacation. I don't know why but instead this worked in tMap expression builder. I think issue was something else, not sure what though. I am now taking the input from excel files instead of CSV. could be because of encoding?

StringHandling.EREPLACE(out3.text,"?","")