Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Write Table now available in Qlik Cloud Analytics: Read Blog
cancel
Showing results for 
Search instead for 
Did you mean: 
Aish123
Contributor
Contributor

How to remove ascii character from a text file

Hi,

 

I have text file with multiple rows and each row has a different schema definition.

Along with this, there is a lot of junk data in form of ascii characters which needs to be removed.

 

Can anyone please help how this can be achieved in a file with multiple schema structure as the characters are not specific they keep on changing and the location too keeps on changing?

 

I got a reference of a regular expression from Talend community which can help achieve this task 

row1.inputField.replaceAll("\\D", "")

 

But my input component for reading a text file is "tFileInputMSPositional" in which i have defined the schemas for all the rows, but when i am trying to connect it to "tReplace" only one row can be selected at a time.

 

Please suggest how to achieve this using "treplace" or "tmap"?

Labels (2)
7 Replies
manodwhb
Champion II
Champion II

@Aish123 , I suggest you read file entire row as a single column and use the below expression to remove the asscii characters and generate a new file and use it next using tFileInputMSPositional.

row1.inputField.replaceAll("\\D", "")

 

Aish123
Contributor
Contributor
Author

Hi,

 

It would be really helpful if you tell me how to read a single row as a single cplumn?

manodwhb
Champion II
Champion II

@Aish123, you can try to read using tfileinputraw.

Aish123
Contributor
Contributor
Author

Hi,

 

Can you please show me this by creating a simple job for better understanding?

manodwhb
Champion II
Champion II

@Aish123 , can you share sample file?

Aish123
Contributor
Contributor
Author

Hi please find the attached sample file.


special_character.txt
Aish123
Contributor
Contributor
Author

Hi @rhall , 

 

Can you please help me in this issue.

 

I have a text file with around 5lakh rows.

There are some set of rows which forms a part of one case, there are almost 13-14 lines which form a part of one case.

These 14 rows have different schema pattern and every row has different column no.

Row 1 - 70 columns

Row2 - 40 columns and so on.

These rows have multiple special character but there positions are not fixed.

There are some special characters like ". , : -" should be a part of our data.

But apart from these there are some special characters which needs to be replaced with blank as it is a junk data.

Can you tell me how can i achieve this as after cleaning this file i have to create an XML file as the output.