Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Aish123
Contributor
Contributor

How to remove ascii character from a text file

Hi,

 

I have text file with multiple rows and each row has a different schema definition.

Along with this, there is a lot of junk data in form of ascii characters which needs to be removed.

 

Can anyone please help how this can be achieved in a file with multiple schema structure as the characters are not specific they keep on changing and the location too keeps on changing?

 

I got a reference of a regular expression from Talend community which can help achieve this task 

row1.inputField.replaceAll("\\D", "")

 

But my input component for reading a text file is "tFileInputMSPositional" in which i have defined the schemas for all the rows, but when i am trying to connect it to "tReplace" only one row can be selected at a time.

 

Please suggest how to achieve this using "treplace" or "tmap"?

Labels (2)
7 Replies
manodwhb
Champion II
Champion II

@Aish123 , I suggest you read file entire row as a single column and use the below expression to remove the asscii characters and generate a new file and use it next using tFileInputMSPositional.

row1.inputField.replaceAll("\\D", "")

 

Aish123
Contributor
Contributor
Author

Hi,

 

It would be really helpful if you tell me how to read a single row as a single cplumn?

manodwhb
Champion II
Champion II

@Aish123, you can try to read using tfileinputraw.

Aish123
Contributor
Contributor
Author

Hi,

 

Can you please show me this by creating a simple job for better understanding?

manodwhb
Champion II
Champion II

@Aish123 , can you share sample file?

Aish123
Contributor
Contributor
Author

Hi please find the attached sample file.


special_character.txt
Aish123
Contributor
Contributor
Author

Hi @rhall , 

 

Can you please help me in this issue.

 

I have a text file with around 5lakh rows.

There are some set of rows which forms a part of one case, there are almost 13-14 lines which form a part of one case.

These 14 rows have different schema pattern and every row has different column no.

Row 1 - 70 columns

Row2 - 40 columns and so on.

These rows have multiple special character but there positions are not fixed.

There are some special characters like ". , : -" should be a part of our data.

But apart from these there are some special characters which needs to be replaced with blank as it is a junk data.

Can you tell me how can i achieve this as after cleaning this file i have to create an XML file as the output.