Re: How to remove ascii character from a text file - Qlik Community

Aish123 · ‎2020-05-13

Hi,

I have text file with multiple rows and each row has a different schema definition.

Along with this, there is a lot of junk data in form of ascii characters which needs to be removed.

Can anyone please help how this can be achieved in a file with multiple schema structure as the characters are not specific they keep on changing and the location too keeps on changing?

I got a reference of a regular expression from Talend community which can help achieve this task

row1.inputField.replaceAll("\\D", "")

But my input component for reading a text file is "tFileInputMSPositional" in which i have defined the schemas for all the rows, but when i am trying to connect it to "tReplace" only one row can be selected at a time.

Please suggest how to achieve this using "treplace" or "tmap"?

manodwhb · ‎2020-05-13

@Aish123 , I suggest you read file entire row as a single column and use the below expression to remove the asscii characters and generate a new file and use it next using tFileInputMSPositional.

row1.inputField.replaceAll("\\D", "")

Aish123 · ‎2020-05-13

Hi,

It would be really helpful if you tell me how to read a single row as a single cplumn?

manodwhb · ‎2020-05-13

@Aish123, you can try to read using tfileinputraw.

Aish123 · ‎2020-05-13

Hi,

Can you please show me this by creating a simple job for better understanding?

manodwhb · ‎2020-05-13

@Aish123 , can you share sample file?

Aish123 · ‎2020-05-13

Hi please find the attached sample file.

special_character.txt

Aish123 · ‎2020-05-18

Hi @rhall ,

Can you please help me in this issue.

I have a text file with around 5lakh rows.

There are some set of rows which forms a part of one case, there are almost 13-14 lines which form a part of one case.

These 14 rows have different schema pattern and every row has different column no.

Row 1 - 70 columns

Row2 - 40 columns and so on.

These rows have multiple special character but there positions are not fixed.

There are some special characters like ". , : -" should be a part of our data.

But apart from these there are some special characters which needs to be replaced with blank as it is a junk data.

Can you tell me how can i achieve this as after cleaning this file i have to create an XML file as the output.

How to remove ascii character from a text file

Talend Data Integration

v7.x