Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
is it possible to eliminate the html-tags from a column that contains following values:
<html><body><br><font color="#000080"><b>test:</b></font> <br><br><font color="#000080"><b>draft</b></font> </font></body></html>
I want that only test and draft is being showed in the column after removing the tags.
Thanks for your help!
Best regards,
Aylin
Hi,
this is an example how to strip HTML tags:
Data:
LOAD RecNo() as RecNo, Field
FROM <your source file>;
HtmlTag_Map:
MAPPING LOAD DISTINCT '<' & TextBetween('<' & SubField(Field, '<', IterNo()),'<','>') & '>' as HtmlTag, '' as Substitute
Resident Data
While NOT IsNull(SubField(Field, '<', IterNo()));
JOIN(Data) LOAD RecNo, MapSubstring('HtmlTag_Map', Field) as Field_Cleansed
Resident Data;
It looks a bit complex but it's easy to adapt. There are three steps:
1. Load the field with HTML tags
2. Create a Mapping Table containing all in data existing HTML tags
3. Use MapSunstring to replace the existing HTML tags with empty string
- Ralf
PS: this example is also released on my GiiHub/Gist account
In addition, I recognized some spaces between tags in your example: </font>_<br> and </font>_</font>
To get rid of this it should be fixed before with a replace like this: replace(Field, '> <', '><')