Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Column that contains HTML

Hello,

is it possible to eliminate the html-tags from a column that contains following values:


<html><body><br><font color="#000080"><b>test:</b></font> <br><br><font color="#000080"><b>draft</b></font> </font></body></html>

I want that only test and draft is being showed in the column after removing the tags.

Thanks for your help!

Best regards,

Aylin

2 Replies
rbecher
MVP
MVP

Hi,

this is an example how to strip HTML tags:

Data:

LOAD RecNo() as RecNo, Field

FROM <your source file>;

HtmlTag_Map:

MAPPING LOAD DISTINCT '<' & TextBetween('<' & SubField(Field, '<', IterNo()),'<','>') & '>' as HtmlTag, '' as Substitute

Resident Data

While NOT IsNull(SubField(Field, '<', IterNo()));

JOIN(Data) LOAD RecNo, MapSubstring('HtmlTag_Map', Field) as Field_Cleansed

Resident Data;

It looks a bit complex but it's easy to adapt. There are three steps:

1. Load the field with HTML tags

2. Create a Mapping Table containing all in data existing HTML tags

3. Use MapSunstring to replace the existing HTML tags with empty string

- Ralf

PS: this example is also released on my GiiHub/Gist account

Astrato.io Head of R&D
rbecher
MVP
MVP

In addition, I recognized some spaces between tags in your example: </font>_<br> and </font>_</font>

To get rid of this it should be fixed before with a replace like this: replace(Field, '> <', '><')

Astrato.io Head of R&D