Skip to main content
Announcements
NEW: Seamless Public Data Sharing with Qlik's New Anonymous Access Capability: TELL ME MORE!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

How to remove html text?

Hi,

I have created a odbc connection and there are some fields that have html code in them. Do you know an easy way to remove the html code to be displayed on qlikview?

I have found the following:

1. Go to Tools >Edit Module, add the following:

  Function stripHTML(strHTML) 

'Strips the HTML tags from strHTML

  Dim objRegExp, strOutput

  Set objRegExp = New Regexp

  objRegExp.IgnoreCase = True

  objRegExp.Global = True

  objRegExp.Pattern = "<(.|\n)+?>"

  'Replace all HTML tag matches with the empty string

  strOutput = objRegExp.Replace(strHTML, "")

 

  'Replace all < and > with &lt; and &gt;

  strOutput = Replace(strOutput, "<", "&lt;")

  strOutput = Replace(strOutput, ">", "&gt;")

 

  stripHTML = strOutput    'Return the value of strOutput

  Set objRegExp = Nothing

End Function

2. In Edit Script, after the field type:

replace(replace(stripHTML([content/properties/Your filed name])

        ,'&#58;',':')

        ,'&#160;',' ') as newcleanfiledname,

But this didn't work for me. Any idea?

Thanks,

4 Replies
jonathandienst
Partner - Champion III
Partner - Champion III

It depends on how complicated the HTML text is. If its a fairly consistent pattern, you may be able to get away with using the built-in string handling functions.

If that is not practical, a module function using Regexes should work, but clearly requires some debugging. Test the module with sample data from your source by sprinkling with MsgBoxes to see what it is doing, and where it may be going wrong.

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein
jonathandienst
Partner - Champion III
Partner - Champion III

These two lines may be the wrong way round:

'Replace all < and > with &lt; and &gt;

  strOutput = Replace(strOutput, "<", "&lt;")

  strOutput = Replace(strOutput, ">", "&gt;")

'The HTML will contain &lt; and &gt;

  strOutput = Replace(strOutput, "&lt;", "<")

strOutput = Replace(strOutput, "&gt;", ">")

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein
jonathandienst
Partner - Champion III
Partner - Champion III

>>replace(replace(stripHTML([content/properties/Your filed name])


This field name looks like something from an XML load. If the HTML is well formed, then this another way to read the data, but you will not need the model functions.

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein
Anonymous
Not applicable
Author

Ok, this is what I have, for example one of my fields is the following:

<p>Test, this is a test for qlikview.$nbsp;</p>

<p> This is another test for qlikview~530</p>

How can modify the code to remove the html code?