Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
sudhakar_budde
Creator
Creator

Remove HTML code from email message field

Hi ,

I need to remove html code as well some styling attributes from the email message coming from web connector. I need to do this within Qlik Sense scripting:
I am using suggestions from the below blog posts:


https://community.qlik.com/t5/QlikView-App-Development/Removing-HTML-markup-code/td-p/466050/highlig...

I am wondering what I am doing wrong.

Data:

LOAD RecNo() as RecNo,
HTML_description

FROM [$(vPath)RawData\xyz.qvd] (qvd);

HtmlTag_Map:

MAPPING LOAD DISTINCT '<' & TextBetween('<' & SubField(HTML_description, '<'),'<','>') & '>' as HtmlTag, '' as Substitute

Resident Data;

Left JOIN(Data) LOAD RecNo,HTML_description, MapSubstring('HtmlTag_Map', HTML_description) as Field_Cleansed

Resident Data;

Exit Script;

I am also trying  similar code suggested here, but this is working as expected but it will be taking hours to load thousands of rows:

https://community.qlik.com/t5/QlikView-Deployment/strip-HTML-codes-from-a-field-during-load-script/m...

Script:

==================

tab1:

LOAD RecNo() as RecNo,
HTML_description

FROM [$(vPath)RawData\xyz.qvd] (qvd);

For vRow = 1 to NoOfRows('tab1')
Let vID = Peek('RecNo',vRow-1,'tab1');
Let vtext = Peek('HTML_description',vRow-1,'tab1');

do while len(TextBetween(vtext, '<', '>'))>0vtext=Replace(Replace(Replace(Replace(Replace(Replace(Replace(Replace(Replace(Replace(vtext, TextBetween(vtext, '<', '>'), ''), '<>', ''),TextBetween(vtext, '{', '}'),''),'{}',''),'&ns;',''), '&nsp;',''),'v\:*',''),'o\:*', ''),'w\:*',''),'..shape','');


loop;

tab2:
load
RecNo,
HTML_description,
'$(vtext)' as text
Resident tab1
Where RecNo=$(vID);

next vRow;

drop Table tab1;

Exit Script;

 

Any suggestions please?

 

Thanks

SB

 

 

Labels (1)
  • HTML

6 Replies
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

sudhakar_budde
Creator
Creator
Author

Hi Rob,

Thanks for your message.

Sorry, I am a bit confused here! 

The data is already fetched via a web connector' Qlik Microsoft Dynamics CRM Connector'. How can I use the Regex again? Can you please elaborate on this?

Also, in the list of the web connectors (where Qlik Microsoft Dynamics CRM Connector is installed) , there is no Regex connector available! Do I need to install it? I am not familiar with this yet. Please help me how to get the solution to my problem (main post).

The code I posted is fetching one record per second. If there are at least 1000 of records it will be taking a lot of time to complete the transformation!

Thanks 

SB

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Is this QlikVIew or Qlik Sense?  If QlikView you can use:

Qlikview Cookbook: Strip HTML From Text http://qlikviewcookbook.com/recipes/download-info/strip-html-from-text/

That referenced example will be rather fast.  

-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com

sudhakar_budde
Creator
Creator
Author

Hi Rob,

Thanks for your reply.

Yeah, sorry this is for Qlik Sense scripting only! I didn't find a Qlik Sense scripting forum, that's why I posted here.

Is your QlikView script not compatible with Qlik Sense?

Thanks

SB

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

QlikVIew allows you to create script functions in the Module (Ctrl-m), Qlik Sense does not.  To create additional functions in Qlik Sense script, you have to use a connector or use a server side extension. Both require some setup. 

-Rob

sudhakar_budde
Creator
Creator
Author

Hi Rob,

Thanks for your response on my question.

I explored the Regex connector, but unfortunately I don't have the knowledge of creating a Regular expression. I tried to seek some help but not lucky!

 

I have used your script to remove the HTML code from a field. But I am getting a 'General script error' ! I couldn't figure out what I am doing wrong. Can you please help me? 

The script that I am using is:


Email_Message_Table:
LOAD
RecNo() as RecNo
,HTML_description as HTML_Message

Resident Table;

//>>>>>>>>>>>>>>>>>>>>>>>> Mapping Loads <<<<<<<<<<<<<<<<<<<<<<<<<//

HtmlTag_Map:

MAPPING LOAD DISTINCT '<' & TextBetween('<' & SubField(HTML_Message, '<', IterNo()),'<','>') & '>' as HtmlTag, '' as Substitute
Resident Email_Message_Table
While NOT IsNull(SubField(HTML_Message, '<', IterNo()));

Special_Char_Map:

MAPPING LOAD DISTINCT '{' & TextBetween('{' & SubField(HTML_Message, '{', IterNo()),'{','}') & '}' as HtmlTag1, '' as Substitute
Resident Email_Message_Table
While NOT IsNull(SubField(HTML_Message, '{', IterNo()));


ReplaceTag_Map:
mapping Load * inline [
HTML_Message, Substitute
table,''
&#13;,''
&#10;,''
&#13;,''
&#10;,''
body, ''
td,''
.shape,''
$<dir=rtl>$,''
.directional_text_wrapper ,''
&nbsp;,''
&amp;,'&'
&quot;,''
&quote;,''
&gt;,''
&lt;,''
&#43;,''
div.outlookmessageheader, ''
print{hs,''
.mozemailheaders,''
mailcontainerbody,''
*.location,''
.rtl,''
.info,''
.h2,''
h2,''
.location address, ''
.location, ''
.header, ''
.file,''
.createTD,''
.updateTD,''
.var,''
.mapicon,''
#entryInfo,''
.adr,''
a:hover adress,''
a:hover,''
span,''
*,''
file-name,''
$*,''
v\:*,''
o\:*,''
u\:*,''
w\:,''
<!--,''
/ Font Definitions /,''
@font-face,''
/ Style Definitions / ,''
p.MsoNormal,''
i.MsoNormal,''
div.MsoNormal,''
a:link,''
.MsoHyperlink,''
a:visited,''
.MsoHyperlinkFollowed,''
p.msonormal0,''
li.msonormal0,''
div.msonormal0,''
p.emailquote,''
li.emailquote,''
div.emailquote,''
.E-mailStijl20,''
.MsoChpDefault,''
@page WordSection1,''
div.WordSection1,''
-->,''
/ Font Definitions /,''
/ Style Definitions /,''
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.d">,''
-name,''
p.MsoTitle, li.MsoTitle, div.MsoTitle,''
.TitleChar,''
.Heading2Char,''
.EmailStyle20,''
p.Default, li.Default, div.Default,''
/ List Definitions /,''
@list l0,''
:level1,''
:level2,''
:level3,''
:level4,''
:level5,''
:level6,''
:level7,''
:level8,''
:level9,''
] ;


// >>>>>>>>>>>>>>> Transformation <<<<<<<<<<<<<<<<<<<<<<<<<<<<<< //

JOIN(Email_Message_Table)

LOAD
RecNo,
MapSubstring('HtmlTag_Map', HTML_Message) as Description
Resident Email_Message_Table;

 

 

Thanks 

SB