Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi ,
I need to remove html code as well some styling attributes from the email message coming from web connector. I need to do this within Qlik Sense scripting:
I am using suggestions from the below blog posts:
I am wondering what I am doing wrong.
Data:
LOAD RecNo() as RecNo,
HTML_description
FROM [$(vPath)RawData\xyz.qvd] (qvd);
HtmlTag_Map:
MAPPING LOAD DISTINCT '<' & TextBetween('<' & SubField(HTML_description, '<'),'<','>') & '>' as HtmlTag, '' as Substitute
Resident Data;
Left JOIN(Data) LOAD RecNo,HTML_description, MapSubstring('HtmlTag_Map', HTML_description) as Field_Cleansed
Resident Data;
Exit Script;
I am also trying similar code suggested here, but this is working as expected but it will be taking hours to load thousands of rows:
Script:
==================
tab1:
LOAD RecNo() as RecNo,
HTML_description
FROM [$(vPath)RawData\xyz.qvd] (qvd);
For vRow = 1 to NoOfRows('tab1')
Let vID = Peek('RecNo',vRow-1,'tab1');
Let vtext = Peek('HTML_description',vRow-1,'tab1');
do while len(TextBetween(vtext, '<', '>'))>0vtext=Replace(Replace(Replace(Replace(Replace(Replace(Replace(Replace(Replace(Replace(vtext, TextBetween(vtext, '<', '>'), ''), '<>', ''),TextBetween(vtext, '{', '}'),''),'{}',''),'&ns;',''), '&nsp;',''),'v\:*',''),'o\:*', ''),'w\:*',''),'..shape','');
loop;
tab2:
load
RecNo,
HTML_description,
'$(vtext)' as text
Resident tab1
Where RecNo=$(vID);
next vRow;
drop Table tab1;
Exit Script;
Any suggestions please?
Thanks
SB
I think you can use the Regex connector?
-Rob
Hi Rob,
Thanks for your message.
Sorry, I am a bit confused here!
The data is already fetched via a web connector' Qlik Microsoft Dynamics CRM Connector'. How can I use the Regex again? Can you please elaborate on this?
Also, in the list of the web connectors (where Qlik Microsoft Dynamics CRM Connector is installed) , there is no Regex connector available! Do I need to install it? I am not familiar with this yet. Please help me how to get the solution to my problem (main post).
The code I posted is fetching one record per second. If there are at least 1000 of records it will be taking a lot of time to complete the transformation!
Thanks
SB
Is this QlikVIew or Qlik Sense? If QlikView you can use:
Qlikview Cookbook: Strip HTML From Text http://qlikviewcookbook.com/recipes/download-info/strip-html-from-text/
That referenced example will be rather fast.
-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com
QlikVIew allows you to create script functions in the Module (Ctrl-m), Qlik Sense does not. To create additional functions in Qlik Sense script, you have to use a connector or use a server side extension. Both require some setup.
-Rob
Hi Rob,
Thanks for your response on my question.
I explored the Regex connector, but unfortunately I don't have the knowledge of creating a Regular expression. I tried to seek some help but not lucky!
I have used your script to remove the HTML code from a field. But I am getting a 'General script error' ! I couldn't figure out what I am doing wrong. Can you please help me?
The script that I am using is:
Email_Message_Table:
LOAD
RecNo() as RecNo
,HTML_description as HTML_Message
Resident Table;
//>>>>>>>>>>>>>>>>>>>>>>>> Mapping Loads <<<<<<<<<<<<<<<<<<<<<<<<<//
HtmlTag_Map:
MAPPING LOAD DISTINCT '<' & TextBetween('<' & SubField(HTML_Message, '<', IterNo()),'<','>') & '>' as HtmlTag, '' as Substitute
Resident Email_Message_Table
While NOT IsNull(SubField(HTML_Message, '<', IterNo()));
Special_Char_Map:
MAPPING LOAD DISTINCT '{' & TextBetween('{' & SubField(HTML_Message, '{', IterNo()),'{','}') & '}' as HtmlTag1, '' as Substitute
Resident Email_Message_Table
While NOT IsNull(SubField(HTML_Message, '{', IterNo()));
ReplaceTag_Map:
mapping Load * inline [
HTML_Message, Substitute
table,''
,''
,''
,''
,''
body, ''
td,''
.shape,''
$<dir=rtl>$,''
.directional_text_wrapper ,''
,''
&,'&'
",''
"e;,''
>,''
<,''
+,''
div.outlookmessageheader, ''
print{hs,''
.mozemailheaders,''
mailcontainerbody,''
*.location,''
.rtl,''
.info,''
.h2,''
h2,''
.location address, ''
.location, ''
.header, ''
.file,''
.createTD,''
.updateTD,''
.var,''
.mapicon,''
#entryInfo,''
.adr,''
a:hover adress,''
a:hover,''
span,''
*,''
file-name,''
$*,''
v\:*,''
o\:*,''
u\:*,''
w\:,''
<!--,''
/ Font Definitions /,''
@font-face,''
/ Style Definitions / ,''
p.MsoNormal,''
i.MsoNormal,''
div.MsoNormal,''
a:link,''
.MsoHyperlink,''
a:visited,''
.MsoHyperlinkFollowed,''
p.msonormal0,''
li.msonormal0,''
div.msonormal0,''
p.emailquote,''
li.emailquote,''
div.emailquote,''
.E-mailStijl20,''
.MsoChpDefault,''
@page WordSection1,''
div.WordSection1,''
-->,''
/ Font Definitions /,''
/ Style Definitions /,''
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.d">,''
-name,''
p.MsoTitle, li.MsoTitle, div.MsoTitle,''
.TitleChar,''
.Heading2Char,''
.EmailStyle20,''
p.Default, li.Default, div.Default,''
/ List Definitions /,''
@list l0,''
:level1,''
:level2,''
:level3,''
:level4,''
:level5,''
:level6,''
:level7,''
:level8,''
:level9,''
] ;
// >>>>>>>>>>>>>>> Transformation <<<<<<<<<<<<<<<<<<<<<<<<<<<<<< //
JOIN(Email_Message_Table)
LOAD
RecNo,
MapSubstring('HtmlTag_Map', HTML_Message) as Description
Resident Email_Message_Table;
Thanks
SB