Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I've got a webquery that generates an XML document in the browser. I'm using this as a web file data source in QlikView, and it works as expected, pulling in the XML schema and data. However, one of the fields is full of HTML markup, and I'm not sure the best way to get it all out. Since the XML is generated dynamically on an internet site, it never hits the server file system and so I can't do anything on that end. Also, the HTML is pretty extensive, with lots of things like:
<TD STYLE="BORDER-BOTTOM: black 0.5pt solid; BORDER-LEFT: black 0.5pt solid; BACKGROUND-COLOR: white; WIDTH: 208pt; HEIGHT: 12.75pt;">
So it's not as simple as a few replace statements to strip <p> and </p>.
Any ideas?
Great, thanks so much Ralf. I was planning on doing that but wanted to check whether I was missing a trick or building something very inefficient!
It could work in another way, too:
Load the Fields
Load the Map for Field1 into Map1
Load the Map for Field2 into Map2
Load the Map for Field3 into Map3
Join/Strip with 3x different MapSubstr()
- Ralf
I'm getting an error when i try this: MapSubString function takes a constant map as first parameter
I'm loading a table from a stored procedure like so:
CRMActivity:
REPLACE SQL EXEC sp_getdata '$(vEmail)';
HtmlTag_Map:
MAPPING LOAD DISTINCT '<' & TextBetween('<' & SubField(description, '<', IterNo()),'<','>') & '>' as HtmlTag, '' as Substitute
Resident CRMActivity
While NOT IsNull(SubField(description, '<', IterNo()));
JOIN(CRMActivity) LOAD activityid, description, objectname, ownername, subject, MapSubstring('HtmlTag_Map', description) as Field_Cleansed
Resident CRMActivity;
The fields returned from the stored procedure are activityid, description, objectname, ownername, subject. I'm trying to clean the description field of html.
Can anyone see what I'm doing wrong?
Strange error, never got that. Can you load the mapping table into a normal table to see what's in it:
HtmlTag_Map:
LOAD DISTINCT '<' & TextBetween('<' & SubField(description, '<', IterNo()),'<','>') & '>' as HtmlTag, '' as Substitute
Resident CRMActivity
While NOT IsNull(SubField(description, '<', IterNo()));
I tried this and it appears the mapping table is blank. Any suggestions?
Can you give an example what's in table CRMActivity?
Here's one of the records:
[1000145,
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"><HTML>
<META NAME="Generator" CONTENT="MS Exchange Server version 14">
<TITLE></TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->
<BR>
<P><FONT SIZE=2>Hello Jackson,<BR>
<BR>
<BR>
Any idea what the following error means:<BR>
<BR>
<BR>
Error: Disconnected: No supported methods available<BR>
<BR>
</BODY>
</HTML>
, Email, User1, RE: Support Case-30084]
Maybe we have to purge the line breaks before:
HtmlTag_Map:
MAPPING LOAD DISTINCT '<' & TextBetween('<' & SubField(PurgeChar(description, chr(13)&chr(10)), '<', IterNo()),'<','>') & '>' as HtmlTag, '' as Substitute
Resident CRMActivity
While NOT IsNull(SubField(description, '<', IterNo()));
JOIN(CRMActivity) LOAD activityid, description, objectname, ownername, subject, MapSubstring('HtmlTag_Map', PurgeChar(description, chr(13)&chr(10))) as Field_Cleansed
Resident CRMActivity;
Same error:
Error in expression:MapSubString function takes a constant map as first parameter
JOIN(CRMActivity) LOAD activityid, description, objectname, ownername, subject, MapSubstring('HtmlTag_Map', PurgeChar(description, chr(13)&chr(10))) as Field_Cleansed
Resident CRMActivity
Thanks for trying. I'm pulling my hair out
Just to clear this up. The mapping works. Even through it throws that error in the script, the field_cleansed column contains the text without the html. Its just that it throws that error every time a partial refresh is done.