Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi guys,
I have a field in SQL with HTML tags.
Is there a way to remove these?
<div>Twin needle stitching detailing. Double fabric hood with self coloured cords. Kangaroo pouch pocket with small hidden opening for earphone cord feed. Ribbed cuffs and hem. Worldwide Responsible Accredited Production (WRAP) certified production. 50 great colours. Soft cotton faced fabric creates ideal printing surface. <br />
<br />
<b>Washing Instructions:</b><br />
Machine wash 30°. Do not bleach. Tumble dry low heat. Low iron. Do not dry clean</div>
Iconic school sweatshirt. Twin needle coverseams on neck. Raglan sleeve cuffs and waistband for extra durability. Herringbone back neck tape. Spotshield stain resistant coating for fabric protection. Fade resistant. <br > <br><b>Washing Instructions:</b><br> Suitable for 40°C wash and tumble dry <br>
I'd also like to remove °
These are just a couple of examples!!
Thanks
I've also discovered a field like this!
<div><b>
<ul style="list-style-type:disc">
<li>Tubular construction for stability</li>
<li>1x1 rib trim with elastane</li>
<li>Herringbone neck tape</li>
</ul>
</b> <br />
<b>Fabric:</b><br />
50% Combed ringspun cotton, 50% Polyester <br />
<br />
<b>Weight:</b><br />
295gsm</div>
Is this going to be impossible? Haha
Have a look on the qv table-wizard which could natively read html and provided a preview and within the extended are options to read the attribut-values, too.
- Marcus
Hi Marcus,
What if the data is stored in a SQL table. Can I still do this? If so, how?
Many thanks!
The way I see it, you have two options (if you want to avoid messy string mangling of the text):
In both cases, you will find it easier to develop the XML load if you manually create a file containing XML in the same schema. Once that load is working, you can use the generated script as a template. Qlikview requires the XML to be well-formed, so partial tags and any constructs not complying with XML rules will prevent the data from loading.
FROMFIELD is a little used feature and there is almost no documentation and very little even here on the community. I have managed to get that working once, but it was some time ago and I no longer have that code. Perhaps someone reading this can offer some asistance.
Another option came to mind after posting the above. VBScript supports regular expressions, and you can call VBScript functions from the load script. You could then parse the text with regular expressions, which will make for much simpler string handling.
Calling vbscript functions is a performance killer, though, so I would only consider this if the size of your data set is not large.
Search the commuity for vbscript and regular expressions / regex for the code.
This might be helpful: https://www.qlik2go.net/en/2014/08/08/b/
- Marcus
See Qlikview Cookbook: Strip HTML From Text http://qlikviewcookbook.com/recipes/download-info/strip-html-from-text/
for a working example of using Regular Expressions to remove html tags.
-Rob
Thanks for your advice guys. I actually found this in on another thread which seems to work really well !
Data:
first 200
LOAD
RecNo() as RecNo,
SKU,
Description_IncHTML
From
Data\nop_product_data.qvd (qvd)
where
not wildmatch(SKU,'ML*')
;
HtmlTag_Map:
MAPPING LOAD DISTINCT '<' & TextBetween('<' & SubField(PurgeChar(Description_IncHTML, chr(13)&chr(10)), '<', IterNo()),'<','>') & '>' as HtmlTag, '' as Substitute
Resident Data
While NOT IsNull(SubField(Description_IncHTML, '<', IterNo()))
;
JOIN(Data) LOAD RecNo, MapSubstring('HtmlTag_Map', Description_IncHTML) as Field_Cleansed
Resident Data
;
I am however still seeing carriage returns
Warm and comfortable fleece jacket. Shaped to fit. Full zip front with cadet collar. Two zipped closed front pockets. Two interior pockets. Adjustable base drawcord. Easy re-label system.
Washing Instructions:
Machine wash 40°. Gentle setting. Do not bleach. Iron low. Do not tumble dry. Do not dry clean
Is there a way to remove these? I'd also like to remove the "°" too !
Thanks for all your help guys. I have found a solution here (if anyone is interested)
Data loading with extra characters - need to trim
This worked perfectly for me !