Skip to main content
Woohoo! Qlik Community has won “Best in Class Community” in the 2024 Khoros Kudos awards!
Announcements
Nov. 20th, Qlik Insider - Lakehouses: Driving the Future of Data & AI - PICK A SESSION
cancel
Showing results for 
Search instead for 
Did you mean: 
SAITEJA
Contributor II
Contributor II

HTML Tags

Hello,

I have qvd file in which one field consists various html tags how to remove them so that we can clean data

13 Replies
QlikTom
Employee
Employee

Sorry for the delay, I just now was able to get back to the community. 

I took your sample data and used this script:
*I stripped out carriage returns, and replaced them with a space character, otherwise the data will not render in a table. 
simply remove the replace function if you want to maintain them and have other plans for displace. 

If you are receiving any kind of error server, I recommend you contact support and they should be able to diagnose the issue. 

SampleData_Temp:
LOAD
    "Field",
    ID
FROM [lib://DataFolder/ExcelData/Community/comm71215.xls]
(biff, embedded labels, table is Sheet1$);


//load all possible HTML tags that exist in field, set replacement as ''
TagMap_Temp:
LOAD DISTINCT 
	'<' & TextBetween('<' & SubField("Field", '<'),'<','>') & '>' as TagMatch
    , '' as Replacement
Resident SampleData_Temp;


//concatenate hand type html entities into temp table (This would happen automatically without concatinate syntax)
//This inline table can be used to replace any HTML entities with their unicode counterpart if required. 
Concatenate (TagMap_Temp)
LOAD 
	* 
    INLINE [
TagMatch,	Replacement
'&#58;',	':'
];

//create a mapping table of all potential tags, and their replacements from the temp table
TagMap:
MAPPING LOAD 
	* 
Resident TagMap_Temp;

//drop TagMap_Temp, as it is no longer needed
Drop Table TagMap_Temp;


//NoConcatenate prevents automatic contatination with source table
NoConcatenate
//use MapSubString Funciton to remove all possible tags
FinalResult:
load 
	Replace(MapSubString('TagMap',"Field"), Chr(10), ' ') as "Field",
    ID
resident SampleData_Temp;

drop table SampleData_Temp;

Data load progress

qliktom_0-1591108780495.png

Result in a table.

qliktom_1-1591109053556.png

 

 

 

 

SAITEJA
Contributor II
Contributor II
Author

Thanks a lot 😊

Felipe_Violin
Contributor II
Contributor II

Congratulations.
Worked perfectly.

Thanks
Just a complement in mysql I needed to use another function in sql before to work.

https://stackoverflow.com/questions/3678980/is-there-a-mysql-function-to-decode-html-entities

Fran_by
Contributor III
Contributor III

Cool solution!