Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
SAITEJA
Contributor II
Contributor II

HTML Tags

Hello,

I have qvd file in which one field consists various html tags how to remove them so that we can clean data

13 Replies
QlikTom
Employee
Employee

Sorry for the delay, I just now was able to get back to the community. 

I took your sample data and used this script:
*I stripped out carriage returns, and replaced them with a space character, otherwise the data will not render in a table. 
simply remove the replace function if you want to maintain them and have other plans for displace. 

If you are receiving any kind of error server, I recommend you contact support and they should be able to diagnose the issue. 

SampleData_Temp:
LOAD
    "Field",
    ID
FROM [lib://DataFolder/ExcelData/Community/comm71215.xls]
(biff, embedded labels, table is Sheet1$);


//load all possible HTML tags that exist in field, set replacement as ''
TagMap_Temp:
LOAD DISTINCT 
	'<' & TextBetween('<' & SubField("Field", '<'),'<','>') & '>' as TagMatch
    , '' as Replacement
Resident SampleData_Temp;


//concatenate hand type html entities into temp table (This would happen automatically without concatinate syntax)
//This inline table can be used to replace any HTML entities with their unicode counterpart if required. 
Concatenate (TagMap_Temp)
LOAD 
	* 
    INLINE [
TagMatch,	Replacement
'&#58;',	':'
];

//create a mapping table of all potential tags, and their replacements from the temp table
TagMap:
MAPPING LOAD 
	* 
Resident TagMap_Temp;

//drop TagMap_Temp, as it is no longer needed
Drop Table TagMap_Temp;


//NoConcatenate prevents automatic contatination with source table
NoConcatenate
//use MapSubString Funciton to remove all possible tags
FinalResult:
load 
	Replace(MapSubString('TagMap',"Field"), Chr(10), ' ') as "Field",
    ID
resident SampleData_Temp;

drop table SampleData_Temp;

Data load progress

qliktom_0-1591108780495.png

Result in a table.

qliktom_1-1591109053556.png

 

 

 

 

SAITEJA
Contributor II
Contributor II
Author

Thanks a lot 😊

Felipe_Violin
Contributor II
Contributor II

Congratulations.
Worked perfectly.

Thanks
Just a complement in mysql I needed to use another function in sql before to work.

https://stackoverflow.com/questions/3678980/is-there-a-mysql-function-to-decode-html-entities

Fran_by
Contributor III
Contributor III

Cool solution!