Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I have qvd file in which one field consists various html tags how to remove them so that we can clean data
Sorry for the delay, I just now was able to get back to the community.
I took your sample data and used this script:
*I stripped out carriage returns, and replaced them with a space character, otherwise the data will not render in a table.
simply remove the replace function if you want to maintain them and have other plans for displace.
If you are receiving any kind of error server, I recommend you contact support and they should be able to diagnose the issue.
SampleData_Temp:
LOAD
"Field",
ID
FROM [lib://DataFolder/ExcelData/Community/comm71215.xls]
(biff, embedded labels, table is Sheet1$);
//load all possible HTML tags that exist in field, set replacement as ''
TagMap_Temp:
LOAD DISTINCT
'<' & TextBetween('<' & SubField("Field", '<'),'<','>') & '>' as TagMatch
, '' as Replacement
Resident SampleData_Temp;
//concatenate hand type html entities into temp table (This would happen automatically without concatinate syntax)
//This inline table can be used to replace any HTML entities with their unicode counterpart if required.
Concatenate (TagMap_Temp)
LOAD
*
INLINE [
TagMatch, Replacement
':', ':'
];
//create a mapping table of all potential tags, and their replacements from the temp table
TagMap:
MAPPING LOAD
*
Resident TagMap_Temp;
//drop TagMap_Temp, as it is no longer needed
Drop Table TagMap_Temp;
//NoConcatenate prevents automatic contatination with source table
NoConcatenate
//use MapSubString Funciton to remove all possible tags
FinalResult:
load
Replace(MapSubString('TagMap',"Field"), Chr(10), ' ') as "Field",
ID
resident SampleData_Temp;
drop table SampleData_Temp;
Data load progress
Result in a table.
Thanks a lot 😊
Congratulations.
Worked perfectly.
Thanks
Just a complement in mysql I needed to use another function in sql before to work.
https://stackoverflow.com/questions/3678980/is-there-a-mysql-function-to-decode-html-entities
Cool solution!