Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello
I have a field in a table that contains HTML tags
I want to strip these HTML tags from the field during load script
How can I accomplish this?
Hi!
Here it is:
tab1:
load * inline
[id, html
1,'<tag1>text1<tag2>text2</tag2></tag1><tag3>text3<tag3>'
2, '<tag4>text4<tag5>text5</tag5><tag6>text6</tag6></tag4>'
];
For vRow = 1 to NoOfRows('tab1')
Let vID = Peek('id',vRow-1,'tab1');
Let vtext = Peek('html',vRow-1,'tab1');
do while len(TextBetween(vtext, '<', '>'))>0
vtext=Replace(vtext, TextBetween(vtext, '<', '>'), '');
vtext=Replace(vtext, '<>', '');
loop;
tab2:
load
id,
html,
'$(vtext)' as text
Resident tab1
Where id=$(vID);
next vRow;
drop Table tab1;
Happy New Year!
Hi!
Table name not allowed in join load script:
Left Join(CASES_PEGA)temp:
Load Distinct
CASE_ID
,MapSubstring('HtmlTag_Map', PYDESCRIPTION) as Field_Cleansed
Resident CASES_PEGA;
Hi!
Use this functions TextBetween, Replace and PurgeChar in iterative loop for field value
Here is my solution:
1. Find tags:
TextBetween('<tag>text', '<', '>')
Returns 'tag'. Use it as search text in replace function below.
2. Replace tags to nothing:
Replace('<tag>text', TextBetween('<tag>text', '<', '>'), '')
Returns '<>text'
3. Remove all <, > chars
PurgeChar ( '<>text', '<>' ) Returns text
what did you mean by "in iterative loop for the field value"
can you give me sample script?
Hi!
Here it is:
tab1:
load * inline
[id, html
1,'<tag1>text1<tag2>text2</tag2></tag1><tag3>text3<tag3>'
2, '<tag4>text4<tag5>text5</tag5><tag6>text6</tag6></tag4>'
];
For vRow = 1 to NoOfRows('tab1')
Let vID = Peek('id',vRow-1,'tab1');
Let vtext = Peek('html',vRow-1,'tab1');
do while len(TextBetween(vtext, '<', '>'))>0
vtext=Replace(vtext, TextBetween(vtext, '<', '>'), '');
vtext=Replace(vtext, '<>', '');
loop;
tab2:
load
id,
html,
'$(vtext)' as text
Resident tab1
Where id=$(vID);
next vRow;
drop Table tab1;
Happy New Year!
thank you for your prompt reply
but the thing is that my table is large and adding a loop to the script makes reload takes a lot of time
isn't there a quicker approach?
I tried this approach but I'm getting a General Script Error:
HtmlTag_Map:Mapping
LOAD DISTINCT
'<' & TextBetween('<' & SubField(PYDESCRIPTION, '<', IterNo()),'<','>') & '>' as HtmlTag,
'' as Substitute
Resident CASES_PEGA
While NOT IsNull(SubField(PYDESCRIPTION, '<', IterNo()));
Left Join(CASES_PEGA)
temp:
Load Distinct
CASE_ID
,MapSubstring('HtmlTag_Map', PYDESCRIPTION) as Field_Cleansed
Resident CASES_PEGA;
Hi!
Table name not allowed in join load script:
Left Join(CASES_PEGA)temp:
Load Distinct
CASE_ID
,MapSubstring('HtmlTag_Map', PYDESCRIPTION) as Field_Cleansed
Resident CASES_PEGA;
Hi,
Thanks for your code to strip the html from a field.
This is working as expected. But unfortunately, it is taking 1 second per record to load the data!
As I have more than 50000 records it is going to taking hours to complete the load.
Is there any quick solution for this please?
To avoid looping, I have tried the below code as well this is not working either!
tab1:
LOAD RecNo() as RecNo,
Html_Field
,Replace(Replace(Replace(Replace(Replace(Replace(Replace(Replace(Replace(Replace(Html_Field, TextBetween(Html_Field, '<', '>'), ''), '<>', ''),TextBetween(Html_Field, '{', '}'),''),'{}',''),'&ns;',''), '&nsp;',''),'v\:*',''),'o\:*', ''),'w\:*',''),'..shape','') as Cleaned_Field
FROM [xyz.qvd] (qvd);
I also tried the code in the below link, this is not working either!
https://community.qlik.com/t5/QlikView-App-Development/Removing-HTML-markup-code/td-p/466050
Is there any other solution please?
Thanks
SB
Worked for me, thank you.