Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us to spark ideas for how to put the latest capabilities into action. Register here!
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

How to get

hey ,

i've loading text field from the sql Dbase but it written on HTML (char)

how can i get rid from all this signs on the script?

the signs exam': <p>,</p>,</strong>

i've been tring to use this functions on the script:

PurgeChar(Free_Text,TextBetween(Free_Text,'<','>'))

but it takes only the first sighn between the <>, I need to do loop on the script but i didn't manage to do it.

hope that some one will help me!

thanks

7 Replies
vgutkovsky
Master II
Master II

Camella,

QlikView is not really well-suited for data scrubbing, so the ideal solution would be to introduce an intermediate step to remove these tags. If you need to do this in QV, you'll be forced to use something like this (which is messy):


TextBetween(mytext, '>', '<',substringcount(mytext, '>' )/2)


This will work because all HTML tags need to be opened and closed, and whatever is left must be your actual text. On the other hand, if the programmer simply forgets to close 1 tag the things breaks...

Regards,

Not applicable
Author

Thank u but it doesn't work. 😞

It's works only for the last string with the: "<"&">"

This why, i need to do loop and i don't know how to do it on the script when i load the data.

Oleg_Troyansky
Partner Ambassador/MVP
Partner Ambassador/MVP

Camells,

I'm not sure what exactly do you mean by LOOP... You can implement traditional loops in the script using something like this:

FOR i= 0 to 10

... more statements here...

next

Searching the Help Section for the word LOOP shows a short list of a few available options: DO...LOOP, FOR EACH ... NEXT and FOR ... NEXT

For your purpose, however, I'd recommend using VBScript function to strip off HTML tags and using the result of the function in QlikView load. Here is one example from the available on-line sources on stripping off HTML tags:

http://www.4guysfromrolla.com/webtech/042501-1.shtml

Ask me about Qlik Sense Expert Class!
vgutkovsky
Master II
Master II

I agree with Oleg. I'm not sure why the solution I posted is not working. I just tested it with this string and it works fine:


TextBetween('<p><strong>this is a test</strong></p>', '>', '<',substringcount('<p><strong>this is a test</strong></p>', '>' )/2)


Not applicable
Author

I think that the reson that it didn't work is because qlikview doesnt support my language .

anyway

Many thanks dear Oleg and Vlad Yes

Not applicable
Author

hi,

For example from a batch file with command line bellow.

sed -e "s/<[^>]*>//" file.html

There is sed for Windows , for example here: http://gnuwin32.sourceforge.net/packages/sed.htm

-Alex

Not applicable
Author

Hi,

If you create macros in Jscript instead of VBscript, there is full regular expression support.


function dropTags(row) {
return row.replace(/<[^>]*>/gi, "");
}


The loading script becomes


raw_html:
LOAD @1 as row
FROM 1.txt
(txt, codepage is 1252, no labels, delimiter is '\n', no quotes);

no_tags:
LOAD dropTags(row) as clean_row
resident raw_html;


-Alex