Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
data_RN
Creator
Creator

Deleting text between characters

Hello All,

I having been searching all over trying to find the right formula to purge text between specific characters and I cannot find the right way to do it 😕

Essentially I am displaying a text field that for some reason is pulling in a bunch of extra text between <> and I just want to purge it.

I can identify the first group that I want to remove (see the second column below), but I can't figure out how to purge it.. Note, there is an extra <> at the end of the string that I also want to remove.  Can I do this within the app or is it something I should remove while loading the file?  If so, how would I do that?

Thanks as always!

 

[Program Description]TextBetween([Program Description],'<','>')
<div class="ExternalClassC09E7B5015894C68B16FD07CFDE66515">​Goal of program is to transform the current stuff into new stuff.</div.>div class="ExternalClassC09E7B5015894C68B16FD07CFDE66515"

 

1 Solution

Accepted Solutions
QlikTom
Employee
Employee

The remaining characters are HTML Entities (here is an example)
https://www.w3.org/MarkUp/html-spec/html-spec_13.html

You likely don't want to remove them, but rather translate them to what they represent (a colon, or a quotation mark, etc). With this method, you can replace them with whatever you like however. 

Since creating a table of all possible html entities would likely be overkill, just add ones you find to an inline load such as the code below.  An inline load just involves you typing the data in. Just press enter to create a new row. fields are separated by a comma.

 

[query-2]:
LOAD
    "Program Title",
    "Planisware Code",
    "Program Description",
    "Value Proposition",
    "Program Status",
    "Initial Bus Case Complete?"
FROM [lib://Desktop/Book1.xlsx]
(ooxml, embedded labels, table is [query-2]);



//load all possible HTML tags that exist in field, set replacement as ''
TagMap_Temp:
LOAD DISTINCT 
	'<' & TextBetween('<' & SubField("Program Description", '<'),'<','>') & '>' as TagMatch
    , '' as Replacement
Resident [query-2];
//concatenate hand type html entities into temp table (This would happen automatically without concatinate syntax)
Concatenate (TagMap_Temp)
LOAD 
	* 
    INLINE [
TagMatch,	Replacement
'&#58;',	':'
];

//create a mapping table of all potential tags, and their replacements from the temp table
TagMap:
MAPPING LOAD 
	* 
Resident TagMap_Temp;

//drop TagMap_Temp, as it is no longer needed
Drop Table TagMap_Temp;


//NoConcatenate prevents automatic contatination with source table
NoConcatenate
//use MapSubString Funciton to remove all possible tags
FinalResult:
load 
    "Program Title",
    "Planisware Code",
	MapSubString('TagMap',"Program Description") as "Program Description",
    "Value Proposition",
    "Program Status",
    "Initial Bus Case Complete?"
resident [query-2];

drop table [query-2];

 

2020-03-17_10-50-49.png

 

View solution in original post

16 Replies
sunny_talwar

How about if you do this

Table:
LOAD TextBetween([Program Description], '>', '<') as [Program Description];
LOAD * INLINE [
    Program Description
    '<div class="ExternalClassC09E7B5015894C68B16FD07CFDE66515">​Goal of program is to transform the current stuff into new stuff.</div.>'
];
QlikTom
Employee
Employee

The example code below should demonstrate how you can use the MapSubString function to achieve the results you desire.  

 

 

//Load example row
example_temp:
load * inline [
Program Description
<div class="ExternalClassC09E7B5015894C68B16FD07CFDE66515">Goal of program is to transform the current stuff into new stuff.</div.>
];

//create a mapping table of all potential tags, with a replacement value of ''
TagMap:
MAPPING LOAD DISTINCT 
	'<' & TextBetween('<' & SubField([Program Description], '<'),'<','>') & '>' as TagMatch
    , '' as Replacement
Resident example_temp;

//use MapSubString Funciton to remove all possible tags
FinalResult:
load 
	[Program Description],
	MapSubString('TagMap',[Program Description]) as Program_Description_Cleansed
resident example_temp;

drop table example_temp;

 

 

2020-03-16_10-21-52.png

 

How does this work?
Take a look at the result of the mapping table below, by creating this table, and combining it with the MapSubString function, we can find and replace every instance of html tags in your string. Hope this helps!

2020-03-16_10-25-05.png

data_RN
Creator
Creator
Author

Thanks to both of you.  I am fairly new to Qlik Sense and still trying to figure it out, so my apologies for my nativity.  I have seen a few posts with "load * inline", but I am not that clear how to use it. I am using the Data Load editor and able to manage the functionality at a high level, but  I haven't used this command before. 

It seems like both of these suggestions would be valid options, but I am not quite sure how to implement it.  I will have about 100 unique entries with similar labeling problems.  How do you use inline load in this case?  

sunny_talwar

You mentioned you used 

 

TextBetween([Program Description], '<', '>') as [Program Description]

 

Wherever you used that... try to use this

TextBetween([Program Description], '>', '<') as [Program Description]

Just swapped the signs around 

QlikTom
Employee
Employee

Whenever you see "Load * Inline [....];", it is just for demonstrative purposes.  In practice, you would just use your actual table in that place. Since we don't have your actual data, we use an Inline Load to make up a fake table. 

a Resident Load allows you to load data from a table you have already loaded into memory.  

a Mapping Load creates a table with  key and value pair, that you can use certain functions with to replace or add values. 

If you haven't already, check out https://learning.qlik.com/ There are lots of free modules to get you started. When you want to go deeper, there are additional learning options available there as well. 

If you are unfamiliar with a term like "inline", you can look it up on https://help.qlik.com/en-US/sense and you will get a description of what it is, and how it is used. 

data_RN
Creator
Creator
Author

Thank you to both!

I was thinking that I should have attached a file to show you because indeed I am just linking to an excel spreadsheet. 

I tried Sunny's method and as much as I wanted it to work, sometimes there are multiple ">" and extra characters are left in place 😕

I attached a simple version of what I am trying to do.  Can I still use the Resident & Mapping load on a linked Excel?  

sunny_talwar

Is this your problem?

Capture.png

QlikTom
Employee
Employee

[query-2]:
LOAD
    "Program Title",
    "Planisware Code",
    "Program Description",
    "Value Proposition",
    "Program Status",
    "Initial Bus Case Complete?"
FROM [lib://Desktop/Book1.xlsx]
(ooxml, embedded labels, table is [query-2]);

//create a mapping table of all potential tags, with a replacement value of ''
TagMap:
MAPPING LOAD DISTINCT 
	'<' & TextBetween('<' & SubField("Program Description", '<'),'<','>') & '>' as TagMatch
    , '' as Replacement
Resident [query-2];

//NoConcatenate prevents automatic contatination with source table
NoConcatenate
//use MapSubString Funciton to remove all possible tags
FinalResult:
load 
    "Program Title",
    "Planisware Code",
	MapSubString('TagMap',"Program Description") as "Program Description",
    "Value Proposition",
    "Program Status",
    "Initial Bus Case Complete?"
resident [query-2];

drop table [query-2];

 

That code works produces this result:

2020-03-17_8-20-46.png

 

data_RN
Creator
Creator
Author

 Hi Sunny,

Yes!  That is exactly the problem.