Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
Sagar_Apte
Contributor III
Contributor III

Find search string within text in script

Hi All,

I need to build the logic in script for below data scenario

ID       text_description                                     search_string
1        abc;N:xyz;abc;N:pqr;K:123;N:123;   N
2        B:abc;xyz;abc;B:123;xyz;N:123;         B
3        123;C:test;C:234;abc;pqr;B:xyz;         C

text_description always starts with some texts and end with ';' and in between texts are also separated by ';'.

I want to search value of search_string column in text_description column and extract the part of string with below logic

Let's take a example of ID =1
text_description is abc;N:xyz;abc;N:pqr;K:123;N:123;
search_string is 'N'


Now I need to search 'N' followed by ':' in text_description and extract the text between 'N:'& ';'

'N:' could be anywhere in the string and could be repeated multiple times. Hence for each occurrence we need to extract the texts

Output for ID=1 will be
ID       text_description                                        String
1        abc;N:xyz;abc;N:pqr;K:123;N:123;       xyz;pqr

Similar logic should work for other search string.

This is fake data as sample data will be difficult to share but in actual scenario format of the data will be same.

Any help will be really appreciated.

2 Solutions

Accepted Solutions
Kushal_Chawda

another approach if you don't want to increase your data using interno() and group it later just to have simple load statements

T1:
load ID,
    trim(text_description) as text_description,
    trim(search_string) as search_string
    SubStringCount(trim(text_description),trim(search_string)&':') as count_search_string
FROM Data;

Max:
Load max(count_search_string) as max_count_search_string;
Load FieldValue('count_search_string',RecNo()) as count_search_string
AutoGenerate FieldValueCount('count_search_string');

Let vMaxCountSearchString = Peek('max_count_search_string',0,'Max');

let vTextDescriptionFieldName ='text_description'; // text description field name in your data

let vSearchStringFieldName = 'search_string';  // search string field name in your data

Let vFormula1 = '(TextBetween($(vTextDescriptionFieldName),$(vSearchStringFieldName)&'
                 &chr(39)&':'&chr(39)&','&chr(39)&';'&chr(39)&')&';

for i=2 to '$(vMaxCountSearchString)'

let vFormula1 = '$(vFormula1)' &chr(39)&';'&chr(39)&'&TextBetween($(vTextDescriptionFieldName),
                $(vSearchStringFieldName)&'&chr(39)&':'&chr(39)&','&chr(39)&';'&chr(39)&',$(i))&';

Next

let vFormula2 = mid('$(vFormula1)',1,len('$(vFormula1)')-1)&')';

let vFormula = 'left('&'$(vFormula2)'&','&
               'len('&'$(vFormula2)'&')'&' - ('&'$(vMaxCountSearchString)'&'- count_search_string))';

let vFormula1 =Null();
let vFormula2 =Null();

T2:
Load *,
    $(vFormula) as String
Resident T1;

Drop Table T1;

 

View solution in original post

Saravanan_Desingh

With Group By.

tab1:
LOAD *, TextBetween(text_description,search_string&':',';',IterNo()) As Word,
		IterNo() As StrIx
While IterNo() <= SubStringCount(text_description,search_string&':');
LOAD * INLINE [
ID, text_description, search_string
1, abc;N:xyz;abc;N:pqr;K:123;N:123;, N
2, B:abc;xyz;abc;B:123;xyz;N:123;, B
3, 123;C:test;C:234;abc;pqr;B:xyz;, C
];

Left Join(tab1)
LOAD ID, Concat(Word,';',StrIx) As String
Resident tab1
Group By ID;

View solution in original post

7 Replies
Saravanan_Desingh

One Solution is.

tab1:
LOAD *, TextBetween(text_description,search_string&':',';',IterNo()) As Word
While IterNo() <= SubStringCount(text_description,search_string&':');
LOAD * INLINE [
ID, text_description, search_string
1, abc;N:xyz;abc;N:pqr;K:123;N:123;, N
2, B:abc;xyz;abc;B:123;xyz;N:123;, B
3, 123;C:test;C:234;abc;pqr;B:xyz;, C
];

 

Saravanan_Desingh

Output.

commQV39.PNG

Kushal_Chawda

another approach if you don't want to increase your data using interno() and group it later just to have simple load statements

T1:
load ID,
    trim(text_description) as text_description,
    trim(search_string) as search_string
    SubStringCount(trim(text_description),trim(search_string)&':') as count_search_string
FROM Data;

Max:
Load max(count_search_string) as max_count_search_string;
Load FieldValue('count_search_string',RecNo()) as count_search_string
AutoGenerate FieldValueCount('count_search_string');

Let vMaxCountSearchString = Peek('max_count_search_string',0,'Max');

let vTextDescriptionFieldName ='text_description'; // text description field name in your data

let vSearchStringFieldName = 'search_string';  // search string field name in your data

Let vFormula1 = '(TextBetween($(vTextDescriptionFieldName),$(vSearchStringFieldName)&'
                 &chr(39)&':'&chr(39)&','&chr(39)&';'&chr(39)&')&';

for i=2 to '$(vMaxCountSearchString)'

let vFormula1 = '$(vFormula1)' &chr(39)&';'&chr(39)&'&TextBetween($(vTextDescriptionFieldName),
                $(vSearchStringFieldName)&'&chr(39)&':'&chr(39)&','&chr(39)&';'&chr(39)&',$(i))&';

Next

let vFormula2 = mid('$(vFormula1)',1,len('$(vFormula1)')-1)&')';

let vFormula = 'left('&'$(vFormula2)'&','&
               'len('&'$(vFormula2)'&')'&' - ('&'$(vMaxCountSearchString)'&'- count_search_string))';

let vFormula1 =Null();
let vFormula2 =Null();

T2:
Load *,
    $(vFormula) as String
Resident T1;

Drop Table T1;

 

Kushal_Chawda

If you don't have huge data set then use what @Saravanan_Desingh  suggested. It is simple and easy to understand. Just to add on if you want the output as ';' separated then use concat(String,';') either as measure in chart or in script using group by on further load.

Saravanan_Desingh

With Group By.

tab1:
LOAD *, TextBetween(text_description,search_string&':',';',IterNo()) As Word,
		IterNo() As StrIx
While IterNo() <= SubStringCount(text_description,search_string&':');
LOAD * INLINE [
ID, text_description, search_string
1, abc;N:xyz;abc;N:pqr;K:123;N:123;, N
2, B:abc;xyz;abc;B:123;xyz;N:123;, B
3, 123;C:test;C:234;abc;pqr;B:xyz;, C
];

Left Join(tab1)
LOAD ID, Concat(Word,';',StrIx) As String
Resident tab1
Group By ID;
Saravanan_Desingh

Output.

commQV40.PNG

Sagar_Apte
Contributor III
Contributor III
Author

Thank you. Both work fine as expected