Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Categorization Based on A long text filed

Dear

in my database a have a field which contain a long text description and in the description i have some hashtags like #Against_HR_Policy, #Not_valid complaint and I need to find a way to generate a new column which contains the all hashtags only so i could use it as a new filter

example

Original Field

CaseDescription
1this is to inform you that this case is #Not_valid_complaint
2this is to inform you that this case is case is #Against_HR_Policy
CaseDescription Category
1this is to inform you that this case is #Not_valid complaint#Not_valid
2this is to inform you that this case is case is #Against_HR_Policy#Against_HR_Policy

The description might have more than one hashtags

Regards,

7 Replies
kaushiknsolanki
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi,

     Try this.

Load *,Mid(Field,index(Field,'#'),Len(Field)-index(Field,'#')+1) as Cat inline [

Field

this is to inform you that this case is #Not_valid complaint

this is to inform you that this case is case is #Against_HR_Policy

];

Regards,

Kaushik Solanki

Please remember to hit the 'Like' button and for helpful answers and resolutions, click on the 'Accept As Solution' button. Cheers!
Not applicable
Author

works only with one hashtags in the description

CaseDescriptionCategory
1this is to inform you that this case is #Not_valid complaint #Against_HR_Policy#Not_valid
2this is to inform you that this case is case is #Against_HR_Policy#Against_HR_Policy

should result in

CaseDescriptionCategory
1this is to inform you that this case is #Not_valid complaint #Against_HR_Policy#Not_valid
2this is to inform you that this case is case is #Against_HR_Policy#Against_HR_Policy
1this is to inform you that this case is case is #Against_HR_Policy #Against_HR_Policy#Against_HR_Policy

it also includes all the words after the hashtag

kaushiknsolanki
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi,

     Try this.

DATA:

Load *,Mid(Field,index(Field,'#',1),

  if(SubStringCount(Field,'#')=1, Len(Field)-Index(Field,'#',1)+1,if(SubStringCount(Field,'#')=2, Index(Field,'#',2)-Index(Field,'#',1)))) as Cat inline [

Field

this is to inform you that this case is #Not_valid complaint #Against_HR_Policy

this is to inform you that this case is case is #Against_HR_Policy

];

Concatenate

Load *,Mid(Field,index(Field,'#',2),

  if(SubStringCount(Field,'#')=1, Len(Field)-Index(Field,'#',1)+1,if(SubStringCount(Field,'#')=2, Index(Field,'#',2)-Index(Field,'#',1)))) as Cat inline [

Field

this is to inform you that this case is #Not_valid complaint #Against_HR_Policy

this is to inform you that this case is case is #Against_HR_Policy

];

DATA2:

LOad Field as Description, Cat as Category

Resident DATA where len(Cat)<>0;

Drop table DATA;

Regards,

Kaushik Solanki

Please remember to hit the 'Like' button and for helpful answers and resolutions, click on the 'Accept As Solution' button. Cheers!
Not applicable
Author

It is working fine except

Issues:

All the text after the hash is included until we have another hash

     we need to have only the word that has the hash as part of it

'will it become easier if we have the word between two hashes #here# ?'

The current output contains some null category

kaushiknsolanki
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi,

    I have tested it.

    Have a look at the application attached.

Please remember to hit the 'Like' button and for helpful answers and resolutions, click on the 'Accept As Solution' button. Cheers!
Not applicable
Author

Check my comments in the Script and applications

kaushiknsolanki
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi,

    Yes because the expression is written for only 2 #, if you want you can further add one more if statement in that and make it useful for 3#.

Regards,

Kaushik Solanki

Please remember to hit the 'Like' button and for helpful answers and resolutions, click on the 'Accept As Solution' button. Cheers!