Qlik Community

QlikView Scripting

Discussion Board for collaboration on QlikView Scripting.

ben2roberts
Contributor

Identify Emoji / Unicode >x in Text

Hi everyone, help appreciated on this one, I have searched far and wide to no avail.

I am building a simple Whatsapp chat analyser in Qlikview and I want to be able to tell which messages have emoji in them.

The input data is a text file from Whatapp I have imported to excel which makes the messages look something like this:

Did you see the game last night? ðŸ”

I.e. the emoji have been replaced by their unicode character.

I then load the excel file into Qlikview.

My research so far has uncovered that these special characters have a unicode value greater than 128 so I was trying to find a way to loop through the letters in text to look for those with unicode > 128 (noting that Ord('character') gives the unicode value.

My aim is to create a column that shows if a message has emoji so I can count the number of messages containing emoji.

Thanks,

Ben

Tags (3)
1 Solution

Accepted Solutions
MVP
MVP

Re: Identify Emoji / Unicode >x in Text

Maybe something along these lines

Input:

LOAD Recno() as ID, * INLINE [

WhatsappText

Did you see the game last night? ðŸ”

Standard Text

];

LEFT JOIN (Input)

LOAD ID, Max(OrdFlag) as Flag GROUP BY ID;

LOAD  *, If(Ord>128,1,0) as OrdFlag;

LOAD ID, Iterno() as CharNo, Ord(Mid(WhatsappText,Iterno(),1)) as Ord, Mid(WhatsappText,Iterno(),1) as Char

RESIDENT Input

While iterno() <= Len(WhatsappText);

2 Replies
MVP
MVP

Re: Identify Emoji / Unicode >x in Text

Maybe something along these lines

Input:

LOAD Recno() as ID, * INLINE [

WhatsappText

Did you see the game last night? ðŸ”

Standard Text

];

LEFT JOIN (Input)

LOAD ID, Max(OrdFlag) as Flag GROUP BY ID;

LOAD  *, If(Ord>128,1,0) as OrdFlag;

LOAD ID, Iterno() as CharNo, Ord(Mid(WhatsappText,Iterno(),1)) as Ord, Mid(WhatsappText,Iterno(),1) as Char

RESIDENT Input

While iterno() <= Len(WhatsappText);

Highlighted
ben2roberts
Contributor

Re: Identify Emoji / Unicode >x in Text

Excellent - works! Thanks!