Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
krishna20
Specialist II
Specialist II

Substring Count

Hi Folks,

I have data like below and expected output as given.

I am using substring count to achieve this, but not getting expected result. Somewhere going wrong, Could anyone please look into this and advise.

Error_CountCount
.,Kalyan2
Uday@Gmail.com1
Uday.,2
Kalyan_krishna1
Gopi232
(krishna1

 

Load* ,
If(
( Substringcount(Error_Count, '.')
+Substringcount(Error_Count, ',')
+ Substringcount(Error_Count, '@')
+ Substringcount(Error_Count, '_')
+ Substringcount(Error_Count, '2')
+ Substringcount(Error_Count, '3')
+ Substringcount(Error_Count, '(')
),1,0) as Count;

 

Regards

Krishna

Labels (3)
1 Solution

Accepted Solutions
m_woolf
Master II
Master II

Get rid of the IF and the 0,1:

Substringcount(Error_Count, '.')
+ Substringcount(Error_Count, ',')
+ Substringcount(Error_Count, '@')
+ Substringcount(Error_Count, '_')
+ Substringcount(Error_Count, '2')
+ Substringcount(Error_Count, '3')
+ Substringcount(Error_Count, '(') as Count

View solution in original post

5 Replies
Vegar
MVP
MVP

Maybe this approach using KeepChar() could help you?

LOAD 
	Error_Count,
	Count as [Desired count], 
	len(KeepChar(Error_Count,'.,@_23(')) as [Calculated count] 
inline [
	Error_Count	Count
	.,Kalyan	2
	Uday@Gmail.com	1
	Uday.,	2
	Kalyan_krishna	1
	Gopi23	2
	(krishna	1
] (delimiter is '	');

 

Notice that the "Uday@Gmail.com" counts 2 not 1 as in your table as it contains both . and @. 

image.png

m_woolf
Master II
Master II

Get rid of the IF and the 0,1:

Substringcount(Error_Count, '.')
+ Substringcount(Error_Count, ',')
+ Substringcount(Error_Count, '@')
+ Substringcount(Error_Count, '_')
+ Substringcount(Error_Count, '2')
+ Substringcount(Error_Count, '3')
+ Substringcount(Error_Count, '(') as Count

krishna20
Specialist II
Specialist II
Author

Hi vegar,

Thanks for your reply. Actually the substring count flag is not showing right sum . Please look into below script and scenario and advise.

Load 

If(Match(Left([Error_Count],1),'.',',','\','1','2','3','4','5','6','7','8','9','0','@','*','#','$','!','%','^','&','(',')',';','?','+','=','{','}','[',']','~','`','/',':','<','>','_')
or Match(Mid([Error_Count],1),',','\','1','2','3','4','5','6','7','8','9','0','@','*','#','$','!','%','^','&','(',')',';','?','+','=','{','}','[',']','~','`','/',':','<','>','_')
or Match(Right([Error_Count],1),'\','1','2','3','4','5','6','7','8','9','0','@','*','#','$','!','%','^','&','(',')','?','+','=','{','}','[',']','~','`','/',':','<','>','_'),
( Substringcount([Error_Count], '.')
 +Substringcount([Error_Count], ',')
+ Substringcount([Error_Count], '@')
+ Substringcount([Error_Count], '1')
+ Substringcount([Error_Count], '2')

+ Substringcount([Error_Count], '<')

+ Substringcount([Error_Count], '>')

+ Substringcount([Error_Count], '_')
as , ,'.,\1234567890@*#$!%^&()?+={}[]~`/:<>_'))) as Error_Count;

 

It creates a flag with different numbers.

Comm_Flag.PNG

But When I sum this flag in straight table, the calculation

Comm_Error.PNG

The count suppose to show 6 but it's showing 234 and many are there like this. 

The Highlighted name has ,.<@.> and count should show as 6.

Vegar
MVP
MVP

By the looks of your screenshot it looks strange. Could it be that you have multiple transactions with that email adress? (234/6 =39)

I also noticed something odd in your expression, is the yellow marking made by me correct? 

image.png

 

You could also try to break out parts of our expression in order to identify which part of the expression is generating the large value. 

 

/Vegar

krishna20
Specialist II
Specialist II
Author

Yes , the else condition was wrong and it's modified. But the logic which have written to find the junk characters is not working properly.