Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
agigliotti
Partner - Champion
Partner - Champion

Churn modelling - prediction meaning

Hi,
I'm going to make predictions to anticipate which customers you're about to lose (Churn modelling).
I have several cases that I can't understand because the ML algorithm prediction is yes (churned = yes)  but the relative confidence (0,39885264042926) is less than the confidence for churned=no (0,60114735957074).
What is the meaning of this prediction?
Attached an example.
Please let me know.

Many thanks in advance for your time.

Best Regards

Labels (1)
4 Replies
Kyle_Jourdan
Employee
Employee

In a perfect dataset distributed 50/50, a prediction threshold of 0.50 may produce the best F1 score, but since real world datasets are often imbalanced, AutoML will automatically tune the threshold to optimize the F1 score:

https://help.qlik.com/en-US/cloud-services/Subsystems/Hub/Content/Sense_Hub/AutoML/scoring-binary-cl...

What you’re seeing is a threshold that has been set to lower than the probability of yes for that record, which is resulting in a “yes” or true prediction. 

agigliotti
Partner - Champion
Partner - Champion
Author

Hi @Kyle_Jourdan ,

Thanks for your support.

Actually that customer is purchasing products so he's an active customer.

Is the algorithm saying you are about to lose that customer?

Could you help me to explain this prediction to a non experienced analyst or business user?

Kyle_Jourdan
Employee
Employee

@agigliotti 

Yes, this customer is predicted to churn. If you look at the help site link I sent, you will see how to view the threshold metric determined for your model. Based on that, anything with a probability above this threshold is predicted as a “true” (or in your case “yes”) outcome. Anything below it is a “false” (or “no”). 

Sometimes, however, it is better to look at the raw probability relative to others rather than the definitive prediction, as you should really treat someone with a 39.5% probability the same as someone with a 39.7% probability, even if the threshold is 39.6% and one is predicted “yes” and one “no”. 

agigliotti
Partner - Champion
Partner - Champion
Author

Hi @Kyle_Jourdan ,

In my churn model the algorithm "CatBoost Classification" Threshold is 0.398  and  F1 value  is 0.863
If I well understood that record is a FP (False positive).
In this scenario (Churn Modelling) what is the real added value of this ML model for a client/prospect?
With the help of the predictions, how can a business user extract valuable information and identify churn drivers so maybe this company can anticipate a customer churn?

@Steven_Pressland @Chris_Mabardy @Kyle_Jourdan @marcel_olmo @Cassandra_Nunley @hardinscott