An overview of scatter plots and how users can take apply them within their own apps.
Throughout my blog entries, we have taken a tour through the catalog of charts that Qlik has to offer. We have covered bar charts, line charts, pie and Sankeys, today we’re going to be diving into one of the lesser known, but still powerful, charts: Scatter Plots.
Scatter plots are used to show the relationship between two quantitative variables. The scatter plot is usually made of three elements, the X axis, the Y axis, and a point to show a data point shared between the two axes. Additional information can be shown on the chart in the form of the size of the data points, in Qlik Sense these data points are called ‘bubbles'.
How can you use a Scatter plot chart to visualize your data?
To demonstrate the capabilities of a scatter plot, we’ll look to an example found in the CRM app. This app was developed to showcase data including sales, numbers of customers and opportunities. This app would help a manager of a company see what parts of their business are doing well, and which need improvement.
Above we have the scatter plot built for this app, as well as the view of the ‘Advanced options’ of the chart to give a clearer view of which data is being shown and how. Beginning with our X axis, # of Customers and the Y axis which is Opportunity Amount. With the interaction of these axes, we’re shown that the higher and more to the right that a data point would be, the better for the company that data point would be (more money, more customers), and the opposite for down and to the left (less money, less customers). Additionally, the name of the Sales Person is assigned to the ‘Bubbles’ in this chart. Finally, the size of the bubbles shows the Amount of Opportunities won.
What information can we gain from this example?
A manager looking at this chart would quickly be able to determine who are the top and bottom performers and in which way. At a glance, the manager could see two outliers, Gonzalo Geary and Val Conforto, for two different reasons. According to our chart Gonzalo is adept at gaining customers, close to around 230 (double that of their closest competitor), with a larger number of opportunities won compared to his fellow salespeople. Val conversely shows that while she does not have as many customers as Gonzalo does, she makes the most out of the customers she does have, ranking highest in the amount of her opportunities.
That is the power of the scatter plot giving users insight into data points between two metrics. If the manager had only looked at Opportunity Amount, they might think Gonzalo as an average salesperson, while the same could be said for Val if looked at through the lens of # of Customers. Instead, the scatter plot allows for the manager to see how these individuals excel, and where they require additional assistance or training.
Hopefully this blog entry has led to a few ideas of how you can use scatter plots to visualize your own data. How can you use scatter plots to help you or your company? Is there something I might have missed? Leave it in a comment down below.
@MattSmart Do Qlik scatters have the ability to smooth min/max ranges for the axis? In my attempts to use scatter charts over the years, there's often been the issue that a small handful of outliers makes the entire chart unreadable. A custom axis can sometimes help, but is less than ideal when the values aren't always known in advance and/or when using Alternative measures which may not share the same value range.
I've also noticed Qlik scatter charts will combine points into squares when there is a large number of values (at least, I think that's why) - I wasn't able to find a way to disable that, and it's often not desirable behavior. Can it be disabled?
As far as I am aware, there is no way to add that smoothing effect you are wanting. If I am understanding correctly, this could also leave out crucial data and give an unclear visualization of the data selected. I think the best approach to solving your problem would be as you stated, adding a min or max for the data being shown, but as you said, you would incure the problem of perhaps not knowing the data in advance. You might go the route of using an expression to show only the middle 10-1000 values, that way you do not get the outliers, and this would adapt to your data, this can be done under the bubble dimension.
Please let me know if this helps or if I am misunderstanding what you are trying to configure.
Thanks, @MattSmart . I don't necessarily agree - this is in the same vein as, say, a bar chart showing a down arrow if the value exceeds the (preset) range. Sometimes things like this are the best way to visualize an outlier without breaking the entire chart, since real-life will often have just one or two values throwing the scale off. You could have down arrows, you could have a message explaining one or more values exceed the scale (click to select or click to show with the non-smoothed scale), etc. That said, I also understand the complexity and I'm sure it's not as easy to solve as just waving a magic wand (or just asking about it here on Community).
Here's an example of what I'm talking about. This is an actual dataset, and I think you'll agree this chart is not useful except to communicate the fact that an outlier exists - the bottom-left dot is a single outlier and the other dot is hundreds of data points in the 0-100% range.
In any case, I was hoping there was something I was overlooking. Appreciate the post regardless of this particular issue!