Skip to main content
Announcements
WEBINAR April 23, 2025: Iceberg Ahead: The Future of Open Lakehouses - REGISTER TODAY
Ouadie
Employee

If you have been learning about Qlik AutoML or looking for examples to get started, you might have only came across Binary Classification problems (such as Customer churn, Employee retention etc…). In this post, we will be solving a different type of problem with Qlik AutoML using a Regression model.

What is Regression, and Why Does It Matter?

Regression is a type of supervised learning used to predict continuous outcomes like housing prices, sales revenue, or stock prices. In industries such as real estate, understanding the factors driving prices can guide better decision-making. For example, predicting house values based on income levels, population, and proximity to the ocean helps realtors and developers target key markets and optimize pricing strategies.

In the upcoming sections, we go through how to build and deploy a regression model using Qlik AutoML to predict house prices using the common California Housing Dataset.

Step 1: Defining the Problem

Before creating the AutoML experiment, let’s define the core elements of our use case:

  • Trigger: New houses or listing entries are added to the dataset.
  • Target: Predict the house's value.
  • Features: Latitude, longitude, median age, total rooms, total bedrooms, population, households, median income, and proximity to the ocean.

Step 2: AutoML

The California Housing dataset is split into  Training (historical) housing_train.csv and Apply (new) housing_test.csv data files.

Start by uploading these files to your Qlik Cloud tenant.

(The files are attached at the end of the blog post)

Creating the AutoML Experiment

  1. Start a New Experiment:
    • In your Qlik Cloud tenant, click on Create → ML Experiment
  2. Select Your Dataset:
    • Choose housing_train.csv as your dataset. AutoML will automatically identify columns as features and recommend their types.
  3. Set the Target Variable:
    • Choose median_house_value as the target for prediction.
    • Ensure all relevant features are selected, and adjust any feature types if needed.
  4. Run the Experiment:
    • Click Run Experiment and let AutoML analyze the data. After a few minutes, you'll see the initial results, including SHAP values and model performance metrics.
    • You can also take a look at the Compare and Analyze tabs for more advanced details.

      Screenshot 2025-01-17 162639.png

      Screenshot 2025-01-17 163123.png

Deploying the AutoML Model

  • Choose the top-performing model from the experiment results.
  • Click on Deploy

    Screenshot 2025-01-17 163302.png

Creating Predictions

Once in the Deployment screen, add the Apply dataset, create a Prediction, and make sure to select SHAP and Coordinate SHAP as files to be generated. We will use these later on in our Qlik Sense Analytics app to gain explainability insights.

Screenshot 2025-01-17 163601.png

Step 3: Creating the Qlik Sense Analytics App

Now it’s time to visualize the predictions:

  1. Load the Predictions:

    • Navigate to the Catalog and locate the newly created Housing_test_Prediction.parquet file.
      Click Create Analytics App.

      Screenshot 2025-01-17 163927.png

    • Add additional data, including SHAP and Coordinate SHAP files as well as the apply dataset. 

      Screenshot 2025-01-17 164256.png

  2. Build the Dashboard:

    • Create visualizations such as:
      • A SHAP ranking to highlight the most influential features.
      • A histogram showing the distribution of predicted house values.
      • A map with gradient colors to visualize house prices by location.

    You can experiment with different visualization types to explore the data from multiple perspectives. 

    Screenshot 2025-01-17 170511.png

Understanding the results:

Based on the Qlik AutoML model, we can clearly see how features like income levels and ocean proximity can influence housing prices.

For more inspiration on how you can use your predictions within your Qlik Sense Apps or in your embedded use cases, check out my previous blog posts:

3 Comments