Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik and ServiceNow Partner to Bring Trusted Enterprise Context into AI-Powered Workflows. Learn More!
Ouadie
Employee
Employee

With less than 50 days to go before the 2026 World Cup kicks off across the US, Canada, and Mexico, I wanted to share a project I've been working on that brings together a few pieces of the Qlik platform I think work really well together: Choose Your Champion 2026.

It's a web app where anyone can fill out their World Cup bracket, get AI-powered predictions for every possible matchup in the tournament powered by Qlik Predict, explore historical World Cup data, and compete on a leaderboard as the competition unfolds.

You can try it here: https://webapps.qlik.com/choose-your-champion-2026/index.html#/ 

Screenshot 2026-04-24 at 7.11.24 AM.png

The app is powered by Qlik, with Qlik Cloud Analytics for the data model and Historical Analysis, Qlik Predict for the matchup predictions, and various Qlik APIs to wire everything into a React front-end.

In this post, I'll walk through how the predictions work under the hood, because that was the most interesting piece to build.

What's in the app:

Choose Your Champion is broken into 4 parts:

  • Build a bracket: Pick your group stage winners, advance teams through the knockout rounds, and lock in your champion.

 

  • Check the predictions: For every possible matchup in the tournament, the app surfaces a Qlik Predict generated win probability for each team plus a draw probability. When you're unsure about a matchup, you can pull up the prediction and use it to decide which team advances.

Screenshot 2026-04-24 at 7.13.20 AM.png

Screenshot 2026-04-24 at 7.13.38 AM.png

 

  • Explore historical World Cup data: The app includes various visualizations to help you uncover insights from past tournaments: goals, top scorers, host nation performance, biggest upsets. All powered by the associative engine.

Screenshot 2026-04-24 at 7.16.20 AM.png

 

  • Leaderboard: As real matches get played in June and July, submitted brackets are scored automatically and players are ranked in the leaderboard table.

Screenshot 2026-04-24 at 7.18.31 AM.png

 

Under the hood: how the predictions work

This was the fun part. The goal was simple, given two national teams, predict the outcome of a hypothetical match (team A wins / draw / team B wins), but the work that makes the predictions actually useful is mostly in the data, not the model (thanks to no-code ML with Qlik Predict).

1. The training dataset

I started with every international football match result from 1872 to March 2026. There's a well-maintained open dataset on GitHub (credit: martj42/international_results) that gets updated after every international window, about 49,000 matches in total.

From that raw history, I built a training dataset focused on the modern era (2010 onwards) and only competitive matches (qualifiers, continental tournaments, World Cup finals). Friendlies got filtered out because they're noisy since teams often don't play their A squads, and the stakes don't match what happens in a real tournament.

That left me with around 9,400 training rows, each representing a real historical match with a known result, enriched with 27 features describing both teams' state going into that match:

  • Elo ratings for both teams
  • FIFA rankings and points snapshot to the match date
  • Rolling 10-match form per team: win rate, goals for, goals against, goal difference
  • Head-to-head history in the last 10 meetings
  • Context flags: neutral venue, tournament tier, cross-confederation
  • World Cup pedigree: a score rewarding teams for deep runs in past tournaments, with more recent success weighted heavier

 

2. ML Experiment

Once the training CSV was in shape, I uploaded it to Qlik Predict, pointed at the result column as the target, and let it do its thing. This is where Qlik Predict really shines, zero code needed. No Python notebooks, no sklearn, no hyperparameter grids to tune. You just upload your data, pick a target, and it does the heavy lifting with full explainability on the outcomes and what drives the predictions.

Qlik Predict runs multiple algorithms in parallel: LightGBM, CatBoost, XGBoost, Random Forest, and a few others, tunes their hyperparameters, and picks the best performer by F1.

Screenshot 2026-04-24 at 12.15.23 AM.png

On my first run, I left all the columns in the dataset checked, including the team name columns (team_a, team_b). When I looked at the SHAP importance chart afterward, team_b and team_a were ranking as the #2 and #3 most influential features, meaning the model was essentially learning "team X usually wins" rather than learning from the engineered features.

I created a new version, went back to the Data tab, unchecked the team name columns and a few date fields (which were also ranking higher than they should), and re-ran the experiment. Qlik Predict automatically dropped several more low-importance features during training, leaving a clean, focused feature set. The F1 did not change a lot (stayed at ~0.50), but the SHAP chart now showed the model leaning on exactly the signals we want:

  1. elo_diff
  2. rank_diff
  3. is_neutral
  4. h2h_team_a_advantage
    etc...

Screenshot 2026-04-24 at 12.43.16 AM.png

 

A few other calls that mattered:

  • Filtering to competitive matches only. A friendly between a top side's B squad and a mid-tier opponent tells you almost nothing about what happens in a World Cup group stage game.
  • Exponential decay on World Cup pedigree. A deep run in 1970 still counts, but less than one in 2022.
  • Removing rows with too many missing features. FIFA rankings don't go back to the 90s for every team, so some rows had to get dropped.

 

3. The apply dataset

Training gives you a model and to use it, you need an apply dataset with new rows you want predictions for.

For Choose Your Champion, I generated every possible pairing of the 48 qualified teams, which comes out to 1,128 unique matchups. Each row has the same 27 features as the training dataset, but computed as a current snapshot: each team's Elo today, their current FIFA ranking, their most recent 10-match form, and so on.

I fed that into the deployed model and got back a probability distribution for every matchup: P(team_a_win), P(draw), P(team_b_win).

Screenshot 2026-04-24 at 7.35.44 AM.png

The web app

The web app is a React front-end that connects to the Qlik tenant over anonymous access via @qlik/api, so users never see a login screen or have to authenticate against a tenant. The bracket UI pulls predictions from the Qlik Sense data model, so whenever a user opens a matchup, they're looking at data straight from Qlik.

For the historical World Cup section, I used a mix of @qlik/embed components when I needed a quick, ready-to-use chart, and custom nebula.js + picasso.js visualizations when I needed more control over the styling to match the app's look and feel. Both approaches work against the same underlying Qlik Analytics app, so everything stays consistent and governed in one place.

Screenshot 2026-04-24 at 7.53.56 AM.png

 

A few takeaways

If you're thinking about building something similar, a few things worth keeping in mind:

Spend the time on feature engineering. The difference between a model that predicts noise and one that predicts football is almost entirely in the features. Qlik Predict handles algorithm selection and tuning well, but it can only work with what you feed it.

The integration is where Qlik Predict pays off. Once a model is deployed, scoring a new dataset and pulling scores back into a Qlik Cloud Analytics app takes one load script. No Python services to maintain, no separate MLOps platform to stand up, no JSON plumbing between systems. That end-to-end data prep, modeling, predictions, and analytics all living in one platform is the thing that made this project come together fast!

Go fill out your bracket

The World Cup starts June 11, so there's plenty of time to get your bracket in and earn your spot on the leaderboard before kickoff. If you're curious about how any of this was built, leave a comment or reach out to me directly!

And if you want to learn more about Qlik Predict and start using it, visit: https://www.qlik.com/us/products/qlik-predict 

P.S: I have attached both Training and Apply datasets if you'd like to use them in your own Qlik Predict experiment.


Thank you!