Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!

Data Science algorithms implemented as a Python SSE

No ratings
cancel
Showing results for 
Search instead for 
Did you mean: 
Nabeel_Asif
Employee
Employee

Data Science algorithms implemented as a Python SSE

Last Update:

Apr 2, 2021 4:25:34 AM

Updated By:

Nabeel_Asif

Created date:

Jan 8, 2019 11:06:44 PM

Project page: https://github.com/nabeel-oz/qlik-py-tools

Qlik's advanced analytics integration provides a path to making modern data science algorithms more accessible to the wider business audience. This project is an attempt to show what's possible.

This repository provides a server side extension (SSE) for Qlik Sense built using Python. The intention is to provide a set of functions for data science that can be used as expressions in Qlik.

Sample Qlik Sense apps are included and explained so that the techniques shown here can be easily replicated.

The implementation includes:

  • Supervised Machine Learning : Implemented using scikit-learn, the go-to machine learning library for Python. This SSE implements the full machine learning flow from data preparation, model training and evaluation, to making predictions in Qlik. In addition, models can be interpreted using Skater.
  • Unsupervised Machine Learning : Also implemented using scikit-learn. This provides capabilities for dimensionality reduction and clustering.
  • Deep Learning : Implemented using Keras and TensorFlow. This SSE implements the full flow of setting up a neural network, training and evaluating it, and using it to make predictions. Deep Learning models can be used for sequence predictions and complex timeseries forecasting.
  • Named Entity Recognition : Implemented using spaCy, an excellent Natural Language Processing library that comes with pre-trained neural networks. This SSE allows you to use spaCy's models for Named Entity Recognition or retrain them with your data for even better results.
  • Association rules : Implemented using Efficient-Apriori. Association Rules Analysis is a data mining technique to uncover how items are associated to each other. This technique is best known for Market Basket Analysis, but can be used more generally for finding interesting associations between sets of items that occur together, for example, in a transaction, a paragraph, or a diagnosis.
  • Clustering : Implemented using HDBSCAN, a high performance algorithm that is great for exploratory data analysis.
  • Time series forecasting : Implemented using Facebook Prophet, a modern library for easily generating good quality forecasts. Now with the ability to use multiple regressors as input.
  • Seasonality and holiday analysis : Also using Facebook Prophet.
  • Linear correlations : Implemented using Pandas.

For more information refer to the project page on GitHub.

For more information on Qlik Server Side Extensions see qlik-oss.

Disclaimer: This project has been started by me in a personal capacity and is not supported by Qlik.

Comments
maxsheva
Creator II
Creator II

Hi, thanks for sharing this project!

I use template with time series forecasting  where Implemented Facebook Prophet.

Could you please clarify how can I correctly add Product as a column on a Trend Predictions sheet?

When I add such extra column the Prophet procedure takes all possible values in y parameter and then generate same forecast per each of product.

How is it possible to add Product in a table and retrieve correct prediction for each of them?

 

Many thanks!

0 Likes
Nabeel_Asif
Employee
Employee

Hi @maxsheva , the Prophet function in this SSE is only designed to take in a measure and a datetime dimension. So it won't work with additional dimensions in the table or chart object.

However, the function produces the forecast in the context of any selections in the app. So if you select a product and then a region for example, the forecast will be produced in real-time for that product and region.

A function to generate forecasts by a dimension could be created, but if you have 10 products that would mean the forecasting algorithm needs to run 10 times for the chart, which would result in a long wait time for the user.

For such cases a better way would be to pre-calculate the forecasts in the load script. This needs an enhancement which I'll add to the next release of PyTools.

maxsheva
Creator II
Creator II

Hi @Nabeel_Asif ,

Thank you for information.

I would like to try such pre-calculated solution before next release as it is quite urgent case for me. I would much appreciate if you could please share a part of code I should use in a script to implement prediction with a different dimensions and then present results in UI.

Thank you!

0 Likes
Nabeel_Asif
Employee
Employee

@maxsheva I've published the new release. Get it on GitHub and refer to the documentation here: https://github.com/nabeel-oz/qlik-py-tools/blob/master/docs/Prophet.md#precalculating-forecasts-in-t...

 

maxsheva
Creator II
Creator II

Hi @Nabeel_Asif ,

 

Much appreciate for so quick response and new release!

I will try this solution and get back with a feedback.

 

Many thanks

Yevhenii 

0 Likes
maxsheva
Creator II
Creator II

Hi @Nabeel_Asif ,

I downloaded sample app "Sample_App_Forecasting_Simple.qvf" and created the same structure of xlsx file with Emergency department data.

It stops with the error "Field 'ds' not found"

error.JPG

 

It is on a step when Response table generating forecast.

I suppose something wrong while querying this part of code " Extension PyTools.Prophet(temp{ds, y, args});"

It is even highlighted in editor.

Capture.JPG

I have correctly installed all necessary Python libraries and it worked fine with on a fly calculated solution.

 

Could you please take a look at this error?

 

0 Likes
maxsheva
Creator II
Creator II

Hi @Nabeel_Asif ,

Is there any chance you may take a look at the error mentioned in my previous post?

I would be much appreciate for your help.

0 Likes
Nabeel_Asif
Employee
Employee

@maxsheva if you look at the load statement for the temp table above you'll see I have a field called Month Start which I rename to ds. If you have a different name for your date field you'd have to update the script accordingly.

0 Likes
maxsheva
Creator II
Creator II

Hi @Nabeel_Asif ,

In the script ds field is renamed correctly.

Script loads data from ds if I add 'Exit Script' before loading from extension PyTools.Prophet

Capture1.JPG

Capture2.JPG

 

How can I share my app with you?

 

0 Likes
Nabeel_Asif
Employee
Employee

@maxsheva The error doesn't have to do with the SSE so it's due to some difference in your load script or  from reverse engineering the data source.

I've just added the original data source to GitHub and updated the sample app. You can just grab those and run a reload with that.

If you're using Qlik Sense Enterprise, the data is attached to the app. If using Qlik Sense Desktop you will need to create a data connection named AttachedFiles and place the data source in the folder for that connection.

0 Likes
Version history
Last update:
‎2021-04-02 04:25 AM
Updated by: