Qlik Community

Qlik Server Side Extensions Documents

Documents related to Server-Side Extensions and Advanced Analytics Integration.

Data Science algorithms implemented as a Python SSE

Employee
Employee

Data Science algorithms implemented as a Python SSE

Project page: https://github.com/nabeel-oz/qlik-py-tools

Qlik's advanced analytics integration provides a path to making modern data science algorithms more accessible to the wider business audience. This project is an attempt to show what's possible.

This repository provides a server side extension (SSE) for Qlik Sense built using Python. The intention is to provide a set of functions for data science that can be used as expressions in Qlik.

Sample Qlik Sense apps are included and explained so that the techniques shown here can be easily replicated.

The implementation includes:

  • Supervised Machine Learning : Implemented using scikit-learn, the go-to machine learning library for Python. This SSE implements the full machine learning flow from data preparation, model training and evaluation, to making predictions in Qlik. In addition, models can be interpreted using Skater.
  • Unupervised Machine Learning : Also implemented using scikit-learn. This provides capabilities for dimensionality reduction and clustering.
  • Named Entity Recognition : Implemented using spaCy, an excellent Natural Language Processing library that comes with pre-trained Neural Networks. This SSE allows you to use spaCy's models for NER or retrain them with your data for even better results.
  • Association rules : Implemented using Efficient-Apriori. Association Rules Analysis is a data mining technique to uncover how items are associated to each other. This technique is best known for Market Basket Analysis, but can be used more generally for finding interesting associations between sets of items that occur together, for example, in a transaction, a paragraph, or a diagnosis.
  • Clustering : Implemented using HDBSCAN, a high performance algorithm that is great for exploratory data analysis.
  • Time series forecasting : Implemented using Facebook Prophet, a modern library for easily generating good quality forecasts.
  • Seasonality and holiday analysis : Also using Facebook Prophet.
  • Linear correlations : Implemented using Pandas.

For more information refer to the project page on GitHub.

For more information on Qlik Server Side Extensions see qlik-oss.

Disclaimer: This project has been started by me in a personal capacity and is not supported by Qlik.

Comments
maxsheva
Contributor II

Hi @Nabeel_Asif ,

Thanks for suggestion. I have grab both app and data file but result is with the same error. 

Capture1.JPG

I suppose there could be some missed or incorrectly installed Python library or other related to extension issue.

Could you please check log of Qlik-Py-Start

0 Likes
Employee
Employee

If you're still getting the error it looks like you're not using the latest version of the SSE.

The Qlik load script fails saying that there is no field called 'ds' at the point where the SSE returns the results. There is definitely a field called 'ds' returned in release 4.0 when you pass load_script=true to the Prophet function. This was not the case with release 3.9 and earlier.

maxsheva
Contributor II

@Nabeel_Asif,  many thanks!

It works with a new version of the SSE.

Let me adapt a script for another data and I will provide a feedback.

 

Much appreciated!

maxsheva
Contributor II

Hi @Nabeel_Asif ,

I have tried to integrate own data into solution. I am able to execute and get forecast results.

However I cannot understand how 'freq' parameter is working e.g. freq=D (W,M,MS,Y)

I see yhat forecast is the best when freq=D but it is still less than 20% from real numbers. For sure I may multiply result * 1.2 but wondering whether any option to adjust it using built-in Prophet parameters?

0 Likes
dubdev
New Contributor

Hi @Nabeel_Asif , I'm new to use analitycs with Qlik and python. Are there some functions in your extension for binary classification? Like kNN or SVM/SVC and others. Is it possible to realise binary classification with stock function of this extention? I'll be gratefull for advise.

0 Likes
Employee
Employee

@dubdev , yes this SSE has functions that support both classification and regression. Most of the algorithms from the scikit-learn library are supported. 

For usage information please head over to the project's GitHub repository: https://github.com/nabeel-oz/qlik-py-tools 

Employee
Employee

@maxsheva , the Freq parameter is based on the granularity of your data so there is only one correct option for a given dataset, for e.g. D if you have daily data. 

The forecast will not align perfectly with historical values as that would be overfitting the model to a sample of data. However, there are a few ways to adjust the output explained here: https://github.com/nabeel-oz/qlik-py-tools/blob/master/docs/Prophet.md

evanplancaster
New Contributor III

@Nabeel_Asiffirst off, this SSE is amazing! Very well-documented, and very well-implemented. Thank you so much for it!

I have two questions:

1) Regarding Prophet forecasting, are there any plans to incorporate additional regressors into the mix? https://facebook.github.io/prophet/docs/seasonality,_holiday_effects,_and_regressors.html#additional...

2) For those who might not feel comfortable enough with Qlik's load scripting to want to go to the trouble of understanding how to feature engineer, train, cross-validate, tune hyper-parameters, and select the best model all inside Qlik, is there a way to do all that "grunt work" outside of Qlik, and then, having deployed the model in the proper location, use the sklearn.Predict function you've developed to pass a dataset in Qlik to that model? This would be a great feature to have for those of us who already have models out in production that we built using other tools and who don't want to rebuild them just so we can see the results in Qlik. (And yes, I know that we could just take our results from a pre-built model and shove them in a table on a database somewhere and pull them into Qlik, but the on-the-fly capabilities of SSEs are what we're really after here.)

Again, great job on this, and I'm so thankful to see that you are actively enhancing it and helping us all get it up and running!

0 Likes
Employee
Employee

Hi @evanplancaster , thanks for the compliments.

I did think about implementing the additional regressors option for Prophet, but felt restricted by a current limitation of SSEs, which is that a function cannot have a variable number of arguments. I guess I could create a new SSE function that allows for just one additional regressor, or come up with a scheme for passing multiple regressors using concatenation. I'll have a think.

On your second question, the models built using the SSE have a bit more in them than a standard sklearn model. They consist of a sklearn pipeline that needs to handle pre-processing (OHE, scaling, etc.), evaluation metrics from cross-validation, and meta-data to interpret features, their data types and how they need to be pre-processed. As I type this, I realize it should be possible to take an existing sklearn pipeline and add metadata to it so the model becomes easier to use with Qlik. So you've given me two things to think about!

Version history
Revision #:
3 of 3
Last update:
a week ago
Updated by:
 
Contributors