Qlik's advanced analytics integration provides a path to making modern data science algorithms more accessible to the wider business audience. This project is an attempt to show what's possible.
This repository provides a server side extension (SSE) for Qlik Sense built using Python. The intention is to provide a set of functions for data science that can be used as expressions in Qlik.
Sample Qlik Sense apps are included and explained so that the techniques shown here can be easily replicated.
The implementation includes:
Supervised Machine Learning : Implemented using scikit-learn, the go-to machine learning library for Python. This SSE implements the full machine learning flow from data preparation, model training and evaluation, to making predictions in Qlik. In addition, models can be interpreted using Skater.
Unupervised Machine Learning : Also implemented using scikit-learn. This provides capabilities for dimensionality reduction and clustering.
Named Entity Recognition : Implemented using spaCy, an excellent Natural Language Processing library that comes with pre-trained Neural Networks. This SSE allows you to use spaCy's models for NER or retrain them with your data for even better results.
Association rules : Implemented using Efficient-Apriori. Association Rules Analysis is a data mining technique to uncover how items are associated to each other. This technique is best known for Market Basket Analysis, but can be used more generally for finding interesting associations between sets of items that occur together, for example, in a transaction, a paragraph, or a diagnosis.
Clustering : Implemented using HDBSCAN, a high performance algorithm that is great for exploratory data analysis.
Time series forecasting : Implemented using Facebook Prophet, a modern library for easily generating good quality forecasts.
Seasonality and holiday analysis : Also using Facebook Prophet.