I started developing a Data Science project within the company I currently work for. Looking for open source solutions and possible integrations with Qlik I found Pytools. This Server Side Extension provides algorithms for advanced analysis in Qlik Sense, making data science algorithms more accessible for business areas.
The Qlik Extension (SSE) was built using a series of Python algorithms intended to provide a set of functions that can be used as expressions in Qlik Sense. Because the project is open source, customization and creation of new algorithms is open to everyone as needed.
Along with this project, I am applying the concept of Data Literacy with the focus of teaching business areas about the importance of reading and writing data. This way, company employees can make more confident, data-driven decisions. Improving analytical, statistical and analytical skills has been one of the biggest challenges so far.
This release includes the following implementations::
Supervised Machine Learning: Implemented using scikit-learn (Python library). This SSE implements full machine learning flow for data preparation, training modeling and assessment to make predictions. Also, models can be interpreted using Skater.
Unsupervised Machine Learning: Also implemented using scikit-learn.
Segmentation: implemented using HDBSCAN, high performance algorithms for more exploratory data analysis.
Forecasting: Implemented using Facebook Prophet, a modern library that facilitates the generation of forecasts in high quality and performance.
Seasonality and holidays analysis: also uses the Facebook Prophet algorithm.
Correlation: Implementation Using Pandas.
About the Setup process, development and presentation.
The Setup for PyTools on the local Machine, perform extension testing, study and customize available algorithms. In this step it is important to install python and its compatible packages according to versioning (pystan, pandas, scipy, prophet etc)
PyTools configuration on local Qlik Sense server, initially in development environment and then in production environment.
Creating relational models, developing metrics, facts and dimensions in SQL Server and Qlik Sense meeting business demands.
Development Dashboards with standard Qlik functionality and use of Pytools extensions
Development of a Qlik Mart for Data Load Optimization in Created Apps (Backlog).
Using Nprinting by Scheduling Dashboard Triggers for User Groups(Backlog).
Algorithms and its expressions:
Clustering This algorithm uses the following expression
Remember, using this project as a base is a great way to start a Data Science project. With great base algorithms, you can customize to your needs and work with Data Literacy education within the enterprise environment without a large upfront investment.