My company has multiple databases (postgresql, microsoft sql server, ibm db2 luw). My ETL workflow has been:
1. Write SQL query to get the appropriate data table (usually I just exported it as csv) (Extract)
2. use Python (np, pandas, tensorflow library) for transforming, complex calculation and analysis, (Transform) (also exported as csv)
3. Load these resulting csv into QLIK cloud apps for displaying in sheets for end BI user.
As you can see,I need to manually go through these steps whenever the database is reloaded. I know I can create data connection for the databases (step 1) in QLIK but step 2 and 3 I have no idea. I still prefer to do analysis/transform in Python instead of QLIK's script editor. I am thinking apache airflow as a one-stop-shop for the ETL process. Can you advice me on this? SSE option is not available for QLIK Cloud SaaS I think (tenant.url.qlikcloud.com). I am thinking sth similar to this:
You can install Qlik Platform SDK to use python to connect to Qlik Sense analytics applications and access data for use in embedded applications and machine learning models. So basically you can load the data through database data connection. Then use Python to do advanced analytics.
I have looked into QLIK Data Integration. But by the look of it, our company would need to subscribe to a Data Warehouse services (Amazon S3) before we even get started. I guess the question is: would Platform SDK is just a bandage solution and would lack robusticity long term and it is better to adopt QLIK Data Integration now?