White Paper - SQAD: SAP, Qlik, Attunity and Databricks Integration
This document was created to show the integrations between Qlik, Databricks and SAP, with a focus of support Databrick’s new Delta Lake offering. The use case documented below is based on SAP IDES Sales and Distribution data stored in a R/4 Azure system. The raw transaction ECC data is moved from SAP to Databricks using Attunity Replicate. A quick description:
Attunity Replicate empowers organizations to accelerate data replication, ingest and streaming across a wide range of heterogeneous databases, data warehouses and Big Data platforms. Used by hundreds of enterprises worldwide, Attunity Replicate moves your data easily, securely and efficiently with minimal operational impact. Find out more at https://www.qlik.com/us/products/attunity-replicate
The base tables were then transformed into a data mart schema by leveraging Attunity Compose for Data Lakes to prepare the data for processing by the Databricks ML engine. This transformed data is landed post processing into Delta Lake format which is a new feature of Compose 6.5.
Attunity Compose for Data Lakes automates the data pipeline to create analytics-ready data. By automating data ingest, Hive schema creation, and continuous updates, organizations realize faster value from their data lakes. Find out more at https://www.qlik.com/us/products/attunity-compose-data-lakes
Delta Lake, is an open-source storage layer that brings ACID transactions to Apache Spark™ and big data workloads. What make this unique is that unlike most Hadoop file formats, Delta Lake supports inserts, updates and deletes which are core to working with SAP ECC transactional systems. Find out more about Delta Lake at https://delta.io/ . Databricks, in our use case, has run a series of machine learning algorithms to predict delivery status based on multiple factors in the data mart. More about Databricks:
Databricks’ mission is to accelerate innovation for its customers by unifying Data Science, Engineering and Business. Databricks provides a Unified Analytics Platform powered by Apache Spark for data science teams to collaborate with data engineering and lines of business to build data products.
Qlik Sense is used as the data integration engine to combine the data from the raw ECC tables and the machine learning output from Delta Lake, correlate with the Qlik Indexing Engine, and then visualize the combined data set.