Qlik Community

Suggest an Idea

Vote for your favorite Qlik product ideas and add your own suggestions.

Announcements
QlikWorld 2022, LIVE in Denver CO., May 16-19, 2022. REGISTER NOW TO RECEIVE EARLY BIRD PRICING

Replicate Synapse / ADLS target - using GZIP files

John_Roberts
Contributor
Contributor

Replicate Synapse / ADLS target - using GZIP files

One of the key pieces in our performance story is the transfer rate between on prem to azure.

Currently the data flow goes:

              Netezza db (data is highly compressed)

              QLIK replicate server load files (data is not compressed)

              Network pipe (on-prem to Azure datacentre)

              ADLS (data is not compressed)

              Synapse tables (data is compressed)

 

Has your engineering team considered gzipping the QLIK replicate server load files ?

So that the data flow would be :

              Netezza db (data is highly compressed)

              QLIK replicate server load files (gzipped)

              Network pipe (on-prem to Azure datacentre)

              ADLS (gzipped)

              Synapse tables (data is compressed).

 

Based on my reading, the Polybase engine supports loading .gz files directly into Synapse.

I realize this may affect the parallelization of the run, but our experience is that the Synapse load process takes 20-35 seconds per Gb file, and the network move takes 100x that, so it might be a compromise worth making, that would lead to a significant throughput boost.

Please ask your team if they have explored this idea,

It would be super easy to implement / prototype to see if yields performance improvements.

1 Comment
Shelley_Brennan
Employee
Employee

Thank you for the suggestion.  We have considered for Polybase but are also looking to implement COPY instead of Polybase and will probably support optional compression with the COPY method in a future release.  

Status changed to: Open - Collecting Feedback