Skip to main content

Suggest an Idea

Vote for your favorite Qlik product ideas and add your own suggestions.

Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW

Compression of Replicate CDC Partitions

ragesh
Contributor II
Contributor II

Compression of Replicate CDC Partitions

We have been experiencing performance issue with Compose CDC, as it takes longer time to begin the data load from landing to storage zone. On analysis we could find, as the number of Replicate CDC partition files increase the time taken for Compose CDC also increases. The suggestion that we received was to compress these partition files and merge as a single file. It is also recommended to perform this compression activity atleast bi weekly, so that we have an optimized performance on Compose.

Currently this requires a series of steps to be run manually on each of the Replicate tasks. It will be nice to have this feature automated as a task, similar to Compactor in Compose, so that we are able to a schedule it and save a lot of manual efforts.

Tags (1)
4 Comments
Tzachi_Nissim
Employee
Employee

Hi ragesh,

Thank you for your input. We delivered a different solution to the same problem. This revolves around dropping partitions that were processed already. The current solution is provided in Replicate and still requires some script to run in order to complete the solution, but we are planning to complete this in Compose as well, so that partitions that were processed will automatically be dropped (with the ability for the user to define settings around this).

I'm setting this to "collecting feedback" since your suggestion is around compression rather than deletion. This is not currently planned, but I am opening this for wider feedback as well as for your response to the deletion option.

Regards,

Tzachi

Status changed to: Open - Collecting Feedback
mgarret2
Contributor III
Contributor III

We do not use Compose; however, we have the same need for Replicate to better manage the partitions it creates on Hadoop is it can generate a significant number of small files which create performance issues.   Additionally, over time we can have tables with too many partitions which also impacts performance.   We do not want to delete the data but partition compression and consolidation are required.  

FritzC
Partner - Contributor
Partner - Contributor

Hi Tzachi,

We are using Compose and testing on a CDP 7.1.4.x  platform and would like the partitions to be dropped automatically. Can you perhaps share how we can achieve this within Compose?

Kind regards,

Fritz

TimGarrod
Employee
Employee

This suggestion was for Qlik Compose for Data Lakes, which is no longer a supported product.  

Qlik Compose (its replacement) has live view features, and different processing patterns. 

Closing and archiving this due to relevancy.

Status changed to: Closed - Archived