Utilizing Provisioning Zone if HIVE projects were historically used
We have historically implemented Compose HIVE projects which does not have the provisioning zone functionality. We also have issues regarding large HDS tables and the current and full views that are not usable from our downstream tools.
Some options we are exploring - creating daily snapshots of the large tables in order to bypass the current/full view.
Version 9 has been touted to solve some of the performance issues, hence wait for version 9
1. Can one create a decoupled SPARK compose job and only create a Provisioning Zone and keep the creation of the Storage Zones with the HIVE project. If technically this is possible is this a good idea?
2. IS there a similar "exit" in the compose HIVE step where one could potentially hook in a "snapshot" insert statement into new schema.
HI @Corne - what cluster/hadoop platform are you on ? and do you have the ability to enable HIVE ACID transactions?
#1 - You cannot take a Spark project and use a Hive based storage layer. The projects are built to handle the end to end processing.
2. There is a Command Task which can run anything on the Compose server. This could be a script or program / executable which could connect to Hive and run additional statements. The Command task can be hooked to a Compose processing task via the Workflow feature.