Hi,
We are currently struggling with our current QV ETL process and source data storage. We are looking to move to a Hadoop or AWS solution. But we have a few questions/queries
1) Our source data is all Excel files which are imported and stored directly to QVD
2) These QVDs are then reloaded and various ETL statements are then run on these
3) The final QVDs are then stored and imported into the client facing 'Live' Dashboard.
My main question is how, within Hadoop, would we carry out the various ETL statements? I've heard of Pig, Hive, Scoop - but unsure which is the best solution. Also, these transformations need to run on a monthly basis. How would this be achieved?
Once all the ETLs have been run, where and how is the final data stored and loaded into our Client Facing App?
Many thanks for any advice!
Phil