Data Architecture best practice

Anonymous · ‎2018-12-07

We are building a new enterprise solution and the size of data is huge ~ 60 GB of raw data. What is should be the ideal data model -

1. We do the back end processing i.e. roll up, filters, aggregation etc. in a back end tool e.g. Alteryx, Python or teradata etc, and use QlikSense just a a front end reporting tool, or do the entire processing in the QlikSense ETL layer.

2. Is it best practice to use Star scehma as a QLikSense back end design or we can follow Snow flake also, and the decision should be scenario dependent - what should be the best data model for handling large data sets.

3. The number of users are ~ 2000

tresesco · ‎2018-12-07

Abhishek,

Architecture best practices are dependent on numerous things apart from just data size and number of users. However, to answer on your points:

1. A proper ETL tool of course would be a better choice instead of doing everything in qlik. A have not worked on Alteryx, so don't know it's capability and if at all this could be called a proper ETL tool. If your budget permits, you can always go for specialist tools for better performance. And yes qlik is not specialized in ETL.

2. In real world, star schema modelling is really difficult. Theoretically, it is said that star schema is best, but in reality I would say, - try to keep your data model as converged as possible.

Other so many aspects would be like:

Hardware architecture (clustering if needed), RAM availability, NUMA/Hyperthreading enabling/disabling, multi-layer processing... to mention few.

Anonymous · ‎2018-12-08

Thanks Tresesco 🙂

Best Practices

Data Model

datamodeling