Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
valpassos
Creator III
Creator III

Qlik Sense as an ETL tool

Hi community,

I'm into Qlik Sense for about a year now, as a consultant, and there are still some aspects about the tool that yields me some suspicion.
I'm evangelized to say that Qlik Sense works as an ETL tool, but I hear a lot that just because it can work as one, it doesn't mean it should be used like one.

I never really understood what exactly should be done upstream QS. Which business logic are we talking about?

The architecture of QS says to extract all tables into QVDs, then perform all the transformations and associations and apply all the business rules upon those QVDs.
So, if I have lots of relationships between tables in the database, like:

 

...
FROM
 table1 t1,
 table2 t2,
 table3 t3,
 table4 t4
WHERE 
 t1.id = t2.id
 and t3.categoryId = 10
 and t3. = t1.Id
 and (t1.statusId = 2 or t1.statusId = 3)
 and t1.id = t4.id;


In QS, according with the best practices, I should extract tables 1, 2, 3 e 4 each one to a QVD (one QVD for each table), and then perform all the WHERE clauses within the QS script, though joins and whatever.

But, all the WHERE clauses are business rules, right? Shouldn't they be done in the database? If I'd need this information in other system or tool, I would need to replicate these rules in that new system.

And in the case of validation, isn't this schema worse if I want to perform data validation? If I need to trace the origin of a problem, I'd need to validate each business rule applied in QS script, and that could be cumbersome.

I'm really just looking for some testimonies from professionals who have great experience working with QS as an ETL tool. What should and shouldn't be done in the QS script?

Thanks!

Labels (2)
4 Replies
passionate
Specialist
Specialist

If you know that the query output will yield all required fields. then you can create a view at database side and just pull that in QS.

Best approach to do transformation logic in Qlik project is as below: 1 being best

1. Database in view

2. Qlik Script

3. Qlik Front end

If you can achieve what you want in view then go ahead with that.

Doing everything in Qlik give you advantage of flexibility.

valpassos
Creator III
Creator III
Author

Hi @passionate !

Thanks for your input.

So, database views: is that normally the best approach even when we are dealing with millions and millions of records on a daily basis?

What are the practical differences between creating a view in the database and extracting the tables one by one and then apply all the required associations in Qlik script? And whose decision should this be? The Qlik developer or the client/stakeholder DBA? (database administrator)

Looking forward for more testimonies of Qlik professionals regarding this topic!

MynhardtBurger
Contributor III
Contributor III

My view is that:

  • If the query tables will only by used in that specific query
  • No other model can reuse the underlying table data
  • Structure of the final result won't change

Then pushing the calculation to the database server will likely yield the best overall performance.

If however you are still in the early development phase (where your table structures change regularly) or the raw tables will be reused within your environment, then dumping the raw tables to QVDs first is best. This saves resources (cpu/memory/network) on the database server side, while allowing you flexibility to transform and reuse existing data as needed.

You will find that you end up with a mixture of using database views and dumping the raw tables to qvd's before doing your own transformations. It is about optimizing the use of your resources. Eliminate duplication and performing the calculations where you have spare cpu/memory.

Sticking to the normal 3 stage process is a good rule of thumb, but there will always be exceptions.

Sometimes you have competent people on the data source side who knows exactly how to structure the queries to give you reliable data. In that case you are lucky and put reliance on the DBA to generate the necessary views, or to provide info on the transformations required. In other cases the people on the data source side aren't willing or able to provide your with the necessary views/queries. In that case you end up having to figure things out yourself which means you end up doing almost all the transformations within qlik script.

valpassos
Creator III
Creator III
Author

Hi @MynhardtBurger!

That was a very clear explanation. Thank you for sharing your experience 🙂

So far I've concluded this:
It is very rare to start a project with some business rules defined, because typically the client wants to see EVERYTHING, or at least wants the possibility to see everything, so he won't filter out anything at a database level. Which means we always end up dumping the raw tables to QVDs first and then we figure things out for ourselves, as you say.

This is what prompt me to ask what normally is the approach taken by a consultant. I can see now that is not a right/wrong answer.

Thank you once again!