A common situation in Business Intelligence is that you have data in different data sources. It could e.g. be that you have several data systems within your company or that you have some data in an Excel spreadsheet in addition to the data in your database.
In any case, you want to load data from several sources and view them in a coherent way. This is sometimes referred to as merging data or blending data.
Not all BI tools can do this – you sometimes have to rely on external tools or SQL to do this prior to loading the data into your BI tool. QlikView, however, can do this easily.
If you have two different database systems, you need two different connect strings in the script:
ODBCCONNECTTO<Database 1>;
SQLSELECT * FROM TableA;
ODBCCONNECTTO<Database 2>;
SQLSELECT * FROM TableB;
At any place in the script, a SELECT statement will use the latest CONNECT string. In addition, LOAD statements will load data from files, disregarding the CONNECT statement. This way you can merge data from any number of databases and any number of files. Simple!
The next question is how to merge the two tables. In principle there are two ways to do this: Concatenating them (a union) or linking them (joining). Which one to use, depends on the situation.
Concatenation should be used, if you have two tables with basically the same type of entity, but different data sets; for example “Customers in Europe” and “Customers in North America”. In this case, you want both tables to be merged into one. See picture above. The script then becomes:
ODBC CONNECT TO<Database 1>;
SQL SELECT * FROM Customers;
ODBC CONNECT TO<Database 2>;
Concatenate SQL SELECT * FROM Customers;
But if you instead have a situation where the tables contain different entities, and a selection of an entity in one of the tables should imply one or several entities in the other table, then you should usually link the tables.
The script then becomes:
ODBC CONNECT TO<Database 1>;
SQL SELECT * FROM Customers;
ODBC CONNECT TO<Database 2>;
SQL SELECT * FROM Orders;
Linking tables is from a logical-mathematical perspective identical to an outer join, but QlikView keeps the two tables separate. Keeping them separate has the advantage that calculations are made in the “right table”.
Not all BI tools can perform an outer join as easily as this. I recently read an article about how to join an “Opportunities” table with a “Leads” table using a competing tool, and it was all but simple. But with QlikView, it is straightforward: Just make sure that the linking key field is named the same in both tables (and that no other fields are) and it will work right away.
The ability of QlikView to load data from any number of sources and merge it any way you want is one of the major strengths of QlikView and its script. Use it.
Thanks HIC. Quick question about concatenation of the QVD files – Let’s say, we have two QVD files called {Full, Subset} with {10, 5} fields respectively. And note that “Subset” QVD file contains same field names as “Full” QVD file. However, it’s just subset of the data. Now, I’d expect to see optimized load after concatenation but QlikView doesn’t perform optimized load. Please can you throw some light on this? May be, new blog post on optimized load…
If you concatenate two QVD files, QlikView needs to use the symbols from the first loaded QVD also for the second QVD, so I think that the second QVD always will be loaded un-optimized. (see also Blog on symbol tables)
I've used this approach but inside a FOR NEXT LOOP. On each iteration of the loop the script connect to a different database but extract a set of tables with same structure. At the end, these tables are consolidated on a single one with concatenation.
It was my understanding that you could have only one CONNECT string per Qlikview document?
So it is possible to CONNECT to more than one Database in the Qlikview Script? Is there a limit to the number of CONNECT to different Databases in the same Qlikview Script?