Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have a general query which I wanted to throw out there, because I suspect many people have a similar issue.
The data which we can collect into QlikView models will usually have various coded data. These codes will often have descriptions we can use in our QV presentation, but not always. Sometimes they won't, or we wish to handle the grouping and/or descriptions differently.
There are ways to deal with this - an inline table or Excel spreadsheet for example. But these solutions are not ideal, and not easily supportable, so they get out of date over time.
I am interested to know how other people deal with this situation?
Regards
Hi Richard,
this is a very open question, and the answer is, as usual, "it depends".
When people have incomplete or "dirty" data coming in from your various sources, they usually supplement the data with some sort of a "cleansing" process and a set of mapping tables to assist in the process of cleansing. The approach typically depends on the kind of tools that are radically available to the person or to the group:
- Inline load is too inflexible, and therefore I would only recommend it for very static data that is not likely to change (For example, 1=Male, 2=Female)
- If the users are business analysts and their main working tool is Excel, then excel is typically used for the supplemental tables. The obvious drawback is that the information needs to be maintained and becomes stale over time. It is also easy to break the structure of the spreadsheet and cause a failure of your load script.
- When the users are IT Developers and they have access to databases like SQL Server and Oracle, they may opt to develop a set of "cleansing" tables on the database. This structure is a bit more robust, but still needs to be maintained separately from the main data source.
- Finally, when the developers have control over the data source itself, they can add new attributes to the "master tables" just for the sake of data cleansing. This is ultimately the best scenario, because the extra data fields can be maintained at the same time as the main data record is being added or modified. However, this solution is not always available because in most cases, we as QlikView developers have very little or no control over the original data source.
I think I covered most of the common uses. Any other thoughts, folks?
Oleg Troyansky
www.masterssummit.com
Hi Oleg
Thanks for a nice summary of options. It is encouraging that others face the same issue, and there are no obviously ideal solutions.
Unless anyone else knows otherwise ?
Regards