Data Governance & Data Management with QlikView & QlikView Expressor

    Cultivating culture that emphasizes consistency and reusability is vital when introducing successful data governance practices. Common problems with many decision support systems are the amount of variation, redundancy and overlap that exists within the data models and business logic used across multiple analytical applications. These problems can delay critical decisions and disrupt IT operations while users struggle to verify the truth in data. Having data is one thing, having “good data” is another. With the volume of data increasing it is important to have tools to monitor and create a structured and consolidated data management layer that contains reusable and consistent definitions. This in turn gives developers and business users assurance that the data they are using, whether to develop applications or make decisions, is “good data”. It also expedites the process of creating new applications and eliminates much of the guesswork in maintaining applications as business requirements evolve over time.

     

    Data Governance

     

    Data Governance can be considered ambiguous as it has an emerging definition – it can be simply defined as the exercise of authority for data related matters.  It ensures that important information assets are formally managed throughout the enterprise and can be trusted to provide effective decisions.

     

    Some of the goals of applying Data Governance practices include:

     

    • Increasing consistency
    • Reducing redundancy
    • Improving regulatory compliance
    • Improving security
    • Introducing best practices and repeatable processes
    • Encouraging reuse
    • Conforming column definitions across all applications

     

    For the most part, with many business intelligence solutions, it should work with some sort of metadata repository / data dictionary in order to be functional to answer critical deployment questions. Once in place, Data Governance will influence the actions and conduct of people who implement and follow these practices.

     

    Metadata Management

     

    The term Metadata is also ambiguous and has an evolving definition.  It can and always will be defined differently by those who work with it. But, when used in the context of QlikView applications or Business Discovery – metadata can be defined simply as - data about data. Within a QlikView ecosystem  - there are two types of “data” that can be described:

     

    • Source Data - DATA that is used to make business decisions such as organizational data
    • QlikView Deployment Data - DATA about the structural elements that make up a QlikView deployment

     

    Metadata's overall purpose is to increases the value of data by providing additional context.  When managed effectively - it can be created once, centralized and reused in a self-service manner across multiple applications. It also can be used to answer questions that pertain to data lineage and impact analysis about the data or the applications it’s describing. In-turn it ensures consistency and understanding of data across the entire deployment for both IT and QlikView users. When applied correctly it can help with the overall effectiveness and efficiency of a QlikView deployment.

     

    Whether it describes data used to make business decisions or data about a QlikView deployment, metadata helps bridge the gap between the way users work with data and how computer applications process it.

     

     

    1. What 2 products are used to introduce Data Governance and Metadata Management to a QlikView deployment?

     

    a) The first product is the QlikView Governance Dashboard (QVGD) - This is a free product available on QlikMarket which contains a QlikView Dashboard (.QVW file) and a run-time processing engine. Its overall function is to retro-actively scan a QlikView deployment(s), create a QlikView associative data model and present various KPIS/metrics about the depolyment(s). It is intended to be largely used by IT and other technical staff to gain visibility and insight to help them answer those questions that pertain to ... well simply - "What is going on in my QlikView deployment?".  The overall value and benefit of the QVGD is to allow those to take actions on their finding such as instituting data governance practices to their QlikView environment, in-turn allowing them to measure its overall effectiveness and efficiency.

     

    Some examples of the questions answered include:

     

    • What QVD/QVX files/fields are/are not being used?
    • How many QlikView applications exist in my deployment?
    • What data is or is not being used and by which QV apps?
    • Which expressions/labels are being used the most (recurring / overlapping)?
    • What and how many of each sheet objects are being used?
    • What sources of data are being accessed?

     

    Please refer to the QVGD product landing page on our web site for more information.

     

    sessions.png

     

    b) The second product is QlikView Expressor Desktop / Server - which comprises of 4 components. A design environment - QVE Desktop, a version control and team development Repository, a server side Engine so created content can be deployed and executed on a server (QV Server / Publisher) and the QlikView Expressor Connector.

     

    QVEtoQVw_repos.png

     

    There are 3 license options for QlikView Expressor:

     

    • A free Desktop edition (interactive execution only)
    • Standard ( 8 core processing limitation, repository, engine)
    • Enterprise (unlimited cores, repository, engine)

     

    QlikView Expressor Desktop - is used to prepare and manage data for QlikView applications. Its primary function is to create a Dataflow that visually provisions (access, conform, cleanse, etc.) data for QlikView. There are components to access data, cleanse, transform and control its flow and output to QlikView and other target systems. QlikView Expressor defines and captures the source, target and business rule metadata along the way which can be reused in other projects and reused amongst multiple QlikView applications. It can help reduce QlikView scripting in certain cases and offers a repeatable way of defining meta-driven QlikView applications. It provides an easy to use interface that most QlikView developers will feel comfortable with.

     

    The Repository allows the storage and version control of what are called design-time model components used to create the Dataflow. (connections, schemas, business rules, templates, etc.)

     

    The Server (engine component known as etask.exe) - will just execute what is created on the QV Server / Publisher machines.

     

    Figure1.png

    QlikView Expressor Desktop and a Dataflow with data output to QlikView

    Figure4.png

    QlikView Expressor Desktop Rules Editor - defining a parameterized, reusable business rule

     

    2. What are some uses of QlikView Expressor within QlikView

     

    In summary, both the QlikView Governance Dashboard and QlikView Expressor enable  discovery and understanding of a QlikView deployment and its data by applying data governance, increasing reuse and facilitating the creation of metadata driven QlikView applications across the entire QlikView environment.

     

    When creating QlikView applications there are few ways one can prepare data for QlikView.

     

    • One can provide direct access to the data via its connectors to databases, files and web services directly in the QlikView application (.QVW) - then use SQL and LOAD script functionality to further transform the data needed for the application.

     

    • QVWs can also be used to just prepare the data with the LOAD scripts, without the layout and chart objects. Connectors, SQL and LOAD scripts are used to access, conform, cleanse the data to create a QlikView datafile known as a .QVD file (QlikView Data layer). Other QlikView applications can use that QVD file if needed. These processes can be scheduled and refreshed as needed using QlikView Publisher (Distribution Service and its task manager)

     

    Due to the extremely user friendly and addictive nature that QlikView offers, anyone can rapidly create content to answer those business questions easily. What happens  when QlikView deployments starts to expand throughout an organization is multiple versions of the rules, metrics, column definitions may exists or are defined differently across similar applications. This can possibly create a difference in conclusions, reducing the confidence in the data, therefore delaying decisions.  The QlikView Governance Dashboard can help identify these areas of concern and QlikView Expressor can help provide a way to manage reusable and consistent data for those QlikView applications as the environment continues to grow.

     

    3. What data sources / targets can QlikView Expressor read / write?

     

    QlikView Expressor can read and write a variety of data using Read and Write Operators. For data sources where an operator does not exist Read and Write Custom operators can be used along with the Datascript syntax.

     

    Sources and targets include:

     

     

    • Common RDBMs (included drivers - MySQL, SQL Server, PostgreSQL, DB2, Oracle, Sybase, Informix, Netezza
    • Files (Flat / Delimited, Excel, QlikView, Fixed)
    • Apache Hive
    • Cloudera Impala
    • MongoDB - (via Datascript API)
    • Salesforce
    • Teradata
    • Generic ODBC (requires DSN configuration)
    • QVX Connector - any QlikView connector that has been built using the QVX specification

     

    4. How do you connect QlikView to QlikView Expressor

     

    QlikView Expressor - can read and write QlikView QVD files. So the QVD output that is created is used as you would normally use it with QlikView. This can then be used as a data source file within QlikView application design as any other QlikView data file. If you output to QVX with QlikView Expressor, you have the option of using the QlikView Expressor Connector (QVEC) - which will allow you to source data directly from the QVE Dataflow without having to explicitly reference the .QVX file from a LOAD script. The QVEC allows you to access what is similar to a traditional metadata repository. "Deployment Packages" defined within QVE projects can be accessed and expose all the Dataflows that will be used to provision data for the QlikView application. The QlikView Expressor connector works specifically with Dataflows that output QVX only.

     

    5. Where can QlikView Expressor Fit?

     

    etl_flow.png

    QlikView Expressor (QVE) provides data governance and data management within a QlikView environment; providing visibility and data confidence in QlikView deployments. It strengths enable the creation of a single conformed data management layer that can be used to drive QlikView applications. QlikView Expressor has also been used as an ETL (Extract Transform Load) / data integration tool to supplement other data preparation needs such as the creation of various data stores. This is common in a setting where other ETL tools are not available. QVE can help consolidate multiple data sources, augment data and create a data store/mart/warehouse to be accessed by QlikView and other applications. Other benefits of QlikView Expressor include its ability to graphically prepare and control the flow of data while storing, sharing and reusing various components of the development process.

     

     

    Demonstration of QlikView Expressor


     

    QlikView Integration with QlikView Expressor - YouTube

     

     

    This video playlist will also help position its capabilities and additional features:

     

    http://www.youtube.com/watch?v=hcs6SmuJzIY&list=PLW1uf5CQ_gSrDrrHbzbe2YL9JVginMBHK