Data Transformation is the process of modifying the existing data to a new data format. It can also involve filtering out or adding some specific values to the existing data set. QlikView can carry out data transformation after reading it to its memory and using many built in functions.
Original data have been rarely in a state such that it can be used directly in QlikView without any modification. QlikView is powerful enough to allow modification of data when the data sources are flat files such as Excel or text files.
Flat files are common data sources for QlikView documents. Since we can’t always control how the data is formatted and arranged in the flat files, we have plenty of options when it comes to transforming this data into something we can use in our QlikView application. Most of these transformations are executed in the File Wizard: Transform dialog (also referred to as the Transformation Wizard).
This chapter describes the modification of data through the various QlikView transformation functions that can be used with flat files such as Excel, CSV, TXT, XML, or HTML.
Flat files are usually not perfectly formatted for use in QlikView; they may contain junk rows, columns, or fields; have blank fields; have repeating columns requiring unwrapping; require rows and columns to be rotated or transposed; or have table types considered to be of cross table or generic types. Fortunately, when connecting to a flat data file, the Transformation Wizard allows you to manipulate the data before loading it into your QlikView document.
This will allow you, as a developer, to finely process the source data and turn it into a clean design, while at the same time keeping an efficient script.
The Transformation Wizard can be accessed from the Script Editor, when you are connected to a flat file such as a table file. Once you have selected the file, click on the Enable Transformation Step button to start the Transformation Wizard.
Let’s start using the Transformation Wizard by opening an .xls file and removing some of the garbage (unnecessary) rows and columns from the data. Download the Hospital_General_Information_Chap3 .xls spreadsheet (this data is adapted from multiple healthcare and hospital spread sheets publicly available at http:// www.data.gov 😞
The Fill function in QlikView is used to fill values from existing fields into a new field.It allows you to fill in field data from data cells adjoining the target field. In this example, note row 18 (Oroville Hospital – Clinic) has a value TBD for the provider ID column. Assume that we know from the business requirements that all fields marked TBD are for hospitals that are child facilities of the main hospital facilities. Let’s remove any fields marked TBD and replace the value with the value of the field directly above it (in this case, the provider ID of the parent facility).
The Excel file we are using for this example has repeating columns (1 to 4 and 5 to 8). We want to transform this data by moving all the columns from 5 to 8 and appending them to their corresponding columns from 1 to 4. We will accomplish this by using the Unwrap feature of the Transformation Wizard:
Column Manipulation is a type of Data Transformation, in which a new column is populated with values from an existing column which meets certain criteria. The criteria can be an expression which is created as part of the Data Transformation step.Now that we’ve unwrapped the columns, we can further manipulate columns if we want to. Select the Column tab in the Transformation Wizard. In this tab, you can move or copy data from one column to an existing column or a new column. In this example, we will copy the provider ID column (column 1) to a new column we will create using this wizard (column 5). Duplication of columns within QlikView may be useful in building expressions later when designing sheet objects. This may also be done in the script itself by loading the same column twice and renaming the second column with an alias.
When you click on Finish in the Transformation Wizard, the script code that is generated appears in the Script Editor.Note that it’s easier to let the Transformation Wizard generate this code, and if anything needs to be tweaked a bit, such as changing a row position or column label name, it can be done in the script itself.
The Rotating table in QlikView is similar to the column and the row transpose feature in Microsoft Excel but with some additional options. We can transpose columns in multiple directions and they give different result.It can be useful in data modeling to rotate a table in any direction or transpose (swap) rows and columns. Let’s try it using the rotate function of the Transformation Wizard. This is very similar to the crosstable function of the Transformation Wizard, but doesn’t allow for aggregation of column data. From the Script Editor, click on the Table Files button and, in the Open Local Files window, navigate to your copy of the file US_Oil_Imports_Chap3. xls Excel file (this file is adapted from the publicly available file at http:// www.data.gov ), and click on Next. The following screenshot illustrates the data fields of this .xls file:
Cross tables are a common table type that creates problems in aggregation and analysis when trying to use them in the QlikView data model. The US_Oil_Imports_Chap3.xls spreadsheet worked within the preceding rotation exercise as a cross table, because the dates are stored as dimensions and you can’t properly aggregate all month data in the present format. We can convert this table into a normal table with the crosstable function of the Transformation Wizard as follows:
4. Click on Finish, and the code is inserted into the script pane of Script Editor. Note in the code the prefix line of the Load statement, named CrossTable.Imports:CrossTable( Date, Barels, 2)Load * FromC:\ QlikView_Scripting\US_Oil_Imports_chap3.xls(biff, embedded labels, header is 1 lines, table is [Selected_Imports$]);
Generic tables are very common tables; also called standard tables. These tables have less organization and have many different dimensions and measure units. Each dimension or unit may have its own field. In this example, we will review a scripting solution to create separate tables of each dimension, and then combine the tables into one fact table:
Defect Table:LOAD * INLINE [Defect, Group, Priority, Stage1, Reports, High, Queued2, Data, Low, Reported3, Data, High, Closed4, Object, Medium, Rejected5, Security, High, Reported6, System, Medium, Rejected7, Security, High, Closed8, Object, Low, Queued9, Data, Medium, Closed10, Requirements, Medium, Verified];
Workflow:GENERIC LOAD Defect, ‘WorkStep_’ & Stage, ‘x’ RESIDENT DefectTable;
In Relational database, you must have come across dimensions which has a different level of granularity. When one field/level is correlated or divided into its several other subfields/sublevel for detailed analysis of data, then together all these fields constitute a hierarchy.Hierarchies are an important part of all business intelligence solutions, used to describe dimensions that naturally contain different levels of granularity. Some are simple and intuitive whereas others are complex and demand a lot of thinking to be modeled correctly. QlikView has its own way of dealing with hierarchies using the hierarchy keyword.The hierarchy function of QlikView is powerful for creating parent-child relationships between two or more values. A quick example of a hierarchy may be: the Solar System is the parent of the child Sun and Planet, and Planet is the parent of the child Earth. With the hierarchy functions set, QlikView can easily use this data to create tree-view tables.Let’s consider another example, using a hospital surgery department structure:
All the above features are explained in-detail in QLIKVIEW TRAINING.