Solved: Re: Set Analysis Syntax - Qlik Community

cbaqir · ‎2015-01-29

My data set shows transactional records that may result in multiple rows for the same workflow step and this it repeats the workflow duration.

REQUEST_ID , WORKFLOW_STEP_NAME , WORKFLOW_DURATION, TRX_DATE

01, StepA, 66, 12/31/2014

02, StepA, 4, 11/30/2014

03, StepA, 8, 7/31/2014

04, StepA, 2, 9/25/2014

05, StepA, .123, 8/5/2014

06, StepA, 1, 11/12/2014

];

I need to show the Avg Workflow Duration but don't want to include the duplicate rows:

i.e.

REQUEST_ID , WORKFLOW_STEP_NAME , WORKFLOW_DURATION, TRX_DATE

01, StepA, 66, 10/31/2014

01, StepA, 66, 11/30/2014

01, StepA, 66, 12/31/2014

That being said, it's possible that a request repeats a workflow step - for example, we would see two rows with the same workflow step name but with different values for workflow duration.

REQUEST_ID , WORKFLOW_STEP_NAME , WORKFLOW_DURATION, TRX_DATE

01, StepA, 5, 8/30/2014

01, StepA, 66, 10/31/2014

01, StepA, 66, 11/30/2014

01, StepA, 66, 12/31/2014

];

Neither of these is exactly right when compared with the raw data.

QV is showing an average of ~24 days but if you exclude two of the entries for Req 01 for 66 days, it should be ~12 days.

Any suggestions? TIA!

manojkvrajan · ‎2015-01-29

Cassandra,

I have attached the sample qvw with both script and expression changes. It may meet your need. Let me know if it helps.

View solution in original post

manojkvrajan · ‎2015-01-29

Cassandra - I have just loaded the distinct rows and dropped the Trx_Date without any change to UI. It kind of meets your requirements. However, unsure if this is what you are looking for? Attached is the sample QVW with changes.

cbaqir · ‎2015-01-29

The problem is that we need the transactional rows in the data set in order to meet another requirement about showing a snapshot status count by month of how many unique requests are Open, Closed etc...

While I appreciate the attempt, simply not loading the data I have will not meet all of my needs for the dashboard.

sunny_talwar · ‎2015-01-29

On the bar chart, have you tried using the word DISTINCT: avg({$} DISTINCT WORKFLOW_DURATION)

Does this solve your purpose?

Best,

S

cbaqir · ‎2015-01-29

I did try that and it wasn't right either. I believe that in that case, it would only look at DISTINCT WORKFLOW_DURATION values which isn't necessarily accurate when you have a large data set.

sunny_talwar · ‎2015-01-29

What makes it DISTINCT? The combination of REQUEST_ID, WORKFLOW_STEP_NAME, WORKFLOWDURATION, TRX_DATE all of them? Or just the first three?

Best,

S

cbaqir · ‎2015-01-29

If you look at the sample data I loaded, I'm not really sure how to tell the difference between when a workflow step was genuinely completed more than once and so the affiliated durations are combined or when we had to add a transactional row in order to get the status count for the end of the month. Does that make sense?

sunny_talwar · ‎2015-01-29

In your sample file, what needs to be the exact Avg you want it to show?

Best,

S

Report Inappropriate Content · ‎2015-01-29

SET ANALYSIS

makes a dashboard lot more meaningful and action oriented. For example, if you just provide number of units sold by a product line in a month, it is good information, but it is not-actionable. If you add comparison against same month last year, last month or average of relevant product lines in this month, you have added context to the number. The business user can take more meaningful actions out of this report / dashboard.

QlikView has feature called SET ANALYSIS that provides us a way to add this context. Set analysis predefines the SET OF DATA that our charts / tables use. So, using a Set Expression, we can tell our object (chart / table) to display values corresponding to various sets of data (e.g. a pre-defined time-period, geographic region, product lines etc.). All of the examples, I mentioned above as part of adding context can be accomplished using Set Analysis in Qlikview.

Most of the QlikView Professionals think that SET ANALYSIS is a complex feature. Through this post, I am trying to change their conviction towards it.

What is SET ANALYSIS ?

Set Analysis can be understood by a simple analogy of how Qlikview works. We make selections on certain variables and the changes reflect in the entire application. This happens because through our selection, we have created a set of data which we want to use. In a similar fashion, using Set Analysis feature, we can pre-define the data to be displayed in our charts.

Some features and characteristics for Set analysis are:

It is used to create different selection compared to the current application selections
Must be used in aggregation function (Sum, Count….).
Expression always begins and ends with curly brackets { }

Example dataset:

SET ANALYSIS syntax broken down into three components:

Identifiers:

Identifier	Description
0	:Represents an empty set, no records
1	:Represents the set of all the records in the application
$	:Represents the records of the current selection
$1	:Represents the previous selection
Bookrmark01	:Represents the set of all records against bookmark ID or the bookmark name

Examples:-

In below example, Current year selection is 2012 and previous selection was 2013.

Operators :

It works on set identifiers

Operator	Operator Name	Description
+	Union	Returns a set of records that belongs to union of sets.
-	Exclusion	Returns records that belong to the first but not the second
*	Intersection	Returns records that belong to both of the set identifiers.
/	Symmetric Difference	Returns a set that belongs to either, but not both of the set identifiers.

Examples:-

In below example, I have created a bookmark “BOOKMARK_1” for company selection A, B and C.

Modifiers:

Modifiers are always in angle brackets <>.
It consists multiple fields and all fields have selection criteria.
Condition of fields within modifiers bypass the current selection criteria.

Dollar Sign Expansion:

If we want to compare current year sale with previous year, previous year sales should reflect values in relation to current selection of year. For example if current selection of year is 2012, previous year should be 2011 and for current selection of year 2013, previous year is 2012.

“=Sum ({$<Year = {$ (=Max (Year)-1)} >} Sale) “

Above expression always returns sale for previous year. Here $ sign (Font color red) is used to evaluate the value for previous year. $ sign is used to evaluate expression and to use variables in set modifiers. If we have variable that holds last year value (vLASTYEAR) then expression can be written as:

“=Sum ({$vLASTYEAR)} >} Sale) “

Indirect SET ANALYSIS: Function P() and E()

Let us take a scenario, where we want to show current sales of the companies who had sales last year.

Expression should be similar like:

=sum({$<Year={$(=Max(Year))},Company_Name={Companies who had sales last year}> } Sale)

First we have to identify companies who had sales last year. To fix this problem, we will use function P() that is used to identify values within a field and function E() that exclude values within a field.

Finally, we have expression:

=sum({<Year={$(=Max(Year))},Company_Name=P({<Year={$(=Max(Year)-1)}>}Company_Name)>}Sale)

This post was an example where we have brought out methods to use SET ANALYSIS in Qlikview. Have you used this feature before? If yes, did you find it useful? Do you have more nifty tricks to make Set Analysis more interesting? If not, do you think this article will enable you to use Set Analysis in your next dashboard?

Do let me know your thoughts on using this feature in QlikView.

manojkvrajan · ‎2015-01-29

Cassandra,

I have attached the sample qvw with both script and expression changes. It may meet your need. Let me know if it helps.