Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

How do joins work in QlikView?

Hi there,

can somebody help me with a very basic question? I'm using QlikView 10.00.9061.7 SR3 on Windows 7 32-bit. Here is my app:

flight1:

LOAD * INLINE [

   day1, id1, capacity

    mon, 1, 500

    mon, 2, 150

    mon, 3, 500

];

flight2:

LOAD * INLINE [

   day2, id2, aircraft

    mon, 1, 737

    mon, 2, 320

    mon, 3, 737

];

aircraft:

load * inline [

aircraft, seats

  737, 500

  320, 150

];

sum(capacity) yields "mon = 1150" which is correct.

sum(seats) yields "mon = 650" which is not. It looks like QlikView is really doing a sum(DISTINCT seats) behind the scene. What gives? How am I supposed to get the correct result? Is a join in QlikView not a true SQL join?

Thanks for your help,

Chris

8 Replies
its_anandrjs

Hi,

I suggest do some changes and join this two tables like below

flight1:

LOAD * INLINE [

   day1, id1, capacity

    mon, 1, 500

    mon, 2, 150

    mon, 3, 500

];

  

flight2:

left join(flight1)

load id2 as id1,

     day2 as day1;

LOAD * INLINE [

   day2, id2, aircraft

    mon, 1, 737

    mon, 2, 320

    mon, 3, 737

];

And let me know if i am not correct

Hop this helps

Regards

Anand

Not applicable
Author

Hi Anand,

thanks for the answer. In fact tables flight1 and flight2 are independent, they are only examples of two different ways of computing the same measure, which should yield the same results but do not. I do not understand how QlikView obtains the result it does for flight2, which looks to be unlike what most classical BI tool would do.

Thanks,

Chris

its_anandrjs

Hi,

Ok i got it see the attached sample i make changes and load data with qualify command and join it with aircraft table might be it is your answer.

By Coding

qualify *;

unqualify capacity,aircraft;

flight1:

LOAD * INLINE [

   day1, id1, capacity

    mon, 1, 500

    mon, 2, 150

    mon, 3, 500

];

flight2:

Join(flight1)

LOAD * INLINE [

   day2, id2, aircraft

    mon, 1, 737

    mon, 2, 320

    mon, 3, 737

];

unqualify *;

aircraft:

load * inline [

aircraft, seats

  737, 500

  320, 150

];

Let me know if not works, see solution sheet.

HTH

Regards

Anand

Not applicable
Author

From your example, doing a sum of seats based on day 1 would return 650. Doing a sum of seats on date 2 should return 1150 since qlikview would recognize the link from aircraft in flight2 table to the aircraft in aircraft able.

It behave somewhat like crystal reports where if in crystal you load 3 tables, 1 header and 2 details, but you only use the info from the header in the report, summing a field from header would return correctly. As soon as a field it used from details the sum of the field from the header would be duplicated for each detail, returning the wrong results.

A better example is if you do a sum of seats for date1 it would only sees each plane once.

List box with dimension only date1 returns:

Monday

Adding the expression to this of seats would only find:

  737, 500

  320, 150

List box of only date2 returns:

Monday

but with the link between the tables sees that there are 2 737 aircrafts finding:

  737, 500

  320, 150

  737, 500

Let me know if I am not clear.

Edit:

as Anand put it, if the dates from Flight1 and flight2 are the refering to the same date, connection them by renaming them would create a link the flows from all tables, so both sums would then be correct using only 1 date field

Not applicable
Author

Hi Marc, Anand,

first let me apologize as it appears that my original message was indeed confusing. So I will try to reformulate the issue more clearly.

The first model, very simplified model behaves as expected:

flight:
LOAD * INLINE [
   day, id, capacity
    mon, 1, 500
    mon, 2, 150
    mon, 3, 500
];

The corresponding SQL model (using postgresql) is as follows

drop table if exists flight cascade;

create table flight (
  day integer,
  id integer,
  capacity integer,
  constraint flight_pk primary key (day, id)
);

insert into flight values (1, 1, 500);
insert into flight values (1, 2, 150);
insert into flight values (1, 3, 500);
commit;

select day, sum(capacity)
from flight
group by day;

and returns the correct answer, i.e., sum = 1150.

However the real model is slightly more complex and involves a join:

flight:
LOAD * INLINE [
   day, id, aircraft
    mon, 1, 737
    mon, 2, 320
    mon, 3, 737
];

aircraft:
load * inline [
 aircraft, seats
  737, 500
  320, 150
];

The corresponding SQL model is as follows:

drop table flight cascade;

create table flight (
  day integer,
  id integer,
  aircraft integer,
  constraint flight_pk primary key (day, id)
);

insert into flight values (1, 1, 737);
insert into flight values (1, 2, 320);
insert into flight values (1, 3, 737);

drop table aircraft cascade;

create table aircraft (
  id integer primary key,
  seats integer
);

insert into aircraft values (737, 500);
insert into aircraft values (320, 150);

commit;

select day, sum(seats), sum(distinct seats)
from flight
inner join aircraft on (flight.aircraft = aircraft.id)  
group by day;

sum(seats) = 1150
sum(distinct seats) = 650

What bothers me is that with model #2 QlikView returns value 650, i.e., sum(distinct seats) although in my opinion it should return 1150, i.e., sum(seats) like any other BI tool does, since I did not ask for a DISTINCT sum.

So my questions are the following:

  1. Why is QlikView behaving this way?
  2. How should I design model #2 to get the correct answer?

Thanks a lot for your help!

Chris

Not applicable
Author

Flight 1 is not linked to anything and that is where capacity is.  The default behaviour is a full outer join. 

You have to relate the flights back to aircraft for each table.  I suggest a concatenate load to get all the flight facts into a single table. 

linkExample.jpg

Not applicable
Author

I am still using personal edition so I can not open your sample file.

What are you using to get the sum(seats)?

As in what dimensions, and what is the expression you use.

With the full outer join behavior as wms said, if you load only 1 table into an object, it does not care about the links between the tables.

so a list box using an expression:

sum(seats)

would only return 650 since all it sees is the 2 aircrafts. You need to incorporate the flight2 table in the object as the dimension example of using date2, and add sum(seats) as an expression. The link between the tables should see that there are 2 737 flights, therefor making the result 1150.

another way to see how it is working is to create a straight chart, add all fields except seats and capacity as dimensions, then add sum(seats) as 1 expression, and sum(capacity) as the second dimension and look at the results.

Not applicable
Author

Another way of looking at the problem is, using model #2, to compute the following measures:

sum(seats)

sum(total <day> seats)

using "day" and "id" as dimensions, which gives the following (incorrect in my opinion) result:

Clipboard Image.png

Can anybody here explain me what's going on here?

Chris