Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I'm having an issue with Joins.
When Qlik joins two tables, it seems like it doesn't ignore null values. If a value is null in both tables, it will consider them as different and generate two separate rows. Coming from a SQL world, I find this counter-intuitive and even counter-productive. In the attached example and code below, I would expect only one row for ID 2, with Name = Brian, Status = Null(), Colour = Blue.
Is there a best way to fix it?
tmpNames:
Load
*
Inline
[
ID, Name
1, Andy
2, Brian
3, Carl
4, David
];
Left Join (tmpNames)
Load
*
Inline
[
ID, Status
1, 1
3, 0
];
// tmpNames Result:
// ID, Name, Status
// 1, Andy, 1
// 2, Brian, Null()
// 3, Carl, 0
// 4, David, Null()
tmpColours:
NoConcatenate
Load
*
Inline
[
ID, Colour
1, Amber
2, Blue
3, Cyan
];
Left Join (tmpColours)
Load
*
Inline
[
ID, Status
1, 1
3, 1
];
// tmpColours result:
// ID, Colour, Status
// 1, Amber, 1
// 2, Blue, Null()
// 3, Cyan, 1
Output:
NoConcatenate Load * Resident tmpNames;
Join (Output) Load * Resident tmpColours;
Drop Table tmpNames, tmpColours;
One option is to use NullAsValue
NULLASVALUE Status;
tmpNames:
LOAD * INLINE [
ID, Name
1, Andy
2, Brian
3, Carl
4, David
];
Left Join (tmpNames)
LOAD * INLINE [
ID, Status
1, 1
3, 0
];
tmpColours:
NoConcatenate
LOAD * INLINE [
ID, Colour
1, Amber
2, Blue
3, Cyan
];
Left Join (tmpColours)
LOAD * INLINE [
ID, Status
1, 1
3, 1
];
Output:
NoConcatenate
LOAD *
Resident tmpNames;
Join (Output)
LOAD *
Resident tmpColours;
DROP Table tmpNames, tmpColours;
NULLASNULL Status;
Hi there
In Qlik, JOIN is the same as FULL OUTER JOIN. What you want to do is INNER JOIN at the end to only keep the values which have entries in both tables.
Hope this helps.
Regards,
Mauritz
Hi Mauritz,
I'm aware it's a full join. What I'm saying is that it doesn't behave as such. A SQL full join would treat null values as equal. Qlik doesn't and creates duplicate lines.
Kind regards,
Fabio
Join on Null values is an issue in Qlik... you can try this
tmpNames:
LOAD * INLINE [
ID, Name
1, Andy
2, Brian
3, Carl
4, David
];
Left Join (tmpNames)
LOAD * INLINE [
ID, Status
1, 1
3, 0
];
tmpColours:
NoConcatenate
LOAD * INLINE [
ID, Colour
1, Amber
2, Blue
3, Cyan
];
Left Join (tmpColours)
LOAD * INLINE [
ID, Status
1, 1
3, 1
];
Output:
NoConcatenate
LOAD *,
ID & Status as Key
Resident tmpNames;
Join (Output)
LOAD ID & Status as Key,
Colour
Resident tmpColours;
DROP Table tmpNames, tmpColours;
Thanks Sunny!
I hoped there would be a cleaner way... This is quite annoying when joining on several fields. 😕
Do you know the reason for this design choice? I'm also surprised that this behaviour is not documented by Qlik.
Hi Fabio
Sorry, I misunderstood your question. We normally make key fields for joins, but that doesn't answer your question 🙂.
Let's see what the rest of the guys say.
Regards,
Mauritz
I am not sure why it is done this way, but may be @hic can comment on this.
One option is to use NullAsValue
NULLASVALUE Status;
tmpNames:
LOAD * INLINE [
ID, Name
1, Andy
2, Brian
3, Carl
4, David
];
Left Join (tmpNames)
LOAD * INLINE [
ID, Status
1, 1
3, 0
];
tmpColours:
NoConcatenate
LOAD * INLINE [
ID, Colour
1, Amber
2, Blue
3, Cyan
];
Left Join (tmpColours)
LOAD * INLINE [
ID, Status
1, 1
3, 1
];
Output:
NoConcatenate
LOAD *
Resident tmpNames;
Join (Output)
LOAD *
Resident tmpColours;
DROP Table tmpNames, tmpColours;
NULLASNULL Status;
THIS is a satisfactory solution!! Thanks a lot 🙂
To my knowledge, NULL values in SQL outer joins never link to other NULL values.
See for example http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc32300.1570/html/sqlug/X78...
So I think that the Qlik engine is doing the correct thing.
HIC