Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi there,
I'm pretty new to Qlik and have encountered a situation where a fact table needs to link to a definition three times. My fact (OR_Data) contains columns that hold IDs to link to my definition (Physician). My solution is to load and rename the Physician table three times. Here's my load script for a testing scenario:
LOAD
Booking_Surgeon_ID,
Anesthesiologist_ID,
Operating_Surgeon_ID,
Procedure
FROM [lib://AttachedFiles/PhysicianTest.xlsx]
(ooxml, embedded labels, table is OR_Data);
Booking_Surgeon:
LOAD
Physician_ID as Booking_Surgeon_ID,
Physician_Code as Booking_Surgeon_Code,
Physician_Name as Booking_Surgeon_Name
FROM [lib://AttachedFiles/PhysicianTest.xlsx]
(ooxml, embedded labels, table is Physician);
Anesthesiologist:
LOAD
Physician_ID as Anesthesiologist_ID,
Physician_Code as Anesthesiologist_Code,
Physician_Name as Anesthesiologist_Name
FROM [lib://AttachedFiles/PhysicianTest.xlsx]
(ooxml, embedded labels, table is Physician);
Operating_Surgeon:
LOAD
Physician_ID as Operating_Surgeon_ID,
Physician_Code as Operating_Surgeon_Code,
Physician_Name as Operating_Surgeon_Name
FROM [lib://AttachedFiles/PhysicianTest.xlsx]
(ooxml, embedded labels, table is Physician);
This works fine, but what I'd like is for the three physician tables to only hold entries that exist in their respective columns in the fact table. The physician table has hundreds of entries, but the OR_Data table might only have 4 distinct physician IDs in the Booking_Surgeon_ID column. I'd only want 4 entries in the Booking_Surgeon table. I've read about the WHERE EXISTS keyword, but can't seem to figure out what to use in the exists expression. Any hints?
Also - is this the best approach in this scenario? I have a feeling I'll be coming up against this sort of situation often.
Thanks!!
Fact:
LOAD
Booking_Surgeon_ID,
Anesthesiologist_ID,
Operating_Surgeon_ID,
Procedure
FROM [lib://AttachedFiles/PhysicianTest.xlsx]
(ooxml, embedded labels, table is OR_Data);
Booking_Surgeon:
LOAD
Physician_ID as Booking_Surgeon_ID,
Physician_Code as Booking_Surgeon_Code,
Physician_Name as Booking_Surgeon_Name
FROM [lib://AttachedFiles/PhysicianTest.xlsx]
(ooxml, embedded labels, table is Physician)
WHERE exists(Booking_Surgeon_ID, Physician_ID);
Anesthesiologist:
LOAD
Physician_ID as Anesthesiologist_ID,
Physician_Code as Anesthesiologist_Code,
Physician_Name as Anesthesiologist_Name
FROM [lib://AttachedFiles/PhysicianTest.xlsx]
(ooxml, embedded labels, table is Physician)
WHERE exists(Anesthesiologist_ID, Physician_ID);
Operating_Surgeon:
LOAD
Physician_ID as Operating_Surgeon_ID,
Physician_Code as Operating_Surgeon_Code,
Physician_Name as Operating_Surgeon_Name
FROM [lib://AttachedFiles/PhysicianTest.xlsx]
(ooxml, embedded labels, table is Physician)
WHERE exists(Operating_Surgeon_ID, Physician_ID);
Fact:
LOAD
Booking_Surgeon_ID,
Anesthesiologist_ID,
Operating_Surgeon_ID,
Procedure
FROM [lib://AttachedFiles/PhysicianTest.xlsx]
(ooxml, embedded labels, table is OR_Data);
Booking_Surgeon:
LOAD
Physician_ID as Booking_Surgeon_ID,
Physician_Code as Booking_Surgeon_Code,
Physician_Name as Booking_Surgeon_Name
FROM [lib://AttachedFiles/PhysicianTest.xlsx]
(ooxml, embedded labels, table is Physician)
WHERE exists(Booking_Surgeon_ID, Physician_ID);
Anesthesiologist:
LOAD
Physician_ID as Anesthesiologist_ID,
Physician_Code as Anesthesiologist_Code,
Physician_Name as Anesthesiologist_Name
FROM [lib://AttachedFiles/PhysicianTest.xlsx]
(ooxml, embedded labels, table is Physician)
WHERE exists(Anesthesiologist_ID, Physician_ID);
Operating_Surgeon:
LOAD
Physician_ID as Operating_Surgeon_ID,
Physician_Code as Operating_Surgeon_Code,
Physician_Name as Operating_Surgeon_Name
FROM [lib://AttachedFiles/PhysicianTest.xlsx]
(ooxml, embedded labels, table is Physician)
WHERE exists(Operating_Surgeon_ID, Physician_ID);
You could use mapping tables.
Physician_Code_Map:
MAPPING LOAD
Physician_ID,
Physician_Code
FROM [lib://AttachedFiles/PhysicianTest.xlsx]
(ooxml, embedded labels, table is Physician);
Physician_Name_Map:
MAPPING LOAD
Physician_ID,
Physician_Name
FROM [lib://AttachedFiles/PhysicianTest.xlsx]
(ooxml, embedded labels, table is Physician);
Fact:
LOAD
Booking_Surgeon_ID,
Anesthesiologist_ID,
Operating_Surgeon_ID,
applymap('Physician_Code_Map', Booking_Surgeon_ID) as Booking_Surgeon_Code,
applymap('Physician_Code_Map', Anesthesiologist_ID) as Anesthesiologist_Code,
applymap('Physician_Code_Map', Operating_Surgeon_ID) as Operating_Surgeon_Code,
applymap('Physician_Name_Map', Booking_Surgeon_ID) as Booking_Surgeon_Name,
applymap('Physician_Name_Map', Anesthesiologist_ID) as Anesthesiologist_Name,
applymap('Physician_Name_Map', Operating_Surgeon_ID) as Operating_Surgeon_Name,
Procedure
FROM [lib://AttachedFiles/PhysicianTest.xlsx]
(ooxml, embedded labels, table is OR_Data);
This blog has more info on ApplyMap Don't join - use Applymap instead
You could exclude the ID fields if these are not necessary in your fact table, so you just have the 3 Code and 3 Name fields.
Dang, this is so simple now that I understand how it works. Thanks for your help Nicole!
Thanks for your reply Colin. This is an useful approach as well. For me I'll actually be using a few more columns for each physician and the fact actually has something like 30 columns, so linking by ID will work better for me I think. I really like the idea of applymap though, thanks for introducing it to me!