Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
valpassos
Creator III
Creator III

Modeling all the different combination of dimension fields using composite key

Hi community!,

I'm trying to come up with a uniform table/dimension of therapeutic areas around an organization. Like this:

THERAPEUTIC AREA:
Immunology
Neuroscience
Vaccines
Oncology

These therapeutic areas are in disparate data sources and have disparate relationships with other entities (doctor, hospital,...).
I don't have a unique data source with all those areas, so what I'm doing is a concatenation of all the different data sources where the therapeutic areas appear (with all the transformation operations it needs). Like this:

 

Therapeutic Area:
LOAD 
Therapeutic Area
,...
,...
FROM datasource1;

CONCATENATE([Therapeutic Area])
LOAD 
Therapeutic Area,
,...
,...
FROM datasource2;

 

Now, associated with these areas are the diseases/patologies. These, too, are being concatenated in the same table of the areas, like a father-child hierarchy:

Therapeutic Area      Disease
Immunology             Rheumatoid arthritis
Immunology             Ankylosing spondylitis
Immunology             Osteoarthritis
Neuroscience           Parkinson’s Disease
Neuroscience           Multiple Sclerosis
Vaccines                     Smallpox
Oncology                   Rare

The problem I'm facing is: not all the areas appear in ALL the data sources, and the same happens with the diseases. I need to have the combination (composite key) of every TA-disease pair, including the combination where the disease doesn't occur, in order to map relationships with all the other entities in my model that are only associated with a therapeutic area, not a disease. Like:

<NULL>|<NULL>
Immunology|<NULL>
Immunology|Rheumatoid Arthritis
Immunology|Ankylosing spondylitis
Immunology|Osteoarthritis
Neuroscience|<NULL>
Neuroscience|Parkinson’s Disease
Neuroscience|Multiple Sclerosis


and so on...

Imagine that the key Immunology|<NULL> doesn't appear in any data source, only Immunology|Rheumatoid Arthritis, but an entity (like the doctor) only requires Immunology|<NULL> (without diseases)... How can I model this relationship?

I sense I will have to apply the concept of Generic Keys, but I don't know exactly how to apply!
Any direction?

Thanks!

Lisa

Labels (3)
5 Replies
valpassos
Creator III
Creator III
Author

Anyone? 😃

valpassos
Creator III
Creator III
Author

@hic, so sorry to insist this way, but I've read that you are the expert in these matters. Can you give your input regarding the use of Generic Keys for this case? Would greatly appreciate.

Thanks!

hic
Former Employee
Former Employee

From your description it sounds as if Generic Keys could be a solution. Or perhaps a Hierarchy Load...

Could you post some sample files?

HIC

valpassos
Creator III
Creator III
Author

Hi @hic!

I'm attaching an excel file with random data of the different data sources I'm trying to integrate. The data sources are numbered. The 4 therapeutic areas in case are Immuno, Neuro, Vaccines and Onco.

For the dimension Therapeutic Area, I'm concatenating all the data sources from 1 to 3 to have dimension values as complete as possible. Actually, Disease should be the name for this dimension since it is more granular (one Therapeutic Area can have multiple Diseases associated, but one Disease is associated with one Therapeutic Area only).

At the end of this concatenation, I'm creating the composite key as follows:

[Therapeutic Area]&'|'&Disease as %ID_Disease

 

Now consider these 2 situations:
1) datasource3 goes to therapeutic area level only (no diseases) and does not have 'Onco' therapeutic area associated with it. Because of that, when I integrate this data source with the rest, I will only come up with these %ID_Disease:

Immuno|
Neuro|
Vaccines|

How do I come up with the Onco| key?

2) I have the datasource4 which gives me doctor-related information (his opinions on a subject). He's related to one therapeutic area only (his specialty). How can I map the relationship Doctor-Therapeutic Area? How do I create the %ID_Disease for the doctor? I have a dimension Doctor with some attributes and don't know if Therapeutic Area should be another one of its attributes or if I should collapse it in the fact table.

This is my model (for now):

datamodel_therapeuticArea.PNG

 

Thanks!

 

Lisa

valpassos
Creator III
Creator III
Author

Everybody feel free to jump in 😄