I have the following Schema design. For simplicity of this example, I am just using Patient ID as the key to illustrate my question.
In my fact table, I have some facts that relate to patients and I want to know which patients had a Liver or Heart biopsy performed. These dimension tables relate as a whole to the patient and not necessarily to any individual fact row.
However, in my operating system, you have to actually enter information in the Liver and Heart sections for it to record data in the database's tables. So, for example, if the Heart doesn't need information put into the system, people will simply skip entering in info.
So you can see patient 2 is not even in my liver dimension table, and patient 1 is not in the heart.
Fact Table
Key | Patient ID (FK) | Fact | Another Fact |
---|---|---|---|
A | 1 | 100 | 500 |
B | 2 | 200 | 600 |
C | 3 | 300 | 700 |
D | 4 | 400 | 800 |
I have the following Schema design. For simplicity of this example, I am just using Patient ID as the key to illustrate my question.
In my fact table, I have some facts that relate to patients and I want to know which patients had a Liver or Heart biopsy performed. These dimension tables relate as a whole to the patient and not necessarily to any individual fact row.
However, in my operating system, you have to actually enter information in the Liver and Heart sections for it to record data in the database's tables. So, for example, if the Heart doesn't need information put into the system, people will simply skip entering in info.
So you can see patient 2 is not even in my liver dimension table, and patient 1 is not in the heart.
Fact Table
Key | Patient ID (FK) | Fact | Another Fact |
---|---|---|---|
A | 1 | 100 | 500 |
B | 2 | 200 | 600 |
C | 3 | 300 | 700 |
D | 4 | 400 | 800 |
Liver Data Dimension
Patient ID | Biopsy | Columns 2 | Columns 3 |
---|---|---|---|
1 | Yes | data…. | data…. |
3 | No | data…. | data…. |
4 | Yes | data…. | data…. |
Heart Data Dimension
Patient ID | Biopsy | Columns 2 | Columns 3 |
---|---|---|---|
2 | Yes | data…. | data…. |
3 | Yes | data…. | data…. |
4 | No | data…. | data…. |
I have a few questions
Can 2 or more dimension tables use the same fact table FK? Can I use Patient ID to link both dimension tables to the fact table
In a Star Schema, would it be better to simply join the heart/liver dimension tables together into one? I understand that this is totally my decision, but I am curious if there is a rule or best practice where if the same key can link 2 different dimension table together, they really should be a single dimension table
If I do keep them separate and I want to pull data on how many patients had a Liver Biopsy done, I would get 2 Yes, 1 No and 1 Blank (Patient 2). How would I fix this without putting every patient in the Liver Dimension table and having every row except Biopsy be NULL?
Share Improve this question edited Feb 22 at 22:03 keithwalsh 8813 silver badges19 bronze badges asked Feb 21 at 18:00 A. RomainA. Romain 291 bronze badge 9- Please do not use images, use table markdown – Dale K Commented Feb 21 at 20:02
- Your schema design feels wrong, to me; what’s the declared grain of your fact table? I would assume it would be biopsy per patient, so you’d have an FK to your patient dimension (which contains only information about the patient, not any biopsy information) and an FK to a biopsy dimension – NickW Commented Feb 22 at 22:33
- 1 I doubt that the patient has only one test. Rather, there will be a series of analyses of the same type on different dates. It is also not worth combining different dimensions into one table. This usually results in non-normalized data. Like text column "Ok. Pv 23. Ph=6.7 Hemoglobin norm")) – ValNik Commented Feb 22 at 22:49
- 2 As described, your two "dimension" tables are actually fact tables. There's a patient, that's a dimension, but the heart and liver tables contain facts about something associated with patients. As you've (reasonably) hidden much of the real detail, exactly how you "should" structure it is hard to guess. I suggest Kimball as good reading, the following covers some aspects of having multiple fact tables ; kimballgroup/2003/04/… – MatBailie Commented Feb 22 at 22:58
- Also, note that the application forms don't need to have a 1:1 relationship with your tables. And that you can have additional fact tables / cubes, along side your application tables. – MatBailie Commented Feb 22 at 23:14
1 Answer
Reset to default 0- Yes multiple dimension tables can use the same fact table FK.
- Yes you can use
PatientID
to link both dimension tables to the fact table. However, it's more usual to have different surrogate keys in the fact table than directly usingPatientID
, but it's still possible. - If the Liver and Data dimension tables share the same grain and similar attributes, then it would make sense to combine them. But there could be valid reasons to keep them seperate also (e.g. different update frequencies).
- If you keep them separate then a
Left Join
+CASE
can handle missing rows.