I'm doing a BI school project in which I had to collect and clean data to create a multidimensional model. I selected yearly crime data by districts of my country and enriched it with demographics and economics, resulting in three sources. Each source has the data for each year of the sample (2017-2022) for each district, e.g this hierarchy:
- 2017
- District 1
- District 2
- 2018
- District 1
- District 2
Sources relate using the district's zip code as key. Here they are:
Crime data
- Year
- State
- Province
- District (and zip code)
- Crime category
- Amount (for that category, for that district, for that year)
Demographic data
- Year
- District zip code
- Total population
- Average age
- Average home size
- Other similar indicators...
Economics data
- Year
- District zip code
- Mean income
- Other similar indicators...
I was trying to design the multidimensional model considering the facts as the crime-related data (number of crimes for the category and some other computed KPIs), but I'm having some trouble with how to fit in the demographic and economic indicators (which would be used to enrich the crime data analysis). Here are my thougths:
- I can't include demographic and economic indicators in the fact table, as its granularity is defined by the crime categories, so, for instance, I would have repeated population data for each of the crime categories for a district
- I can't include them neither on the district dimension, as the values vary for each year. Same happens for the time dimension (which I doubt if it would make sense at all, since I only have year-level granularity), because the values are linked to the municipality
- I could create an 'indicators' dimension, but it would have a PK composed by two FKs (year and municipality, which are the PKs of other two dimensions), but what I think I've learned of multidimensional models, that doesn't fits a Star or Snowflake schema
- I could have a fact table with its specific district-level granularity for the demographic and economic indicators, but it would still be linked to the original fact table, and I'm not completely sure that two fact tables should be connected
Am I taking the wrong approach here?
Thank you!