最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

database - Multidimensional model with interrelated dimensions? - Stack Overflow

programmeradmin5浏览0评论

I'm doing a BI school project in which I had to collect and clean data to create a multidimensional model. I selected yearly crime data by districts of my country and enriched it with demographics and economics, resulting in three sources. Each source has the data for each year of the sample (2017-2022) for each district, e.g this hierarchy:

  • 2017
    • District 1
    • District 2
  • 2018
    • District 1
    • District 2

Sources relate using the district's zip code as key. Here they are:

  1. Crime data

    • Year
    • State
    • Province
    • District (and zip code)
    • Crime category
    • Amount (for that category, for that district, for that year)
  2. Demographic data

    • Year
    • District zip code
    • Total population
    • Average age
    • Average home size
    • Other similar indicators...
  3. Economics data

    • Year
    • District zip code
    • Mean income
    • Other similar indicators...

I was trying to design the multidimensional model considering the facts as the crime-related data (number of crimes for the category and some other computed KPIs), but I'm having some trouble with how to fit in the demographic and economic indicators (which would be used to enrich the crime data analysis). Here are my thougths:

  • I can't include demographic and economic indicators in the fact table, as its granularity is defined by the crime categories, so, for instance, I would have repeated population data for each of the crime categories for a district
  • I can't include them neither on the district dimension, as the values vary for each year. Same happens for the time dimension (which I doubt if it would make sense at all, since I only have year-level granularity), because the values are linked to the municipality
  • I could create an 'indicators' dimension, but it would have a PK composed by two FKs (year and municipality, which are the PKs of other two dimensions), but what I think I've learned of multidimensional models, that doesn't fits a Star or Snowflake schema
  • I could have a fact table with its specific district-level granularity for the demographic and economic indicators, but it would still be linked to the original fact table, and I'm not completely sure that two fact tables should be connected

Am I taking the wrong approach here?

Thank you!

发布评论

评论列表(0)

  1. 暂无评论