Some scenarios for M:M relationships in a data warehouse model
Most OLAP servers and ROLAP systems have a means to deal with M:M data structures now, but there are some caveats about this that you will need to pay attention to. If you do implement M:M relationships you will need to keep an eye on your reporting layer and what tools you want to support.
Scenario 1: M:M dimension onto a fact table
An example of this might be multiple drivers on a motor policy. If you add or remove a driver, the policy adjustment transaction may have a relationship to a list of drivers that changes with the adjustment.
Option 1 - M:M driver-fact bridge table
This will have quite a large volume of data, as it has drivers x transactions rows for a given policy. SSAS can consume this data structure directly, but it is slower to query through a ROLAP tool.
If your M:M relationship is based on entities that are specific to the fact row (e.g. drivers on a car) this might be appropriate for a ROLAP tool as well, providing your ROLAP tool supports M:M relationships (e.g. using contexts in Business Objects).
Option 2 - Dummy 'combinations' dimension table
If you are mapping a list of common codes to a fact table (i.e. the linked entities are not peculiar to the fact row) then you can take another approach that will reduce the data volumes. An example of this type of scenario is ICD codes on an inpatient visit. Each inpatient visit will have one or more ICD diagnoses and/or procedures listed against it. The ICD codes are global.
In this case, you can make up a distinct list of the combinations of codes on each case. Make a dimension table with one row for each distinct combination, and have a link table between the combinations and the reference tables for the ICD codes themselves.
The fact table can have a dimension key to the 'combinations' dimension, and the dimension row has a list of references to actual ICD codes. Most ROLAP tools can consume this data structure. If your tool will only work with an actual M:M relationship then you can create a view that emulates the M:M relationship between the fact and the coding reference table. This would be the preferred approach with SSAS.
Advantages of option 1:
- Appropriately indexed, queries based on selecting fact table rows with a certain relationship through the M:M table can be reasonably efficient.
- Slightly simpler conceptual model
Advantages of option 2:
- Data storage is more compact
- You can emulate a straight 1:M relationship by presenting the combinations in a human-readable format as a code on the 'combinations' dimension. This might be more useful on dumber reporting tools that lack support for M:M relationships.
Scenario 2: M:M relationship between dimensions:
Harder to think of a use case, but one could envisage something out of healthcare with ICD codes again. On a cost analysis system, the inpatient visit may become a dimension, and will have M:M relationships between the visit (or consultant-episode in NHS-speak) and the codings.
In this case you can set up the M:M relationships, and possibly codify a human-readable rendering of them on the base dimension. The relationships can be done through straight M:M link tables or through a bridging 'combinations' table as before. This data structure can be queried correctly through Business Objects or better quality ROLAP tools.
Off the top of my head, I can't see SSAS being able to consume this without taking the relationship right down to the fact table, so you would need to present a view of the M:M relationship between the coding and the fact table rows to use SSAS with this data.