1

this year one of our collage courses was related to creating and managing data warehouses. There is one thing that I really struggle to understand and I would appreciate any kind of advice that you have.

My question would be: How does the original relations from relational database map to fact and dimension tables of the data warehouse.

In my example I have a simple cinema Database with this structure (has other columns but are not relevant)

ticket
-------
id - primary 
price
something...

invoice

id - primary ticketId - forieng key orderId - foriegn key totalPrice


My DWH looks like this (it has more columns but they are not important, it's a star schema)

FactInvoice
----------
id - prim
ticketKey - foreign
dateKey - foreign
orderKey -foreign
currency
totalOrderPrice
totalPrice

DimTicket

id - prim seatNumber isUsed

DimOrder

id - prim orderType

DimDate

id - prim day month year nameOfDay

My problem with this is, what are ticketKey, dateKey and orderKey, are they just references to the autogenerated primary key in dimensions or are/should they be relations from the original relational database ?

For example if I had this in relational database

TICKET - ID:12 price:5 
INVOICE - ID:234, ticketId: 12, orderId:123, totalPrice:20

Should the keys in data warehouse represent ID 12 for ticket so that the relation from INVOICE to TICKET exists or should they be independent of the relational model.

Also if they are just references to the autogenerated field in the DW how can I add a reference so I have a correct relation between single invoice and single ticket.

We are using Visual Studio SSIS extension to create data flow, and for now I have queries that select these information but I don't know if I should import the ID's from those tables.

Sorry if this question has a obvious answer but the whole concept of fact/dimension is pretty new for me and I don't know if I'm understanding it correctly. Thanks In Advance

stiMULAnt
  • 11
  • 1

0 Answers0