this year one of our collage courses was related to creating and managing data warehouses. There is one thing that I really struggle to understand and I would appreciate any kind of advice that you have.
My question would be: How does the original relations from relational database map to fact and dimension tables of the data warehouse.
In my example I have a simple cinema Database with this structure (has other columns but are not relevant)
ticket
-------
id - primary
price
something...
invoice
id - primary
ticketId - forieng key
orderId - foriegn key
totalPrice
My DWH looks like this (it has more columns but they are not important, it's a star schema)
FactInvoice
----------
id - prim
ticketKey - foreign
dateKey - foreign
orderKey -foreign
currency
totalOrderPrice
totalPrice
DimTicket
id - prim
seatNumber
isUsed
DimOrder
id - prim
orderType
DimDate
id - prim
day
month
year
nameOfDay
My problem with this is, what are ticketKey, dateKey and orderKey, are they just references to the autogenerated primary key in dimensions or are/should they be relations from the original relational database ?
For example if I had this in relational database
TICKET - ID:12 price:5
INVOICE - ID:234, ticketId: 12, orderId:123, totalPrice:20
Should the keys in data warehouse represent ID 12 for ticket so that the relation from INVOICE to TICKET exists or should they be independent of the relational model.
Also if they are just references to the autogenerated field in the DW how can I add a reference so I have a correct relation between single invoice and single ticket.
We are using Visual Studio SSIS extension to create data flow, and for now I have queries that select these information but I don't know if I should import the ID's from those tables.
Sorry if this question has a obvious answer but the whole concept of fact/dimension is pretty new for me and I don't know if I'm understanding it correctly. Thanks In Advance