I have a SQL Server 2014 table that looks like the following:
OrderId int not null IDENTITY --this is the primary key column
OrderDate datetime2 not null
CustomerId int not null
Description nvarchar(255) null
Some folks on my team have suggested that the clustered index should be on OrderId, but I think that the CustomerId + OrderId would be a better choice for the following reasons:
- Almost all queries will be looking
WHERE CustomerId = @param, notOrderId CustomerIdis a foreign key to theCustomertable, so having a clustered index withCustomerIdshould speed up joins- While
CustomerIdisn't unique, having the additionalOrderIdcolumn specified in the index will ensure uniqueness (We can use theUNIQUEkeyword when creating the clustered index on those 2 columns, to avoid the overhead of not having uniqueness) - Once data is inserted, the
CustomerIdandOrderIdnever change, so these rows wouldn't be moving around after initial write. - Data access happens via an ORM that requests all columns by default, so when a query based on
CustomerIdcomes in, the clustered index will be able to provide all columns without any additional work.
Does the CustomerId and OrderId approach sound like the best option given the above? Or, is OrderId on its own better, since it's a single column that's guaranteeing uniqueness by itself?
Currently, the table has a clustered index on OrderId, and a nonclustered index on CustomerId, but it's not covering, so since we're using an ORM and all columns are requested, it's extra work to retrieve them. So with this post, I'm trying to consider improving performance with a better CI.
The activity on our DB is about 85% reads and 15% writes.