15

Can I get some thoughts on the entity design?

Let's say I have an entity called Book. Let's say I create a specific instance of that book. It has a title A and an author B. Since there can be multiple books, should the amount be included in the entity OR should I instead create a separate entry for each of the same book in my database ?

So in each case, if the amount of books were 30, database would look respectively:

id = 1, title = A, author = B, amount = 30;

vs

id = 1, title = A, author = B; 
id = 2, title = A, author = B;
...
id = 30, title = A, author = B;

Which method (if any) is considered a good practice? Since in this case there may be multiple identical books and I may want to update them, having one entry in database which would include the amount seems easier to update than n amount of db entries which only differ by id (every other data for that specific book being identical).

I am developing using Java and Spring Boot if that matters.

EDIT: all the answers I have received were very helpful to me. Too bad I can't accept all of them as an answer so I will have to go with the seniority!

Christophe
  • 81,699

4 Answers4

99

This is not a question of "good practice", but a question of the requirements of the system. For example:

  • Let's say your system is for a library. If the library has several instances of the same book, each copy will have an individual library id and individual attributes like its age, who borrowed it at what date, which condition the copy has, and maybe some more.

  • Let's say your system is for an online book shop, mainly selling new books. Hundreds of the same copy of the book are sold daily, and the copies have all the same price and are exchangeable. Then the system will probably not give each book an own identity, and it is not important to keep track of each individual item. So storing an amount for each book will probably way more useful within the system.

Often people run into the common misconception that there are "best practices" which can replace a thorough requirements analysis - do yourself a favor and don't fall into that trap. Data models are not living in "thin air", one needs context to make the right modeling decisions.

Doc Brown
  • 218,378
21

It depends on the identity of your items:

  • In a book store, books having the same title and same author all look identical and interchangeable. You would have one instance, holding the total quantity.
  • In a second-hand book store, books having the same title and the same author do not look identical: each has a different origin, is more or less used state, may have distinctive marks. Not only could you distinguish each of them, but moreover, you may even want to have an individualized price or you want to track where each of it is in the shop. You would a separate instance for each distinct individual.
  • In an art book store, you could have fine art books printed in limited editions, with a serial number on each item. There are two common strategies for this: one is to have multiple instances, each bearing the serial number of the individual book. However, in large scale industrial or commercial applications, the strategy is often to manage one singe instance of the book, and have a separate associated entity with distinct instances for the serial numbers. The business process then foresee an extra step to ask for serial numbers when ever you reduce or increase the quantity.
  • You may have different sets of books sharing the same title and the same authors, but with different editions. Or you have several sets of books purchased at different times: the older ones are dusty and passed color due to exposure to light. You may then want to manage different batches of the same book. You would have a book entity associated with batch entities (lots): one book instance could be associated with many batch instances with the quantity maintained at the batch level. This is not really common for books, but for pharmaceuticals and chemical products it's always like this (batches have in this case also expiration dates).
  • You could also for practical reasons decide to manage arbitrary sets of books. For example if you have a huge warehouse dominated by robots, with boxes of the same book on different shelves. You could then either manage stock keeping units (i.e. manage individual boxes of books referring to the same book instance), or manage each box as if it were a batch of books.

Several scenarios above can be combined. And there are other variants as well. In a typical ERP you'd have a product (book) with only the attributes of the product (title, author, ISBN, publisher, year), and then you would have entities for managing the copies of a book in warehouses (e.g. book B in quantity x in warehouse Y of subsidiary Z), and probably as well entities for managing the product in the stores (e.g. book B in quantity x at store/point of sale W of the subsidiary Z at a selling price s).

So in the end, it's not about the best approach, but the most suitable one. The fundamental question is what is the identifiable items that you are interested in or that you need to track individually.

Glorfindel
  • 3,167
Christophe
  • 81,699
1

It depends on what are you going to do with it.

If every entity needs to be distinguished from the others or needs to have its own history, then every entity should be represented by a separate record in the database, and the amount should not be its attribute. Examples: You are renting cars, or books, or you are selling tickets for flights and every ticket has its specific seat number.

If entities don't need to be distinguished and have no own history, then there is no sense to represent every entity by a separate record, a single record for all entities of this type will be sufficient, and the amount will be one of attributes of such entity. Examples: You are selling books, selling museum entry tickets.

-3

You have a million copies of the latest Harry Potter book. Does it seem at all reasonable to have a million rows in the database?

You’d have one table with book ids, authors and titles (which is probably not enough to uniquely identify a book). And the you have a second table where each row contains an id for the row, and a book Id and an amount of books.

Why not one table with author, title and amount? Because you are stuck when you have a use of a book that doesn’t involve amounts.

gnasher729
  • 49,096