Let's start by explaining the entity that I'm thinking of breaking it into three tables and how the UI form is organized, and data entry is happening. There is a Person who has many properties, and these can be grouped into three parts (according to the UI form).
- The first part is the Person's basic info, things like his name, date of birth, location, etc. It has 21 fields.
- The second part is the Person's medical condition. This part has 5 fields and all of them could be null.
- The third part relates to the Person's financial situation. This part has 20 fields and many of them could be null.
The UI form is divided into these three main sections (there are other sections, like document upload, but these will have their own separate tables)
Should I keep all these data in one table (Person) which will contain about 46 columns, or do I make three tables each for each section? Am I going to run into performance issues down the line because the Person table has 46 columns? I'm inclined towards having one table for simplicity and I don't see a good reason to have three tables.
Let's talk data access patterns.
The Person table will be the most accessed table in my app in terms of reads. The user will be able to see a paginated list of persons, and this query will select only about 3 or 4 fields.
The user will also be able to filter persons based on about 10 fields (most of them from basic info and some from the person's medical condition). Also, this query will have to do some joins with another tables whose data should not be in the Person table (like Photos).
For showing a single person, the query will select about 10 fields, 9 of them is from the person's basic info and 1 from his financials. It will also retrieve other data from different tables. The listing of persons and of a single person will be the two most used queries in my app.
Another query that will be used many times will be listing all the data of a single person, the 46 columns along with other columns from different tables. There will also be the occasional updates to the Person table but mostly it's reads.
I am using Postgres.
Part of the medical condition fields is "diseases" and these can be multiple. I just have an array of "diseases" as field for the medical condition part. But essentially, a Person can have only one "medical condition" so one array of "diseases" for each Person. A person will have one-to-one with each one of these 46 fields.
For instance, if I have M1, M2 fields related to medical condition and Person can only have one value of M1 and of M2, why can't I just stick M1 and M2 in Person table? What are the disadvantages? Same goes for the financial situation part.
The medical condition section in the UI form has different fields of different things all related to the person's medical situation, let's call them M1 and M2 (in reality it's more than 2). I thought that a different table (person_medical_condition) might house these M1 and M2 columns and references the Person table but why would I do that given that there is one-to-one mapping from the Person to these fields. A person can't have multiple M1s and M2s.
I could just add these M fields to the Person table even when some of them could be null. The same case goes for the finance part. I have different fields related to the Person's finances that could be added to the Person table given they also have one-to-one mapping. For instance, a Person's monthly income. He can't have multiple monthly incomes.