I've developed a shop management software using Postgresql and JAVA. I would like to know which of the solutions below is the best one for my specific situation:
- Only one database with huge tables (maybe partitioned);
- Lots of independent databases distributed into several servers;
Details of my specific situation:
- There are 8,000 shops that use the software.
- Each shop has its own database that has no relationship with the other shops databases.
- There are no customization to the shops: the database structure is always the same for all of them.
- Today each shop has the software and database installed locally. But I want to change that, transferring the databases to online servers so they could be accessed from anywhere by the users (through the software).
- Each database has 50 tables. The 7 bigger ones have 300,000 records each and have relationship among them. The other tables are small, with less than 5,000 records each.
- On average the size o the database folder is about 200 MB in disc.
- Each database can have 2 or 3 persons connecting to it during the day. But, on average, it will have only 1 person using the software per shop.
Pros and Cons of each solution:
According to lots of similar cases that I've read, I know that there are pros and cons for each solution.
Solution 1
Only one database with huge tables (maybe partitioned)
Pros:
- Easy to maintain;
- Easy to recover from a server crash;
- Easy to make backups;
Cons:
- There would be tables with 2.5 billion records (8k shops * 300k records)
- Slow queries for huge tables, even using partitions;
- Any problem with the database would affect all shops.
- Expensive hosting servers. A server with a great configuration is more expensive than several servers with medium configuration.
Doubts:
- What would be the minimum configuration needed to run this solution? If possible, a configuration plan from Digital Ocean (https://www.digitalocean.com/pricing)
- I could use Data Warehouse to separate old data, but the bigger tables would still big for recent data, with 250 million records. Would it help make it the right solution? If so what would be the minimum server configuration?
- Would "list partitioning" based on
shop_idhelp sufficiently? What would be the drawbacks? Is it OK to have 7 tables with 8,000 partitions each?
Solution 2
Lots of independent databases distributed into several servers
Pros:
- Bigger tables with only 300 thousands records;
- Very fast queries due the small tables;
- A problem with a database would affect only one shop;
- Cheap hosting servers. Several regular servers cost less than a server with a great configuration
Cons:
- Difficult to maintain;
- Difficult to make backups;
- Difficult to recover from a server crash. Lots of databases in the same server could get corrupted in the same time due a server crash. It would be hard to do with it.
- Need to develop extra code to deal with all databases at the same time. For example: script to backup all databases, script to add a new column at a specific table on all databases, code to validate users and point them to the right server and database.
Doubts:
- Considering that I will use the $20 servers of Digital Ocean, which have 2 core processor and 2 GB memory, how much databases could I have for each server?
- My main concern about this solution is the possibility of lots of databases get corrupted at the same time due a server crash. In case of a server crash what would be the percentage of the databases that would be corrupted and, then, need a recovery?
Conclusion
I know the pros and cons of each solution. But I am still not able to decide what is the best solution to my specific case. Also, I have a lot of doubts like what server configuration to choose, how many databases can I have in a single server, how often databases get corrupted after a system crash, how really difficult it is to maintain lots of databases and how big a table can be without becoming slow.
I know that there are a plenty of this subject in the internet. I've read a lot and I'm still not sure which solution to choose. I would not spend so much time writing this if I weren't really in need of help.
Please clarify my doubts and help me choose the best solution.