I read about microservices and it seems illogical to me to create a separate DB per service just to achieve isolation. I can achieve the same using only web services and a single database. Why do we even need it? The thing that separate database is beyond discussion. Or I am plain wrong? Can you guide me on this?
6 Answers
Why do we even need it?
You don't.
Creating a separate database for each service helps to enforce domain boundaries, but it's only one approach. There's nothing stopping you from having all your services share the same database.
As long as your services behave and don't do unexpected things to data owned by other services, you'll be fine.
I don't know what you read, but you should be aware that there are many differing opinions on microservices architecture. Here's a good blog post on the topic.
I’ve seen folks refer to this idea in part, trivially, as “each microservice should own and control its own database and no two services should share a database.” The idea is sound: don’t share a single database across services because then you run into conflicts like competing read/write patterns, data-model conflicts, coordination challenges, etc.
But a single database does afford us a lot of safeties and conveniences: ACID transactions, single place to look, well understood (kinda?), one place to manage, etc.
The journey to microservices is just that: a journey. It will be different for each company. There are no hard and fast rules, only tradeoffs.
- 3,143
Why do we even need it?
The enormous benefit of microservices—and more largely, SOA—is the high level of abstraction of the internals—not only the implementation, but also the technologies being used. For instance, if a system is developed in a form of five microservices by five teams, one team can decide to move to a completely different technological stack (for instance from Microsoft stack to LAMP) without even asking other teams for their opinion.
Look at Amazon AWS or Twilio. Do you know if their services are implemented in Java or Ruby? Do they use Oracle or PostgreSQL or Cassandra or MongoDB? How many machines do they use? Do you even care about that; in other words, are those technological choices affecting the way you use those services?... And more importantly, if they move to a different database, would you have to change your client application accordingly?
Now, what happens if two services use the same database? Here are a tiny part of the issues which may arise:
The team developing service 1 wants to move from SQL Server 2012 to SQL Server 2016. However, the team 2 relies on a deprecated feature which was removed in SQL Server 2016.
Service 1 is a huge success. Hosting the database on two machines (master and failover) is not an option any longer. But scaling the cluster to multiple machines requires strategies such as sharding. Meanwhile, team 2 is happy with the current scale, and sees no reason to move to anything else.
Service 1 should move to UTF-8 as its default encoding. Service 2, however, is happy using Code Page 1252 Windows Latin 1.
Service 1 decides to add a user with a specific name. However, this user already exists, created a few months ago by the second team.
Service 1 needs a lot of different features. Service 2 is a highly critical component and needs to keep database features at their minimum to reduce the risk of attacks.
Service 1 requires 15 TB of disk space; the speed is not important, so ordinary hard disks are perfectly fine. Service 2 requires 50 GB at most, but needs to access it as fast as possible, meaning the data should be stored on an SSD.
...
Every little choice affects everyone. Every decision needs to be taken collaboratively, by people from every team. Compromises have to be made. Compare that to a complete freedom to do whatever you want in a context of SOA.
it's too [...] unmanageable.
Then you're doing it wrong. I suppose you're deploying manually.
This is not how things should be done. You need to automate the deployment of virtual machines (or Docker containers) which run the database. Once you automated them, deploying two servers or twenty servers or two thousand servers is not very different.
The magic thing about isolated databases is that it's extremely manageable. Have you tried managing a huge database used by dozens of teams? It's a nightmare. Every team has specific requests, and as soon as you touch something, it affects someone. With a database paired with an app, the scope becomes very narrow, meaning that there are much less things to think about.
If a huge database requires specialized system administrators, databases which are used by only one team can essentially be managed by this team (DevOps is also about that), freeing system administrators' time.
it's too costly
Define cost.
Licensing costs depend on the database. At the era of cloud computing, I'm pretty sure all major players redesigned their licensing to accommodate the context where instead of one huge database, there are lots of small ones. If not, you may consider moving to a different database. There are a lot of open source ones, by the way.
If you're talking about processing power, both virtual machines and containers are CPU-friendly, and I wouldn't be very affirmative that one huge database will consume less CPU than a lot of small ones doing the same job.
If your issue is the memory, then virtual machines are not a good choice for you. Containers are. You'll be able to span as many as you want, knowing that they won't consume more RAM than needed. While the total memory consumption will be higher for lots of small databases compared to a large single one, I suppose that the difference won't be too important. YMMV.
- 137,583
As Dan Wilson answers, you don't really need it. Microservices are the new hot thing, and like all new hot things, people use them in a lot of places even when they don't provide much value.
Microservices allow you to independently deploy and scale things at a "micro" level. That granularity provides a bunch of technical benefit and even more non-technical benefit since it allows you to better separate development teams, release as needed rather than one big release, try out new technologies or processes in isolation, etc. Having a shared DB kills a lot of that because of the dependency on the DB. If you can't deploy your service without worrying about other service's data, you've lost.
The thing that separate database is beyond discussion. Or I am plain wrong?
That said, you're also plain wrong.
When you're working in the cloud, databases are cheap. Free usually! Sure, the server costs money, but we're not talking about an individual server per microservice (at least, not at first). A single server with a bunch of (logical) databases is just fine as long as you're diligent about avoiding cross-database queries (which introduce dependencies that harm "independently deployable and scalable"). Hell, cross DB queries are impossible in some cloud database services like Azure SQL. You don't even need to be diligent there...
And I've even seen microservices where they shared a database, but each service got its own schema. Again, you need to be diligent about avoiding queries that cross data boundaries.
Lots of places aren't that diligent. They have entry level devs, or people who don't value the microservice approach, or have poor team leads, or have timeline pressure that causes people to take shortcuts.
Having a separate database is the cleanest way to enforce that decoupling that allows service independence, but it's not the only way. And it's not that expensive - especially when you compare it to the time/salary spent trying to enforce data boundaries in a shared database.
- 110,259
In addition to all of the answers, from my understanding, putting a single database for most of your microservices is practical. It is useful to cover almost all database-per-microservice dilemmas (e.g., resource-draining for message passing, transactions, and ACID rules, eventual consistency over strong consistency, data redundancy, and maintenance) except:
- The fact that there should be a failure point for highly used tables. One or some tables of a module that lift heavy traffic is a failure point for the entire database.
- Some data may need to be kept in a DBMS other than the main DBMS (e.g., elastic for text data or new4j for graphs).
- Some teams need their autonomy over development, deployment, and even architectural design.
Hence, practically no need to provision a separate database for each and every service, but consider separating the ones
- that can be easily detached,
- or foreseen to be used heavily,
- or need to be stored in a different DBMS,
- or when a customer needs his/her data to be kept in his/her own place but you don’t want to hand over the whole database structure.
As a rule of thumb, use schema (or such) to logically group each table/collection by its service name so that they can be distinguished, learned, and separated more easily later.
- 131
Dependendin on what you consider “expensive”.
A database does not necessarily have to be an expensive commercial database server (think Oracle) not does it have to be necessarily a resource hungry affair. Depending on what are your requirements, you can use SQLite database or even file system as a persistent data storage.
All those services could also share a single database instance/server and only have isolated schemas per service.
The key argument here is that the service needs to own and control its data. How does it achieve that, is a matter of choice and technical details.
The best way a service can own and control its data is by having its own “personal” database. This allows for complete freedom of choice of the technology and data scheme evolution. The only way any other service can access data owned by a service is by asking for the data from a service. This way, if internal data representation needs to be changed, it can be easily changed and no other services will break.
So, to recap. It is not necessarily expensive to have a database per service nor is it necessary. It is simply a decision that you need to make at some point when developing microservices. Each of the choices has its implications and limitations. Study those and make your own choice.
- 197
Separating databases for each service can be hard to achieve, especially when relationships between data are complex. I've seen teams having to work in several services to get one simple request done. Would start with SOA, then if the data are also isolated then split the service, common sense design instead of doing it because it's fashionable.
- 1