2

I have a client who's insisting in a SaaS application to have one table for all users, and then for each user to have a separate database on which we perform CRUD operations based on an authentication token that contains permission only for that database. This will be done with python using psycopg2, from what I've read is that PostgreSQL has a max_connections option and this would mean that at one point in future we're going to reach a bottleneck and will no longer be able to serve our application for an x simultaneous number of clients. What other arguments can I provide to make my client change his mind and avoid such mistake that might cost him too much as he's also had some amount of app development experience?

Arka-cell
  • 123
  • 3

2 Answers2

3

His primary reason is performance, instead of having one big database, he'll separate it by having small databases.

Well he's mostly wrong about using it as a design decision for performance. I'm sure he thinks his performance issues were due to the amount of data in one place and splitting it up this way reduces the amount, but indexes already split up the data in an exponentially more performant way.

There's only a few potential performance improvements one may gain by doing this. One is by reducing lock contention on the tables. But lock contention shouldn't normally be much of an issue with a properly architected database with well designed queries. And can be helped with proper isolation level usage.

The other potential performance improvement is more specific statistics to the data can be generated. One client's data may have a lot more rows in a given table, and another client may have very little rows in that table. In the same table, the statistics would be based on the aggregate of both clients' data, resulting in what may not be optimal for one or both clients when certain queries are ran against those tables. Separate databases (and ergo separate tables) for each client allows more targeted statistics about the data in those tables, which can help improve performance in certain cases.


Now there are reasons in a SaaS application to model the database as a separate database per client. It makes some management operations easier such as backup restores at the individual client level. It also gives more flexibility with customizing the database catered towards the client, even just at a performance level with indexing that may make more sense for one client's set of data that varies greatly from the norm. And there's a better assurance of security towards the clients that only they can access their data (with it not being intermingled with other clients' data in the same database).

So it's not a poor design choice to do this. But it also makes management for you as the developer more difficult for things such as deploying schema changes, for example.


from what I've read is that PostgreSQL has a max_connections option and this would mean that at one point in future we're going to reach a bottleneck and will no longer be able to serve our application for an x simultaneous number of clients.

This is a limitation on the server, and will be true regardless if you use a single database or separate databases for you SaaS clients. You're still going to have X number of clients using your application to connect to the database(s) either way. It's actually an argument that can be used in favor of using a separate database per client, so that they can be easily migrated to other servers as needed, to help balance out the connection contention on the current server.


FWIW, when I worked for a SaaS company in my past, we followed the separate database per client model. It worked well for us. We had about 1,000 databases / clients across a couple of servers. This wasn't done directly for performance reasons, rather for improved management, security, and flexibility reasons, as previously mentioned. We did a lot of analytical querying on the data though, so custom indexing and query tuning was needed for certain clients' databases.

J.D.
  • 40,776
  • 12
  • 62
  • 141
1

All databases have a max_connections options.

I already replied to this question here and I then wrote blog post about that.

The rule of thumb should be: are you serving customers or are you serving users?

  • Customers: You are selling a SaaS, your customers are actual companies with an office --> go single tenant
  • Users: your users can create an account with a @gmail.com e-mail, they come and go -- go multi tenant
Francesco Mantovani
  • 1,695
  • 14
  • 28