Document Database versus Relational Database : how to choose?

Question

I'm a SQL guy, but I know there is Not Only SQL databases - document-database mostly. As with most technologies there are pro and cons for each technology.

I've read some articles, but they were too theorical. What I would like is two real cases :

when a switch from relational- to document-database gave an improvement
when a switch from document- to relational-database gave an improvement

Improvement being any thing that makes better programs - less developpement time, scalabilty, performance, anything that is programming related. There is a caveat for 2. : stories like " falling back to relational database because everybody knows SQL" is not good

Jonas · Accepted Answer · 2011-07-08T10:51:17.353

The main reason for choosing a NoSQL database the last years have been Availability. For companies like Amazon, Google and Facebook an hour of downtime or so isn't acceptable. To achieve high availability you need to reduce single-point-of-failure, that means you need to use a distributed system with multiple computers in case a computer crashes, the service is still available.

Traditional Relatione databases isn't very good in a distributed multi-master setup. That's why NoSQL has been so popular lately. So if you need high availability you may choose a NoSQL database like Riak, Cassandra, HBase, S3 or BigTable.

There is a good blog post about Amazon's Dynamo that is a good introduction to distributed NoSQL databases.

Now, the NoSQL term is very broad so there are many NoSQL databases that aren't distributed. But they solve other problems. E.g. Neo4j - a graph database are good on a type of queries that traditional RDBMS aren't optimized for. Or as in your case a document database, where you don't have to change the schema if you want to add some fields for some documents. In other words a document database is good when most posts (documents) has different fields so a relational table with predefined columns isn't usable.

However, most of the NoSQL databases are not as flexible as traditional RDBMS databases are, so it's a good choice to use a traditional RDBMS database until it can't solve your problems anymore.

Falcon · Answer 2 · 2011-07-08T10:42:41.480

I have a simple approach to determine the database that best fits the data.

I just ask myself: Assuming I'd have no database, would I rather save the most and the important data as document or would I store them in a spreadsheet.

When the answer is "Spreadsheet", this is a clear sign that a relational model and a traditional RDBMS best suits the tasks most of the times. If the data are really simple, like only key value pairs or simple tables and referential integrity is not a topic, then a NoSQL database is probably best suited for the task and might boost performance quite a lot!

Also, when you cannot find a common structure at all, a NoSQL database is best suited for the task.

When the data are more document-like, e.g. hierarchically structured textual data without clear relations, then I immediatelly think of an XML-Database, which easily lets you store hierarchical structured documents. Sometimes it's best to use a document-management software, though.

So, in order to give a concrete and simple answer to both of your questions: It depends on the data.

when a switch from relational- to document-database gave an improvement

When you need to persist hierarchically structured textual data, an Xml-Database can be a big improvement in terms of maintainability and probably also scalability.

when a switch from document- to relational-database gave an improvement

Well, for example when the data are mostly in table-like form with clear relations and you need to guarantee integrity.

score 10 · Answer 3 · answered Jul 08 '11 at 10:34

We had to give up on the relational model because the data we were getting had no simple, obvious, fixed, static schema.

The users -- and the user stories -- did not have a fixed, static schema.

We tried to impose a fixed, static, RDBMS schema, but it was a mistake.

Each 3rd party data delivery (from customers and from vendors) was similar, but not identical. We tried mapping it to a fixed relational schema, but the variability was too great. We either had to add fields with every file (several each week) or we had to step away from the fixed, static relational schema.

If we viewed each record as a "document" with a common subset of elements and a unique (as well as ill-defined) collection of additional data elements, we were much, much happier.

The ill-defined collection of data elements is what the users actually needed for their use cases.

The fixed, static schema of the relational model did not fit our use cases.

Document Database versus Relational Database : how to choose?

3 Answers3

Linked