Questions tagged [sharding]

Sharding is the practice of logically dividing or partitioning data, usually using a specific key (referred to as a shard key), and then placing that data on separate hosts (subsequently known as shards). This is generally done to scale horizontally (more hosts) as opposed to vertically (more powerful hosts) and can provide significant cost benefits, particularly with the now ubiquitous availability of cloud computing.

Sharding is also known as partitioning, specifically horizontal partitioning. A key, or combination of keys is picked and subsequently used to create logical groupings of data. These data represented by these groupings can then be spread across many hosts to enable horizontal scaling across smaller (generally less expensive) hosts rather than using progressively more powerful (and expensive) hosts for the database as utilization increases.

Generally one or more grouping will reside on a particular host (or set of hosts) and that will then be referred to as a shard.

When accessing the data (or adding new data), there must be a mechanism whereby the application, user, DBA etc. can determine where data currently resides or should be placed based on the sharding scheme. This may be implemented many ways - lookup tables, range based partitioning, consistent hashing schemes, hash maps and various others.

309 questions
15
votes
2 answers

Effectively handle 10-100 millions row table of unrelated data

What are the common approaches to boost read/write performance of table with up to 100 millions of rows? Table has columnSEGMENT_ID INT NOT NULL, where each segment has about 100.000-1.000.000 rows. Writes - all rows for SEGMENT_ID are inserted at…
VB_
  • 407
  • 1
  • 4
  • 10
15
votes
3 answers

Should a multi tenant system with SQL Server 2016, Shard or have Tenant isolation via separate database per tenant?

Given the use case: Tenant data should not cross talk, one tenant does not need another tenant's data. Each tenant could potentially have large historical data volume. SQL Server is hosted in AWS EC2 instance. Each tenant is geographically…
D.S.
  • 365
  • 4
  • 10
12
votes
2 answers

MongoDB: co-locate the mongos process on application servers

I would like to ask a question about a best practice described in this document: http://info.mongodb.com/rs/mongodb/images/MongoDB-Performance-Best-Practices.pdf Use multiple query routers. Use multiple mongos processes spread across multiple…
tenshi
  • 173
  • 1
  • 9
11
votes
5 answers

MongoDB --- Failed global initialization: Failed to open "/var/log/mongodb/mongod-config.log"

I'm trying to setup the config servers for mongodb sharding. I created a specific config file that is set to log to /var/log/mongodb/mongod-config.log. When I run mongod --config , I get this error: `F CONTROL [main] Failed…
rapidDev
  • 151
  • 1
  • 3
  • 15
10
votes
11 answers

mongodb could not find host matching read preference { mode: \"primary\" } for set?

I am deploy a monodb sharding, I have deploy a replica set in three machines: dev41:27017,dev42:27017,dev193:27017 and configsvr in three machine: dev41:27019,dev42:27019,dev193:27019 and also a mongos in machine: dev41:28000 at last I try to add…
roger
  • 213
  • 1
  • 2
  • 6
9
votes
1 answer

mongodb shard chunk migration 500GB takes 13 days - Is this slow or normal?

I have mongodb shard cluster, shard key is hashed. It has 2 shard replica sets. Each replica set has 2 machines. I did an experiment by adding another 2 shard replica sets, and it starts to rebalance. However, after a while I find out that chunk…
rendybjunior
  • 259
  • 2
  • 10
9
votes
2 answers

verify that mongos server is connected to config servers

I've been writing a backup script for sharded replica-sets and it's almost done. Except I can't seem to get it to successfully start the balancer backup after everything's all said and done. Here's the command I'm trying to use to start the balancer…
Alexej Magura
  • 233
  • 2
  • 7
7
votes
3 answers

Scaling out SQL Server and syncing data across multiple machines

I don't have expertise in architecting databases, and I've been teaching myself new stuff every day. I'd like to make an Internet-scale application using SQL Server as the data store. I haven't found any good information online with regards to…
Mark13426
  • 579
  • 1
  • 7
  • 11
7
votes
1 answer

Horizontally scaling SQL Server, distributing the database with sharding

I wanted to know if there is any way to distribute a SQL Server (I'm using 2012 version) database accross multiple nodes. I'm trying to compare READ queries performance between SQL Server and MongoDB. The distribution is all set with MongoDB with…
7
votes
2 answers

Mongos not returning existing data in shards

So we have a configuration of 2 shard (each is a replica set) and one sharded collection. There are around 10 million records in each shard. We are using mongo 2.6.4 version. We have sharded with hash_id shard key which we generated specific for…
Ivan Longin
  • 181
  • 2
6
votes
1 answer

Remove a shard in MongoDB

I have a MongoDB cluster, I accidentally hosed a shard, I don't have a backup. It's pretty easy to recover the data, but after issuing the removeShard command, it says it's draining. However, the shard is unreachable, and will forever be unreachable…
Kevin
  • 61
  • 1
5
votes
2 answers

Chunk Size is shown as 1 KB in mongo db logs even it is set to 300 MB

I am running a mongo cluster. Chunk size set is 300 MB but for today morning it is showing me in logs that chunk size is 1024 Byte. I checked in current op there also it is showing chunks of 1024 byte. I have checked with monos and on all config…
viren
  • 511
  • 2
  • 9
  • 29
5
votes
1 answer

Mariadb sharding?

I'm trying to implement a mariadb sharding for learning purpose. I have looked into database sharding in general and found mariadb spider storage engine support sharding. I do not want spider storage engine and like to test tokudb storage engine due…
Viraj
  • 333
  • 1
  • 6
  • 18
5
votes
1 answer

MongoDB presplitting chunks for compound shard key

In my mongodb setup I have a compound shard key {"region" : 1, "foo" : 1, "bar" : 1} and I know the values region can be and that each region should be on one chunk. Therefore I'd like to pre-split based on the region key only. The sharding status…
Nils
  • 153
  • 5
5
votes
1 answer

Sharding Postgres Database

I have a Postgres database that has grown to the size where it is no longer feasible to store everything on a single database node. There is a Customer table in my schema where each row represents a (surprise!) customer. Every other table in my…
CadentOrange
  • 783
  • 1
  • 8
  • 10
1
2 3
20 21