2

I have to move a MongoDB Collection (Size: 1TB+) from Sharded MongoDB Cluster to another[5 shards on both clusters]. During this operation, it is assumed that the accompanying program will be offline, so no need to worry about inconsistent data, but I need to minimize the downtime.

I've tried testing it with mongodump and mongorestore, but looks like that is taking ages to finish.

Please share your experiences for such kind of scenarios.

Note: Cluster has only one collection, so I am open at cluster level sync as well.

andy_l
  • 31
  • 1
  • 3

1 Answers1

1

As per MongoDB documentation Create Chunks in a Sharded Cluster

If you want to ingest a large volume of data into a cluster that is unbalanced, or where the ingestion of data will lead to data imbalance, such as with monotonically increasing or decreasing shard keys. Pre-splitting the chunks of an empty sharded collection can help with the throughput in these cases.

EXAMPLE

To create chunks for documents in the myapp.users collection using the email field as the shard key, use the following operation in the mongo shell:

for ( var x=97; x<97+26; x++ ){
    for ( var y=97; y<97+26; y+=6 ) {
        var prefix = String.fromCharCode(x) + String.fromCharCode(y);
        db.adminCommand( { split: "myapp.users", middle: { email : prefix } } );
    }
}

This assumes a collection size of 100 million documents.

Mani
  • 842
  • 1
  • 6
  • 10