1

I recently had to assign more diskspace to my MongoDB 2.4.8 instance. This instance continually receives transactions, makes some updates and then deletes them after 3 months. I would therefore expect that the disk usage was relatively constant. The documents have a relatively uniform size of 5KB.

db.stats()
{
"db" : "mydb",
"collections" : 16,
"objects" : 4.71578e+006,
"avgObjSize" : 5368.2594088278856000,
"dataSize" : 25315551828.0000000000000000,
"storageSize" : 111230508336.0000000000000000,
"numExtents" : 128,
"indexes" : 41,
"indexSize" : 1398799136.0000000000000000,
"fileSize" : 122280738816.0000000000000000,
"nsSizeMB" : 16,
"dataFileVersion" : {
    "major" : 4,
    "minor" : 5
},
"ok" : 1.0000000000000000
}

I understand that disk usage will be larger than data size due to preallocation and fragmentation, but I cannot see any reasonble explanation for a 5 to 1 ratio other than a large historical delete or a bug.

Is MongoDB unable to reuse space properly so that we must schedule manual repair-jobs on otherwise completely stable systems, or do I have another problem somewhere?

1 Answers1

1

Based on the comments I have received the following actions seem to address my concerns:

  • Migrate existing collections to power of 2 sizes.
  • Run repair or compress periodically to optimize the free list search so that default allocation of new disk space on timeout is avoided.
  • Only capped collections should be considered "100% maintenance-free".