I recently had to assign more diskspace to my MongoDB 2.4.8 instance. This instance continually receives transactions, makes some updates and then deletes them after 3 months. I would therefore expect that the disk usage was relatively constant. The documents have a relatively uniform size of 5KB.
db.stats()
{
"db" : "mydb",
"collections" : 16,
"objects" : 4.71578e+006,
"avgObjSize" : 5368.2594088278856000,
"dataSize" : 25315551828.0000000000000000,
"storageSize" : 111230508336.0000000000000000,
"numExtents" : 128,
"indexes" : 41,
"indexSize" : 1398799136.0000000000000000,
"fileSize" : 122280738816.0000000000000000,
"nsSizeMB" : 16,
"dataFileVersion" : {
"major" : 4,
"minor" : 5
},
"ok" : 1.0000000000000000
}
I understand that disk usage will be larger than data size due to preallocation and fragmentation, but I cannot see any reasonble explanation for a 5 to 1 ratio other than a large historical delete or a bug.
Is MongoDB unable to reuse space properly so that we must schedule manual repair-jobs on otherwise completely stable systems, or do I have another problem somewhere?