6

I have been through the other questions/answers regarding inode usage, mounting issues and others but none of those questions seem to apply...

df -h

/dev/sdd1 931G 100G 785G 12% /media/teradisk

df -ih

/dev/sdd1 59M 12M 47M 21% /media/teradisk

Basically, I have an EXT4 formatted drive 1TB in size, and am writing arount 12 million (12201106) files into one directory. I can't find any documentation on a files-per-directory limit for EXT4 but the filesystem reports no space left.

Oddly, I can still create new files on the drive and target folder but when doing a large cp/rsync, the calls to mkstemp and rename report no space left on device.

rsync: mkstemp "/media/teradisk/files/f.xml.No79k5" failed: No space left on device (28)

rsync: rename "/media/teradisk/files/f.xml.No79k5" -> "files/f.xml": No space left on device (28)

I know storing this many files in one directory isn't advised for a ton of reasons, but unless I can help it I don't wish to split them up.

Inode and space usage for tmpfs, the device and everything else looks fine. Any ideas of the cause?

6 Answers6

6

It seems you are hitting directory size limit. Directory itself is some kind of special file which contains names (+ inode numbers and probably some other metadata) of all files in it. And it can't be larger than 2G.

Anyway, it's not a good idea to have more than few thousands of files in one dir: searches by file name would be very slow and you'll have a lot of problems with standard tools like ls, rm and others.

Update:

a-ha!

http://old.nabble.com/re:The-maximum-number-of-files-under-a-folder-td16033098.html

On Mar 13, 2008 13:23 -0400, Theodore Ts'o wrote:

There is no limit to the number of files in a folder, except for the fact that the directory itself can't be bigger than 2GB, and the number of inodes that the entire filesystem has available to it. Of course, if you don't have directory indexing turned on, you may not like the performance of doing directory lookups, but that's a different story.

There is also a limit in the current ext3 htree code to be only 2 levels deep. Along with the 2GB limit you hit problems around 15M files, depending on the length of the filenames.

rvs
  • 4,225
2

Another source of ENOSPC errors when using the ext4 filesystem can be hash collisions of the directory index hash algorithm used.

This blog post talks about more details:

ext4 uses half_md4 as a default hashing-mechanism. If I interpret my google-results correctly, this uses the md4-hash algorithm, but strips it to 32 bits. This is a classical example of the birthday-paradox: A 32 bit hash means, that there are 4294967296 different hash values available, so if we are fair and assume a uniform distribution of hash values, that makes it highly unlikely to hit one specific hash. But the probability of hitting two identical hashes, given enough filenames, is much much higher. Using the formula from Wikipedia we get (with about 50K files) a probability of about 25% that a newly added file has the same hash. This is a huge probability of failure. If on the other hand we take a 64bit hash-function the probability becomes much smaller, about 0.00000000007%.

Changing the directory hash algorithm to tea should resolve the problems.

2

The XFS filesystem would be a more supportable (long-term) solution for what you're trying to do now. Large file-count directories are not a problem for XFS. Of course, fixing this at the application level would also be helpful...

ewwhite
  • 201,205
1

is ext4 absolutely needed for you? These days XFS should handle a situation like this without a hitch.

1

I had this problem. My solution was:

mkfs.ext4 -i 1024 -b 1024 /dev/blah

-4

It seems you're running out of i-nodes. Show output of df -iht ext4.

I also have had a problem with removing a directory on EXT4 containing ~ 1 million of files (linux kernel 3.0, IIRC). What's kernel version in your case?

Finally, I'd suggest using Reiser3 — it doesn't have format time i-node limits, and in aforementioned case seemed to have solved the problem as well.

UPD.: For those wondering if Reiser3 being supported on not:

cd linux-stable/fs/reiserfs && git log --pretty='format:%aD %s' . | head -n20

Tue, 10 Jan 2012 15:110:11 -0800 reiserfs: don't lock root inode searching
Tue, 10 Jan 2012 15:11:09 -0800 reiserfs: don't lock journal_init()
Tue, 10 Jan 2012 15:11:07 -0800 reiserfs: delay reiserfs lock until journal initialization
Tue, 10 Jan 2012 15:11:05 -0800 reiserfs: delete comments referring to the BKL
Mon, 9 Jan 2012 12:51:21 -0800 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Wed, 21 Dec 2011 21:18:43 +0100 reiserfs: Force inode evictions before umount to avoid crash
Wed, 21 Dec 2011 17:35:34 +0100 reiserfs: Fix quota mount option parsing
Wed, 21 Dec 2011 20:17:10 +0100 reiserfs: Properly display mount options in /proc/mounts
Wed, 7 Dec 2011 18:16:57 -0500 vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
Tue, 26 Jul 2011 02:50:53 -0400 reiserfs: propagate umode_t
Tue, 26 Jul 2011 01:52:52 -0400 switch ->mknod() to umode_t
Tue, 26 Jul 2011 01:42:34 -0400 switch ->create() to umode_t
Tue, 26 Jul 2011 01:41:39 -0400 switch vfs_mkdir() and ->mkdir() to umode_t
Mon, 12 Dec 2011 15:51:45 -0500 vfs: fix the stupidity with i_dentry in inode destructors
Fri, 9 Dec 2011 08:06:57 -0500 vfs: mnt_drop_write_file()
Wed, 23 Nov 2011 11:57:51 -0500 switch a bunch of places to mnt_want_write_file()
Fri, 28 Oct 2011 14:13:29 +0200 filesystems: add set_nlink()
Fri, 28 Oct 2011 14:13:28 +0200 filesystems: add missing nlink wrappers
Tue, 25 Oct 2011 12:11:02 +0200 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Thu, 15 Sep 2011 15:08:05 +0200 Merge branch 'master' into for-next
poige
  • 9,730
  • 3
  • 28
  • 53