2

Possible Duplicate:
Doing an rm -rf on a massive directory tree takes hours

I'm running a simulation program on a computing cluster (Scientific Linux) that generates hundreds of thousands of atomic coordinate files. But I'm having a problem deleting the files because rm -rf never completes and neither does

find . -name * | xargs r

Isn't there a way to just unlink this directory from the directory tree? The storage unit is used by hundreds of other people, so reformatting is not an option.

Thanks

Nick
  • 143

4 Answers4

3

Method 1 Assuming those file are meant to be created, just need to be removed after use.

If possible, have all those files, and only those files, create in a standalone partition or disk. When it is time to deleted them, umount the partition and format it. EXT4 (not EXT2) format only takes a few seconds.

Make sure you are not saving information/report/etc in the same location.

You can mount a new partition or a new disk to the original location, either directly or with the -o bind option.

Method 2

Thinking a bit out of the box, instead of individual file, put all those data into a database table. Then drop the whole table after use.

John Siu
  • 3,787
  • 2
  • 19
  • 24
2

I typically use something like:

find ./directoryname -type f -name '*file-pattern*' -exec rm {} +

It is also possible to use the -delete flag to the find command.

find ./directoryname -type f -name '*file-pattern*' -delete

Is the generation of these files a problem/bug? Is there anything at the application level that can help?

ewwhite
  • 201,205
2

My guess is you're running across a strange filetype that's blocking rm from completing. Try something like

find . -type d -o -type f -print0 | xargs -0 rm -rf --
2

Just unlinking the directory would be perfectly possible if you didn't mind not getting the free space back, and all the files reappearing in /lost+found at the next fsck.

Removing the files isn't the time-consuming bit, it's all the file system maintenance code that tidies up behind the scenes that is time-consuming, and it takes an extra-long time to do millions of small files. It takes even longer if they are in a flat, wide file structure, instead of a deep, thin one (ie many files in few directories instead of many files in may nested directories). As you've noticed, in some cases it can longer to do this than simply to recreate the file system.

If this were my issue, I'd make a custom partition to keep those files in, and in addition, I'd probably use tmpfs, which is better-designed for the storage of temporary files anyway, and will cut down the file system re-creation time.

MadHatter
  • 81,580