Disk full, du tells different. How to further investigate?

Question

I have a SCSI disk in a server (hardware Raid 1), 32G, ext3 filesytem. df tells me that the disk is 100% full. If I delete 1G this is correctly shown.

However, if I run a du -h -x / then du tells me that only 12G are used (I use -x because of some Samba mounts).

So my question is not about subtle differences between the du and df commands but about how I can find out what causes this huge difference?

I rebooted the machine for a fsck that went w/out errors. Should I run badblocks? lsof shows me no open deleted files, lost+found is empty and there is no obvious warn/err/fail statement in the messages file.

Feel free to ask for further details of the setup.

score 174 · Answer 1 · edited Feb 15 '19 at 10:29

Just stumbled on this page when trying to track down an issue on a local server.

In my case the df -h and du -sh mismatched by about 50% of the hard disk size.

This was caused by apache (httpd) keeping large log files in memory which had been deleted from disk.

This was tracked down by running lsof | grep "/var" | grep deleted where /var was the partition I needed to clean up.

The output showed lines like this:
httpd 32617 nobody 106w REG 9,4 1835222944 688166 /var/log/apache/awstats_log (deleted)

The situation was then resolved by restarting apache (service httpd restart), and cleared up 2gb of disk space, by allowing the locks on deleted files to be cleared.

score 134 · Accepted Answer · answered May 30 '11 at 12:35

Check for files on located under mount points. Frequently if you mount a directory (say a sambafs) onto a filesystem that already had a file or directories under it, you lose the ability to see those files, but they're still consuming space on the underlying disk. I've had file copies while in single user mode dump files into directories that I couldn't see except in single usermode (due to other directory systems being mounted on top of them).

score 111 · Answer 3 · edited Sep 21 '21 at 16:51

111

I agree with OldTroll's answer as the most probable cause for your "missing" space.

On Linux you can easily remount the whole root partition (or any other partition for that matter) to another place in you filesystem say, /mnt, for example, just issue a

mount -o bind / /mnt

then you can do a

du -h /mnt

and see what is using up your space.

edited Sep 21 '21 at 16:51

David Buck

153

answered May 30 '11 at 13:54

Marcel G

2,459

score 36 · Answer 4 · edited Nov 14 '14 at 19:44

36

In my case this had to do with large deleted files. It was fairly painful to solve before I found this page, which set me on the correct path.

I finally solved the problem by using lsof | grep deleted, which showed me which program was holding two very large log files (totalling 5GB of my available 8GB root partition).

edited Nov 14 '14 at 19:44

user

4,505

answered Nov 14 '14 at 18:15

Adrian

361

score 33 · Answer 5 · edited Dec 15 '15 at 08:17

33

See what df -i says. It could be that you are out of inodes, which might happen if there are a large number of small files in that filesystem, which uses up all the available inodes without consuming all the available space.

edited Dec 15 '15 at 08:17

HBruijn

84,206
24
145
224

answered May 30 '11 at 14:10

eirescot

594

score 11 · Answer 6 · answered Feb 09 '18 at 11:51

11

For me, I needed to run sudo du as there were a large amount of docker files under /var/lib/docker that a non-sudo user doesn't have permission to read.

answered Feb 09 '18 at 11:51

Job Evers

261

score 10 · Answer 7 · answered May 30 '11 at 12:51

Files that are open by a program do not actually go away (stop consuming disk space) when you delete them, they go away when the program closes them. A program might have a huge temporary file that you (and du) can't see. If it's a zombie program, you might need to reboot to clear those files.

score 7 · Answer 8 · answered Jun 26 '11 at 13:05

This is the easiest method I have found to date to find large files!

Here is a example if your root mount is full / (mount /root) Example:

cd / (so you are in root)

ls | xargs du -hs

Example Output:

 9.4M   bin
 63M    boot
 4.0K   cgroup
 680K   dev
 31M    etc
 6.3G   home
 313M   lib
 32M    lib64
 16K    lost+found
 61G    media
 4.0K   mnt
 113M   opt
 du: cannot access `proc/6102/task/6102/fd/4': No such file or directory
 0  proc
 19M    root
 840K   run
 19M    sbin
 4.0K   selinux
 4.0K   srv
 25G    store
 26M    tmp

then you would notice that store is large do a cd /store

and run again

ls | xargs du -hs

Example output: 
 109M   backup
 358M   fnb
 4.0G   iso
 8.0K   ks
 16K    lost+found
 47M    root
 11M    scripts
 79M    tmp
 21G    vms

in this case the vms directory is the space hog.

score 6 · Answer 9 · answered Jun 26 '11 at 10:38

6

Try this to see if a dead/hung process is locked while still writing to the disk: lsof | grep "/mnt"

Then try killing off any PIDs which are stuck (especially look for lines ending in "(deleted"))

answered Jun 26 '11 at 10:38

Phirsk

61

score 3 · Answer 10 · answered Dec 05 '16 at 23:02

One more possibility to consider - you are almost guaranteed to see a big discrepancy if you are using docker, and you run df/du inside a container that is using volume mounts. In the case of a directory mounted to a volume on the docker host, df will report the HOST's df totals. This is obvious if you think about it, but when you get a report of a "runaway container filling the disk!", make sure you verify the container's filespace consumption with something like du -hs <dir>.

score 3 · Answer 11 · answered May 04 '17 at 18:01

So I had this problem in Centos 7 as well and found a solution after trying a bunch of stuff like bleachbit and cleaning /usr and /var even though they only showed about 7G each. Was still showing 50G of 50G used in the root partition but only showed 9G of file usage. Ran a live ubuntu cd and unmounted the offending 50G partition, opened terminal and ran xfs_check and xfs_repair on the partition. I then remounted the partition and my lost+found directory had expanded to 40G. Sorted the lost+found by size and found a 38G text log file for steam that eventualy just repeated a mp3 error. Removed the large file and now have space and my disks usage agrees with my root partition size. I would still like to know how to get the steam log to not grow so big again.

darxtrix · Answer 12 · 2018-04-20T19:36:00.677

A similar thing happened to us in production, disk usage went to 98%. Did the following investigation :

a) df -i for checking the inode usage, inode usage was 6% so not much smaller files

b) Mounting root and checking hidden files. Could not file any extra files. du results were same as before mount.

c) Finally, checked nginxlogs. It was configured to write to disk but a developer deleted the log file directly causing nginx to keep all the logs in-memory. As the file /var/log/nginx/access.log was deleted from disk using rm it was not visible using du but the file was getting accessed by nginx and hence it was still held open

score 1 · Answer 13 · answered Jul 18 '18 at 13:33

I had the same problem that is mentioned in this topic, but in one VPS. So I have tested everything that is described in this topic but without success. The solution was a contact for support with our VPS provider who performed a quota recalculation and corrected the space difference of df -h and du-sh /.

score 1 · Answer 14 · answered Oct 12 '18 at 21:58

I ran into this problem on a FreeBSD box today. The issue was that it was an artifact of vi (not vim, not sure if vim would create this problem). The file was consuming space but hadn't fully been written to disk.

You can check that with:

$ fstat -f /path/to/mount/point |sort -nk8 |tail

This looks at all open files and sorts (numerically via -n) by the 8th column (key, -k8), showing the last ten items.

In my case, the final (largest) entry looked like this:

bob      vi         12345    4 /var      97267 -rwx------  1569454080 rw

This meant process (PID) 12345 was consuming 1.46G (the eighth column divided by 1024³) of disk despite the lack of du noticing it. vi is horrible at viewing extremely large files; even 100MB is large for it. 1.5G (or however large that file actually was) is ridiculous.

The solution was to sudo kill -HUP 12345 (if that didn't work, I'd sudo kill 12345 and if that also fails, the dreaded kill -9 would come into play).

Avoid text editors on large files. Sample workarounds for quick skimming:

Assuming reasonable line lengths:

{ head -n1000 big.log; tail -n1000 big.log } |vim -R -
wc -l big.log |awk -v n=2000 'NR==FNR{L=$1;next}FNR%int(L/n)==1' - big.log |vim -R -

Assuming unreasonably large line(s):

{ head -c8000 big.log; tail -c8000 big.log } |vim -R -

These use vim -R in place of view because vim is nearly always better ... when it's installed. Feel free to pipe them into view or vi -R instead.

If you're opening such a large file to actually edit it, consider sed or awk or some other programmatic approach.

score 1 · Answer 15 · answered Mar 05 '19 at 17:43

1

check if your server have ossec agent installed. Or some proccess is using the deleted log files. In my a time ago was ossec agent.

answered Mar 05 '19 at 17:43

Richard Mérida

11

score 1 · Answer 16 · answered Dec 24 '19 at 16:30

In my case lsof did not help. I was able to track this down because I had mounted disk images using losetup as loop devices. Even after unmounting these devices and deleting the corresponding images there were processes that maintained some sort of indirect reference to the disk images.

So in short, sudo ps -ef|grep loop then sudo losetup -d /dev/loopX. This is not a direct answer to why du and df disagree but it has come up often enough for me that I was able to finally figure out the reason why which was different from any answer I could find.

score 0 · Answer 17 · answered Jun 21 '12 at 10:33

if the mounted disk is a shared folder on a windows machine, then it seems that df will show the size and disk use of the entire windows disk, but du will show only the part of the disk that you have access too. (and is mounted). so in this case the problem must be fixed on the windows machine.

score -4 · Answer 18 · edited Nov 23 '16 at 23:01

-4

check the /lost+found, I had a system (centos 7) and some of file in the /lost+found ate up all the space.

edited Nov 23 '16 at 23:01

Michael Hampton

252,907

answered Nov 23 '16 at 22:24

Jude Zhu

1

Disk full, du tells different. How to further investigate?

18 Answers18

Linked

Related