I was using df -h to print out human readable disk usage. I would like to figure out what is taking up so much space. For instance, is there a way to pipe this command so that it prints out files that are larger than 1GB in size? Other ideas?
Thanks
I was using df -h to print out human readable disk usage. I would like to figure out what is taking up so much space. For instance, is there a way to pipe this command so that it prints out files that are larger than 1GB in size? Other ideas?
Thanks
You may want to try the ncdu utility found at: http://dev.yorhel.nl/ncdu
It will quicky sum the contents of a filesystem or directory tree and print the results, sorted by size. It's a really nice way to drill-down interactively and see what's consuming drive space.
Additionally, it can be faster than some du combinations.
The typical output looks like:
ncdu 1.7 ~ Use the arrow keys to navigate, press ? for help
--- /data ----------------------------------------------------------------------------------------------------------
163.3GiB [##########] /docimages
84.4GiB [##### ] /data
82.0GiB [##### ] /sldata
56.2GiB [### ] /prt
40.1GiB [## ] /slisam
30.8GiB [# ] /isam
18.3GiB [# ] /mail
10.2GiB [ ] /export
3.9GiB [ ] /edi
1.7GiB [ ] /io
1.2GiB [ ] /dmt
896.7MiB [ ] /src
821.5MiB [ ] /upload
691.1MiB [ ] /client
686.8MiB [ ] /cocoon
542.5MiB [ ] /hist
358.1MiB [ ] /savsrc
228.9MiB [ ] /help
108.1MiB [ ] /savbin
101.2MiB [ ] /dm
40.7MiB [ ] /download
I use this one a lot.
du -kscx *
It can take a while to run, but it'll tell you where the disk space is being used.
I myself use
du -c --max-depth=4 /dir | sort -n
this returns amount of space used by a directory and its subdirectories up to 4 deep, sort -n will put the largest last.
New versions of sort can handle "human-readable" sizes, so one can use much more readable
du -hc --max-depth=4 /dir | sort -h
As I haved to determine what is taking up so much space? a lot of time, I wrote this little script in order to search for big occupation on a specific device (without argument, this will browse current directory, searching for >256Mb directory entries):
#!/bin/bash
humansize() {
local _c=$1 _i=0 _a=(b K M G T P)
while [ ${#_c} -gt 3 ] ;do
((_i++))
_c=$((_c>>10))
done
_c=$(( ( $1*1000 ) >> ( 10*_i ) ))
printf ${2+-v} $2 "%.2f%s" ${_c:0:${#_c}-3}.${_c:${#_c}-3} ${_a[_i]}
}
export device=$(stat -c %d "${1:-.}")
export minsize=${2:-$((256*1024**2))}
rdu() {
local _dir="$1" _spc="$2" _crt _siz _str
while read _crt;do
if [ $(stat -c %d "$_crt") -eq $device ];then
_siz=($(du -xbs "$_crt"))
if [ $_siz -gt $minsize ];then
humansize $_siz _str
printf "%s%12s%14s_%s\n" "$_spc" "$_str" \\ "${_crt##*/}"
[ $d "$_crt" ] && rdu "$_crt" " $_spc"
fi
fi
done < <(
find "$_dir" -mindepth 1 -maxdepth 1 -print
)
}
rdu "${1:-.}"
Sample of use:
./rdu.sh /usr 100000000
1.53G \_lib
143.52M \_i386-linux-gnu
348.16M \_x86_64-linux-gnu
107.80M \_jvm
100.20M \_java-6-openjdk-amd64
100.17M \_jre
99.65M \_lib
306.63M \_libreoffice
271.75M \_program
107.98M \_chromium
99.57M \_lib32
452.47M \_bin
2.50G \_share
139.63M \_texlive
129.74M \_texmf-dist
478.36M \_locale
124.49M \_icons
878.09M \_doc
364.02M \_texlive-latex-extra-doc
359.36M \_latex
Little check:
du -bs /usr/share/texlive/texmf-dist
136045774 /usr/share/texlive/texmf-dist
echo 136045774/1024^2 | bc -l
129.74336051940917968750
Nota: using -b instead of -k tell du to sumarize only used bytes, but not effective reserved space (by block of 512 bytes). For working about blocs size, you have to change line du -xbs ... by du -xks, suppress b in _a=(K M G T P) and divide argument size by 1024.
... There is a modified version (I will keep for myself) using blocks sizes by default, but accepting -b as first argument for bytes calculation:
After some work about, there is a newer version a lot quicker and with output sorted in descending size order:
#!/bin/bash
if [ "$1" == "-b" ] ;then
shift
export units=(b K M G T P)
export duargs="-xbs"
export minsize=${2:-$((256*1024**2))}
else
export units=(K M G T P)
export duargs="-xks"
export minsize=${2:-$((256*1024))}
fi
humansize() {
local _c=$1 _i=0
while [ ${#_c} -gt 3 ] ;do
((_i++))
_c=$((_c>>10))
done
_c=$(( ( $1*1000 ) >> ( 10*_i ) ))
printf ${2+-v} $2 "%.2f%s" ${_c:0:${#_c}-3}.${_c:${#_c}-3} ${units[_i]}
}
export device=$(stat -c %d "${1:-.}")
rdu() {
local _dir="$1" _spc="$2" _crt _siz _str
while read _siz _crt;do
if [ $_siz -gt $minsize ];then
humansize $_siz _str
printf "%s%12s%14s_%s\n" "$_spc" "$_str" \\ "${_crt##*/}"
[ -d "$_crt" ] &&
[ $(stat -c %d "$_crt") -eq $device ] &&
rdu "$_crt" " $_spc"
fi
done < <(
find "$_dir" -mindepth 1 -maxdepth 1 -xdev \
\( -type f -o -type d \) -printf "%D;%p\n" |
sed -ne "s/^${device};//p" |
tr \\n \\0 |
xargs -0 du $duargs |
sort -nr
)
}
rdu "${1:-.}"
To display the biggest top-20 directories in the current folder, use the following one-liner:
du -ah . | sort -rh | head -20
or:
du -a . | sort -rn | head -20
For the top-20 biggest files in the current directory (recursively):
ls -1Rs | sed -e "s/^ *//" | grep "^[0-9]" | sort -nr | head -n20
or with human readable sizes:
ls -1Rhs | sed -e "s/^ *//" | grep "^[0-9]" | sort -hr | head -n20
The second command to work on OSX/BSD properly (as
sortdoesn't have-h), you need to installsortfromcoreutils. Then add the bin folder to yourPATH.
So these aliases are useful to have in your rc files (every time when you need it):
alias big='du -ah . | sort -rh | head -20'
alias big-files='ls -1Rhs | sed -e "s/^ *//" | grep "^[0-9]" | sort -hr | head -n20'