what's root cause of "A start job is running for Create Volatile Files and Directories"

Question

Stuck at "A start job is running for Create Volatile Files and Directories" after reboot a server(Debian 9.5, 64bit), and solve by this "boot-stuck-at-a-start-job-is-running-for-create-volatile-files-and-directories".

I can't figure out what is the root cause of this issue, although search from many questions which are not refer the root cause but just the varied solutions that not meet me.

We have not reach the limit of file or (sub) directory, and set the dir_nlink for ext4.

# sudo tune2fs -l /dev/debian-vg/root | grep dir_nlink
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent
 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum

And the are more than 50% capacity of inode and disk.

The original /tmp directory only little file and directory, total disk space usage only 1G.

Some info:

$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.9.0-7-amd64 root=/dev/mapper/debian--vg-root ro net.ifnames=0 biosdevname=0 console0=tty0 console=ttyS0,115200n8 quiet
$ mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs (rw,nosuid,relatime,size=4077900k,nr_inodes=1019475,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=817924k,mode=755)
/dev/mapper/debian--vg-root on / type ext4 (rw,relatime,errors=remount-ro,data=ordered)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=36,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=9039)
mqueue on /dev/mqueue type mqueue (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=817920k,mode=700,uid=1000,gid=1000)
$ lsblk
NAME                MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
vda                 254:0    0 1000G  0 disk 
└─vda1              254:1    0 1000G  0 part 
  └─debian--vg-root 253:0    0    3T  0 lvm  /
vdb                 254:16   0    4T  0 disk 
vdc                 254:32   0    2T  0 disk 
└─debian--vg-root   253:0    0    3T  0 lvm  /
$ blkid
/dev/vda1: UUID="ijfyeQ-*" TYPE="LVM2_member" PARTUUID="d6"
/dev/mapper/debian--vg-root: UUID="2d2294a9-" TYPE="ext4"
/dev/vdc: UUID="PXrGC9-*" TYPE="LVM2_member"
$ sudo find /tmp/ | wc -l
28905144

chutz · Accepted Answer · 2022-04-26T01:46:22.747

As you are showing with your sudo find /tmp/ | wc -l command, you indeed have close to 30 million entries in /tmp. You could start with a fresh /tmp directory as pointed out in other answers, and you probably should, but as you have guessed, unless you get to the bottom of this, you'll end up in the same situation.

Unfortunately there could be all kinds of reasons for this issue. For example, one issue I have personally experienced is atd going crazy and starting to create empty files in /tmp in a crazy loop (talking thousands per second or something to that extent). I am not saying this is your case as at is not a popular tool these days, but you'll have to look at the filenames in /tmp and try to guess where they came from based on their names, and maybe timestamps.

Try sudo find /tmp -ls | more and look for any clues. It will hopefully be obvious.

score 0 · Answer 2 · answered Apr 27 '22 at 06:28

There are two causes of your situation at least:

1, 28905144 the result of find /tmp/ | wc -l shows that you have tons of file in /tmp directory. Obviously, /tmp directory wasn't cleared out normally at boot or at shutdown.
2, / directory was setting to a large value which capacity reached 3T. With more space, HDD(I guess that isn't SSD) addressing will slower.

Advice:

1, check files which under the /tmp directory whether be created normally or not, and you will figure out the reason.
2, make the / directory no more than 2T, if possible, or use high-performance media such as SSD(NVMe).

what's root cause of "A start job is running for Create Volatile Files and Directories"

2 Answers2