13

Basically I need to be able scan the process tree and find processes that match a certain name and started running more than a week a go. Once I have them, I need to kill them. All the processes are still seen as in a running state by the system, just not using any system time. They'll usually sit forever in this state too.

Ideally I'd like something similar to find, but for processes.

System is Debian linux and this will be scripted and run by cron so I've no real issues with something large but understandable.

Ryaner
  • 3,137

7 Answers7

12

YOu can do this with a combination of ps , awk and kill:

ps -eo pid,etime,comm

Gives you a three column output, with the process PID, the elapsed time since the process started, and the command name, without arguments. The elapsed time looks like one of these:

mm:ss
hh:mm:ss
d-hh:mm:ss

Since you want processes that have been running for more than a week, you would look for lines matching that third pattern. You can use awk to filter out the processes by running time and by command name, like this:

ps -eo pid,etime,comm | awk '$2~/^7-/ && $3~/mycommand/ { print $1 }'

which will print the pids of all commands matching 'mycommand' which have been running for more than 7 days. Pipe that list into kill, and you're done:

ps -eo pid,etime,comm | awk '$2~/^7-/ && $3~/mycommand/ { print $1 }' | kill -9
6

killall --quiet --older-than 1w process_name

billyw
  • 1,640
  • 17
  • 26
1

If you have a Python/Perl/Ruby script you want to kill, killall won't help you since killall will just look for "python" or "perl", it can't match the name of a specific script.

To kill all processes older than X seconds where any part of the full running command matches a string, you can run this as root:

MAX_SECONDS=43200
PROGRAM_THAT_NEEDS_TO_DIE=bad-python-script.py
ps -eo pid,etimes \
    | grep -v PID \
    | awk '$2>'$MAX_SECONDS'{ print $1 }' \
    | xargs --no-run-if-empty ps -f --pid \
    | grep $PROGRAM_THAT_NEEDS_TO_DIE \
    | awk '{ print $2 }' \
    | xargs --no-run-if-empty kill -9

Use ps to get a list of all processes (-e) and only output the pid and the elapsed number of seconds (-o pid,etimes).

grep -v PID to remove the header line.

Use awk to only select lines where the elapsed seconds are greater than 43200s (12 hours), and to strip out just the first column with the PIDs.

Pass the list of PIDs back to ps to get the full process listing.

Use grep to find the lines that contain the name of the script that you want to kill.

Use awk again to pull out the PID of the script.

If there are any processes found, kill them.

Earl Ruby
  • 429
  • 4
  • 6
1

All the info you need can be grabbed from ps -ef. See the "STIME" column. Combine that with grep to sort out the processes you need. At that point, you can use cut to grab the pid of all the matching processes and pass those to kill.

Please let me know if you'd like more details on how to do this.

EEAA
  • 110,608
0

if you're root, to get rid of trash ( /proc/fs proc/stat ...)

find /proc -maxdepth 1 -regex '/proc/[0-9]*' -type d -mtime +2 -exec basename {} \;
0

Nobody mentioned ps-watcher here. I think you might be able to compare $start_time using the elapsed2sec function but I'm not entirely sure. Here's my first thought:

[myproc]
occurs = every
trigger = elapsed2secs('$start_time') > 7*DAYS
action = <<EOT
  echo "$command has been running more than 7 days" | /bin/mail user\@host
  kill -TERM $pid
EOT

no idea if that works, but it should be a good starting point.

-1

When a process starts up, it creates a directory in the /proc filesystem. You can use the find command to get directories older than 7 days and kill the processes as follows:

find /proc -user myuser -maxdepth 1 -type d -mtime +7 -exec basename {} \; | xargs kill -9 
dogbane
  • 984