How I find and kill rogue programs

0

As useful and important as Firefox is to my ability to interact with the outside world, create on this website, manage my money, and much more, it can sometimes be a pain.

The problem

Firefox can suck up huge amounts of CPU time and memory so I sometimes want to restart it. Sometimes I’ve exited Firefox and later tried to restart it only to receive a message stating that Firefox is already running because one or more threads remain intact.

At first I would try to search through the data stream from ps, an under-appreciated tool to list running processes. But modern Linux computers with desktops have hundreds, if not thousands, of running programs. I used grep to narrow down the choices and that seemed to work fairly well. But not always. The two examples in Figure 1 illustrate this point. The first command shows only the top-level PID. The second shows all PIDs belonging to Firefox, but not in a form that can be easily parsed in order to ensure that all processes belonging to Firefox are killed.

dboth@david:~$ ps -e | grep fire
  11778 ?        00:19:43 firefox

dboth@david:~$ ps -ef | grep fire
dboth      11778    3326  2 Nov19 ?        00:19:41 /usr/lib64/firefox/firefox
dboth      11907   11778  0 Nov19 ?        00:00:00 /usr/lib64/firefox/firefox -contentproc -parentBuildID 20241112154828 -prefsLen 44517 -prefMapSize 275107 -appDir /usr/lib64/firefox/browser {d8f8607a-d71e-45f7-bca3-483270c15b2a} 11778 true 1 socket
dboth      11928   11778  0 Nov19 ?        00:00:31 /usr/lib64/firefox/firefox -contentproc -isForBrowser -prefsLen 44620 -prefMapSize 275107 -jsInitLen 234660 -parentBuildID 20241112154828 -greomni /usr/lib64/firefox/omni.ja -appomni /usr/lib64/firefox/browser/omni.ja -appDir /usr/lib64/firefox/browser {36011970-5a4b-46a6-8c16-b20eda118e22} 11778 true 2 tab
dboth      12044   11778  4 Nov19 ?        00:29:47 /usr/lib64/firefox/firefox -contentproc -isForBrowser -prefsLen 50990 -prefMapSize 275107 -jsInitLen 234660 -parentBuildID 20241112154828 -greomni /usr/lib64/firefox/omni.ja -appomni /usr/lib64/firefox/browser/omni.ja -appDir /usr/lib64/firefox/browser {3e94c144-4195-4307-843a-599c77fc4c6f} 11778 true 3 tab
dboth      12095   11778  0 Nov19 ?        00:01:20 /usr/lib64/firefox/firefox -contentproc -parentBuildID 20241112154828 -sandboxingKind 0 -prefsLen 51016 -prefMapSize 275107 -appDir /usr/lib64/firefox/browser {c74129b1-b237-4367-b4d5-ef60861fb68c} 11778 true 4 utility
<SNIP>
dboth    1199696 1199407  0 08:18 pts/11   00:00:00 grep --color=auto fire
dboth@david:~$ 

Figure 1: The information from the ps command isn’t always easy to parse by humans or computers.

Of course, Firefox isn’t the only application that can go rogue. And sometimes it’s not about rogue programs so much as it is about the dozen or so terminal sessions I leave open on different desktops. So how do we fix this?

The solution

Regardless of the application causing the problem, there are two interesting and simple tools we can use to suss out those processes and kill them. All we need is a name, full or partial.

As its name implies, the pgrep command works like a combination of ps and grep. And pkill adds the ability to kill the processes it finds. Let’s start by searching for Firefox. Note that these commands look for exactly what you use for as a search argument, and it is case sensitive. So using “Firefox” gets you nothing while “firefox” does.

$ pgrep firefox
11778

That’s not very helpful as it only gives us the Firefox parent PID and none of the children. The reason for this is that, by default, pgrep only searches the process name, which the child processes don’t necessarily carry. We can use the -f option which searches the full command line for the search argument. We can also add the -c option to count the number of processes that match the search argument.

$ pgrep -f firefox
11778
11907
11928
12044
12095
12104
12403
36821
1020768
3254170
3356264
3356268
3356393
$ pgrep -fc firefox
13
$

Now we have all the PIDs that belong to Firefox, we can ensure that we kill them all using pkill.

$ pkill -f firefox

The pkill utility works by sending a signal 9 (SIGKILL) to each identified process. The kill -l can be used to list all possible signals that can be sent. Three of these signals can be used to kill a process.

  • SIGTERM (15): Signal 15, SIGTERM is the default signal sent by top and the other monitors when the k key is pressed. It may also be the least effective because the program must have a signal handler built into it. The program’s signal handler must intercept incoming signals and act accordingly. So for scripts, most of which do not have signal handlers, SIGTERM is ignored. The idea behind SIGTERM is that by simply telling the program that you want it to terminate itself, it will take advantage of that and clean up things like open files and then terminate itself in a controlled and nice manner.
  • SIGKILL (9): Signal 9, SIGKILL provides a means of killing even the most recalcitrant programs, including scripts and other programs that have no signal handlers. For scripts and other programs with no signal handler, however, it not only kills the running script but it also kills the shell session in which the script is running; this may not be the behavior that you want. If you want to kill a process and you don’t care about being nice, this is the signal you want. This signal cannot be intercepted by a signal handler in the program code.
  • SIGINT (2): Signal 2, SIGINT can be used when SIGTERM does not work and you want the program to die a little more nicely, for example, without killing the shell session in which it is running. SIGINT sends an interrupt to the session in which the program is running. This is equivalent to terminating a running program, particularly a script, with the Ctrl-C key combination.

Signal 15 (SIGTERM) requests the program to shut itself down, to terminate itself. Sometimes one of more processes may be ignoring this signal which is sent during a normal shutdown of the program. We can send a more imperative signal, 9 (SIGKILL), which causes Linux to simply kill the program even if it is not responding to any other signals.

$ pkill -9 -f firefox

Or you could use the long form for the signal option.

$ pkill --signal 9 -f firefox

This should kill any recalcitrant process. I’ve never had an instance where it hasn’t, but I always check with pgrep to ensure that’s the case.

To experiment with this, open a terminal session as a non-root user, and create a file in /tmp named cpuHog and make it executable with the permissions rwxr_xr_x (755). Add the following content to the file.

#!/bin/bash 
# This little program is a cpu hog 
X=0;while [ 1 ];do echo $X;X=$((X+1));done

Open another terminal session in a different window, position them adjacent to each other so you can watch the results and run top in the new session. Run the cpuHog program with the following command.

$ /tmp/cpuHog

This program simply counts up by one and prints the current value of X to STDOUT. And it sucks up CPU cycles. The terminal session in which cpuHog is running should show a very high CPU usage in top. Observe the effect this has on system performance in top. CPU usage should immediately go way up and the load averages should also start to increase over time. If you want, you can open additional terminal sessions and start the cpuHog program in them so that you have multiple instances running but that’s not necessary for this little experiment.

Determine the PID of the cpuHog program you want to kill. The PID for my process is 265140 but the PID on your host will be different.

dboth@testvm1:~$ pgrep -f cpuHog
265140
dboth@testvm1:~$ 

Now kill the cpuHog process and verify that it has been killed.

dboth@testvm1:~$ pkill cpuHog
dboth@testvm1:~$ pgrep -f cpuHog
dboth@testvm1:~$ 

If the process had not been terminated, it could be killed using signal 9.

Here’s what this looks like in ther terminal session in which cpuHog was running.

<SNIP>
59995161
59995162
Terminated

And so much more…

When you read the man page for pgrep or pkill, you’ll see that they’re both the same program and simply act a bit differently depending upon which name used to invoke them. But pgrep can be used to kill processes by using the –signal option. Actually, any signal can be sent to the target processes.

There are a number of interesting options available to pgrep and pkill. For example the -e (echo) option displays the name and PID of each killed process. You can also specify UIDs and GIDs to ensure that only processes with a specific user or group ownership are found — or killed. Other options allow searching for the oldest or newest processes matching the search argument, and one option allows searching for processes older than a specified time such as 160 seconds. You can invert the meaning of a search or search with cgroup (Control Group) numbers to help narrow the search.

I use these tools frequently, not just to kill processes, but to explore the processes that are running. I also use tools like top and htop. htop has a nice feature that shows the process tree so you can see a process and all its sub-processes. But that’s another article.

Leave a Reply