More on Signals in Bash
You login a shell and launch a long running script, then quit the shell/terminal but not sure if the script (and its child processes) will continue running or not? You try to catch and handle signals in the script but it does work as you expected? Below are some tests to help you understand these things.
1: #!/usr/local/bin/bash 2: 3: for each in "SIGHUP" "SIGINT" "SIGTERM"; do 4: # shellcheck disable=SC2064 5: trap "echo Received $each && exit 1" $each 6: done 7: 8: python3 -c ' 9: import datetime, time 10: while True: 11: print("Python loop - %s" % datetime.datetime.now(), flush=True) 12: time.sleep(2) 13: ' 14: 15: while True; do 16: echo "Script loop - $(date -u)"; sleep 2 17: done 18: 19: echo "end of the script"
The above is the script (
test.sh) I use for the tests:
- set up traps to print the signal name before exiting the script if one of the three signals is received.
- spawn a Python process to loop infinitely. This to mimic a long running external command invoked in the script.
- an infinitely shell loop.
Now, let's start the tests. In each test, I'll run the test script (
and then monitor (tail) the output file to check if the processes are terminated
Fistly, the simplest case, run the script in the foreground:
./test.sh >test.out 2>&1
The interactive bash shell sends SIGINT (for Ctrl-c) or SIGHUP (or closing terminal) to all foreground processes. Therefore, in these two cases, both the parent process (
test.sh) and the child process (
python3) will get the signal and exit:
Python loop - 2020-02-11 14:51:04.795263 ... Python loop - 2020-02-11 14:51:12.806468 Traceback (most recent call last): File "<string>", line 5, in <module> KeyboardInterrupt <=== from the child process: python3 Received SIGINT <=== from the parent process: test.sh
Close the terminal
Python loop - 2020-02-11 14:52:24.618521 Python loop - 2020-02-11 14:52:26.620370 Python loop - 2020-02-11 14:52:28.622768 Hangup: 1 <=== from the child Received SIGHUP <=== from the parent
Secondly, run the script in background:
./test.sh >test.out 2>&1 &
Obviously, the processes will continue running upon Ctrl-c because the interactive shell will not send SIGINT to background processes. But, when the terminal is closed, they will still receive SIGHUP and abort.
- Closing the terminal means closing the terminal directly. Running
exitto quit the shell (so that the terminal emulator close the window automatically) does not count. In this case, the shell quits actively and does not receive SIGHUP. Hence it won't send SIGHUP to the script and the child processes of the script.
- You may not observe this behaviour with some terminal emulators. My guess is: some terminal emulators manage to communicate with the shell session and close it gracefully when a terminal window is closed. Therefore, the shell (as well as the script and its child processes) does not receive SIGHUP. According to my test with OS X, SIGHUP is sent when a terminal is closed in tmux or iTerm2 but that is not case with the builtin Terminal APP.
- Closing the terminal means closing the terminal directly. Running
Run the script using
As the name suggests,
nohupmakes the spawned process ignore SIGHUP. But they will not ignore SIGINT. As a result:
nohupis run in the foreground (
nohup ./test.sh), Ctrl-c interrupts the script. However, closing the terminal will not interrupt the script.
- In comparison, if
nohupis run in the background (
nohup ./test.sh &), neither Ctrl-C nor closing terminal would interrupt the script.
NOTE If a process registers its own SIGHUP handler,
nohupwill not overwrite the handler to ignore SIGHUP.
What if we edit
test.shto execute the Python process in the background (i.e.
python3 -c '...' &)?
In this case, the Python process will be run in the background no matter if the parent process (i.e. the script) is in the background or not. Therefore, we'd expect the following:
Ctrl-c does not affect the Python process but may interrupt the parent script.
Below is the output of
./test.sh >test.out 2>&1. Initially both the parent process and child process were printing timestamps. Once I pressed Ctrl-c, the parent process (the loop in the script) was interrupted but the Python process continued.
Python loop - 2020-02-11 23:12:02.196811 Script loop - Tue Feb 11 12:12:04 UTC 2020 Python loop - 2020-02-11 23:12:04.198417 Script loop - Tue Feb 11 12:12:06 UTC 2020 Python loop - 2020-02-11 23:12:06.201232 Received SIGINT Python loop - 2020-02-11 23:12:08.201730 Python loop - 2020-02-11 23:12:10.206854 Python loop - 2020-02-11 23:12:12.209047
On the other hand, closing terminal still delivers SIGHUP to both the parent and the child process and interrupts them regardless if the script is executed in the foreground or not.
Script loop - Tue Feb 11 12:15:10 UTC 2020 Python loop - 2020-02-11 23:15:10.597609 Python loop - 2020-02-11 23:15:12.600922 Script loop - Tue Feb 11 12:15:12 UTC 2020 Python loop - 2020-02-11 23:15:14.601010 Script loop - Tue Feb 11 12:15:14 UTC 2020 Hangup: 1 Received SIGHUP
What if we remove the
while True:loop from the shell script? Does it change the test results at all when compared with the previous test?
Below is the modified script:
#!/usr/local/bin/bash for each in "SIGHUP" "SIGINT" "SIGTERM"; do # shellcheck disable=SC2064 trap "echo Received $each && exit 1" $each done python3 -c ' import datetime, time while True: print("Python loop - %s" % datetime.datetime.now(), flush=True) time.sleep(2) ' & echo "end of the script"
Launch the above modified script in the foreground and then check the processes. From the output of
psbelow, we can see:
- The parent process (the script) has finished (not in the output).
- The child process is reaped by the
initprocess (ppid of it is 1).
$ ./test.sh >test.out 2>&1 $ ps -eo "pid,ppid,pgid,jobc,command" | egrep -i '(test|python)' | sed 's/ \/.*\// /' 55317 1 55306 0 Python -c \012import datetime, time\012while True:\012 print("Python loop - %s" % datetime.datetime.now(), flush=True)\012 time.sleep(2)\012 ...
In comparison, when both the script and the python process keep running, the currently shell is the parent of the script and the script is the parent of the spawned process:
$ echo $$ 53064 $ ps -eo "pid,ppid,pgid,jobc,command" | egrep -i '(test|python)' | sed 's/ \/.*\// /' 53488 53064 53488 1 test.sh 53494 53488 53488 1 Python -c \012import datetime, time\012while True:\012 print("Python loop - %s" % datetime.datetime.now(), flush=True)\012 time.sleep(2)\012 ...
This is actually a very importance difference. Many processes do register their own SIGHUP handlers and that defeats the purpose of
nohup: upon the reception of SIGHUP, these processes will not ignore the signal but instead call the registered signal handlers. In this situation, the common practice to ensure these processes keep running after corresponding terminals are closed is: the parent script spawn the long running process in the background and then quit. As shown above, this makes the spawned process a child of the
initprocess, meaning the current session will not dispatch SIGINT, SIGHUP etc. to it.
Finally let's have a closer look at
trap. As you must already know,
trapcatches the specified signals and runs according command(s).
But, if you run the original script and then send
SIGINTto the script using
kill -SIGINT pid_of_script, you'll notice the script does not respond to the signal.
Below is how I ran the test:
$ ./test.sh >test.out 2>&1 & # <=== run the script  75519 $ ps -eo "pid,ppid,pgid,jobc,command" | egrep -i '(test|python)' | sed 's/ \/.*\// /' 75519 56180 75519 1 test.sh 75525 75519 75519 1 Python -c \012import datetime, time\012while True:\012 print("Python loop - %s" % datetime.datetime.now(), flush=True)\012 time.sleep(2)\012 ... $ kill -SIGINT 75519 # <=== send SIGTERM to the parent process $ kill -SIGINT 75525 # <=== send SIGTERM to the child + Exit 1 ./test.sh > test.out 2>&1 $
test.outindicates the script didn't respond to the signal until I sent SIGINT to the child (python) process. Once the child aborted because of SIGINT, the parent ran the trap.
Python loop - 2020-02-13 16:09:59.676853 Python loop - 2020-02-13 16:10:01.677777 Python loop - 2020-02-13 16:10:03.679282 Python loop - 2020-02-13 16:10:05.680427 Traceback (most recent call last): File "<string>", line 5, in <module> KeyboardInterrupt Received SIGINT
So, the parent did receive the signal but would not run the trap until the child process completed? Exactly! In fact, if you review the
test.outof previous tests you'll find the child process always quit before the parent. Why? This is actually clearly documented in Bash Manual:
If Bash is waiting for a command to complete and receives a signal for which a trap has been set, the trap will not be executed until the command completes.
The Bash Manual also says:
When Bash is waiting for an asynchronous command via the wait builtin, the reception of a signal for which a trap has been set will cause the wait builtin to return immediately with an exit status greater than 128, immediately after which the trap is executed.
waitchange the behaviour of the parent process and the child process? I'd leave it for you to figure out.
kill -SIGINT pid_of_parentdoes not work the same as pressing Ctrl-c?
Now we understand why the parent process didn't exit immediately when we sent it a SIGINT via
kill -SIGINT pid. But, why in the first test we were able to interrupt both the parent and the child by pressing Ctrl-c?
The reason is: actually Ctrl-c sends SIGINT to not a process a process group. Therefore, when we press Ctrl-c, both the child and parent receive the signal, causing the child and parent exit in turn. To achieve the same using
kill, please send the signal to the process group instead using
kill -SIGHUP -- -pgid. In the following example, both the script and the python process belong to process group 77210, therefore
kill -SIGINT -- -77210did the trick.
$ ./test.sh >test.out 2>&1 &  77210 $ ps -eo "pid,ppid,pgid,jobc,command" | egrep -i '(test|python)' | sed 's/ \/.*\// /' 77210 56180 77210 1 test.sh 77216 77210 77210 1 Python -c \012import datetime, time\012while True:\012 print("Python loop - %s" % datetime.datetime.now(), flush=True)\012 time.sleep(2)\012 .. $ kill -SIGINT -- -77210 + Exit 1 ./test.sh > test.out 2>&1
For in-depth discussions on processes and signals in the UNIX/Linux environment, please refer to:
- Advanced Programming in the UNIX Environment, Third Edition: Chapter 8. Process Control, 9. Process Relationships, and 10. Signals.
- Bash Guide for Beginners - Chapter 12. Catching signals
blog comments powered by Disqus