Some of my scripts kick off a number of children in background, and then wait for their completion. Currently I invoke the wait
in a loop for each PID:
for pid in "${!PIDs[@]}"
do
if wait $pid
then
log ${PIDs[$pid]} completed successfully
else
log WARNING: ${PIDs[$pid]} failed
errors+=1
fi
done
This works, allowing me to analyze and process failures, but the processing happens in the order, in which PIDs are listed -- not in the order, in which the processes actually complete. That is, the 5th process may finish first, but its exit-code will not be processed until the first four are done...
As far as I know, sh
provides two modes for wait
:
- Bare
wait
will wait for all backgrounded jobs to finish, but it will always "succeed" losing the exit-codes of the backgrounded processes. wait PID
will wait for the specified process. This is providing the exit-code, but can only wait for that one process.
But, maybe, bash has this improved compared to the old sh? Is there a way to request bash's wait
to return when any of the backgrounded processes completes -- and have it provide both the finished PID and its exit-code?
The underlying C-functions waitpid
and friends can do this -- if you provide the PID of -1
. I tried doing that with bash and got an error...
Some of my scripts kick off a number of children in background, and then wait for their completion. Currently I invoke the wait
in a loop for each PID:
for pid in "${!PIDs[@]}"
do
if wait $pid
then
log ${PIDs[$pid]} completed successfully
else
log WARNING: ${PIDs[$pid]} failed
errors+=1
fi
done
This works, allowing me to analyze and process failures, but the processing happens in the order, in which PIDs are listed -- not in the order, in which the processes actually complete. That is, the 5th process may finish first, but its exit-code will not be processed until the first four are done...
As far as I know, sh
provides two modes for wait
:
- Bare
wait
will wait for all backgrounded jobs to finish, but it will always "succeed" losing the exit-codes of the backgrounded processes. wait PID
will wait for the specified process. This is providing the exit-code, but can only wait for that one process.
But, maybe, bash has this improved compared to the old sh? Is there a way to request bash's wait
to return when any of the backgrounded processes completes -- and have it provide both the finished PID and its exit-code?
The underlying C-functions waitpid
and friends can do this -- if you provide the PID of -1
. I tried doing that with bash and got an error...
3 Answers
Reset to default 2If you are restricted to a version of Bash that does not support wait -n
then one possible way to do what you want is to poll with the jobs builtin to detect when background processes have completed. This Shellcheck-clean code (lightly tested with Bash 4.2) demonstrates the idea:
unwaited_pids=( "${!pids[@]}" )
declare -A is_running
while (( ${#unwaited_pids[*]} > 0 )); do
sleep 0.1
jobs_output=$(jobs -rp)
is_running=()
while IFS= read -r p; do
[[ -n $p ]] && is_running[$p]=Y
done <<<"$jobs_output"
new_unwaited_pids=()
for pid in "${unwaited_pids[@]}"; do
if [[ -n ${is_running[$pid]-} ]]; then
new_unwaited_pids+=( "$pid" )
elif wait "$pid"; then
log "${pids[$pid]} completed successfully"
else
log "WARNING: ${pids[$pid]} failed"
(( errors+=1 ))
fi
done
unwaited_pids=( "${new_unwaited_pids[@]}" )
done
You can adapt following script to your needs :
while test -n "${PIDs[*]}" && { wait -n -p pid "${!PIDs[@]}"; status=$?; }; do
echo "Command [${PIDs[$pid]}] with pid:$pid exited with status:$status"
unset "PIDs[$pid]" # Remove $pid
done
For bash 4.2, try this :
while (( ${#PIDs[@]} > 0 )); do
sleep .1 # Sleep .1 second
for pid in "${!PIDs[@]}"
do
[ -d "/proc/$pid" ] && continue # $pid still running, check next
if wait "$pid"
then
echo "${PIDs[$pid]} completed successfully"
else
echo "WARNING: ${PIDs[$pid]} failed"
errors+=1
fi
unset "PIDs[$pid]" # Remove $pid
done
done
Might take a few steps.
Just as a test, complete with one process killed to prove it catches error codes.
$: cat tst
#! /usr/bin/env bash
for x in 1 3 5 7 9; do sleep $x & done
declare -A rc=()
pids=($(jobs -pr))
while (( ${#pids[@]} ))
do for k in "${!pids[@]}"
do p=${pids[$k]}
if ps -p $p >/dev/null; then :
else wait $p; rc+=( $p $? ); unset pids[$k]
date +"%F %T PID $p: rc ${rc[$p]}"
fi
done
((skip++)) || kill ${pids[3]}
sleep 1
done
$: ./tst
2025-02-07 14:55:19 PID 3036: rc 0
2025-02-07 14:55:19 PID 3039: rc 143
2025-02-07 14:55:21 PID 3037: rc 0
2025-02-07 14:55:23 PID 3038: rc 0
2025-02-07 14:55:28 PID 3040: rc 0
-n
and-p
options may be what you are looking for. – pjh Commented Feb 7 at 1:25