Observations on software: from "State of the Nation" to nitty-gritty technical details.

2010/03/28

Elegant parallel processes in bash

Thanks in large part to a desire to facilitate better use of my multi-CPU GNU/Linux system I decided to put together a script that would allow spawning parallel tasks according to a simple template provided on the command line.

The bash shell has been a scripting staple for decades. No other scripting system (no, not even Perl) gives the developer more control over processes, descriptors and pipes as an innate part of the language. The bash shell (its name was originally a pun: the Bourne Again Shell) inherits much of this facility from its predecessor 'sh' (the Bourne Shell). It seemed natural to use the process control facility of bash to build a script that dispatched processes in parallel.

Hunting around I found that several people have done something similar, but looking closely at the scripts on the internet I realized that all were fundamentally flawed in that they all polled the dispatched subprocesses to determine whether they have completed or not. The most egregious of these systems actually used the bash 'wait' command with no arguments, hence requiring all subprocesses to complete before the next batch were dispatched. Unix is a resource sharing system and processes ideally should avoid using any resource unnecessarily. Polling by periodically waking up and seeing if the world has changed is inelegant.

I'm a "make" fan, I wanted something that works just like "make -j 9" but was easier to use such that spawning a multiplicity of commands from a single template should be trivial. Hence I wrote parallel.sh, a short (200 lines, including 50 lines of usage) script that elegantly solves the problem of multiplexing subprocesses. The parallel.sh script blocks (it does NOT poll) waiting until one of its subprocesses completes; at which point it wakes up long enough to harvest the child's result code, read in the arguments for the next pending task, and fork. It does all the right things that a good UNiXy program should do: using only those resources (i.e., CPU, memory, disk) that it must to get the job done and then get out of the way.

What it does:

The parallel.sh script reads one or more lines off STDIN to compose the optional arguments for a command whose template is specified on the parallel.sh command line, and then forks as many subtasks as it can up to a user imposed limit (i.e., -j 4). As each subtask completes it reaps the child and starts a new one until EOF is reached on STDIN, at which point it waits for all child processes to complete and exits.

How to do it wrong:

The easiest (and worst) approach is simply to spawn each child process gunshot style using the '&' background operator:
while read line ; do eval "$line" & ; done
No job control, no resource management, simply shoot the whole wad in one go. What this does is bring your system to its knees if more than 3-5 processes per CPU are spawned. Everything goes slow. The system is constantly context switching and flushing cache lines and if each process is disk intensive you'll quickly bring your entire system to a crawl. The security concerns of exec'ing shell scripts should bother you too (though this article is not about security). Don't do this.

After a child process is successfully started in the background using '&', its identifier is accessible to the parent shell via the '$!' variable. What most people do is spawn the subprocess, store the PID of the spawned process and then periodically wake up and see if the PID is still running. There are several ways of checking that a child process is alive, here are two:
% [[ -d /proc/$pid ]] && echo "child $pid is alive and well"
% kill -0 $pid 2> /dev/null && echo "child $pid is alive and well"
As an example of the prototypical (but inelegant) approach of running parallel processes in a shell script, I give you the naive script that reads commands off STDIN and dispatches up to a "maximum" number of jobs at a time. This is a more practical approach to resource usage than the first example.
maximum=4
declare -a cpids
while read line ; do
    until (( ${#cpid[*]} < maximum )) ; do
        for (( i = 0; i < maximum; ++i )); do
            if [[ ! -d /proc/${cpid[$i]} ]] ; then # or kill
                wait ${cpid[$i]} # harvest child
                unset cpid[$i]
                break 2
            fi
        done
        sleep 1
    done
    eval "$line" &
    cpid=( "${cpid[@]}" $! ) # cpid[${#cpid[*]}]=$! # does not work
done
wait
exit 0
Phew! This probably works, but the logic is fairly complicated. There are several issues that bother me with this code. The 'until' test is redundant after the first iteration. The polling of the subprocesses and sleep is inelegant. This code (as with the first example above) blurs the line between instruction and data. Finally, this is really quite complex code and does not really leverage the strengths of bash.

How to do it right:

The following code is the crux of the parallel.sh script.
# setup a named pipe (FIFO)
myfifo=$( mktemp --dry-run )
mkfifo --mode=0700 $myfifo
exec 3<>$myfifo # hook pipe up to descriptor 3 for read/write
rm -f $myfifo # pipe remains intact until closed
maximum=4 # cpus
running=0
while read line ; do
    while (( running >= maximum )) ; do
        if read -u 3 cpid ; then
            wait $cpid
            (( --running ))
        fi
    done
    ( # child process
        "$@" "$line"
        echo $BASHPID 1>&3
    ) &
    (( ++running ))
done
wait
exit 0
My parallel.sh script actually does a bit more than this, but that is due to some of the other features it supports: template commands (i.e., commands that accept multiple arguments), blocking the output into a log file, option processing (i.e., -j #cpus) and several other items.

So what did I do here? I'm using a named pipe. The pipe is created in your temp directory, and then removed so only this script has access to the fifo. All pipes require a reader and a writer. In this case the reader is the parent process, which allows it to block until a writer sends something through the pipe. The writer is the child process which holds the pipe in escrow until the underlying command in the subprocess completes at which point the PID of the child is written to the fifo and the child exits. Upon EOF the script waits for all children to complete and then exits.

Note also how the command itself is formulated. No longer is the entire command read off of each line from STDIN. Rather the command is entered on the command line itself (under direct control of the developer, rather than trusting a data feed) and each line from STDIN is appended to the end of the command (if you really want the behavior of the first two examples, simply run "process.sh exec").

Cut and paste the above code into parallel.sh, add "#!/bin/bash" to the top, chmod +x process.sh and you now have a basic parallel processing command for your multi-CPU system. You can now run xargs style command sequences in parallel as this version of parallel.sh has similar operating semantics to xargs.

The only additional major feature that I have in my actual parallel.sh script is the ability to templatize specific arguments in the command line, such that each dispatched command may reference multiple lines as individual arguments. For example, I'm able to process images in parallel using ImageMagick and parallel.sh thus:
% /bin/ls my_camera/*.jpg | sed -e 'p;s/camera/website/' |
> parallel.sh -j 8 mogrify -resample 800x600 --write '$2' '$1'
The above command processes all JPEG files in the directory "my_camera" and down-samples them to 800x600, writing the final image into the directory "my_website". A total of 8 processes will be allowed to run simultaneously.

UPDATE: I've uploaded the complete source code of parallel.sh to ActiveState Code.

1 comment:

  1. parallel.sh has several problems. Output runs together:

    (echo foss.org.my; echo debian.org; echo freenetproject.org) | parallel.sh traceroute

    Composed commands are not supported:

    seq 1 5 | parallel.sh sleep {} ';' echo {}

    GNU Parallel http://www.gnu.org/software/parallel/ does not suffer from this.

    ReplyDelete