Revisiting Shell Concurrency (this time in ZSH)
I’ve been thinking about concurrency in the command shell again. This was prompted by my ongoing transition from Bash to ZSH. I’ve decided to re-implement my conc
and xconc
functions in a slightly different way.
Intro
Lets say you have 50 data files to compress. Here’s one way:
% xz *.dat
<< completed in 2 minutes, 4 seconds >>
%
That’s not bad, but it doesn’t really take advantage of all those fancy cores we have in our computers these days. Let’s run multiple xz
jobs as-parallel-as-possible, so that each file gets its own xz
process.
% for f in *.dat; do
for> xz $f &
for> done; wait
# job monitoring info excluded here for blog readability
<< completed in 1 minute, 37 seconds >>
%
Not too bad. It actually took less time (78% of the single-process baseline) but in the process it really bogged down the system. If there were more than 50 files it would have been much much worse or even impossible.
concblock
What if we could easily limit the concurrency to a reasonable level? This is where concblock
comes in. In this context it blocks the for loop, preventing it from creating any more xz
processes until some have finished. Here’s the previous example repeated, this time including concblock
:
% for f in *.dat; do
for> xz $f &
for> concblock
for> done; wait
# job monitoring info excluded here for blog readability
<< completed in 1 minute, 7 seconds >>
%
An improvement! It took much less time (54% of the single-process baseline) and the system load stayed low. The whole idea here is to use the shell’s own job control infrastructure to make it easy to parallelize common shell tasks. I do a reasonable amount of on-the-fly for
and while
loops on the command line, so this can really help in those situations.
Here’s my code for concblock:
if ! is_function concblock; then
export CONC_MAX=${CONC_MAX:-2}
zmodload zsh/parameter
zmodload zsh/zselect
function concblock () {
CONC_MAX=${CONC_MAX:-2}
# Block until there is an open slot
if [[ ${#jobstates} -ge $CONC_MAX ]]; then
while; do
zselect -t 20
if [[ ${#jobstates} -lt $CONC_MAX ]]; then
break
fi
done
fi;
return 0
}
fi
One notable thing about concblock
is that it does not use the shell’s built-in wait
function. Instead it uses two very powerful zsh
modules; parameter
, which gives me information about job states, and zselect
, which I’m using as a high-resolution built-in sleep command. With the combination of these two modules I can wait for individual subprocesses to exit and I don’t have to call out to external programs like sleep
. In an ideal world, there would be a way to wait for specific signals, like SIGCHLD, or the built-in wait
function would give me the option to wait for any child process to exit; unfortunately those don’t currently seem to be options.
The shell global variable $CONC_MAX
is a setting that determines how many processes should be allowed to run in parallel. I’m defaulting to two because I have a dual core machine. Eventually I’ll adjust the $CONC_MAX
default to automatically match the number of cores on the machine.
xconc
In our baseline example, one process gets created and it is given all the filenames as arguments. On the command line, that would look something like this:
% xz 1.dat 2.dat 3.dat 4.dat 5.dat 6.dat 7.dat 8.dat 9.dat 10.dat \
11.dat 12.dat 13.dat 14.dat 15.dat 16.dat 17.dat 18.dat 19.dat \
20.dat 21.dat 22.dat 23.dat 24.dat 25.dat 26.dat 27.dat 28.dat \
29.dat 30.dat 31.dat 32.dat 33.dat 34.dat 35.dat 36.dat 37.dat \
38.dat 39.dat 40.dat 41.dat 42.dat 43.dat 44.dat 45.dat 46.dat \
47.dat 48.dat 49.dat 50.dat
One nice aspect about this approach is that the overhead of switching from one input file to another is very low. The xz
program is designed to loop through these arguments very efficiently. In our concblock
example, each process gets only one argument and the OS has to setup and teardown a processes each time we want to compress a new file, this can have non-trivial overhead. On the command line, that would look something like this:
% xz 1.dat &
% xz 2.dat &
% wait
% xz 3.dat &
# ...yadda yadda..
% wait
% xz 49.dat &
% xz 50.dat &
% wait
It would be nice to get a compromise between these two approaches. For example, we could use four processes and give each process about 12 arguments each. xconc
is designed to do just this; here it is applied to our example compression task:
% xconc xz *.dat
<< completed in 1 minute, 2 seconds >>
%
Another improvement! It is faster still (50% of the single-process baseline) and keeps the load on the system even lower. In the background, xconc
does something like this:
% xz 1.dat 10.dat 11.dat 12.dat 13.dat 14.dat 15.dat 16.dat 17.dat \
18.dat 19.dat 2.dat 20.dat &
% xz 21.dat 22.dat 23.dat 24.dat 25.dat 26.dat 27.dat 28.dat 29.dat \
3.dat 30.dat 31.dat 32.dat &
% xz 33.dat 34.dat 35.dat 36.dat 37.dat 38.dat 39.dat 4.dat 40.dat \
41.dat 42.dat 43.dat &
% xz 44.dat 45.dat 46.dat 47.dat 48.dat 49.dat 5.dat 50.dat 6.dat \
7.dat 8.dat 9.dat &
Here’s my code for xconc
:
if ! is_function xconc; then
export CONC_MAX=${CONC_MAX:-2}
function xconc() {
local command_string arg_count group_size group_count
command_string=$1;
shift;
arg_count=${#@}
if [[ $arg_count -lt $CONC_MAX ]]; then
group_size=1
group_count=$arg_count
else
group_size=$(( arg_count / CONC_MAX ));
group_count=$CONC_MAX
fi
remainder=$(( arg_count % (group_count * group_size) ))
(
local length=$group_size;
local offset=1;
local i;
for (( i=0; i < $group_count; i++ )); do
if [[ $remainder -gt 0 ]]; then
length=$(( group_size + 1 ))
remainder=$(( remainder - 1 ))
else
length=$group_size
fi
"${=command_string}" "${@:$offset:$length}" &
offset=$(( offset + length ))
concblock
done; wait
)
return 0
}
fi
In Progress, Kinda
concblock
and xconc
are still works in progress but in the case of xconc, there are already more versatile tools that can perform these tasks (e.g. GNU Parallel). The following commands are essentially equivalent:
% parallel -j 2 -X xz --extreme ::: *.dat
<< completed in 1 minute, 5 seconds >>
%
And:
% xconc 'xz --extreme' *.dat
<< completed in 1 minute, 4 seconds >>
%
GNU parallel has many many more useful features, including the ability to behave like xargs
and take arguments from standard input.
Hear It & Hear It & Hear It
I had a bit of geek fun making recordings of concurrent text-to-speech processes while I was working on concblock and xconc. I frequently find clean solutions to difficult problems that involve the OS X say
command, in this case it was a nice audible way to debug.
While I was making the script to generate the more horrible of the aforementioned recordings, I had need for a simple command-line tool that implements python’s random.randrange
. It took no seconds to make but I still put it on github; this is what it looks like:
% randrange -h
usage: randrange [-h] [-c COUNT] low high
Print a random value using the specified parameters
positional arguments:
low the lower boundary for the allowable range
high the upper boundary for the allowable range
optional arguments:
-h, --help show this help message and exit
-c COUNT, --count COUNT
the number of random values to print
I also made a similar tool for printing the name of a randomly-selected OS X text to speech voice. It is also on github and it looks like this:
% randomvoice -h
usage: randomvoice [-h] [-e] [-n]
Print the randomly-chosen name of an available text-to-speech voice
optional arguments:
-h, --help show this help message and exit
-e, --exclude-novelty-voices
exclude the novelty voice set
-n, --only-novelty-voices
only choose from the novelty voice set