Revisiting Shell Concurrency (this time in ZSH) // galvanist

I’ve been thinking about concurrency in the command shell again. This was prompted by my ongoing transition from Bash to ZSH. I’ve decided to re-implement my conc and xconc functions in a slightly different way.

Intro

Lets say you have 50 data files to compress. Here’s one way:

% xz *.dat
<< completed in 2 minutes, 4 seconds >>          
%

That’s not bad, but it doesn’t really take advantage of all those fancy cores we have in our computers these days. Let’s run multiple xz jobs as-parallel-as-possible, so that each file gets its own xz process.

% for f in *.dat; do
for> xz $f &
for> done; wait

# job monitoring info excluded here for blog readability

<< completed in 1 minute, 37 seconds >>                                                                                                                        
%

Not too bad. It actually took less time (78% of the single-process baseline) but in the process it really bogged down the system. If there were more than 50 files it would have been much much worse or even impossible.

concblock

What if we could easily limit the concurrency to a reasonable level? This is where concblock comes in. In this context it blocks the for loop, preventing it from creating any more xz processes until some have finished. Here’s the previous example repeated, this time including concblock:

% for f in *.dat; do
for> xz $f &
for> concblock 
for> done; wait

# job monitoring info excluded here for blog readability
	
<< completed in 1 minute, 7 seconds >>                                                                                                                          
%

An improvement! It took much less time (54% of the single-process baseline) and the system load stayed low. The whole idea here is to use the shell’s own job control infrastructure to make it easy to parallelize common shell tasks. I do a reasonable amount of on-the-fly for and while loops on the command line, so this can really help in those situations.

Here’s my code for concblock:

if ! is_function concblock; then
    export CONC_MAX=${CONC_MAX:-2}
    zmodload zsh/parameter
    zmodload zsh/zselect
	
    function concblock () { 
        CONC_MAX=${CONC_MAX:-2}
	
        # Block until there is an open slot
        if [[ ${#jobstates} -ge $CONC_MAX ]]; then
            while; do
                zselect -t 20
                if [[ ${#jobstates} -lt $CONC_MAX ]]; then
                    break
                fi
            done
        fi;
	
        return 0
    }
fi

One notable thing about concblock is that it does not use the shell’s built-in wait function. Instead it uses two very powerful zsh modules; parameter, which gives me information about job states, and zselect, which I’m using as a high-resolution built-in sleep command. With the combination of these two modules I can wait for individual subprocesses to exit and I don’t have to call out to external programs like sleep. In an ideal world, there would be a way to wait for specific signals, like SIGCHLD, or the built-in wait function would give me the option to wait for any child process to exit; unfortunately those don’t currently seem to be options.

The shell global variable $CONC_MAX is a setting that determines how many processes should be allowed to run in parallel. I’m defaulting to two because I have a dual core machine. Eventually I’ll adjust the $CONC_MAX default to automatically match the number of cores on the machine.

xconc

In our baseline example, one process gets created and it is given all the filenames as arguments. On the command line, that would look something like this:

% xz 1.dat 2.dat 3.dat 4.dat 5.dat 6.dat 7.dat 8.dat 9.dat 10.dat \
11.dat 12.dat 13.dat 14.dat 15.dat 16.dat 17.dat 18.dat 19.dat \
20.dat 21.dat 22.dat 23.dat 24.dat 25.dat 26.dat 27.dat 28.dat \
29.dat 30.dat 31.dat 32.dat 33.dat 34.dat 35.dat 36.dat 37.dat \
38.dat 39.dat 40.dat 41.dat 42.dat 43.dat 44.dat 45.dat 46.dat \
47.dat 48.dat 49.dat 50.dat

One nice aspect about this approach is that the overhead of switching from one input file to another is very low. The xz program is designed to loop through these arguments very efficiently. In our concblock example, each process gets only one argument and the OS has to setup and teardown a processes each time we want to compress a new file, this can have non-trivial overhead. On the command line, that would look something like this:

% xz 1.dat &
% xz 2.dat &
% wait
% xz 3.dat &
# ...yadda yadda..
% wait
% xz 49.dat &
% xz 50.dat &
% wait

It would be nice to get a compromise between these two approaches. For example, we could use four processes and give each process about 12 arguments each. xconc is designed to do just this; here it is applied to our example compression task:

% xconc xz *.dat
<< completed in 1 minute, 2 seconds >>                                                                                                                          
%

Another improvement! It is faster still (50% of the single-process baseline) and keeps the load on the system even lower. In the background, xconc does something like this:

% xz 1.dat 10.dat 11.dat 12.dat 13.dat 14.dat 15.dat 16.dat 17.dat \
18.dat 19.dat 2.dat 20.dat &
	
% xz 21.dat 22.dat 23.dat 24.dat 25.dat 26.dat 27.dat 28.dat 29.dat \
3.dat 30.dat 31.dat 32.dat &
	
% xz 33.dat 34.dat 35.dat 36.dat 37.dat 38.dat 39.dat 4.dat 40.dat \
41.dat 42.dat 43.dat &
	
% xz 44.dat 45.dat 46.dat 47.dat 48.dat 49.dat 5.dat 50.dat 6.dat \
7.dat 8.dat 9.dat &

Here’s my code for xconc:

if ! is_function xconc; then
    export CONC_MAX=${CONC_MAX:-2}
	
    function xconc() {
        local command_string arg_count group_size group_count
        command_string=$1;
        shift;
        arg_count=${#@}
	
        if [[ $arg_count -lt $CONC_MAX ]]; then
            group_size=1
            group_count=$arg_count
        else
            group_size=$(( arg_count / CONC_MAX ));
            group_count=$CONC_MAX
        fi
        
        remainder=$(( arg_count % (group_count * group_size) ))
	
        (
            local length=$group_size;
            local offset=1;
            local i;
            for (( i=0; i < $group_count; i++ )); do
	
                if [[ $remainder -gt 0 ]]; then
                    length=$(( group_size + 1 ))
                    remainder=$(( remainder - 1 ))
                else
                    length=$group_size
                fi
	
                "${=command_string}" "${@:$offset:$length}" &
	
                offset=$(( offset + length ))
	
                concblock
            done; wait
        )
	
        return 0
    }
fi

In Progress, Kinda

concblock and xconc are still works in progress but in the case of xconc, there are already more versatile tools that can perform these tasks (e.g. GNU Parallel). The following commands are essentially equivalent:

% parallel -j 2 -X xz --extreme ::: *.dat  
<< completed in 1 minute, 5 seconds >>                                                                                                                          
%

And:

% xconc 'xz --extreme' *.dat
<< completed in 1 minute, 4 seconds >>                                                                                                                          
%

GNU parallel has many many more useful features, including the ability to behave like xargs and take arguments from standard input.

Hear It & Hear It & Hear It

I had a bit of geek fun making recordings of concurrent text-to-speech processes while I was working on concblock and xconc. I frequently find clean solutions to difficult problems that involve the OS X say command, in this case it was a nice audible way to debug.

While I was making the script to generate the more horrible of the aforementioned recordings, I had need for a simple command-line tool that implements python’s random.randrange. It took no seconds to make but I still put it on github; this is what it looks like:

% randrange -h
usage: randrange [-h] [-c COUNT] low high
	
Print a random value using the specified parameters
	
positional arguments:
  low                   the lower boundary for the allowable range
  high                  the upper boundary for the allowable range
	
optional arguments:
  -h, --help            show this help message and exit
  -c COUNT, --count COUNT
                        the number of random values to print

I also made a similar tool for printing the name of a randomly-selected OS X text to speech voice. It is also on github and it looks like this:

% randomvoice -h
usage: randomvoice [-h] [-e] [-n]
	
Print the randomly-chosen name of an available text-to-speech voice
	
optional arguments:
  -h, --help            show this help message and exit
  -e, --exclude-novelty-voices
                        exclude the novelty voice set
  -n, --only-novelty-voices
                        only choose from the novelty voice set