hpr2793_bash_coproc_manuscript.adoc 6.16 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103
bash coproc: the future (2009) is here
======================================

Hi, I'm clacke.

It's not every day that you discover and get excited for a brand new feature in bash, but last week I did!

If you've been following Dave's excellent series of extended universe ultimate super bash tips, you already know about command substitution and process substitution. Or maybe you like reading man pages. Either way, in case you forgot or never knew, here's a quick recap:

Command substitution is the dollar-parenthesis syntax, or the older backquote syntax. It's when you call a command or function in a subshell, wait for it to finish, then insert its output on the command line. Here's an absolutely ridiculous toy example #1:

Example #1:
[source,bash]
----
$ echo $(echo hacker public radio)
hacker public radio
$ $(echo echo hacker public radio)  # It can even supply the command itself, not just parameters. Note the word splitting.
hacker public radio
$ "$(echo echo hacker public radio)"  # Counteract word splitting by putting the command substitution in quotes.
bash: echo hacker public radio: command not found
$ `echo echo hacker public radio`  # Old-style command substitution
hacker public radio
----

More on this in Dave's http://hackerpublicradio.org/eps.php?id=1903[hpr1903: Some further Bash tips].

That was command substitution. Then there's process substitution. That's when you have a greater-than or less-than sign followed by a parenthesis. This calls a command or function in a subshell too, but instead of returning the output, it returns a file descriptor to the subshell's stdin (if you use greater-than), or its stdout (if you use less-than). The process runs in the background, rather than us waiting for it. "File descriptor" here means a file path to a pipe. Ridiculous toy example #2:

Example #2:
[source,bash]
-----------
$ echo <(echo hacker public radio)
/dev/fd/63
$ cat <(echo hacker public radio)
hacker public radio
-----------

You can also combine process substitution with redirection. Ridiculous toy example #3:

Example #3:
[source,bash]
----
$ echo hacker public radio > >(sed -e 's/$/!/')  # You need the space between the greater-thans here!
hacker public radio!
----

More on this in Dave's http://hackerpublicradio.org/eps.php?id=2045[hpr2045: Some other Bash tips].

Ok, great. That's the background to this episode. So we can send data to processes, and we can receive data from processes. But if we send data to a process with process substitution, we can't receive its output. It goes straight to our stdout, and there's no super convenient and portable way to change that (See https://libranet.de/display/0b6b25a8-135c-83e5-1f4b-82a136800329[my Fediverse post on this], and I owe you a show). No way, that is, unless you live in the _future_ and have access to bash 4!

A coprocess in bash is a subshell to which you have access to two file descriptors: Its stdin and its stdout.

The two file descriptors will be put in a bash array. To learn more about arrays, check out Dave's series within the bash series, a whopping five-part quadrology including http://hackerpublicradio.org/eps.php?id=2709[hpr2709], http://hackerpublicradio.org/eps.php?id=2719[hpr2719], http://hackerpublicradio.org/eps.php?id=2729[hpr2729], http://hackerpublicradio.org/eps.php?id=2739[hpr2739] and https://hackerpublicradio.org/eps.php?id=2756[hpr2756].

Back to coprocesses. You create a coprocess using the new `coproc` keyword. This is so new, having been introduced only in 2009, that the ecosystem is still catching up. Asciidoc's bash syntax highlighting doesn't yet support it (I'm filing issues for pygments and GNU source-highlight)!

There are two ways to call coproc. The first way is to give `coproc` a _simple command_. That means just a normal single command. This will put the file descriptor numbers in an array called COPROC, capital letters. Utterly trivial example #4 that does nothing and then exits:

Example #4:
[source,bash]
----
$ coproc :; declare -p COPROC
[1] 25155
declare -a COPROC=([0]="63" [1]="60")
[1]+  Done                    coproc COPROC :
----

The other way is to give `coproc` a https://www.gnu.org/software/bash/manual/bash.html#Command-Grouping[Command Grouping], surrounding a series of statements with either curly brackets or parentheses. It will create a subshell either way, so it doesn't make any difference which style you choose. Example #5:

Example #5:
[source,bash]
----
$ coproc HPR (:); declare -p HPR
[1] 25469
declare -a HPR=([0]="63" [1]="60")
[1]+  Done                    coproc HPR ( : )
----

Ok, now we want a slightly less contrived example, that at least does _something_. Let's use `grep`. Example #6. I got stuck on this for a while, because I didn't realize that `grep` won't do a thing until it has received a full buffer or an end of file, so it just waited and didn't output anything. The solution is to add the parameter `--line-buffered`, which makes it less performant, but more practical to deal with in our case. Example #6:

Example #6:
----
$ coproc GREP (grep --line-buffered pub); printf '%s\n' hacker public radio >&${GREP[1]}; cat <&${GREP[0]}
[1] 25627
public
^C
$ kill %1
[1]+  Terminated              coproc GREP ( grep --color=auto --line-buffered pub )
----

Now we see the word `public`, and then everything stops. Why? Well, the `GREP` process keeps waiting for input, so it's still running. And `cat` is waiting for more output from `GREP`. So we'll have to Ctrl-C to kill `cat` and then `kill %1` to kill the `GREP` process.

But we know that `GREP` will only return one line, so we can just read that one line. And when we are done feeding it lines, we can close our side of its stdin, and it will notice this and exit gracefully. Example #7:

Example #7:
----
$ coproc GREP (grep --line-buffered pub); printf '%s\n' hacker public radio >&${GREP[1]}; head -n1 <&${GREP[0]}; exec {GREP[1]}>&-
[1] 25706
public
[1]+  Done                    coproc GREP ( grep --color=auto --line-buffered pub )
----

There we go! Not the most brilliant example, but it shows all the relevant moving parts, and we covered a couple of caveats. Now go out and play with this and come back with an example on how this is actually useful in the real world, and submit a show to Hacker Public Radio.