Including verify-only implementations

@DemiMarie opened openpgp-wg/rfc4880bis#87 asking in the spec about a verify-only profile. There are definitely useful OpenPGP implementations that are verify-only (e.g. gpgv and sqv, though they come from a suite that is more fully-capable). But it would be good to be able to include a verify-only sop implementation in the test suite without worrying that it's going to cause a lot of noise in the test suite. (no sense in reporting that gpgv cannot decrypt).

I can see a couple ways to try to do this:

make use of sop return codes -- if an implementation returns UNSUPPORTED_SUBCOMMAND in a test, mark that test as "unimplemented" rather than "success" or "fail" (maybe this would be appropriate for UNSUPPORTED_OPTION and UNSUPPORTED_SPECIAL_PREFIX too?). Maybe we hide those entries from the default view of the test suite results?
tag specific drivers with particular profile labels (e.g. in config.json), and tag tests with corresponding labels. For example, you could tag an implementation with verify-only, and some subset of the tests with verify-only (maybe just the consumer side of some of the cross-product tests?) and for a driver that is marked with a tag for a specific profile, only include it in zoo for the tests with the matching tag.

The first approach (return codes) looks elegant, but it also sounds like a lot of work. and i'm have my doubts about most sop implementations getting the return codes right (sopgpy definitely doesn't right now for UNSUPPORTED_COMMAND, though i could probably fix it), The bigger work would be modifying all the tests, i think.

The second approach seems like it might be a bit simpler -- though you'd still need to think through all the tests and decide which ones to tag -- but i haven't tried to do it either.

The additional work would be figuring out how to update the test suite summary so that the verify-only implementations don't get lumped in with the others.

Maybe we just have a separate tag-specific leaderboard, where the scoring has the denominator of the number of tests with the given tag? we could include in that leaderboard all untagged implementations as well, of course, just counting how they do on those particular tests.