JSON output for sq and also acceptance/integration tests for sq
(Related to sequoia!1116) I would like to add JSON output to `sq`, and these are my thoughts on how to approach that. I would like feedback before I dig myself deep in a hole. Currently, sq has the following subcommands: ~~~ encrypt Encrypts a message decrypt Decrypts a message sign Signs messages or data files verify Verifies signed messages or detached signatures key Manages keys keyring Manages collections of keys or certs certify Certifies a User ID for a Certificate autocrypt Communicates certificates using Autocrypt keyserver Interacts with keyservers wkd Interacts with Web Key Directories armor Converts binary to ASCII dearmor Converts ASCII to binary inspect Inspects data, like file(1) packet Low-level packet manipulation help Prints this message or the help of the given subcommand(s) ~~~ They output from these commands is either nothing, some status or progress messages, specific file formats such as ASCII armored data, or semi-structured ad hoc plain text output. The output of `sq` should be easy to use from programs (shell scripts, Python scripts, Ruby scripts, etc). Writing a custom Rust program that uses the Sequoia crate is not always feasible. Supporting script writers well seems like an important thing for Sequoia to do. Thus, JSON output for `sq`. JSON is the obvious first choice for structured output, as it's so very well supported by all scripting languages. I propose to add an option `--output-format=json` to each subcommand that produces structured output. The option will also accept the value `default`, to produce current output, which may or may not be textual. Other formats can be added later, if there's use for them. For future-proofing, the JSON output will be versioned. For this, the output will always be a JSON object (i.e., dict, hashmap, set of key/value pairs), and there will always be the key `sequoia-json-format` that specifies the version of the JSON schema being used. For this proposal, the version is a list of the integer 1 (i.e., `[1]`). It's a list of integers to allow, say, version 1.2 (list `[1,2]`) or 3.14.15 (`[3,14,15]`), later on, if that makes sense in the future. I propose to add JSON output to at least `inspect` and `packet dump` to start with. Once we have JSON support for those, adding support for other commands should be straightforward. (For example, listing keys in a key ring.) The output of `sq inspect` might look like this: ~~~json { "sequoia_json_version": [1], "sq_operation": "inspect", "filename": "liw.key.pgp", "file_type": "transferable-secret-key", "user_ids": [ "Lars Wirzenius <liw@liw.fi>" ], "main_key": { "fingerprint": "F8D3A8621B90C8589CE5B919627D12E85029C0A3", "algorithm": "RSA", "usage": ["encrypt", "sign"], "key_size": 4096, "secret_key_encrypted": false, "creation_time": "2021-08-03 11:53:16 UTC", "expiration_time": "2024-08-03 05:19:37 UTC (creation time + P1095DT62781S)", "key_flags": ["certification"] }, "subkeys": [ { "fingerprint": "D00EDB7017C9EAE3C6D438B7BFFD4B52F08C7AB1", "algorithtm": "RSA", "usage": ["encrypt", "sign"], "key_size": 4096, "secret_key_encrypted": false, "creation_time": "2021-08-03 11:53:16 UTC", "expiration_time": "2024-08-03 05:19:37 UTC (creation time + P1095DT62781S)", "key_flags": ["sign"] } ] } ~~~ Notes on the above example: * I added a field for the `sq` subcommand being used. Not sure that's useful, but just in case. * Timestamps are textual so that they are human-readable. Additional fields with the same timestamp a Unix timestamp (seconds since 1970 started, in UTC) might also be good for scripting. ## Implementation thoughts Currently `inspect` and `packet dump` produce output via `write!` calls interspersed in the code that examines the parsed input. This is an obvious way to implement the functionality, but makes supporting another output format painful. I propose to change this to form a data structure that can be serialized to JSON using `serde_json`. This would make JSON output trivial, and would make it quite hard to produce invalid JSON, such as accidentally leaving a trailing comma. It would also allow "pretty printed JSON" for free. The current human-oriented textual output should remain. It would probably mean writing a custom serde serializer, but that is not very difficult. Using serde would make it fairly easy to support other output formats as well, should that become interesting: YAML, TOML, or S-expressions, for example. I can see two approaches to using serde: either implement serde serialization for existing types for keys, packets, etc, or introduce new, lightweight types just for output purposes. I'm not yet familiar enough with the Sequoia types and code structure to know which approach would be better. Approach 1: Implement serialization for `sequoia_openpgp::packet::Key`, `sequoia_openpgp::packet::Signature`, and so on, and change `inspect` and `packet dump` to use those to produce output. Approach 2: Create new types for the things that we want to output, and serialization for those. For example, a struct for the `inspect` output, and additional structs for the key output.
issue