Add standardized test approach to evaluate stream output against expectations

Summary

This request covers the ability to write integration tests at the stream level for taps. Example tests a developer may want to create are:

Stream returns at least one record.
All discovered stream schema keys are available in the returned records.
All live record schema keys are recorded in the discovered stream.
All primary keys in Stream A also exist in Stream B, Column X.

Proposed benefits

An endorsed approach to testing streams will allow developers to easily implement test-driven development practices as well as increase the quality of taps overall.

Proposal details

I recently added some testing to tap-slack that might be worth refining/abstracting for the SDK. The approach was this:

In a Pytest fixture, perform a full tap sync with the sample config.
Read stdout and parse the records into an array. Then group the records by TYPE and STREAM.
Create a generic set of tests that can be applied on a stream basis: at least one record returned, catalog schema keys are in the record schema and vice versa.
Apply the generic tests for each stream, passing in the parsed full sync results.

This approach allowed me to catch several schema mismatches and a few critical issues related to the state partitioning keys mentioned above.

https://github.com/MeltanoLabs/tap-slack/blob/7892c39667f7817e426ee025d2c52622568c38d6/tests/test_streams.py#L27

Best reasons not to build

I don't think adding a feature like this would negatively affect existing taps, as the tests could be added "a la carte" by developers. However, I do think there is a risk in adding a test suite that is prone to taking up a long time or prone to error. For example, the approach outlined above works when there is a very small data volume but would not work on large taps. So finding ways to control execution time in particular is very important.