Testing things

Let's be honest, this is a bug which will never be closed: there can never be enough tests. However, we could definitely make some improvements to the current testing system, as well as testing many more things

Testing infrastructure

The test system is OK. It does the job, but there are a couple of improvements we could make:

Allow running multiple test files at once: I don't really want to have a test runner, but it would also be nice if we could have a way of running multiple test files at a time, merging their results together. Any ideas on a way to achieve this?
Code coverage: Some basic, per-line code coverage really shouldn't be too hard to add, though we'd need a way to make it as efficient as possible - enabled via a flag or something. Note: We could technically get per-branch coverage if we instrumented on the instruction level, but that isn't really possible without slowing everything down a lot.
- Code coverage of macros and other compile-time-only code. More generally, can we profile macros?

Property testing/QuickCheck

The current QuickCheck system is really neat, and we should be using it more. However, there are some improvements which could be made:

Ability to register custom types: Enough said. I don't know if we want to go the setf! route (look up a <name>/generate! method), or have a central symbol to function lookup.
Shrink failing inputs: We really should have a way to shrink inputs which break things: For instance: lists get elements removed, numbers shrink towards 0, etc...

Testing libraries

Migrate most existing tests to use affirm instead of assert.
Use QuickCheck in more places. Sometimes we could replace hard coded constants with QuickCheck, other times it might be better to have both.
Test the remaining libraries. Again, code coverage would be useful here to see what is and what isn't tested. Pattern matching, argparse, do could really do with some tests. As could many of the built-in macros.
Test all examples given in doc comments, asserting they compile and produce the same output as expected.

Testing the compiler

Apart from the lexer, basically nothing is tested. It would be good to also test the parser, but I feel the optimiser and code gen are the most important things: after all, they are the biggest source of bugs. A couple of things we could do:

Create a collection of "tricky cases": I have a load of code which broke the optimiser/code-gen at some point: it would be good to upload that and ensure that it never breaks it again. Similarly, we should try to re-create similar cases which may break other things.
Property testing on the optimiser and codegen: it might be possible to create complex combinations of cond, lambda, set! and some basic functions in order to produce random code which can then be optimised and run. Whilst we may not be able to validate it's output, we should still be able to assert that it compiles correctly.

Edited Oct 16, 2017 by SquidDev

Assignee

Select assignees

Time tracking