Testing
We need to add some testing to make sure things are working as intended.
Write test cases using the pytest framework. Since this project is mostly a wrapper around pefile and various ML libraries, the tests should be end-to-end style where we define a set of malware to be used in order to test the entire app.
The sample-set should NOT be distributed with this repo. Instead, come up with some mechanism to list the sample and how they are to be used (sha256, train/test). It will be up to the person running the tests to provide the appropriate samples or come up with a way to download them automatically. This will likely cause trouble with any CI/CD platform.
These tests should always return the same results unless:
- tune models differently
- add dimensions (changed data extracted from each samples)
- changed sample set
- something actually broke