Automatic fuzzing harness generation for coverage-guided fuzzing
Problem to solve
This is an effort to lower the barrier-to-entry of the new Coverage-Guided Fuzz Testing features that are being added to GitLab.
Creating targeted and efficient fuzzing harnesses requires a degree of experience with fuzzing that few developers have. Although the process is not complicated, the understanding of the mechanics of fuzzing is still required and adds to the learning curve.
We should be able to automatically derive coverage-guided fuzzing harnesses from existing code (example code, unit tests) for projects that have codebases in supported languages.
Intended users
User experience goal
Users should have the benefits of fuzzing, out-of-the-box, with minimal effort on their part. Ideally this will only require the inclusion of a CI template.
Proposal
I think there is an iterative path that can achieve this:
1. Stand-alone Fuzzing Harness Generation Tool
A stand-alone tool that can generate fuzzing harnesses from existing code would be a good MVC. It would:
- Identify existing code that would work well with coverage-guided fuzzing
- Create fuzzing harness(es) for the identified code
This tool could be open-sourced and maintained.
2. Automatic Fuzzing Harness Generation as part of GitLab
Once the stand-alone tool exists and is stable, the next step is to integrate it into GitLab. The first integration could be to:
- Have a manual job included in a GitLab CI tempate that:
- automatically generates the fuzzing harnesses
- creates a new MR to the current project that
- adds the fuzzing harnesses
- adds new jobs to .gitlab-ci.yml
Having the results of the stand-alone tool added to the project via a merge request will give the user opportunity to edit and modify the generated harnesses.
3. Full automation
Full automation would be the end result. As a job in an includeable CI template, it would:
- Identify code in the codebase that would work with coverage-guided fuzzing
- Generate the fuzzing harness(es) and save them as build artifacts
- Generate a new
strategy: dependschild pipeline that has jobs for the generated fuzzing harness(es)
The potential for false positives here is high. Robust filtering options would be needed to allow the user to include/exclude areas of code from the fuzzing-harness-generation process.
The benefit of this approach is that the fuzzing will evolve automatically as the project evolves.
Further details
Permissions and Security
Documentation
Availability & Testing
What does success look like, and how can we measure that?
What is the type of buyer?
What is the buyer persona for this feature? See https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/buyer-persona/
- CISO
- Director App-Dev
- VP App-Dev
In which enterprise tier should this feature go? See https://about.gitlab.com/handbook/product/pricing/#four-tiers
- Ultimate
Is this a cross-stage feature?
No, Secure only