Add typosquatting support for dependency scanning.
Problem to solve
Typosquatting is a technique in which a package/dependency is created by a bad actor that is very similar to an official package, but varies slightly by a typo.
Another variation is that a bad actor creates a seemingly official package name to trick engineers.
This recently occurred in python, when "jeIlyfish" (the first L is an I), was published in order to get included when engineers actually wanted the "jellyfish" library.
Also, "python3-dateutil," was published to trick engineers who actually wanted to include "dateutil". See https://www.zdnet.com/article/two-malicious-python-libraries-removed-from-pypi/ for more information.
The attack works because an engineer adds a misspelled package to their project or did an internet search and found what appeared to be the correct library and then incorporated it into a project. Because the corrupted dependency mirrors the behavior of the legitimate library, the engineer has no awareness that their project now has a malicious dependency.
This type of attack would bypass traditional dependency scanning and work because these libraries will not exist in a vulnerability database until they are discovered.
Intended users
- [Sasha (Software Developer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sasha-software-
- Sam (Security Analyst)
Proposal
GitLab should create a risk score for libraries that do not match well known library names and evaluate whether they may be typosquatted, or just libraries that do not have known reported vulnerabilities.
Testing
Benign typosquatted library could be published to a local repository to test the detection ability.
Links / references
gitlab-org/security-products/gemnasium-db!734 (merged) gitlab-org/security-products/gemnasium-db!733 (merged) gitlab-org/security-products/gemnasium-db!969 (merged)