Problem validation for native code nav

Problem Statement

When a developer is performing a code review, they must thoroughly understand the proposed change to be able to determine if the merge request will have the intended outcome, if there are adverse consequences, and if quality standards have been achieved. Identifying defects (1) and ensuring maintainability (2) are the primary objectives, but these require deep understanding (3).

References

Identifying defects is the primary goal of code review - Alberto Bacchelli, Christian Bird. Expectations, Outcomes, and Challenges Of Modern Code Review
Maintainability comments account for 50% of comments, and 15% are defects - Jacek Czerwonka, Michaela Greiler, Jack Tilford. Code Reviews Do Not Find Bugs
High level of understanding is needed to detect defects - Alberto Bacchelli, Christian Bird. Expectations, Outcomes, and Challenges Of Modern Code Review

Further details

Reviewing code by reading a file top to bottom is insufficient, let alone a diff fragment. This is because code isn't written to be read top to bottom, or page by page like a novel. Source code is terse and deduplicated. Functions and libraries allow reuse, and are linked together in a large and complex graph of dependencies. ~~Reading~~ Understanding code is about understanding these relationships.

Understanding the relationships between changed code to other changed code, or unchanged code is difficult. Features like jump to definition or seeing a last of places that call a specific function are very helpful, and tools like ctags which dates to BSD 3.0 (1980) show the long history and importance of such tools.

Tools to solve this problem have existed for 40 years. Developers rely on them and expect them to be available.

Competitors

Local development tools:

ctags (1980) et al
openGrok: https://github.com/oracle/opengrok (used by Uber)
Microsoft Developer tools: Visual Studio et all
Language Server protocol (used by Visual Studio Code, Atom)
JetBrains IDEs

Peers (online code review tools):

GitHub circa Jun 2019: https://github.blog/changelog/2019-06-11-jump-to-definition-in-public-repositories/ powered by https://github.com/github/semantic
Bitbucket Cloud plugin circa 2015: https://marketplace.atlassian.com/plugins/com.mohamicorp.stash.plugin.codenav/server/overview
Sourcegraph circa 2015: https://sourcegraph.com/ powered by LSP/LSIF, ⭐ 80k Chrome installs
Codota (Java only): https://www.codota.com/
Sourcehut: annotation strategy, could use LSIF or ctags https://drewdevault.com/2019/07/08/Announcing-annotations-for-sourcehut.html

Reach

10.0 = Impacts the vast majority (~80% or greater) of our users, prospects, or customers.

All teams that write software should practice code reviews regardless of application type. Even developers that work by themselves benefit from self review.

Impact

2.0 = High impact

Quality and efficiency of code review can be increased for the overwhelming majority of customers.

Merge Requests used by +90% of instances w/ >20 users (source)
Code Review takes longer than 1 day for over 50% of merge requests on GitLab.com, and longer than 3 days for 27% of merge requests. (source: ibid)

Confidence

100% = High confidence

User feedback: consistent feedback from customers when asking about Code Review practices https://gitlab.com/gitlab-org/create-stage/issues/12610
Sourcegraph success: 80k Chrome extension users, paid customers include Uber, Yelp, Lyft, SoFi, Quantcast, and Convoy, estimated $3m ARR (Crunchbase)
Competition: GitHub is also building this feature

Effort

Proof of concept: 2 (based on 1 BE and 1 FE for 1 month)
MVC: 5 (based on 1 FE for one month, and 2 BE for 2 months)

Risks:

Scalability – compute and storage. Indexes can grow large, and can be hard to scale.
Default on – Runners and Auto DevOps are required_
Performance – page performance of merge requests is critical to customers
Complexity – understanding/learning new domain

Edited Dec 07, 2020 by 🤖 GitLab Bot 🤖