[Experiment] Explain this Vulnerability (#10284) · Epics · GitLab.org

[Experiment] Explain this Vulnerability

# Experiment/Prototype ## Demo 2023-04-19 - available on Slack at https://gitlab.slack.com/archives/C0530NHQ77A/p1681458742502489 (Uses experimental API) 2023-04-20 - Enabled on gitlab-org/gitlab - find a vulnerability through the [vulnerability report](https://gitlab.com/gitlab-org/gitlab/-/security/vulnerability_report/?scanner=GitLab.SAST&after=eyJzZXZlcml0eSI6ImNyaXRpY2FsIiwidnVsbmVyYWJpbGl0eV9pZCI6IjgxNzAyMjEzIn0). ## Problem to be solved ### User problem _What user problem will this solve?_ GitLab surfaces vulnerabilities that contain relevant information, however, more often users aren't sure where to start. It takes time to research and synthesize information that is surfaced within the vulnerability record. Moreover it can be difficult to figure out how to fix a given vulnerability. #### Solution hypothesis _Why do you believe this AI solution is a good way to solve this problem?_ Users are looking to quickly understand a vulnerability so that they know what next steps to take, i.e. what code change do I need to make etc. ### Assumption _What assumptions are you making about this problem and the solution?_ - The amount of information for a vulnerability can be under/overwhelming. - It is difficult to know where to start. - Not all fixes are straightforward. ### [Personas](https://about.gitlab.com/handbook/product/personas/#list-of-user-personas) _What personas have this problem, who is the intended user?_ - [Sasha (Software Developer)](https://about.gitlab.com/handbook/product/personas/#sasha-software-developer) can use this feature to better understand and potentially fix vulnerability findings before she tries to merge to the default branch. - [Sam (Security Analyst)](https://about.gitlab.com/handbook/product/personas/#sam-security-analyst) uses this feature to quickly triage vulnerabilities and learn about specific vulnerabilities quickly. ## Proposal  ### Success _How will you measure whether this experiment is a success?_ * The length of time vulnerabilities sit with the detected status decreases. * The number of suggestions over time increases. * The number of suggestions answered that were helpful exceeds the number of suggestions that were not helpful. * The number of suggestions that are helpful increases over time, making the number of questions answered that weren't helpful nominal. * The to train the model is nominal. # Feature release ### Main Job story _What job to be done will this solve?_  There are two specific JTBD: 1. When I am triaging vulnerabilities, I want to address business-critical risks, so I can ensure there is no unattended risk in my organizations assets. 2. When I am investigating a vulnerability, I want to determine it's risk level, so that I can take the next appropriate action. Note: all Secure and Govern JTBD can be found [here](https://about.gitlab.com/handbook/product/ux/stage-group-ux-strategy/sec/jtbd/#). The two JTBD are from the _Addressing detected business-critical vulnerabilities_ section. ## Proposal updates/additions  ### Problem validation _What validation exists that customers have this problem?_ There isn't a problem validation at present. We will be conducting a live problem validation with this prototype at RSA in April 2024. ### Business objective _What business objective will be achieved with this proposal?_  This will serve as the first prototype for the ~"section::sec" AI Integration. The goal is to have a demo of this prototype at [RSA](https://www.rsaconference.com/usa) (April 24th - 27th). ### Confidence _Has this proposal been derived from research?_  No ### Requirements _What tasks or actions should the user be capable of performing with this feature?_  #### The user needs to be able to: - For any SAST vulnerability, a user can click a button on a vulnerability that auto-generates a ChatGPT prompt that: - Explains a vulnerability - Tells the user how they can resolve it - Recommends what needs to be changed in the code - _For this *first iteration* we are looking at displaying the recommendations in a drawer, the results will not persist._ #### Future iterations under consideration: - Give input if the suggestion was helpful or not with a :thumbsup_tone1: or :thumbsdown_tone1:. - Provide input as to how the answers could have been better. - In addition to vulnerability records this feature is also accessible via: - MR security widget - Pipeline security tab on the MR To measure success for future iterations we will consider success by measuring: - The percentage of vulnerabilities with the detected status decreases, because users are able to fix vulnerability findings more easily before they are merged into the main branch. ## Timeline ### Experiment - `Complete by 6PM PST Tuesday, April 11th` - DRI: @dbolkensteyn and @idawson - Deliverables: Determine what parameters from a vulnerability, code, etc. for ChatGPT prompts yield the best most informative results. - Things to consider: - Do we need to vary input for the prompt by scan type? - What do we need to input from the vulnerability? - What do we need to input for the code base/ MR/ diff? - Ideally we want the user to have something: - Explains a vulnerability with more detail than what is already available in the vulnerability record - Tells the user how they can resolve it - Recommends what needs to be changed in the code ### Feature Delivery - `Complete by 6PM PST Wednesday, April 19th` - DRI: @dftian, @mokhax - Deliverables: Please see the [The user needs to be able](https://gitlab.com/gitlab-org/gitlab/-/issues/405128#the-user-needs-to-be-able-to) to section ### Feature Delivery - `Complete by 6PM PST Friday, April 21st` - DRI: @abellucci - Deliverables: Documentation and marketing materials for RSA. - https://gitlab.com/groups/gitlab-com/marketing/-/epics/3791+ - https://gitlab.com/gitlab-com/marketing/corporate_marketing/corporate-marketing/-/issues/7696+ - [Q1 FY24 Quarterly Announcement](https://docs.google.com/document/d/1goAYa6B87_GTB6PKEjhSe5fvCxgWW1LXwYDhkY-FRUg/edit#) ## Checklist ### Experiment <details> <summary> Issue information </summary> - [ ] Add information to the issue body about: - [x] The user problem being solved - [x] Your assumptions - [x] Who it's for, list of personas impacted - [x] Your proposal - [ ] Add relevant designs to the Design Management area of the issue if available - [x] Ensure this issue has the ~wg-ai-integration label to ensure visibility to various teams working on this </details> ### Feature release <details> <summary> Issue information </summary> - [ ] Add information to the issue body about: - [ ] Your proposal - [ ] The Job Statement it's expected to satisfy - [ ] Details about the user problem and provide any research or problem validation - [ ] List the personas impacted by the proposal. - [ ] Add all relevant solution validation issues to the Linked items section that shows this proposal will solve the customer problem, or details explaining why it's not possible to provide that validation. - [ ] Add relevant designs to the Design Management area of the issue. - [ ] You have adhered to our [Definition of Done](https://docs.gitlab.com/ee/development/contributing/merge_request_workflow.html#definition-of-done) standards - [ ] Ensure this issue has the ~wg-ai-integration label to ensure visibility to various teams working on this </details> <details> <summary> Technical needs </summary> - [ ] [https://gitlab.com/gitlab-org/gitlab/-/issues/403859#note_1337519985](https://gitlab.com/gitlab-org/gitlab/-/issues/403859#note_1337519985)+s 1. Work estimate and skills needs to build an ML viable feature. - To build any ML feature depending on the work, there are many personas that contribute including, Data Scientist, NLP engineer, ML Engineer, MLOps Engineer, ML Infra engineers, and Fullstack engineer to integrate the ML Services with Gitlab. Post-prototype we would assess the skills needed to build a production-grade ML feature for the prototype 2. Data Limitation - We would like to upfront validate if we have viable data for the feature including whether we can use the DataOps pipeline of ModelOps or create a custom one. We would want to understand the training data, test data, and feedback data to dial up the accuracy and the limitations of the data. 3. Model Limitation -We would want to understand if we can use an open-source pre-trained model, tune and customize it or start a model from scratch as well. Further, we would asses based on the ModelOps model evaluation framework which would be the right model to use based on the use case. 4. Cost, Scalability, Reliability -We would want to estimate the cost of hosting, serving, inference of the model, and the full end-to-end infrastructure including monitoring and observability. 5. Legal and Ethical Framework -We would want to align with legal and ethical framework like any other ModelOps features to cover across the nine principles of responsible ML and any legal support needed. </details> <details> <summary> Dependency needs </summary> - [ ] [https://gitlab.com/gitlab-org/gitlab/-/issues/403859#note_1337519985](https://gitlab.com/gitlab-org/gitlab/-/issues/403859#note_1337519985)+s </details> <details> <summary> Legal needs </summary> - [ ] TBD </details> ## Additional resources - If you'd like help with technical validation, or would like to discuss UX considerations for AI mention the AI Assisted group using `@gitlab-org/modelops/applied-ml`. - Read about our [AI Integration strategy](https://internal-handbook.gitlab.io/handbook/product/ai-strategy/ai-integration-effort/) - Slack channels - `#wg_ai_integration` - Slack channel for the working group and the high level alignment on getting AI ready for Production (Development, Product, UX, Legal, etc.) But from the other channels fell free to reach out and post progress here - `#ai_integration_dev_lobby` - Channel for all implementation related topics and discussions of actual AI features (e.g. explain the code) - `#ai_enablement_team` - Channel for the AI Enablement Team which is building the base for all features (experimentation API, Abstraction Layer, Embeddings, etc.)

epic