Gather the right data, and transform it into something digestible by other sources (JSON file, etc.). That could be the input for the script that generates the markdown table.
In the first iteration, we can accept the publishing to be a manual process, as long as the artifact is automatically available, since we want to publish it as a GitLab docs page to start.
In the future, we could work on this with a designer/FE team member to design a more robust UI/UX around the same information (we can enrich that further if necessary). Then this can be published in docs or on a custom page/site or other medium.
While it is possible to check the simplest version of this answer by searching the sast-rules repo, @theoretick points out in Slack that CWEs are hierarchical, so have to search up and down the chain to answer the question fully. For example:
Connor Gilbertchanged title from Document the rules that are included in GitLab SAST outside of code to Document rules and CWE coverage for GitLab SAST
changed title from Document the rules that are included in GitLab SAST outside of code to Document rules and CWE coverage for GitLab SAST
Connor Gilbertchanged the descriptionCompare with previous version
In a prefect world, I believe we could scrape mitre.org & sast-rules routinely to build a json graph of CWEs we support. But how would we display it? Maybe we don't need to? Just make the json graph available for download?
I'll try to get a POC going for the graph production.
I like the idea a lot! Is this also intended for the advanced SAST rules?
Btw, there are more data points to anchor, such as OWASP definitions, and in the advanced SAST rule set, we also have “attack-type”, which is an internal documentation on the general attack type (sqli, xxe, etc.).
Apart from that, we can also reflect that in the prism of language / framework which can be useful.
@dabeles I assume the scraping can be adapted, my understanding is that Advanced SAST rules as somewhat similar in terms of syntax & structure to sast-rules.
I'll spin something up so that we can iterate on it.
If my understanding is correct, the request from @pedrickng in Update SAST listing of supported CWE IDs and St... (#469452 - closed) • Unassigned (a duplicate of this issue) was related to the handbook page; just generating/providing a table or list of CWEs we support (including OWASP information) and putting it directly or indirectly into the handbook should be sufficient to solve this issue. I think just slicing the CWEs from the CWE hierarchy and putting this information into a table is probably sufficient.
I believe for both the new detected secrets page and DAST's browser checks page the pages are manually generated, unfortunately. Maybe @craigmsmith can confirm but from the file history that looks to be the case so it might be more challenging to keep these updated for SAST which has a much large list of rules
Having a CWE graph from either Tal's great PoC or vulninfo is most ideal but I think the next big step would be some method of automating the docs page generation. I can see two quick ways
@theoretick yes last time I checked the browser based checks page is manually generated, although the manual part is fairly small. The process is documented here.
I've been wanting to do something similar for sast-rules for ages! It would be awesome to create something that runs as part of the sast-rules release.
@connorgilbert I think there are some wonderful ideas in the comments, would you like to define expectations so that we can start planning this change? It's aligned with our documentationOKR
One thing to note explicitly, by the way—sometimes we say "handbook" when we mean "docs". This content should be published in product documentation, not the company handbook.
We must document the coverage of Advanced SAST rules. We should document the summarized coverage of sast-rules.
@rvider this looks to be more of a Static AnalysisCore issue. I've put in the %Backlog pending your scheduling. If there is anything we can do to help achieve the "should" clause, LMK.
First iteration - gathering the right data, and transforming it into something digestible by other sources (JSON file, etc.). That could be the input for the script that generated the markdown table from that. In the first iteration, we can accept the publishing to be a manual process, as long as the artifact is automatically available, since we want to publish it as a GitLab docs page to start.
Next iteration - we can work on this with a designer/FE team member to design a more robust UI/UX around the same information (we can enrich that further if necessary). Then this can be published in docs or on a custom page/site or other medium.