Skip to content

Optimize DAST reporting to reduce memory usage

Problem

DAST scans on target applications that make lots of HTTP requests are experiencing a spike in memory. This is causing the Runner to run out of memory on at least one customer scan.

The most likely cause

While it is not guaranteed to be the problem, it is known that store.NavigationResult.LoadAll() is extremely inefficient, and is the only known contender for such large memory spikes.

LoadAll loads results of all user actions attempted by DAST during the scan, which could number in the thousands for a long scan. Each NavigationResult object contains the HTTP requests and responses recorded during the action. It is common for request and response bodies to be many MBs in size. For example, if Chromium loads a 10 MB JavaScript file on every page, the 10 MB response body will be recorded in every navigation result, all of which will be loaded in memory when LoadAll is called.

Customer

See Memory Usage % for an example of such a memory usage spike https://gitlab.com/gitlab-com/sec-sub-department/section-sec-request-for-help/-/issues/125#note_1794526803.

Proposal

Convert the offending method to stream results, or to return a specific set of results according to how it is used.

Implementation plan

  • Change printer.UniqueAuditedURLS to use store.NavigationResult.IterateHTTPMessages
  • Change services.SecurityReportFormatter to use store.NavigationResult.IterateHTTPMessages
  • Remove CrawlGraph.GetNavigationResults and store.NavigationResult.LoadAll()
  • Add a changelog entry