Skip to content

Draft: Spike retrieving HTTP messages from the ZAP database to resolve memory issues

What does this MR do?

This MR is a spike to understand if retrieving the HTTP request/response messages for each alert from the ZAP database is possible and whether it has any impact to memory used by DAST. Only the fields required by DAST will be loaded from the database.

What are the relevant issue numbers?

gitlab-org/gitlab#223827 (closed)

Why this change is expected to use less memory

Currently, DAST uses the following process to get each messages. All of the alerts are retrieved in one call, then for each alert a separate call is made to ZAP for the associated message.

sequenceDiagram
    DAST->>ZAP: GET alerts
    ZAP-->>DAST: alerts[]
    DAST->>ZAP: GET message/1
    ZAP-->>DAST: message(1)
    DAST->>ZAP: GET message/2
    ZAP-->>DAST: message(2)

This leads to the following spike in Python memory usage at the end of a scan:

Screenshot_2020-07-29_16.58.41

There are a few issues this MR attempts to solve:

  • Reduce the number of API calls back and forth between DAST and ZAP. Messages should be able to be retrieved in batches.
  • Each message returned from ZAP contains the entire request body and response body. DAST doesn't expose HTTP bodies, so these only make the problem worse. Consider a single response body that is 100MB. It will be loaded from the database into ZAP memory and returned in the API. It will then be loaded into Python memory. This is a lot of wasted memory usage.
  • All of the alerts are held in memory at the same time in Python. This means that all of the associated HTTP messages (and their bodies) are held in memory at the same time.
  • (Not confirmed) When the Java process requests memory from the OS to expand the heap, it will not give the memory back to the OS until the process has finished. Ditto with the Python process. This means that garbage collection isn't that helpful, we need to ensure that we don't use too much memory in the first place.

Does this MR meet the acceptance criteria?

Edited by Cameron Swords

Merge request reports

Loading