The Zap spider used in Dast jobs can fail and let the job run until it timeouts 1 hour later

Summary

When ran on certain repository, the owasp/zap2docker-stable image used in Sast job never finishes ; its spider process crashes but Zap doesn't detect it. It times out eventually and doesn't yield any result.

Steps to reproduce

Clone git@gitlab.com:Flockademic/Flockademic.git Checkout commit fe1d5423bc40fee8aa8a40d9d742a30d88f762a5

Run The ZAP docker image :

docker run -ti --rm -v $PWD:/zap/wrk owasp/zap2docker-weekly zap-baseline.py -J gl-dast-report.json -t https://flockademic.com/

To see the spider process failing, run:

docker run -ti --rm -v $PWD:/zap/wrk owasp/zap2docker-weekly zap-baseline.py -d -t https://flockademic.com/

Press ctrl-c and an exception dump will be displayed.

Example Project

git@gitlab.com:Flockademic/Flockademic.git

What is the current bug behavior?

ZAP runs for an hour and timeout. When using the -d flag, it becomes apparent that Zap runs a spider that gets stuck at 96%. Interupting the process (CTRL-c) reveals that an exception occured but Zap doesn't detects it and wait for the spider endlessly.

What is the expected correct behavior?

ZAP runs succesfully and yields results.

Relevant logs and/or screenshots

Exception reported when Zap is interrupted with ctrl-c :

27107 [ZAP-SpiderThreadPool-0-thread-1] FATAL hsqldb.db..ENGINE  - /home/zap/.ZAP_D/session/untitled1.data getFromFile failed 38005
org.hsqldb.HsqlException: Data cache size limit is reached: 10000
        at org.hsqldb.error.Error.error(Unknown Source)
        at org.hsqldb.error.Error.error(Unknown Source)
        at org.hsqldb.persist.Cache.put(Unknown Source)
        at org.hsqldb.persist.DataFileCache.getFromFile(Unknown Source)
        at org.hsqldb.persist.DataFileCache.get(Unknown Source)
        at org.hsqldb.persist.RowStoreAVLDisk.get(Unknown Source)
        at org.hsqldb.index.NodeAVLDisk.findNode(Unknown Source)
        at org.hsqldb.index.NodeAVLDisk.getRight(Unknown Source)
        at org.hsqldb.index.NodeAVLDisk.child(Unknown Source)
        at org.hsqldb.index.IndexAVL.insert(Unknown Source)
        at org.hsqldb.persist.RowStoreAVL.indexRow(Unknown Source)
        at org.hsqldb.persist.RowStoreAVLDisk.indexRow(Unknown Source)
        at org.hsqldb.TransactionManager2PL.addInsertAction(Unknown Source)
        at org.hsqldb.Session.addInsertAction(Unknown Source)
        at org.hsqldb.Table.insertSingleRow(Unknown Source)
        at org.hsqldb.StatementDML.insertSingleRow(Unknown Source)
        at org.hsqldb.StatementInsert.getResult(Unknown Source)
        at org.hsqldb.StatementDMQL.execute(Unknown Source)
        at org.hsqldb.Session.executeCompiledStatement(Unknown Source)
        at org.hsqldb.Session.execute(Unknown Source)
        at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(Unknown Source)
        at org.hsqldb.jdbc.JDBCPreparedStatement.executeUpdate(Unknown Source)
        at org.parosproxy.paros.db.paros.ParosTableHistory.write(ParosTableHistory.java:349)
        at org.parosproxy.paros.db.paros.ParosTableHistory.write(ParosTableHistory.java:294)
        at org.parosproxy.paros.model.HistoryReference.<init>(HistoryReference.java:359)
        at org.zaproxy.zap.extension.spider.SpiderThread.notifySpiderTaskResult(SpiderThread.java:488)
        at org.zaproxy.zap.spider.Spider.notifyListenersSpiderTaskResult(Spider.java:809)
        at org.zaproxy.zap.spider.SpiderTask.run(SpiderTask.java:241)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Possible fixes

Increasing FILES CACHE SIZE in db/zapdb.script from 10000 to 100000 (for example) fixes the issue.

The Dast .gitlab-ci.yml snippet uses the owasp/zap2docker-stable docker image directly. We could build an image of our own with the increased file cache size since it's needed in some of our user's repositories.

Ideally Zap would also fail when it's spider fail and print an error message.

Edited Mar 02, 2018 by Fabien Catteau
Assignee Loading
Time tracking Loading