WIP: Fix contractor test ndf
Test NDF Fix
Passing Pipeline
See MR pipeline. Additionally both tests were run locally 50 times in a loop and passed all 50 times.
Original Error Message / Reason for NDF
Below is the original failure messages. This would happen almost 100% locally, but less frequent online.
--- FAIL: TestIntegrationFormContract (2.34s)
host_integration_test.go:218: We already have a contract with host ed25519:de45e8ccac7c01a3393bc82221cd455d1b539fbd176d9e2fbf5d5b5bdd9dafd9
--- FAIL: TestContractPresenceLeak (2.33s)
host_integration_test.go:684: We already have a contract with host ed25519:8ee294ede325fc7472c2cf334bf9b8874d5ecff5f053ad604bbd8f025715c5eb
Description of Solution
Through testing I found that adding a sleep at the beginning of the test eliminated the error. This aligns with why the NDF is more present locally than online. Further investigation identified that the reason for the NDF and the reason the sleep helped was due to the background execution of threadedContractMaintenance
. Both these tests manually call managedNewContract
which is also called by threadedContractMaintenance
. Online and with the sleep, threadedContractMaintenance
runs twice before the allowance for the contractor
is set which means it doesn't try and form contracts. Locally the test executes fast enough that the contractor
allowance is set before the 2nd trigger of threadedContractMaintenance
which causes conflicts between the two calls of managedNewContract
.
managedNewContract
did not hold the contractor
lock throughout the duration of the method, instead it acquired and released the lock multiple times. This allowed for the two calls of managedNewContract
to conflict with each other and try and form a contract with the same host at the same time. Updating managedNewContract
to acquire the lock at the beginning and then release it with a defer
statement resolved this conflict and the NDF.