Contract Tracking in the Hostdb
Overview
The hostdb should be aware of what hosts the contractor has made contracts with. This provides important information to the hostdb about the true value of a host to the renter, and allows the hostdb to make more informed decisions when scoring hosts.
Ideally, the hostdb has as much information as possible about the host relationship. This should minimally include information like how much storage is actively being used with the host, and eventually will probably be expanded to include things like total bandwidth spending. We will probably want to make whatever data structure we pass between the two flexible so that we can add fields as we go.
I think the hostdb should not be worried as much about the idea of contracts, and should not have to worry about navigating things like renewals and refreshes, but instead be primarily focused on the aggregate statistics with a particular host, and aggregate history with a particular host.
List of Relationships in the Hostdb [Design Part 1]
The hostdb should have a map[hostPubKey]contractInfo
that stores a list of every contract that the contractor has made with hosts. The contractInfo
will initially only carry the contract utility and the amount of storage in the contract, however we should expect that this info will grow over time. This map should be persisted in the hostdb. This map should only be tracking contracts that have not yet expired.
Each host in the hostdb should get a new field indicating whether or not the contractor has a contract that is not expired with the host. If a contract with the host does exist, the contract can be found by looking at the contract map described above. All of the information is kept in that map, to avoid cluttering the hostdb with a bunch of fields that will be empty for the majority of hosts.
These values need to be updated somehow. If I recall correctly, the hostdb only persists data every 2 minutes to avoid hammering the disk, because it persists everything in json and the hostdb file can get pretty large. Any method we have for passing data to the hostdb needs to keep this in mind - the data may not get persisted right away. The best method for doing this may be to have the contractor just pass the hostdb a map that has all of the current contracts in it. The hostdb can then compare that map to its current information and make any updates as appropriate. This means that even if some of the updates fail to get persisted, the hostdb will snap back to being accurate the next time siad loads.
Always scanning hosts that we have contracts with. [Design Part 2]
Once we have the above lists, it should be pretty easy to ensure that every scan always includes every host that we have an active contract with.
Account for amount currently stored with a host during storage adjustments. [Design Part 3]
Currently, we look at how much data is left on a host when deciding whether to penalize that host for having not much storage remaining. If we are already storing data on the host, we should quadruple the amount of data that we are storing on that host and count it towards the data that the host has remaining.
This is because the primary purpose of the storage adjustments in the score is to ensure that we will not be spending contract fees and transaction fees on a host that can't be used for storage anyway. But if we are already storing data with that host, then spending more money on fees is fine, because those fees are going to be amortized by the data we already have stored on the host.
Zero out host scores in hostdb [Design Part 4]
Once we are actively considering the amount of storage that we already have with a host, we can start to zero out the scores of hosts that do not have much storage remaining. I think that a good value would be 100 GB. If the total amount of data available on the host is less than 100 GB (including counting data we already have on the host at 4x), then the host should have its score multiplied by minFloat
to ensure that the renter will never use that host.
Disregard Allowance 'Expected Storage' if the actual storage is higher [Design Part 5]
Currently, the allowance has the two values 'Expected storage' and 'Expected redundancy', which get multiplied together to determine how much data is expected to be stored on all of the contracts. If the amount of data that is actually being stored on the aggregate of all GoodForRenew contracts is larger than Allowance.ExpectedStorage * Allowance.ExpectedRedundancy
, then the total data stored on all of the contracts should be used instead.
This will help to ensure that the hostdb has the best possible calibration, though it may contribute slightly to churn if the user is not doing a good job of setting an accurate ExpectedStorage
.
This can probably be 5 merge requests, one per section above:
-
Contractor updates hostdb with which contracts are in use -
Hostdb adds all hosts actively being used to scan -
Hostdb accounts for data stored on a host when considering the storage score adjustments -
Hostdb will zero out / minimize the score of any host that has <100GB data available -
Hostdb will use actual storage in GFR contracts if expected storage * expected redundancy is lower