Double Spend Proof -- SpecEdited
The hash-ID for the double spend proof is a double sha256 over the entire serialized content of the proof, as defined next.
Is there a need to use this as ID instead of just the outpoint? Do we envision ever needing to store or relay more than one proof per outpoint?
Yes, there is a need.
While there is no need for there to be more than one proof per outpoint, there can be an infinite amount of such proofs. And we want to be able to differentiate before we download the proof which one it is.
As such, the hash-ID is not just an ID for the outpoint. It is an ID for the entire proof.
The ID is a hash of the entire content to protect from altering the contents. Much like this is the case in most of the protocol (tx, block, etc). For instance you see a transaction being advertised by INV and if after download the transaction doesn't validate its txid is remembered and we don't download it again later. I have a patch for Flowee doing the same for proofs, that a proof is only downloaded once. Even if its tossed as invalid.
If you were to use your suggestion of using an outpoint, the following attack is possible;
- Person intents to double spend outpoint O.
- Person sends an invalid (not even validating) proof with ID based on 'O'
- Node downloads, rejects and blacklists hash.
- Person double spends and is certain no node will learn about it.
If nodes do not black-list already validated but rejected proofs then sending INVs will cause a lot of bandwidth to be burned to send the same thing again and again.
Hope that's clear. Let me know if its not.
That attack seems to be mitigated by simply blacklisting the peer instead of the proof.
Possibly, I went with what bitcoin already does in these situations.
Did you see a benefit to use the outpoint-hash as an ID?
It just seems like useless work to inspect both the whole-proof hash and the outpoint, the latter of which you'll have to inspect anyway. The former does not offer any additional defense against DoS - since it's trivial to "generate" invalid proof with a different hash - and seems to be additional hashing and inspection for no benefit.
At the end of the day you'll likely need to ban the peer who forward you invalid proof anyway. Using outpoint saves you a hash, makes redundancy detection slightly easier among honest peers, and simplifies things.
I think you have a misconception of what the hash is used for. It is not inspected or checked. Any more than a block or a transaction have their txid/blockhash inspected. Instead, we just use it as a identifier that is generated on the fly should we need it.
The usage of the outpoint for ID would in the end have zero effect on how much checking we do.
And, yes, on a non-valid proof the peer that forwarded gets banned. On the other hand, low-effort Sybill is avoided by remembering the hash (generated from the actual proof they sent us).
If it's used as an identifier, then in the case two or more different proofs exist for the same outpoint, both will need to be downloaded. Using outpoint as identifier, the second one can be not downloaded at all.
Someone asked me to followup;
If it's used as an identifier, then in the case two or more different proofs exist for the same outpoint,
that usecase means that we are talking about triple spends. As the spec is designed such that a double spend can only lead to one unique proof.
both will need to be downloaded. Using outpoint as identifier, the second one can be not downloaded at all.
This is true. In case of a triple spend you could avoid downloading the second proof.
The problem, as I described in my previous answer, is that the Bitcoin design is based on a unique ID for a unique primitive. Going away from this is a huge undertaking and I personally don't see the benefit.
Yes, for every additional transaction spending a specific outpoint there exists a new proof, and INV. But as nodes very quickly will have that outpoint already marked with a DSP the additional ones will stop propagating and thus you won't see them clogging up the network. Just like triple spends don't clog up the network because of the first seen principle a double spend proof won't be propagated when that outpoint is already notified as having a DSP.
Bottom line; the p2p layer as we inherited doesn't like your optimization as its different from what everything else does.
I've read through this a couple times and I think I get the overall concept, but the technical details are a bit too deep for me to comfortable provide meaningful feedback. I asked some clarifying questions on twitter that might be of some insight to how others like me might look at this.
There are some minor things that might make sense to update, such as clarify that this really isn't about proving someone spent exactly twice, but rather more-than-once, but that is mostly semantics.
Some of the properties of the protocol might not be immidiately obvious to a layman or someone who is new to the underlying structures, so it might be valuable to explain at more depth how or where the important properties come from.
Thank you Jonathan, the main goal of the spec is to let other implementations support this from a specification first. The concept that if there are differences in code, then the spec is the one that wins. This to counter the problem of "other chains" that specify that there is a reference implementation which is right.
I will take a look at clarifying some of the points you highlighted, you made some good observations here and on Twitter.