• The hash-ID for the double spend proof is a double sha256 over the entire serialized content of the proof, as defined next.

    Is there a need to use this as ID instead of just the outpoint? Do we envision ever needing to store or relay more than one proof per outpoint?

  • Yes, there is a need.

    While there is no need for there to be more than one proof per outpoint, there can be an infinite amount of such proofs. And we want to be able to differentiate before we download the proof which one it is.

    As such, the hash-ID is not just an ID for the outpoint. It is an ID for the entire proof.

    The ID is a hash of the entire content to protect from altering the contents. Much like this is the case in most of the protocol (tx, block, etc). For instance you see a transaction being advertised by INV and if after download the transaction doesn't validate its txid is remembered and we don't download it again later. I have a patch for Flowee doing the same for proofs, that a proof is only downloaded once. Even if its tossed as invalid.

    If you were to use your suggestion of using an outpoint, the following attack is possible;

    • Person intents to double spend outpoint O.
    • Person sends an invalid (not even validating) proof with ID based on 'O'
    • Node downloads, rejects and blacklists hash.
    • Person double spends and is certain no node will learn about it.

    If nodes do not black-list already validated but rejected proofs then sending INVs will cause a lot of bandwidth to be burned to send the same thing again and again.

    Hope that's clear. Let me know if its not.

  • That attack seems to be mitigated by simply blacklisting the peer instead of the proof.

  • Possibly, I went with what bitcoin already does in these situations.

    Did you see a benefit to use the outpoint-hash as an ID?

  • It just seems like useless work to inspect both the whole-proof hash and the outpoint, the latter of which you'll have to inspect anyway. The former does not offer any additional defense against DoS - since it's trivial to "generate" invalid proof with a different hash - and seems to be additional hashing and inspection for no benefit.

    At the end of the day you'll likely need to ban the peer who forward you invalid proof anyway. Using outpoint saves you a hash, makes redundancy detection slightly easier among honest peers, and simplifies things.

  • I think you have a misconception of what the hash is used for. It is not inspected or checked. Any more than a block or a transaction have their txid/blockhash inspected. Instead, we just use it as a identifier that is generated on the fly should we need it.

    The usage of the outpoint for ID would in the end have zero effect on how much checking we do.

    And, yes, on a non-valid proof the peer that forwarded gets banned. On the other hand, low-effort Sybill is avoided by remembering the hash (generated from the actual proof they sent us).

  • If it's used as an identifier, then in the case two or more different proofs exist for the same outpoint, both will need to be downloaded. Using outpoint as identifier, the second one can be not downloaded at all.

  • Someone asked me to followup;

    If it's used as an identifier, then in the case two or more different proofs exist for the same outpoint,

    that usecase means that we are talking about triple spends. As the spec is designed such that a double spend can only lead to one unique proof.

    both will need to be downloaded. Using outpoint as identifier, the second one can be not downloaded at all.

    This is true. In case of a triple spend you could avoid downloading the second proof.

    The problem, as I described in my previous answer, is that the Bitcoin design is based on a unique ID for a unique primitive. Going away from this is a huge undertaking and I personally don't see the benefit.

    Yes, for every additional transaction spending a specific outpoint there exists a new proof, and INV. But as nodes very quickly will have that outpoint already marked with a DSP the additional ones will stop propagating and thus you won't see them clogging up the network. Just like triple spends don't clog up the network because of the first seen principle a double spend proof won't be propagated when that outpoint is already notified as having a DSP.

    Bottom line; the p2p layer as we inherited doesn't like your optimization as its different from what everything else does.

  • I've read through this a couple times and I think I get the overall concept, but the technical details are a bit too deep for me to comfortable provide meaningful feedback. I asked some clarifying questions on twitter that might be of some insight to how others like me might look at this.

    There are some minor things that might make sense to update, such as clarify that this really isn't about proving someone spent exactly twice, but rather more-than-once, but that is mostly semantics.

    Some of the properties of the protocol might not be immidiately obvious to a layman or someone who is new to the underlying structures, so it might be valuable to explain at more depth how or where the important properties come from.

  • Thank you Jonathan, the main goal of the spec is to let other implementations support this from a specification first. The concept that if there are differences in code, then the spec is the one that wins. This to counter the problem of "other chains" that specify that there is a reference implementation which is right.

    I will take a look at clarifying some of the points you highlighted, you made some good observations here and on Twitter.

    1. What are we doing with the protocol version?

    2. Are we sending reject messages if the dsproof is rejected?

  • What are we doing with the protocol version?

    The message uses the existing inv/getdata messages which have been around forever. Only nodes that recognize and explicitly request the inv to be for a DSP will get the new message. Added to this that the majority of the node implementations aim to have this implemented soon, there should be no need for SPV clients to do work in order to find DSP capable peers.

    As such the protocol version doesn't really seem to be relevant to me.

    Are we sending reject messages if the dsproof is rejected?

    A DSProof can be "rejected" based on two grounds. One is that there is no transaction in the mempool to complete the data with in order to verify it. My implementation waits some time and then tosses the DSP giving one (out of 100) ban point to the peer.

    The second ground a DSProof can be rejected is because it does not validate. And in that case I suggest simply disconnecting the peer as they are violating the protocol.

    I don't see any other cases where a rejection message is useful at this time.

  • Feeding some review comments upstream from our downstream review in BCHN:

    I linked to our review comments so you can find which content those comments relate to exactly.

    Edited by freetrader
  • Thank you! I made all the relevant changes, it shows the age of this doc that I can remove the bip62 note :)

  • Why do the spender fields (FirstSpender, DoubleSpender) have no sizes?

    Also, you may want to update this:

    Last Revised: 2020-08-11

  • Why do the spender fields (FirstSpender, DoubleSpender) have no sizes?

    What would you put there?

  • It's clearly variable (due at least to the push-data), so I would put something like "variable, see below" (since the tables don't have proper names).

    I checked on the implementation provided to BCHN, and did not find any of the list-size fields in the data structure or its serialization.

    So right now I'm uncertain why there is a discrepancy here between the structure laid out in the spec above, and the implementation.

    Edited by freetrader
  • I would put something like "variable, see below"

    That would mess up the table structure a bit and since the lines between the two tables explain thi, i'll skip this suggestion.

    Thanks for your review!

  • Ok, after reading the code it's clear now that pushData is a vector of byte vectors, and each vector is serialized with a preceding list-size giving its number of elements.

    The first one ("Number of push-data's") seems always one according to current code), which is why there is only one push-data listed in the table, and it all results in two consecutive list-size items.

    I would suggest renaming the description "Number of items in the push-data list" to "Number of bytes in the push-data list" since the push-data itself is a vector of bytes, and there should be no question about the length of an item in such a vector - it is precisely 8 bits.

  • The question has been raised: what is the use case for multiple push-data's in one Spender?

    ie. what do we need the list-size field "Number of push-data's" for? It seems to contain a single signature always, and the spec states

    The details required to validate one input are provided in the spender field

    There doesn't seem to be a reason to store multiple signatures in one Spender.

  • I would suggest renaming the description...

    Sure, I can adjust it for someone with your eye.

    what is the use case for multiple push-data's in one Spender?

    Future extensibility. Multisig comes to mind. Maybe it will be removed in the next iteration. The goal is to get this on mainnet now so we can get the rest of the infrastructure to start using this imperfect solution. Running code beating perfect code.

    Edited by Tom Zander
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment