Removing Freeze Time from Host Picking
I don't think it's something we want to integrate yet, that's the type of thing you'd add after there's an entire ecosystem and an easy way to bootstrap. For now it's just a barrier to entry and a headache to implement. So I'm scrapping it entirely.
Instead, you'll create a list of hosts following this naively chosen set of rules, which we will update and refine with time:
- Find all hosts and put them into a large database. This database will mostly remain on disk and not in memory.
- Select hosts randomly from this database. Only 1 host is accepted per IP address, and generally speaking only 1 host is accepted per /24. Hosts within a /24 or on the same IP address will be able to sybil attack each other, but they shouldn't be able to sybil attack the wider network. To get unfair global representation, people would need a wide range of IP addresses, and they'd have to handle all uploads and downloads from those IP addresses (clients will refuse to talk to a host at a different address than the one advertised on the blockchain) which means their bandwidth will go up unless their datacenter is distributed.
This will cause a problem for multiple hosts on the same /24. We might be able to make a whitelist for universities or something (particularly RPI since that's likely to be a hotspot initially). Eventually we might implement freezing or something specifically to let multiple computers be picked from the same /24 with higher weight. I'm sure there will be evolving strategies and I'm okay with this.
- Each host that was randomly selected (perhaps up to 500 live hosts or something) will be pinged infrequently to get a sense of uptime, throughput, and capacity. For now, we'll just trust a host that talks about capacity and treat everyone the same beyond 1TB of capacity. (easy enough to hit, and not that big a deal if someone lies about having a whole 1 TB).
This will at some level discourage hosts from offering more than 1TB, or disadvantage them a bit but that'll help with decentralization in the early network. This is a cap that'll probably be raised and again these rules will be evolving. It would be pretty simple to implement proof-of-capacity tests.
More briefly:
- all hosts announced on the blockchain will be put into one large collection
- this collection will be organized into buckets of hosts from the same /24 and buckets will be selected at random (up to 500). (perhaps favoring geographically close buckets)
- One host is selected from each bucket (also at random) and pinged. If it's offline a new host will be picked from that bucket. (DoS attack vectors exist here, not sure we can do much about that easily)
You then have ~500 geographically diverse hosts to pick from. In the ultra early network we might relax this from /24 buckets to just one bucket per ip address. Actually, for now that's how I'm planning on implementing it.