Draft: Voting benchmark
@asbjornu: Here's me kicking the tires on benchmarking how voting scales, and it reveals some expected stuff, and some... not quite expected stuff.
Here's what the test does:
- First of all, the test "collapse" the empty vote-set to see that this is nice and fast initially.
- Then it casts 1000 votes for each entry (with the same score, but this shouldn't affect performance).
- Then it collapses after the first round of votes.
- Then it casts another 1000 votes for each entry
- Finally, it collapses after the second round of votes.
Now, the expected stuff is that step 1 is very very cheap. Also, step 5 is about twice as slow as step 3. Also 2 and 4 basically performs the same. All good.
So scaling seems like we'd expect at the moment: voting seems to be linear in cost to the number of votes, and collapsing the votes seems to be linear in time to the number of votes.
But what's surprising me, is how fast this actually is ATM. With thousands of votes, collapsing all votes still takes fractions of a nanosecond on my machine. Granted, my machine here is kind of a beast (AMD Ryzen 9 5950X 16-Core), but with this kind of performance, I don't see how this is telling the whole story at all from what happened at Evoke.
Also, having more voters shouldn't make things that much worse; sure, we'd get more cache thrashing etc due to more independent tasks, but at the same time, vote collapsing should be scaling to the number of per user, not the global number... right? Some quick experiments seems to somewhat confirm this, but it's not entirely clear. I should probably write a separate benchmark that tries to prove that...
Another big difference here from production, is that this uses an in-memory version of SQLite. Going to the filesystem would of course be a magnitude slower. But this should affect voting more than collapsing votes, which is not what was shown to be slow. It's also not really clear to me how to test this using disk-access... Perhaps you have an idea?
Edit: I'm an idiot. We actually need to iterate the number of times the benchmark-framework wants us to... I kinda thought this happened automatically, but it does not. The numbers no longer look surprisingly good, they look many orders of magnitude worse. Which I guess is good?!