Partial and efficient replication

For Manyverse to exit beta, and become v1.0.0, we must solve initial sync efficiently.

Functional goal: reduce total time for initial sync
Functional goal: user can see replication progress per-feed, like a download manager
Qualitative goal: user can always interact with the UI with no perceivable input lag
Qualitative goal: user's device remains "cold" while syncing

JITDB #1104 (closed)
Profiling setup on Manyverse desktop
Rewrite ssb-replicate / ssb-ebt / ssb-friends / ssb-db
ssb-db2 defragging
Secure partial replication with index feeds
Secure partial replication with sliced replication
Respect CPU and Bandwidth while downloading
Prioritized replication based on hops distance
~~BIPF Native on hold~~
~~ssb-neon #1105 on hold~~

This is a quite big project that will touch the entire stack (from near-metal systems programming to Node.js, to frontend code, and UI design). The following are loose ideas for a course of action, but the actual implementation should be data-driven and benchmark-driven. Not all of these ideas are necessarily going to be done, these are just candidates.

JITDB

Use https://github.com/arj03/jitdb as a new replacement of Flume. Help JITDB with development to polish it until it's solid and complete. Adopt JITDB gradually, since it could be used for some specific indexes, like (1) contacts, (2) abouts, (3) votes, (4) posts and threads.

Rust that speaks Node.js

We could migrate the entire SSB stack from Node.js to Rust, but: (1) that would be a huge project, (2) it's likely that only some parts (not all) of Node.js SSB are a bottleneck for initial sync, so it's not a must to rewrite everything, (3) it would require a lot of testing to make sure that all corner cases are covered.

Instead, we should profile the performance of initial sync, identify the bottlenecks, and investigate whether a Rust implementation of those bottlenecks would help the performance.

Sunrise Choir is an obvious starting point. We should identify how far SC has gotten, what's missing, and what was built so far that we might not need, etc.

Then, we should consider setting up Rust code so that it is used behind a Node.js interface, in order to interoperate with the rest of our Node.js stack. See Neon or WASM for this purpose. But we should also investigate whether Rust-to-Node.js communication becomes a bottleneck in case these two runtimes become too chatty in our implementation.

We should aim to put all Rust code in production as soon as it's written. In other words, we should gradually adopt Rust, not suddenly swap. First we setup Rust-to-Node.js bindings, and then we add a small amount of Rust. We can gradually increase the percentage of Rust code in the stack, but it will be one stack, not two stacks that require a swap.

Profiling setup

We could use a bare-minimum Manyverse-on-desktop (Electron) setup to have just the backend code running so we can use the Chrome dev tools to profile to backend.

Rewrite ssb-replicate / ssb-ebt / ssb-friends / ssb-db(?)

We should identify the parts of the Node.js SSB stack that most consume CPU and consider a rewrite, in the same manner that we rewrote from ssb-gossip to ssb-conn.

Maybe it should be TypeScript-based, or maybe have a Rust core. These modules should have more features that provide the user more fine-grained control, in particular, fine-grained progress status for each feed being replicated.

Prioritized replication

Replication should be understood as a similar topic as "download management" (e.g. like a Torrent client such as uTorrent), that allow the user to pick some downloads to be "high priority" and others to be "low priority". By default, we could pick low hop distances to be high priority, and high hop distances to be low priority.

Respect CPU and Bandwidth

Replication & indexing should have backpressure so that CPU is not maxxed out for these tasks. In the past I built pull-drain-gently for this purpose, but didn't know how/where to use it in the SSB stack.

We should treat replication (the act of network-fetching the log msgs for each account) separately from indexing (the act of building database indexes necessary for UI queries), but both could/should have backpressure so that they are not greedy with resources. E.g. replication should respect customizable bandwidth/speed limits, and indexing should respect CPU.

Client plugins

Recently, Christian Bundy has been experimenting with running all the SSB plugins on the client-side, with only (I guess?) ssb-db and flume on the server side. Coincidentally this also means that we could easily swap the implementation of the server side. It could be Go, or Rust. This could be a pathway for a gradual rewrite that is compatible with the current stack. The "client side" plugins don't actually need to be strictly on the (frontend) client, we could have the following setup: (1) Go or Rust flume server, (2) Node.js middleman client, (3) React Native frontend client.

UI design: "My community"

A new screen in Manyverse should list all accounts in the user's device, and their corresponding replication status. This could look a bit like a Torrent client when those feeds are being replicated & indexed.

Beyond replication status, "My community" screen can also show hop distance, and total storage that this account incurs to the user's local disk. The replication status itself could actually be hidden unless there is a replication task currently in progress.

Help other apps

Solving initial sync in Manyverse might mean the creation of open source modules in the SSBC GitHub that are useful for many other apps. Whatever solutions we reach, we shouldn't make it Manyverse-specific, but continue using and improving the canonical SSB stack (Node.js).

Edited May 16, 2022 by staltz

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

Assignee

Select assignees

Time tracking