CoBox is built on top of
dat is a modular peer-to-peer technology stack. You can find a good explanation of how it works in the guide 'how dat works'.
dat's core database is an append-only log called
hypercore. This log can be easily replicated between devices or peers in real-time, and it's integrity is preserved - ensuring other peers haven't tampered with the log's data - using cryptographically signed entries and merkle trees. To do so, each
hypercore holds its own cryptographic keypair - a public key and a signing
secret_key. Only the holder of the
secret_key, the 'writer', can make changes to the log. But all peers with a copy of the log, without the keys, are allowed to 'read'.
Lets step up the stack one step. On top of
dat have built a tool called
hyperdrive is a file system abstraction that uses two
metadata - to represent a fully functioning unix-style file system. It can be mounted as a folder on your computer. When combined with a system of replication and peer discovery, 'writers' of a
hyperdrive archive can dynamically update files on the remote 'reader' computers. Changes are replicated with all connected peers in real time, without the need for an authenticated server to guarantee the integrity of the data.
There are limitations to this setup. CoBox has taken steps to solve one limitation, which is the ability for multiple devices to participate as a 'writer', not just a 'reader', in a given file system.
With the default
datsetup, if diverging changes are made using the same
secret_keyfrom different devices, the integrity of the log is broken and can be regarded as 'forked'. To prevent data corruption, its imperative that the
secret_keyis only ever used on a single device. This is a significant usability problem - it makes
hyperdriveunsuitable for collaborative applications involving multiple peers or devices with write access - the fabled 'multi-writer' setup.
An additional usability issue
datstruggles with is having many many
secret_keys scattered across your file system, one for each
hypercore, which, in our case, is quite a lot! We've gotten around this by using
libsodium's key derivation function. All your hypercore keys are derived deterministically from a single
To resolve this issue, CoBox has made use of innovations by the
multifeed is an aggregation tool for multiple hypercores - it binds together a set of
hypercores under a single public key, or as we call it,
address. This public key does not directly correspond to a single hypercore, rather it corresponds to a dynamic set. When peers meet at this address on with their chosen networking tool, they first exchange a list of
hypercore public keys they hold, then proceed to update each log with the latest data, as well as importing any new logs that may have appeared.
In a simple configuration, such as implemented in the IRC chat app Cabal, each peer maps to a single
hypercore instance. For CoBox, each peer maps to three
hypercores. If we need to, we can add to this dynamically in the future. Two of these
metadata feeds - are used by
hyperdrive. The third is simply named
log and is similar to Cabal's use, it stores binary-encoded JSON messages which are used for application-layer data.
To serve peers with live updates from a set of continuously sync'ing
kappa-db community built a dynamic indexing system called
kappa-core can be used to build custom materialised views over the datasets contained within a
multifeed's collection of
hypercores. We've used
kappa-core to collect together the metadata from each peer's personal
hyperdrive, to bind them together to act as a single multi-writer hyperdrive, called
kappa-drive. To assist with possible collisions in drive state, we've implemented a conflict resolution mechanism based on 'vector clocks'. We've also built a module called
kappa-view-query which lets us define a dynamic set of indexes for message types in our
hypercores, over which we can perform scoped queries, and implement
map/filter/reduce functions to reduce large datasets quickly and easily and help serve application layer data.
CoBox uses the
hyperswarm distributed hash table to connect peers over our