Proposal: Upload Streaming
This is a first draft on how upload streaming will be implemented and how we can split it up into multiple MRs to simplify the review process.
The goal is for the API to have an endpoint with which a user can stream a file of arbitrary size to Sia which uploads this file chunk-by-chunk in real time without buffering any additional data on disk.
For that to work, the repair loop needs to be able to load chunks for repairs from arbitrary readers. We also need to make sure that streamed uploads have a higher priority than regular repairs. Maybe not within the workers themselves but at least within the upload loop.
We also need to extend the
SiaFile to be able to grow in size.
The following is a suggestion on how to split up the work that needs to be done.
Change the existing repair code to use
Readers to repair files
AppendChunkmethod to the
SiaFileto grow the
.siafile by a chunk
- Add the actual upload streaming endpoint
- Handling for interrupted streams (optional)
How do we handle interrupted streams?
A stream could suddenly be interrupted for various reasons. That will almost always leave the file in a state where the last couple of chunks have a redundancy < MinRedundancy. We don't really need to handle that and could leave it to the user to delete that file and try again later. At least for the initial release of the feature.
A more sophisticated solution would be to have a
repair endpoint. It takes a file-offset as an argument, a siapath and a stream of data. The node would then skip the stream for chunks which are already fully repaired, repair the bad ones and append the rest of the data at the end of the file.
The nice thing about that is, that it would also work for files which haven't been uploaded using the upload-streaming endpoint.