Updating user facing documentation with storage use cases.
Problem / Opportunity Statement
We offer several kinds of storage:
- Block devices (a.k.a. Cinder volumes, virtual hard drives)
- Images (a.k.a. Glance)
- Object storage (a.k.a. buckets)
- Both fast (NVMe) and slow (spinning rust)
- Exposed via RADOS, Swift, or S3 semantics
- Shared filesystems (a.k.a. Manila, CephFS)
They differ on axes of, e.g.,:
- Ease (and self-service-ness) of getting started using
- Performance (Throughput, IOPS)
- Capacity (for PB-scale storage allocations)
- Tolerance for many small files
- Ability to access from multiple processes/instances at the same time
- Support for locking (prevent multiple processes from overwriting the same location)
- Ease of sharing (and there are several kinds of sharing)
- How to implement access controls
- Ability to boot an instance from it
But our documentation doesn't really talk about this. So it's unclear how community members would weigh tradeoffs and make informed choices.
Resolution
We should create a user facing document (or update https://docs.jetstream-cloud.org/general/storage/) that shows and explains which type of storage is best suited for which need.
Maybe a pro/con list with manila, volume backed storage, radosgateway with examples of users who datasets best fit each use case.
Multiple audiences for this information:
- New users just getting started (who won't want to be troubled with the technical implementation details, only know how to choose)
- Technically sophisticated users (who do want to understand the technical implementation details, e.g. that Manila uses CephFS with metadata servers)
Edited by Chris Martin