OpenCloud
Hybrid Meetup #55 took place 2025-10-28 19:00 at Basislager Leipzig and we had a great presentation by Klaas Freitag (CTO) and Principal Architect Dr. Jörn F. Dreyer from OpenCloud. A recording can be found here:
OpenCloud is a widely deployed cloud storage and collaboration platform built on a variation of a microservices architecture. It scales from homelab installations to large clusters with millions of users.
The presentation reflected on some architectural and deployment changes over the years - densily packed with engineering wisdom that extends beyond code and includes aspects like deployment, requirements and project constraints, backwards compatibilty and scalability.
Background and origins
- reva is a CERN storage interop layer and is also where the opencloud story started, many years ago:
Reva is an interoperability platform consisting of several daemons written in Go. It acts as bridge between high-level clients (mobile, web, desktop) and the underlying storage (CephFS, EOS, local filesytems). It exports well-known APIs, like WebDAV, to faciliate access from these devices. It also exports a high-performance gRPC API, codenamed CS3 APIs, to easily integrate with other systems. Reva is meant to be a high performant and customizable HTTP and gRPC server. – github.com/cs3org/reva/
EOS itself is an impressive storage system:
EOS instances at CERN store more than seven billion files and provide 780 petabytes of disk storage capacity using over 60k hard drives (as of June 2022), matching the exceptional performance of the LHC machine and experiments.
CERNBox acts as an file sync and service layer over EOS and is based on ownCloud (from which opencloud was forked).
CERNBox is a cloud storage and file synchronization service developed at CERN, built on the open-source software ownCloud and EOS. It enables users to securely store, access, and share files from any device. It offers 1TB of personal space (just login to cernbox.cern.ch) and 1-100TB for (justified) project space.
More background on CERNBox: Turning CephFS into a collaborative space with CERNBox (2025).
Highlight from the presentation
- not uncontroversial: you can get rid of a database at the core of your application (which was, in parts, a bottleneck) and move to a file based setup (and caching)
- moving from individual shares to the concept of spaces opened up a more maintainably way to handle users (and users that left)
- moving from from individual microservices to a more monolithic microservice architecture has been beneficial; internally opencloud uses nats for messaging (cf. list of microservices in the docs: section services)
- large scale deployments with predictable, but still spiky patterns inspired changes to the node communication setup
- while user report that opencloud feels fast, it is hard to attribute this to the move from PHP to Go, only
- the layer between a (distributed) filesystem or object store and the end user view is developed by an active community, which in parts is organized under the CS3 umbrella
Find out more about the project at:
Thanks again to Klaas and Jörn for the inspiring presentation!
References
Assorted references from the talk:
- lizardfs, forked from MooseFS
- SaunaFS
- JuiceFS
- Ceph filesystem
- GPFS (IBM)
- SeaFile
- PyDio
- CS3 APIs
- NFS (use noacl!)
- k6, designed for load testing
- gomicro, microservice framework
- DNS based routing in k8s
- Apache Tika, can be used as a document extractor
- Collabora, online document editing suite
- WebDAV specs (extension to the HTTP/1.1 protocol that allows clients to perform remote Web content authoring operations – RFC4918)
- Garage, an open-source distributed object storage service tailored for self-hosting
Join our meetup to get notified of upcoming events.
