CoreOS Launches Torus Project for Cloud-Native Storage

Containers

While there are already multiple distributed storage systems, Torus aims to be cloud-native and integrate closely with Kubernetes.

CoreOS on June 1 officially launched the Torus open-source distributed storage project for container deployment. Torus makes use of some existing CoreOS-led open-source efforts, including the etcd distributed key value store that is also a core component of the Kubernetes container orchestration system.

Torus will join a landscape of open-source distributed storage systems that includes the Ceph project, which is widely used in OpenStack cloud deployments. Torus is intended to be simple, reliable, distributed storage for modern application containers, as well as an enabler for wider enterprise Kubernetes adoption, according to Wei Dang, head of product at CoreOS.

“Torus is designed from the outset for cloud-native environments, while existing distributed storage systems were not designed to support large-scale clusters of dynamically scheduled containers that require persistent storage,” Dang told eWEEK.

CoreOS Torus

coreos LEADIn Dang’s view, existing storage solutions can often be difficult to set up, configure and operate while trying to fit them in with modern container cluster infrastructures. Most existing distributed storage systems were designed for small clusters of large machines, rather than large clusters of inexpensive, small machines, he said. In contrast, Torus is designed from the ground up to be cloud-native.

“Torus itself can be deployed in containers and managed using Kubernetes,” Dang said.

CoreOS is a strong supporter and contributor to Kubernetes, and it packages Kubernetes as part of its commercial Tectonic offering. CoreOS today provides storage capabilities and works with multiple partners including ClusterHQ, the lead sponsor of the open-source Flocker data volume manager.

“Torus is in its early stages, and we are looking forward to working with the community and other vendors to ensure it focuses on being a simple composable component of the cloud-native stack,” Dang said.

On the commercial front, customers should have flexibility and choice over which storage solutions they choose to deploy, according to Dang. He noted that organizations can still use CoreOS components with existing storage solutions offered by CoreOS partners and other vendors. The goal with Torus is to be able to be used alongside existing solutions or as a stand-alone solution, depending on the customer’s environment and use cases.

Is the British technology industry better off with the UK as an EU member?

View Results

Loading ... Loading ...

Data Sharding

Torus makes use of some well-known distributed storage approaches including sharding and replicating data blocks. With data sharding, units of data storage are separated into small units distributed across multiple elements, or shards.

“To automatically handle the placement of these blocks, it uses a consistent hash ring, an extensible approach that is referenced in Torus’ name,” Dang said. A torus is a geometrical shape similar to the shape of a doughnut in that is a circular with a hole in the middle.

To keep track of the metadata for the storage cluster, the availability of the volumes and the sharding algorithm, Torus uses etcd, which provides a reliable, production-tested, key-value store that enables distributed consensus.

As an added benefit, Dang said that since Torus uses etcd for distributed consensus, it also uses it for auto-discovery of nodes in a container cluster. As such, applications in Kubernetes can discover where storage lives simply by asking etc—meaning that if an application is rescheduled elsewhere in the cluster and moved from one node to another, it sees the same volume.

Scale challenges

navigating the cloud landscapeA key challenge for any distributed system is always scale. Currently in its initial iteration Torus can scale to hundreds of individual nodes. Dang said the storage capacity depends on how large the individual disks are on each node, so, for example, with 100 nodes and with large enough capacity on each node (for example, 10TB), that would mean a petabyte of total storage.

Although Torus is a cloud-native storage technology, Dang said it is meant to run stand-alone, primarily using local disks.

“Distributed block storage is supported first, but the architecture is extensible to supporting other types of storage such as Amazon S3-style object storage as well,” he said.

As a new effort, there are a number of challenges that face Torus, and not all of those challenges are technical either.

“The big challenge is building a community interested in pushing forward the status quo of distributed storage,” Dang said. “Storage is a challenging problem, but by building a new storage system designed to be cloud-native, we can start to realize the larger goal of GIFEE (Google Infrastructure For Everyone Else).”

Originally published on eWeek