Gluster: an open-source NAS solution

On October 7th 2011, Red Hat has announced the acquisition of a company called Gluster, which has developed a distributed NAS technology based on open standards. This technology is included in a software appliance that can be deployed to share files over the network.

Why is this an interesting move ? Because NAS is ideal to store unstructured data, and that is the area that grows the fastest in the storage industry.

But what is structured data as opposed to unstructured ? These are entries that follow a strict definition: such as defined numbers (order numbers inside a company for example) or character strings (such as like customer IDs), etc. SANs (Storage Area Networks), such as Fiber Channel and iSCSI, are generally a good solution to store this structured data – in general in a database. However, NAS is the right solution for the incredible quantity of data that is produced every day from various sources (sensors, digital cameras, spreadsheets, presentations, etc.). According to an ESG report,  the market for NAS will grow by 72% compound annual growth rate (CAGR) from 2010 to 2015 !

IT departments wanting to setup a network-attached storage environment had so far two main options:
– a simple NFS (network file system) server. That simple and cheap solution can be installed on any Linux or Unix server. However, this centralized solution concentrates the file accesses on on single server. Replication, failover and disaster recovery are limited, customized and cumbersome processes and the single server can become a performance bottleneck.
– dedicated appliances based on proprietary technology from EMC, NetApp, etc. Although they have very powerful features and are nicely integrated in enterprise environments, they are very expensive.

What most organizations have been asking, though, is to first to reduce the costs of storing this data that grows at incredible rates and second, have the capability to “burst” and leverage cloud capabilities, such as Amazon’s S3 while managing hybrid environments in an easy manner. Yet legacy solutions cannot offer solutions that cover these needs, and that is precisely why Gluster was developed.

I was fortunate enough to go to Mountain View, CA for a training with the Gluster people and discover their technology that is now called Red hat Storage.

The three strengths of Gluster are its scalability (a cluster can contain up to 64 nodes -supported- and way beyond), its manageability (you manage your public and private cloud storage blocks from one single point of view) and reliability (high-availability is built-in and there is no centralized metadata server, hence no single point of failure).

The Gluster infrastructure is based on commodity hardware, i.e. x86_64 servers from HP, Dell or SuperMicro, with direct attached storage (DAS) at disposal, e.g. the disks that are shipped inside the server. Recommended configuration is to have the OS on two disks (RAID1) and the data on twelve remaining disks (RAID6). This storage space will be put at disposal inside the Gluster environment through the network. No need for an expensive array: just take the servers you already know and Gluster will transform them into storage boxes !

From an architectural point of view, it is very important to mention that, although the technology is called GlusterFS, Gluster is not yet another file system. Gluster leverages standard file systems (such as XFS in the software appliance supported by Red Hat) and provides mechanisms to access the data across multiple servers.

The Gluster architecture is based on four elements (the bricks, the nodes, the volumes and the clients) and looks like this:

[Picture courtesy of Red Hat]

– the node: the Gluster software is installed on our commodity hardware server and on RHEL. This combination is called a storage node.
– the brick: the storage available to the OS, for example the RAID disks, will be formatted with a standard XFS (with extended attributes) and mounted to a certain mount point. A brick equals a mount point.
– the volumes: Gluster will play a sort of LVM role by managing bricks distributed across several nodes as one single mount point over the network.
– the clients are computers which access the data. They can be standard Windows clients (vis CIFS), NFS clients, or they can use a specific Gluster client that provides enhancements over NFS, in terms of high-availability.

Example 1 of a Gluster deployment

Let’s take an example : we have two servers -node01 and node02- running RHEL and Gluster. These two servers are identical and have, for the sake of simplicity, one drive on which we want to store data. This drive is formatted with XFS and mounted to, for instance,  /brick1. This directory (and mount point) is identical on the two servers node01 and node02, to manage them more easily.

What happens next is  that we create one Gluster volume, called volume01 -how creative !- from each brick available on the two servers. As I mentioned above, Gluster will play a sort of LVM role, by creating one logical disk from the two distributed disks attached to node01 and node02.

It means concretely that if I mount the volume via the network from another computer, called client1 (for example via the Gluster client), I would do the following command:

[root@client1]# mount -t glusterfs node01:/volume1

and I would have access to the capacity of both drives via the network. From a client perspective, no matter what new files I would store, no matter files I would read, I would not know that the underlying data is actually distributed across multiple nodes. Moreover, if I were an administrator of the servers, I could access the files via their mount point without even knowing that Gluster is running, because it leverages the standard components of a Linux infrastructure.

Example 2 of a Gluster deployment

In this example, two business units (marketing and legal) need two different volumes, isolated from each other. We will have roughly the same configuration as before, but with two data disks per server. Each disk on the server will be dedicated either to legal or to marketing. From these two disks, we will then create two volumes, one called marketing, the other one legal, that will be mounted by their respective clients.

How are the files stored ?

In our first example, when the client wants to store a word processing file file called “myfile.odt” at a specific location on the volume (for example /gluster/myplace ), Gluster will take in account the complete path to the file (in our example /gluster/myplace/myfile.odt) and a mechanism called EHA (elastic hashing algorithm), based on the Davies-Meyer algorithm, will compute a hash that will then indicate on which node and on which disk the file will be stored. When the file must be retrieved, the path to the file is given by the client, Gluster will compute the hash and will then be able to find the file on the given node.

The interesting part of this EHA is that if you store, for example 100 files on a two nodes cluster, like in our first example, the distribution rate will be quite equal. After having saved the 100 files on the volume, and regardless of the complexity of their names, we will end up having roughly 50 files on the node01 and roughly 50 files on node02. Why is that so powerful ? Because instead of having one single server becoming a bottleneck, the cluster can spread the files across its nodes and ensure that the network bandwidth will not become an issue, thus generating a highly scalable solution.

One important thing is also that there is no centralized meta-data server. The hash is computed, for example by the client,  for every access and hence removes a huge single point of failure compared to competitive architectures. If the meta-data is broken, the data (and it can be up to petabaytes of it !) is simply gone, there is no way to find it back. Gluster, on the other hand, has no such centralized architecture, and the beauty of it is that there is no proprietary file system underneath. Every file can be accessed from a standard XFS file system, even if the Gluster deamon is shut down on the machine.

Mount type glusterfs ?

As you can see in the examples above, it is possible to mount the volumes with the mount types NFS or glusterfs. Indeed, in order to mount a Gluster volume in a “native” mode, the client needs a specific package installed. The advantage of this client is that high-availability is built-in (i.e. if a node fails, access to the replicated data is possible without any disruption) and also, the client is able to use the EHA to calculate itself the position of a certain file inside the cluster and hence will talk directly to the node that contains the data, thus reducing the speed to access data and reducing the network traffic.

What about High-Availability ?

Gluster offers the possibility to mirror bricks across the network. This means that if a node fails, the data will still be available via another node. It is also of course possible to combine both the distribution of files and the replication with, for example four disks : two used to save the data and two that are their replicas. After the node or the brick are available again, Gluster will use a technology called self-healing and will update -in the background- all the data that was modified during the downtime so that the data is identical on both replica after the self-healing process is done.

When it comes to disaster-recovery, it is also possible to use a two-way technology called georeplication that maintains asynchronously a copy of the data at another site. The recovery site can be a Gluster cluster, or another type of storage.

What are the advantages for my organization ?

Gluster is a great technology that brings a lot of advantages. The highlights are definitely that Gluster :
– increases the availability of the data by replicating the data and by having no meta-data server i.e. no single point of failure
– manages the data better. The command-line interface is very intuitive and is able to manage petabytes of data in an easy way
– scales to petabytes level, by spreading linearly the data across multiple nodes, hence avoiding the creation of bottlenecks
– lowers the costs of storage by using commodity hardware

I think that Red Hat was very smart to extend its portfolio to storage. Indeed, after the commoditization of the server market from proprietary Unix architectures to standard Linux servers, it is time for the storage vendors to become more open and dramatically increase their affordability. This is just the beginning…

Leave a Reply

Your email address will not be published. Required fields are marked *