Category Archives: Linux

Gluster: an open-source NAS solution

On October 7th 2011, Red Hat has announced the acquisition of a company called Gluster, which has developed a distributed NAS technology based on open standards. This technology is included in a software appliance that can be deployed to share files over the network.

Why is this an interesting move ? Because NAS is ideal to store unstructured data, and that is the area that grows the fastest in the storage industry.

But what is structured data as opposed to unstructured ? These are entries that follow a strict definition: such as defined numbers (order numbers inside a company for example) or character strings (such as like customer IDs), etc. SANs (Storage Area Networks), such as Fiber Channel and iSCSI, are generally a good solution to store this structured data – in general in a database. However, NAS is the right solution for the incredible quantity of data that is produced every day from various sources (sensors, digital cameras, spreadsheets, presentations, etc.). According to an ESG report,  the market for NAS will grow by 72% compound annual growth rate (CAGR) from 2010 to 2015 !

IT departments wanting to setup a network-attached storage environment had so far two main options:
– a simple NFS (network file system) server. That simple and cheap solution can be installed on any Linux or Unix server. However, this centralized solution concentrates the file accesses on on single server. Replication, failover and disaster recovery are limited, customized and cumbersome processes and the single server can become a performance bottleneck.
– dedicated appliances based on proprietary technology from EMC, NetApp, etc. Although they have very powerful features and are nicely integrated in enterprise environments, they are very expensive.

What most organizations have been asking, though, is to first to reduce the costs of storing this data that grows at incredible rates and second, have the capability to “burst” and leverage cloud capabilities, such as Amazon’s S3 while managing hybrid environments in an easy manner. Yet legacy solutions cannot offer solutions that cover these needs, and that is precisely why Gluster was developed.

I was fortunate enough to go to Mountain View, CA for a training with the Gluster people and discover their technology that is now called Red hat Storage.

The three strengths of Gluster are its scalability (a cluster can contain up to 64 nodes -supported- and way beyond), its manageability (you manage your public and private cloud storage blocks from one single point of view) and reliability (high-availability is built-in and there is no centralized metadata server, hence no single point of failure).

The Gluster infrastructure is based on commodity hardware, i.e. x86_64 servers from HP, Dell or SuperMicro, with direct attached storage (DAS) at disposal, e.g. the disks that are shipped inside the server. Recommended configuration is to have the OS on two disks (RAID1) and the data on twelve remaining disks (RAID6). This storage space will be put at disposal inside the Gluster environment through the network. No need for an expensive array: just take the servers you already know and Gluster will transform them into storage boxes !

From an architectural point of view, it is very important to mention that, although the technology is called GlusterFS, Gluster is not yet another file system. Gluster leverages standard file systems (such as XFS in the software appliance supported by Red Hat) and provides mechanisms to access the data across multiple servers.

The Gluster architecture is based on four elements (the bricks, the nodes, the volumes and the clients) and looks like this:

[Picture courtesy of Red Hat]

– the node: the Gluster software is installed on our commodity hardware server and on RHEL. This combination is called a storage node.
– the brick: the storage available to the OS, for example the RAID disks, will be formatted with a standard XFS (with extended attributes) and mounted to a certain mount point. A brick equals a mount point.
– the volumes: Gluster will play a sort of LVM role by managing bricks distributed across several nodes as one single mount point over the network.
– the clients are computers which access the data. They can be standard Windows clients (vis CIFS), NFS clients, or they can use a specific Gluster client that provides enhancements over NFS, in terms of high-availability.

Example 1 of a Gluster deployment

Let’s take an example : we have two servers -node01 and node02- running RHEL and Gluster. These two servers are identical and have, for the sake of simplicity, one drive on which we want to store data. This drive is formatted with XFS and mounted to, for instance,  /brick1. This directory (and mount point) is identical on the two servers node01 and node02, to manage them more easily.

What happens next is  that we create one Gluster volume, called volume01 -how creative !- from each brick available on the two servers. As I mentioned above, Gluster will play a sort of LVM role, by creating one logical disk from the two distributed disks attached to node01 and node02.

It means concretely that if I mount the volume via the network from another computer, called client1 (for example via the Gluster client), I would do the following command:

[root@client1]# mount -t glusterfs node01:/volume1

and I would have access to the capacity of both drives via the network. From a client perspective, no matter what new files I would store, no matter files I would read, I would not know that the underlying data is actually distributed across multiple nodes. Moreover, if I were an administrator of the servers, I could access the files via their mount point without even knowing that Gluster is running, because it leverages the standard components of a Linux infrastructure.

Example 2 of a Gluster deployment

In this example, two business units (marketing and legal) need two different volumes, isolated from each other. We will have roughly the same configuration as before, but with two data disks per server. Each disk on the server will be dedicated either to legal or to marketing. From these two disks, we will then create two volumes, one called marketing, the other one legal, that will be mounted by their respective clients.

How are the files stored ?

In our first example, when the client wants to store a word processing file file called “myfile.odt” at a specific location on the volume (for example /gluster/myplace ), Gluster will take in account the complete path to the file (in our example /gluster/myplace/myfile.odt) and a mechanism called EHA (elastic hashing algorithm), based on the Davies-Meyer algorithm, will compute a hash that will then indicate on which node and on which disk the file will be stored. When the file must be retrieved, the path to the file is given by the client, Gluster will compute the hash and will then be able to find the file on the given node.

The interesting part of this EHA is that if you store, for example 100 files on a two nodes cluster, like in our first example, the distribution rate will be quite equal. After having saved the 100 files on the volume, and regardless of the complexity of their names, we will end up having roughly 50 files on the node01 and roughly 50 files on node02. Why is that so powerful ? Because instead of having one single server becoming a bottleneck, the cluster can spread the files across its nodes and ensure that the network bandwidth will not become an issue, thus generating a highly scalable solution.

One important thing is also that there is no centralized meta-data server. The hash is computed, for example by the client,  for every access and hence removes a huge single point of failure compared to competitive architectures. If the meta-data is broken, the data (and it can be up to petabaytes of it !) is simply gone, there is no way to find it back. Gluster, on the other hand, has no such centralized architecture, and the beauty of it is that there is no proprietary file system underneath. Every file can be accessed from a standard XFS file system, even if the Gluster deamon is shut down on the machine.

Mount type glusterfs ?

As you can see in the examples above, it is possible to mount the volumes with the mount types NFS or glusterfs. Indeed, in order to mount a Gluster volume in a “native” mode, the client needs a specific package installed. The advantage of this client is that high-availability is built-in (i.e. if a node fails, access to the replicated data is possible without any disruption) and also, the client is able to use the EHA to calculate itself the position of a certain file inside the cluster and hence will talk directly to the node that contains the data, thus reducing the speed to access data and reducing the network traffic.

What about High-Availability ?

Gluster offers the possibility to mirror bricks across the network. This means that if a node fails, the data will still be available via another node. It is also of course possible to combine both the distribution of files and the replication with, for example four disks : two used to save the data and two that are their replicas. After the node or the brick are available again, Gluster will use a technology called self-healing and will update -in the background- all the data that was modified during the downtime so that the data is identical on both replica after the self-healing process is done.

When it comes to disaster-recovery, it is also possible to use a two-way technology called georeplication that maintains asynchronously a copy of the data at another site. The recovery site can be a Gluster cluster, or another type of storage.

What are the advantages for my organization ?

Gluster is a great technology that brings a lot of advantages. The highlights are definitely that Gluster :
– increases the availability of the data by replicating the data and by having no meta-data server i.e. no single point of failure
– manages the data better. The command-line interface is very intuitive and is able to manage petabytes of data in an easy way
– scales to petabytes level, by spreading linearly the data across multiple nodes, hence avoiding the creation of bottlenecks
– lowers the costs of storage by using commodity hardware

I think that Red Hat was very smart to extend its portfolio to storage. Indeed, after the commoditization of the server market from proprietary Unix architectures to standard Linux servers, it is time for the storage vendors to become more open and dramatically increase their affordability. This is just the beginning…

Install HP Virtual rooms on Fedora 16

As a partner of HP, I use their collaboration platform HP Virtual Rooms, that is also available on Red Hat Linux. As I use Fedora, I needed to install some more packages.Here is what I did

# wget https://www.rooms.hp.com/vRoom_Cab/hpvirtualrooms-install64-F4-8.0.0.4282.tar.gz

# tar -xzvf hpvirtualrooms-install64-F4-8.0.0.4282.tar.gz

# cd hpvirtualrooms-install

# ./install-hpvirtualrooms
virtualrooms-install : /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory

Then I learned a cool feature of yum : you just need to enter the file that you need and yum will download and install the package that needs the file for you. For example :

# yum -y install /lib/ld-linux.so.2

So, all in all, you need to install the following packages :

# yum -y install glibc-2.14.90-24 libXi.so.6 libSM.so.6 libXi.so.6 libXrender.so.1 libXrandr.so.2 libz.so.1 libglib-2.0.so.0 libXfixes.so.3 libasound.so.2 libfontconfig.so.1 libpng12.so.0 libGLU.so.1

and then test it.

Move to Red Hat

After four years spent at HP, I accepted an offer to work for Red Hat, thus moving from Stuttgart to Munich.

I lived four amazing years at HP, surrounded by fantastic and dedicated people. I learned a tremendous amount of things. I could acquire a sound technical knowledge about enterprise IT environments and also learned a lot from my mentor and my colleagues. I also attended very useful and interesting sales and soft skills trainings.

This change to Red Hat is quite a challenge. First of all, the company’s business is radically different. I always found right to sell Free and Open Source Software and this is a great opportunity to do my job according to my ethical principles. Moreover, I change from the biggest IT company in the world, with 300,000 employees (not counting all the contractors and partners) to a roughly 4,000 persons company. Red Hat is clearly not a start-up any more, but it is way smaller and things need to be handled in a creative way.

I’ll take over a new role in Red Hat to support systems integrators, OEMs (such as HP) and ISVs from a presales perspective at the EMEA level. I look forward to passing my Red Hat Certified Engineer certification and to learning a lot. The fact that KVM is installed and ready to create virtual machines on all PCs inside the company is a great sign of geekiness and that is already a good start!

HP brings x86 on the the Superdome !

Big announcements for HP !
As internally already rumored, the next generation of Superdome 2 servers will be able to use x86 processors, such as the Intel Xeon and run Linux x86_64 natively !

As stated in this press conference, HP has launched a project called “Odissey” that will probably be a complete game changer in the x86 industry.

So far, only HP-UX could be run on a Superdome, but now, customers will have the capability of running HP-UX as well as Linux in the same Superdome server. The lowest-level virtualization layer of the Superdome is the nPar (node partition) and is an electrically-isolated group of Superdome cells (the picture on the right shows the SD2 enclosure populated with cell blades). As nPars are electrically isolated from each other, it will be possible to have nPars equipped with Xeon CPUs and other nPars with Itanium CPUs. Just as the first generation of Superdomes could run PA-RISC and Itanium processors in different nPars in the same server. A mix of CPUs types or families will not be possible.

Of course, the HP-UX cell blade will need Itanium CPUs and the Linux cell blade will need Xeon CPUs (as Linux is not supported on the latest Itanium-based servers), however, this opens the door to bringing Linux to new levels of availability, making use, for example, of the highly available crossbar of the Superdome 2 that routes all IO signals from the IO extenders, which contain the PCI-e cards, to the cell blades. This crossbar is able to retry all possible transactions and to reroute signals to make sure that every IO is performed accurately.

HP-UX will not be ported under under x86 and it will continue to run on the Integrity blades, rx2800 i2 rack-mount servers, as well as on the Superdome cells with Itanium CPUs. Also, this integration will only be for Intel Xeon processors, not AMD Opterons. The development of HP-UX will continue, as the Itanium roadmap still has two CPUs codenamed “Poulson” and “Kittson” to be delivered in the future.

It would be possible to run Linux (with the current Xeon CPUs – the number of cores of Intel’s next platform, codenamed Sandy bridge, for servers is not clear as of now) on 32 sockets, or 320 cores, or 640 threads !! That is huge and great news for all the customers who wanted to switch smoothly from Unix to Linux, or needed scale-up servers going beyond the 8 sockets provided by most of the vendors.

Also, the Integrity blades, which were very modular (they could be extended from two sockets to four sockets and even to eight sockets by just combining blades together and linking them with a blade link pictured below), will also be made available for Xeon processors.

The new servers (Superdome 2 and scalable blades) are planned for 2013.

Finally, HP announced that the Linux HA portfolio would be similar to the HP-UX one, which means that ServiceGuard for Linux (that was stopped two years ago) will be reactivated.

I think that all these announcements are great news for Linux customers who wanted to push their Linux infrastructures to mission-critical levels. Although HP-UX still has a clear roadmap, the attractiveness of the Xeon processor with Linux on such a scalable and available platform will be very strong.

This offer could also be interesting for customers of other commercial Unix versions by offering amazing scale-up capabilities for Linux on the x86 platform, which is the most open one.

Switching from Gnome3 to KDE

I used Gnome for years – roughly since Ubntu Warty Warthog went out seven years ago. I liked the way the desktop was organized and I could even use 3D effects to make it very eye-catching. When I switched to Fedora last year, I remained on the Gnome desktop, which, in the end, provided very few changes to me from a visualization perspective (besides having a blue theme rather than a brownish one).

I have never been too impressed with what Ubuntu came witch, such as the Ubuntu netbook edition, although it did the job for the HP mini I had. Granted that I never tried Ubuntu Unity, but even so, I wanted to stick with Fedora and Gnome…

But then came Gnome3, the new version of this Linux desktop.
It is not that I dislike the new Gnome shell. It is very pretty, actually. The problem to me is that, although it is pretty, unlike, for example Apple products, the Gnome developers and designers could not bring two more factors in the equation : the intuitiveness and my desktop production style -which I am sure I share with quite a few people-. I am certainly no Apple fan, quite the contrary, actually, but I really missed here something not only beautiful and user-friendly, but also productive.

I need different fixed desktops for my music, to browse the web, to work on documents and to read my emails. This was simply not possible with Gnome, since the desktops automatically close when they are empty, changing their order. There is probably a way to fix that, but this was not the sole issue…

One more thing is that the 3D effects were far too slow to be usable. After having opened the fourth application, my desktop, that works like a charm under Gnome2 and KDE, became really too slow. Not an option for me.

The last thing that finished convincing me to switch to KDE was the wireless device that worked on the LiveCD but not after being installed. That really upset me, especially given that it works well under KDE (hence not a Linux kernel problem).

For these three reasons : because it is not sufficiently intuitive, because it is too slow, and because I did not want to lose 20 hours fixing a wifi stick that would work on another desktop, I switched to KDE.

I really think that the new Gnome shell is pretty and has value for some users, but not enough for me. That is something I love about Free Software : you don’t like what you have ? Then switch to something else ! Competition is definitely good for everybody…