hp msa1000 on Linux
I've been setting up an hp msa1000 Fibre Channel disk array with one of our Linux machines.
Here are some entirely unofficial notes. If you're actually setting such a unit up, please don't rely on these at all but rather go and read the careful and detailed hp instruction manuals. This is just an overview. (My first hint is that the manuals are mostly not in printed form, but rather on a CD in the little box.)
The msa1000 is basically a rack-mount box with controllers, gianormous power supply/blower units, and about 14 SCSI disks. It can be configured to have no single point of failure with redundant power supplies and lines, redundant disks, mirrorred controllers, and mirrored data paths. Or you can simply attach it directly to a single machine, as I am doing for testing.
There are two SCSI ports on the back which you can use to attach additional disk enclosures. You cannot use these to attach it to a computer. With a firmware upgrade you can also attach SATA enclosures which give you slower, possibly less reliable but much cheaper storage. This sounds like a good deal to me for some applications: RAID is meant to be a redundant array of inexpensive disks, and the msa1000 lets you do the redundancy without too much trouble.
The msa1000 attaches to a computer through the Fibre Channel port on the back of the controller. You can either connect this straight through to a FC Host Bus Adapter PCI card on the computer, or you can go through an FC switch, allowing several machines to share the storage.
From the point of view of the host, the msa is basically a big block
device, like a SCSI disk. The computer can't directly see the 14-odd
disks inside the array though: it sees a RAID abstraction of those
disks presented by the controller. This logical disk is called a
unit
.
Before you can do anything useful from the host, you need to configure the msa and make a storage unit. There are two ways to do this: either by running some hp software (Insight Manager or Array Configuration Utility) on the host, or by connecting to the serial console on the front. (The serial console needs a special RJ11 cable which comes in the box.) I used the serial console.
The main operation you need to do on the serial console is an ADD UNIT command, which allocates some of the disks for storage. I made a single unit out of all of the disks, with one RAID-5 redundant disk and one hot spare. This is exposed to the Linux host as a SCSI LUN.
You may be able to ask Linux to rescan and hot-add that device. That didn't seem to work with the old kernel on my machine so I just rebooted, and it discovered the disk as sda. It looks like there is no partition table for these units.
At this point I suppose you have a choice of doing Linux LVM or simply creating a filesystem directly on the device. It may seem a bit redundant to run Linux LVM on top of hardware RAID, and perhaps it is.
On the other hand LVM is more flexible than the raid system done in hardware: for example LVM can reduce the size of a logical volume, but the msa firmware cannot.
I'm going to try XFS on this disk for testing; apparently it performs well on big arrays. I may have some results later.
This kind of array can also be simultaneously accessed by multiple machines running something like GFS, Lustre or ClusterFS but I'm not trying that now.
p.s. sneakums asks:
I'm not familiar with these units, but can one not just fdisk it like any other disk? Even if I were only going to use a single partition, I'd be inclined to partition anyway for consistency's sake.
I kind of agree about consistency, but on the other hand a partition table is one more thing you can get wrong if you try to expand the array later.
In a brief look I could not see any advice from hp on whether you should make a partition table on the logical unit or not.
I forget if it matters to LVM itself, but it's typical for LVM PVs to be partitions of type 0x8e. If you were to use MD to stripe across a bunch of these units, you would have to partition for array autostart to work, since MD will only consider partitions of type 0xfd.
Right, but since you cannot (?) boot off these devices I don't think autostart matters too much, and without autostart the partition types don't seem to matter.
Trying to create a GPT table (common on ia64) fails with an IO error, but creating a DOS partition table works. Micah Parrish tells me
This is due to a longstanding 2.4 kernel bug where it is impossible to read or write to the last block on devices with odd numbers of blocks, such as an MSA1000 logical unit. There is a kernel patch floating around which adds a couple of ioctls allowing you to access these blocks directly, but it isn't in the kernel.org sources. One rather old version of it is at this url.
This should be fixed more elegantly in 2.6 kernels. Parted has some special code to use the ioctls on 2.4 kernels. I believe that Red Hat and SuSE include the patch in their 2.4 kernels, but debian may not.
posted Tue 24 Aug 2004 in /software/linux | link
Archives 2008: Apr Feb 2007: Jul May Feb Jan 2006: Dec Nov Oct Sep Aug Jul Jun Jan 2005: Sep Aug Jul Jun May Apr Mar Feb Jan 2004: Dec Nov Oct Sep Aug Jul Jun May Apr Mar Feb Jan 2003: Dec Nov Oct Sep Aug Jul Jun May
Copyright (C) 1999-2007 Martin Pool.