To simplify creation and management of test configurations the
dragon and x4550 configureis have been integrated with udev. Our
current best guess as to how we'll actually manage the disks in
these systems is with a udev mapping scheme. The current leading
scheme is to map each drive to a simpe <CHANNEL><RANK> id. In
this mapping each CHANNEL is represented by the letters a-z, and
the RANK is represented by the numbers 1-n. A CHANNEL should
identify a group of RANKS which are all attached to a single
controller, each RANK represents a disk. This provides a nice
mechanism to locate a specific drive given a known hardware
configuration. Various hardware vendors use a similar scheme.
A nice side effect of these changes is it allowed me to make
the raid0/raid10/raidz/raidz2 setup functions generic. This
makes adding new test configs easy, you just need to create
a udev rules file for your test config which conforms to the
naming scheme.
After spending considerable time thinking about this I've come to the
conclusion that on Linux systems we don't need Solaris style devid
support. Instead was can simply use udev if we are careful, there
are even some advantages.
The Solaris style devid's are designed to provide a mechanism by which
a device can be opened reliably regardless of it's location in the system.
This is exactly what udev provides us on Linux, a flexible mechanism for
consistently identifing the same devices regardless of probing order.
We just need to be careful to always open the device by the path provided
at creation time, this path must be stored in ZPOOL_CONFIG_PATH. This
in fact has certain advantages.
For example, if in your system you always want the zpool to be able to
locate the disk regardless of physical location you can create the pool
using /dev/disk/by-id/. This is perhaps what you'ld want on a desktop
system where the exact location is not that important. It's more
critical that all the disks can be found.
However, in an enterprise setup there's a good chace that the physical
location of each drive is important. You have like set things up such
that your raid groups span multiple hosts adapters, such that you can
lose an adapter without downtime. In this case you would want to use
the /dev/disk/by-path/ path to ensure the path information is preserved
and you always open the disks at the right physical locations. This
would ensure your system never gets accidently misconfigured and still
just works because the zpool was still able to locate the disk.
Finally, if you want to get really fancy you can always create your
own udev rules. This way you could implement whatever lookup sceme
you wanted in user space for your drives. This would include nice
cosmetic things like being able to control the device names in tools
like zpool status, since the name as just based of the device names.
I've yet to come up with a good reason to implement devid support on
Linux since we have udev. But I've still just commented it out for now
because somebody might come up with a really good I forgot.
The majority of this this patch concerns itself with doing a direct
replacement of Solaris's libdiskmgt library with libblkid+libefi.
You'll notice that this patch removes all libdiskmgt code instead of
ifdef'ing it out. This was done to minimize any confusion when reading
the code because it seems unlikely we will ever port libdiskmgt to Linux.
Despite the replacement the behavior of the tools should have remained
the same with one exception. For the moment, we are unable to check
the partitions of devices which have an MBR style partition table when
creating a filesystem. If a non-efi partition sceme is detected on a
whole disk device we prompt the user to explicity use the force option.
It would not be a ton of work to make the tool aware of MBR style
partitions if this becomes a problem.
I've done basic sanity checking for various configurations and all
the issues I'm aware of have been addressed. Even things like blkid
misidentifing a disk as ext3 when it is added to a zfs pool. I'm
careful to always zero out the first 4k of any new zfs partition. That
all said this is all new code and while it looks like it's working right
for me we should keep an eye on it for any strange behavior.