This change should wrap up the last of the missing block device
support in the vdev_disk layer. With this change I can now
successfully create and use zpools which are layered on top of
md and lvm virtual devices. The following changes include:
1) The big one, properly handle the case when page cannot be added
to a bio due to dynamic limitation of a merge_bdev handler. For
example the md device will limit a bio to the configured stripe
size. Our bio size may also end up being limited by the maximum
request size, and other factors determined during bio construction.
To handle all of the above cases the code has been updated to
handle failures from bio_add_page(). This had been hardcoded to
never fail for the prototype proof of concept implementation. In
the case of a failure the number of bytes which still need to be
added to a bio are returned. New bio's are allocated and attached
to the dio until the entire data buffer is mapped to bios. It is
then submitted as before to the request queue, and once all the bio's
attached to a dio have finished the completion callback is run.
2) The devid comments have been removed because it is not clear to
me that we will not need devid support. They have been replaced
with a comment explaining that udev can and should be used.
For the sake of completeness we need to validate everything works
well not just on IDE or SCSI drives. But we need to verify a
zpool configured on top of the Linux virtual block devices.
These scripts simply that testing process, and have shown that
while everything is good on top of a ram disk. Right now the
code base panics the kernel when layered on top of either an
md or dm style device. For the moment don't do that.
To simplify creation and management of test configurations the
dragon and x4550 configureis have been integrated with udev. Our
current best guess as to how we'll actually manage the disks in
these systems is with a udev mapping scheme. The current leading
scheme is to map each drive to a simpe <CHANNEL><RANK> id. In
this mapping each CHANNEL is represented by the letters a-z, and
the RANK is represented by the numbers 1-n. A CHANNEL should
identify a group of RANKS which are all attached to a single
controller, each RANK represents a disk. This provides a nice
mechanism to locate a specific drive given a known hardware
configuration. Various hardware vendors use a similar scheme.
A nice side effect of these changes is it allowed me to make
the raid0/raid10/raidz/raidz2 setup functions generic. This
makes adding new test configs easy, you just need to create
a udev rules file for your test config which conforms to the
naming scheme.
Remove the hard coded 512 byte SECTOR_SIZE and replace it with
bdev_hardsect_size() to get the correct hardware sector size.
Usage of get_capacity() was incorrect. We the block_device
references a partition we need to return bdev->part->nr_sects.
If get_capacity() is used the entire device size will be returned
ignoring partition information. This is however the correct thing
to do when the block device in question has not partition table.