Commit Graph

18 Commits

Author SHA1 Message Date
sckobras d49d9c2bdc vdev_id: implement slot numbering by port id
With HPE hardware and hpsa-driven SAS adapters, only a single phy is
reported, but no individual per-port phys (ie. no phy* entry below
port_dir), which breaks topology detection in the current sas_handler
code. Instead, slot information can be derived directly from the port
number. This change implements a new slot keyword "port" similar to
"id" and "lun", and assumes a default phy/port of 0 if no individual
phy entry can be found. It allows to use the "sas_direct" topology with
current HPE Dxxxx and Apollo 45xx JBODs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Daniel Kobras <d.kobras@science-computing.de>
Closes #6484
2017-08-14 15:18:26 -07:00
Ned Bass 2a152383a2 vdev_id: fix failure due to multipath -l bug
Udev may fail to create the expected symbolic links in
/dev/disk/by-vdev on systems with the
device-mapper-multipath-0.4.9-100.el6 package installed. This affects
RHEL 6.9 and possibly other downstream distributions.

That version of the multipath command may incorrectly list a drive
state as "unkown" instead of "running". The issue was introduced
in the patch for https://bugzilla.redhat.com/show_bug.cgi?id=1401769

The vdev_id udev helper uses the state reported by "multipath -l" to
detect an online component disk of a multipath device in order to
resolve its physical slot and enclosure. Changing the command
invocation to "multipath -ll" works around the above issue by causing
multipath to consult additional sources of information to determine
the drive state.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Ned Bass <bass6@llnl.gov>
Closes #6039
2017-04-20 12:10:55 -07:00
Andreas Buschmann bba365cfc8 Add extra keyword 'slot' to vdev_id.conf
Add new keyword 'slot' to vdev_id.conf
This selects from where to get the slot number for a SAS/SATA disk
Needed to enable access to the physical position of a disk in a
Supermicro 2027R-AR24NV .

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ned Bass <bass6@llnl.gov>
Closes #3693
2015-08-30 10:03:56 -07:00
Ned Bass 2d9d57b0fb vdev_id: use mawk-compatible regular expression
Slot mapping in vdev_id doesn't work on systems using mawk as the 'awk'
alternative. A regular expression in map_slot() contains an unquoted
empty string following the alternation (|) operator, which results in an
"missing operand" error with mawk. The solution is to rearrange the
expression so the alternation has two operands.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes zfsonlinux/pkg-zfs#136
Closes zfsonlinux/zfs#2965
2014-12-19 12:05:16 -08:00
Ned Bass 09d0b30fd1 vdev_id: support per-channel slot mappings
The vdev_id udev helper currently applies slot renumbering rules to
every channel (JBOD) in the system.  This is too inflexible for systems
with non-homogeneous storage topologies.  The "slot" keyword now takes
an optional third parameter which names a channel to which the mapping
will apply.  If the third parameter is omitted then the rule applies to
all channels.  The first-specified rule that can match a slot takes
precedence.  Therefore a channel-specific rule for a given slot should
generally appear before a generic rule for the same slot number.  In
this way a custom slot mapping can be applied to a particular channel
and a default mapping applied to the rest.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2056
2014-01-17 11:17:54 -08:00
Simon Guest 383efa5743 Fix multipath bug in vdev_id caused by inconsistent field numbering
The bug is caused by multipath output like this:

35000c50056bd77a7 dm-15 HP,MB3000FCWDH
size=2.7T features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=0 status=active
| `- 2:0:16:0 sdq  65:0    active undef running
`-+- policy='round-robin 0' prio=0 status=enabled
  `- 4:0:52:0 sdfp 130:176 active undef running

Note that the pipe symbols mean that the field numbering is different
between the sdq and sdfp lines.  The fix edits out the pipe symbols.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #1692
2013-12-10 09:58:35 -08:00
Ned Bass ba43f4565a vdev_id: improve keyword parsing flexibility
The vdev_id udev helper strictly requires configuration file keywords
to always be anchored at the beginning of the line and to be followed
by a space character.  However, users may prefer to use indentation or
tab delimitation.  Improve flexibility by simply requiring a keyword
to be the first field on the line.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #1239
2013-01-25 13:44:32 -08:00
Ned Bass 2957f38d78 vdev_id support for device link aliases
Add a vdev_id feature to map device names based on already defined
udev device links.  To increase the odds that vdev_id will run after
the rules it depends on, increase the vdev.rules rule number from 60
to 69.  With this change, vdev_id now provides functionality analogous
to zpool_id and zpool_layout, paving the way to retire those tools.

A defined alias takes precedence over a topology-derived name, but the
two naming methods can otherwise coexist. For example, one might name
drives in a JBOD with the sas_direct topology while naming an internal
L2ARC device with an alias.

For example, the following lines in vdev_id.conf will result in the
creation of links /dev/disk/by-vdev/{d1,d2}, each pointing to the same
target as the device link specified in the third field.

  #     by-vdev
  #     name     fully qualified or base name of device link
  alias d1       /dev/disk/by-id/wwn-0x5000c5002de3b9ca
  alias d2       wwn-0x5000c5002def789e

Also perform some minor vdev_id cleanup, such as removal of the unused
-s command line option.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #981
2012-12-03 14:04:47 -08:00
Cyril Plisko 38b344d22a vdev_id fails to handle complex device topologies
While expanding positional parameters shell requires non-single
digits to be enclosed in braces. When the SAS topology is
non-trivial the number of positional parameters generated internally
by vdev_id script (using set -- ...) easily crosses single digit limit
and vdev_id fails to generate links.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #1119
2012-11-29 13:07:47 -08:00
Ned Bass a6ef9522ea Make vdev_id POSIX sh compatible
Full bash may not be available in all environments where udev helpers
run, such as in an initial ramdisk.  To avoid breakage in this case,
remove use of bash-specific features such as variable arrays and the
`declare' keyword from the vdev_id script.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #870
2012-11-27 14:23:22 -08:00
Brian Behlendorf ca8b5af89d Remove autotools products
Remove all of the generated autotools products from the repository
and update the .gitignore files accordingly.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #718
2012-08-27 11:47:44 -07:00
Etienne Dechamps ee5fd0bb80 Set zvol discard_granularity to the volblocksize.
Currently, zvols have a discard granularity set to 0, which suggests to
the upper layer that discard requests of arbirarily small size and
alignment can be made efficiently.

In practice however, ZFS does not handle unaligned discard requests
efficiently: indeed, it is unable to free a part of a block. It will
write zeros to the specified range instead, which is both useless and
inefficient (see dnode_free_range).

With this patch, zvol block devices expose volblocksize as their discard
granularity, so the upper layer is aware that it's not supposed to send
discard requests smaller than volblocksize.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #862
2012-08-07 14:55:31 -07:00
Richard Yao 739a1a82e0 Linux 3.5 compat, end_writeback() changed to clear_inode()
The end_writeback() function was changed by moving the call to
inode_sync_wait() earlier in to evict().   This effecitvely changes
the ordering of the sync but it does not impact the details of
the zfs implementation.

However, as part of this change end_writeback() was renamed to
clear_inode() to reflect the new semantics.  This change does
impact us and clear_inode() now maps to end_writeback() for
kernels prior to 3.5.

Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #784
2012-07-23 12:29:36 -07:00
Richard Yao ea1fdf46e2 Linux 3.5 compat, iops->truncate_range() removed
The vmtruncate_range() support has been removed from the kernel in
favor of using the fallocate method in the file_operations table.

Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #784
2012-07-23 12:29:32 -07:00
Richard Yao 756c3e5a9c Linux 3.5 compat, eops->encode_fh() takes inodes
The export_operations member ->encode_fh() has been updated to
take both the child and parent inodes.  This interface used to
take the child dentry and a bool describing if the parent is needed.

NOTE: While updating this code I noticed that we do not currently
cleanly handle the case where we're passed a connectable parent.
This code should be audited to make sure we're doing the right thing.

Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #784
2012-07-23 12:29:23 -07:00
Etienne Dechamps b5a28807cd Move partition scanning from userspace to module.
Currently, zpool online -e (dynamic vdev expansion) doesn't work on
whole disks because we're invoking ioctl(BLKRRPART) from userspace
while ZFS still has a partition open on the disk, which results in
EBUSY.

This patch moves the BLKRRPART invocation from the zpool utility to the
module. Specifically, this is done just before opening the device in
vdev_disk_open() which is called inside vdev_reopen(). This requires
jumping through some hoops to get to the disk device from the partition
device, and to make sure we can still open the partition after the
BLKRRPART call.

Note that this new code path is triggered on dynamic vdev expansion
only; other actions, like creating a new pool, are unchanged and still
call BLKRRPART from userspace.

This change also depends on API changes which are available in 2.6.37
and latter kernels.  The build system has been updated to detect this,
but there is no compatibility mode for older kernels.  This means that
online expansion will NOT be available in older kernels.  However, it
will still be possible to expand the vdev offline.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #808
2012-07-17 09:17:31 -07:00
Richard Yao 6a0936babc Linux 3.4 compat, d_make_root() replaces d_alloc_root()
torvalds/linux@adc0e91ab1 introduced
introduced d_make_root() as a replacement for d_alloc_root(). Further
commits appear to have removed d_alloc_root() from the Linux source
tree. This causes the following failure:

  error: implicit declaration of function 'd_alloc_root'
  [-Werror=implicit-function-declaration]

To correct this we update the code to use the current d_make_root()
interface for readability.  Then we introduce an autotools check
to determine if d_make_root() is available.  If it isn't then we
define some compatibility logic which used the older d_alloc_root()
interface.

Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #776
2012-06-11 10:04:49 -07:00
Ned A. Bass 821b683436 Add vdev_id for JBOD-friendly udev aliases
vdev_id parses the file /etc/zfs/vdev_id.conf to map a physical path
in a storage topology to a channel name.  The channel name is combined
with a disk enclosure slot number to create an alias that reflects the
physical location of the drive.  This is particularly helpful when it
comes to tasks like replacing failed drives.  Slot numbers may also be
re-mapped in case the default numbering is unsatisfactory.  The drive
aliases will be created as symbolic links in /dev/disk/by-vdev.

The only currently supported topologies are sas_direct and sas_switch:

o  sas_direct - a channel is uniquely identified by a PCI slot and a
   HBA port

o  sas_switch - a channel is uniquely identified by a SAS switch port

A multipath mode is supported in which dm-mpath devices are handled by
examining the first running component disk, as reported by 'multipath
-l'.  In multipath mode the configuration file should contain a
channel definition with the same name for each path to a given
enclosure.

vdev_id can replace the existing zpool_id script on systems where the
storage topology conforms to sas_direct or sas_switch.  The script
could be extended to support other topologies as well.  The advantage
of vdev_id is that it is driven by a single static input file that can
be shared across multiple nodes having a common storage toplogy.
zpool_id, on the other hand, requires a unique /etc/zfs/zdev.conf per
node and a separate slot-mapping file.  However, zpool_id provides the
flexibility of using any device names that show up in
/dev/disk/by-path, so it may still be needed on some systems.

vdev_id's functionality subsumes that of the sas_switch_id script, and
it is unlikely that anyone is using it, so sas_switch_id is removed.

Finally, /dev/disk/by-vdev is added to the list of directories that
'zpool import' will scan.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #713
2012-06-01 08:55:14 -07:00