OpenZFS on Linux and FreeBSD
Go to file
Prakash Surya 99573cc053 Timeout waiting for ZVOL device to be created
We've seen cases where after creating a ZVOL, the ZVOL device node in
"/dev" isn't generated after 20 seconds of waiting, which is the point
at which our applications gives up on waiting and reports an error.

The workload when this occurs is to "refresh" 400+ ZVOLs roughly at the
same time, based on a policy set by the user. This refresh operation
will destroy the ZVOL, and re-create it based on a snapshot.

When this occurs, we see many hundreds of entries on the "z_zvol" taskq
(based on inspection of the /proc/spl/taskq-all file). Many of the
entries on the taskq end up in the "zvol_remove_minors_impl" function,
and I've measured the latency of that function:

Function = zvol_remove_minors_impl
msecs               : count     distribution
  0 -> 1          : 0        |                                        |
  2 -> 3          : 0        |                                        |
  4 -> 7          : 1        |                                        |
  8 -> 15         : 0        |                                        |
 16 -> 31         : 0        |                                        |
 32 -> 63         : 0        |                                        |
 64 -> 127        : 1        |                                        |
128 -> 255        : 45       |****************************************|
256 -> 511        : 5        |****                                    |

That data is from a 10 second sample, using the BCC "funclatency" tool.
As we can see, in this 10 second sample, most calls took 128ms at a
minimum. Thus, some basic math tells us that in any 20 second interval,
we could only process at most about 150 removals, which is much less
than the 400+ that'll occur based on the workload.

As a result of this, and since all ZVOL minor operations will go through
the single threaded "z_zvol" taskq, the latency for creating a single
ZVOL device can be unreasonably large due to other ZVOL activity on the
system. In our case, it's large enough to cause the application to
generate an error and fail the operation.

When profiling the "zvol_remove_minors_impl" function, I saw that most
of the time in the function was spent off-cpu, blocked in the function
"taskq_wait_outstanding". How this works, is "zvol_remove_minors_impl"
will dispatch calls to "zvol_free" using the "system_taskq", and then
the "taskq_wait_outstanding" function is used to wait for all of those
dispatched calls to occur before "zvol_remove_minors_impl" will return.

As far as I can tell, "zvol_remove_minors_impl" doesn't necessarily have
to wait for all calls to "zvol_free" to occur before it returns. Thus,
this change removes the call to "taskq_wait_oustanding", so that calls
to "zvol_free" don't affect the latency of "zvol_remove_minors_impl".

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Gallagher <john.gallagher@delphix.com>
Signed-off-by: Prakash Surya <prakash.surya@delphix.com>
Closes #9380
2019-10-01 12:33:12 -07:00
.github Fix typos 2019-09-02 18:17:39 -07:00
cmd OpenZFS restructuring - zpool 2019-09-30 12:16:06 -07:00
config OpenZFS restructuring - move platform specific headers 2019-09-05 09:34:54 -07:00
contrib Fix typos in contrib/ 2019-08-30 09:44:43 -07:00
etc Fix typos in etc/ 2019-08-30 09:46:52 -07:00
include OpenZFS restructuring - zpool 2019-09-30 12:16:06 -07:00
lib OpenZFS restructuring - zpool 2019-09-30 12:16:06 -07:00
man Add warning for zfs_vdev_elevator option removal 2019-09-25 09:23:29 -07:00
module Timeout waiting for ZVOL device to be created 2019-10-01 12:33:12 -07:00
rpm Canonicalize Python shebangs 2019-09-12 13:32:32 -07:00
scripts kmodtool: depmod path 2019-09-11 11:14:50 -07:00
tests OpenZFS restructuring - zfs_ioctl 2019-09-27 10:46:28 -07:00
udev Restore :: in Makefile.am 2019-08-26 11:48:31 -07:00
.gitignore Linux 5.3 compat: Makefile subdir-m no longer supported 2019-08-19 15:22:52 -07:00
.gitmodules Add zimport.sh compatibility test script 2014-02-21 12:10:31 -08:00
.travis.yml Add .travis.yml 2017-11-13 09:18:18 -08:00
AUTHORS Update build system and packaging 2018-05-29 16:00:33 -07:00
CODE_OF_CONDUCT.md Add CODE_OF_CONDUCT.md 2019-04-30 10:58:45 -07:00
COPYRIGHT OpenZFS restructuring - move platform specific sources 2019-09-06 11:26:26 -07:00
LICENSE Update build system and packaging 2018-05-29 16:00:33 -07:00
META Tag 0.8.0 2019-05-21 11:11:41 -07:00
Makefile.am OpenZFS restructuring - move platform specific sources 2019-09-06 11:26:26 -07:00
NEWS Add NEWS file 2018-09-18 12:03:47 -07:00
NOTICE Update build system and packaging 2018-05-29 16:00:33 -07:00
README.md Explicitly state supported Linux versions 2018-05-30 20:11:19 -07:00
TEST Update build system and packaging 2018-05-29 16:00:33 -07:00
autogen.sh Cause autogen.sh to fail if autoreconf fails 2018-07-06 09:27:37 -07:00
configure.ac Add subcommand to wait for background zfs activity to complete 2019-09-13 18:09:06 -07:00
copy-builtin copy-builtin: SPL must be in Kbuild first (again) 2019-09-11 11:09:50 -07:00
zfs.release.in Move zfs.release generation to configure step 2012-07-12 12:22:51 -07:00

README.md

img

ZFS on Linux is an advanced file system and volume manager which was originally developed for Solaris and is now maintained by the OpenZFS community.

codecov coverity

Official Resources

Installation

Full documentation for installing ZoL on your favorite Linux distribution can be found at our site.

Contribute & Develop

We have a separate document with contribution guidelines.

Release

ZFS on Linux is released under a CDDL license.
For more details see the NOTICE, LICENSE and COPYRIGHT files; UCRL-CODE-235197

Supported Kernels

  • The META file contains the officially recognized supported kernel versions.