zfs/include/sys
George Wilson ddc751d56b OpenZFS 8857 - zio_remove_child() panic due to already destroyed parent zio
PROBLEM
=======
It's possible for a parent zio to complete even though it has children
which have not completed. This can result in the following panic:
    > $C
    ffffff01809128c0 vpanic()
    ffffff01809128e0 mutex_panic+0x58(fffffffffb94c904, ffffff597dde7f80)
    ffffff0180912950 mutex_vector_enter+0x347(ffffff597dde7f80)
    ffffff01809129b0 zio_remove_child+0x50(ffffff597dde7c58, ffffff32bd901ac0,
    ffffff3373370908)
    ffffff0180912a40 zio_done+0x390(ffffff32bd901ac0)
    ffffff0180912a70 zio_execute+0x78(ffffff32bd901ac0)
    ffffff0180912b30 taskq_thread+0x2d0(ffffff33bae44140)
    ffffff0180912b40 thread_start+8()
    > ::status
    debugging crash dump vmcore.2 (64-bit) from batfs0390
    operating system: 5.11 joyent_20170911T171900Z (i86pc)
    image uuid: (not set)
    panic message: mutex_enter: bad mutex, lp=ffffff597dde7f80
    owner=ffffff3c59b39480 thread=ffffff0180912c40
    dump content: kernel pages only
The problem is that dbuf_prefetch along with l2arc can create a zio tree
which confuses the parent zio and allows it to complete with while children
still exist. Here's the scenario:
    zio tree:
        pio
         |--- lio
The parent zio, pio, has entered the zio_done stage and begins to check its
children to see there are still some that have not completed. In zio_done(),
the children are checked in the following order:
    zio_wait_for_children(zio, ZIO_CHILD_VDEV, ZIO_WAIT_DONE)
    zio_wait_for_children(zio, ZIO_CHILD_GANG, ZIO_WAIT_DONE)
    zio_wait_for_children(zio, ZIO_CHILD_DDT, ZIO_WAIT_DONE)
    zio_wait_for_children(zio, ZIO_CHILD_LOGICAL, ZIO_WAIT_DONE)
If pio, finds any child which has not completed then it stops executing and
goes to sleep. Each call to zio_wait_for_children() will grab the io_lock
while checking the particular child.
In this scenario, the pio has completed the first call to
zio_wait_for_children() to check for any ZIO_CHILD_VDEV children. Since
the only zio in the zio tree right now is the logical zio, lio, then it
completes that call and prepares to check the next child type.
In the meantime, the lio completes and in its callback creates a child vdev
zio, cio. The zio tree looks like this:
    zio tree:
        pio
         |--- lio
         |--- cio
The lio then grabs the parent's io_lock and removes itself.
    zio tree:
        pio
         |--- cio
The pio continues to run but has already completed its check for ZIO_CHILD_VDEV
and will erroneously complete. When the child zio, cio, completes it will panic
the system trying to reference the parent zio which has been destroyed.
SOLUTION
========
The fix is to rework the zio_wait_for_children() logic to accept a bitfield
for all the children types that it's interested in checking. The
io_lock will is held the entire time we check all the children types. Since
the function now accepts a bitfield, a simple ZIO_CHILD_BIT() macro is provided
to allow for the conversion between a ZIO_CHILD type and the bitfield used by
the zio_wiat_for_children logic.

Authored by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: Youzhong Yang <youzhong@gmail.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Dan McDonald <danmcd@omniti.com>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/8857
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/862ff6d99c
Issue #5918
Closes #7168
2018-02-14 15:30:09 -08:00
..
crypto OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R 2016-10-03 14:51:15 -07:00
fm Extend deadman logic 2018-01-25 13:40:38 -08:00
fs Project Quota on ZFS 2018-02-13 14:54:54 -08:00
lua OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
sysevent OpenZFS 8959 - Add notifications when a scrub is paused or resumed 2018-01-17 10:31:00 -08:00
Makefile.am Project Quota on ZFS 2018-02-13 14:54:54 -08:00
abd.h OpenZFS 8416 - abd.h is not C++ friendly 2017-06-30 11:11:01 -07:00
arc.h Fix ARC hit rate 2018-01-08 09:52:36 -08:00
arc_impl.h Support re-prioritizing asynchronous prefetches 2017-12-21 09:13:06 -08:00
avl.h Remove dead code from AVL tree 2017-10-05 19:28:00 -07:00
avl_impl.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
blkptr.h OpenZFS 8067 - zdb should be able to dump literal embedded block pointer 2017-07-07 11:28:01 -07:00
bplist.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
bpobj.h Illumos 5810 - zdb should print details of bpobj 2015-05-11 15:10:24 -07:00
bptree.h Illumos 4914 - zfs on-disk bookmark structure should be named *_phys_t 2014-08-06 14:48:41 -07:00
bqueue.h Illumos 5960, 5925 2016-01-08 15:08:19 -08:00
dbuf.h OpenZFS 7531 - Assign correct flags to prefetched buffers 2017-11-11 20:24:34 -08:00
ddt.h Native Encryption for ZFS on Linux 2017-08-14 10:36:48 -07:00
dmu.h Project Quota on ZFS 2018-02-13 14:54:54 -08:00
dmu_impl.h OpenZFS 7793 - ztest fails assertion in dmu_tx_willuse_space 2017-03-07 09:51:59 -08:00
dmu_objset.h Project Quota on ZFS 2018-02-13 14:54:54 -08:00
dmu_send.h Free objects when receiving full stream as clone 2017-10-10 15:30:51 -07:00
dmu_traverse.h Native Encryption for ZFS on Linux 2017-08-14 10:36:48 -07:00
dmu_tx.h OpenZFS 8997 - ztest assertion failure in zil_lwb_write_issue 2018-01-26 20:19:46 -08:00
dmu_zfetch.h OpenZFS 6322 - ZFS indirect block predictive prefetch 2016-08-30 14:26:55 -07:00
dnode.h Project Quota on ZFS 2018-02-13 14:54:54 -08:00
dsl_bookmark.h Illumos 4368, 4369. 2014-07-29 10:55:29 -07:00
dsl_crypt.h Encryption Stability and On-Disk Format Fixes 2018-02-02 11:37:16 -08:00
dsl_dataset.h OpenZFS 8600 - ZFS channel programs - snapshot 2018-02-08 15:29:24 -08:00
dsl_deadlist.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
dsl_deleg.h Project Quota on ZFS 2018-02-13 14:54:54 -08:00
dsl_destroy.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
dsl_dir.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
dsl_pool.h Sequential scrub and resilvers 2017-11-15 17:27:01 -08:00
dsl_prop.h Illumos 6171 - dsl_prop_unregister() slows down dataset eviction. 2016-01-12 10:53:12 -08:00
dsl_scan.h Sequential scrub and resilvers 2017-11-15 17:27:01 -08:00
dsl_synctask.h Illumos 4951 - ZFS administrative commands should use reserved space 2015-05-04 09:41:10 -07:00
dsl_userhold.h Illumos #3740 2013-11-04 11:17:48 -08:00
edonr.h OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R 2016-10-03 14:51:15 -07:00
efi_partition.h Fix spelling 2017-01-03 11:31:18 -06:00
frame.h Suppress incorrect objtool warnings 2017-12-07 10:28:50 -08:00
hkdf.h Encryption patch follow-up 2017-10-11 16:54:48 -04:00
metaslab.h OpenZFS 7303 - dynamic metaslab selection 2017-01-12 11:52:56 -08:00
metaslab_impl.h OpenZFS 7613 - ms_freetree[4] is only used in syncing context 2017-01-26 15:27:19 -08:00
mmp.h Add callback for zfs_multihost_interval 2017-07-25 13:22:20 -04:00
mntent.h Make zfs mount according to relatime config in dataset 2016-04-05 18:55:59 -07:00
multilist.h OpenZFS 7968 - multi-threaded spa_sync() 2017-03-20 18:36:00 -07:00
nvpair.h Replace __va_list with va_list 2014-08-13 10:35:00 -07:00
nvpair_impl.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
pathname.h Add pn_alloc()/pn_free() functions 2016-04-21 09:49:25 -07:00
policy.h Add `zfs allow` and `zfs unallow` support 2016-06-07 09:16:52 -07:00
range_tree.h Sequential scrub and resilvers 2017-11-15 17:27:01 -08:00
refcount.h OpenZFS 8081 - Compiler warnings in zdb 2017-10-27 12:46:35 -07:00
rrwlock.h Illumos 5008 - lock contention (rrw_exit) while running a read only load 2015-07-06 09:34:13 -07:00
sa.h Project Quota on ZFS 2018-02-13 14:54:54 -08:00
sa_impl.h Implement large_dnode pool feature 2016-06-24 13:13:21 -07:00
sdt.h Add line info and SET_ERROR() to ZFS debug log 2017-07-25 23:09:48 -07:00
sha2.h OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R 2016-10-03 14:51:15 -07:00
skein.h OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R 2016-10-03 14:51:15 -07:00
spa.h Extend deadman logic 2018-01-25 13:40:38 -08:00
spa_boot.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
spa_checksum.h Implementation of AVX2 optimized Fletcher-4 2016-06-02 14:30:51 -07:00
spa_impl.h Extend deadman logic 2018-01-25 13:40:38 -08:00
space_map.h Illumos 5164-5165 - space map fixes 2014-10-23 15:30:32 -07:00
space_reftree.h Illumos #4101, #4102, #4103, #4105, #4106 2014-07-22 09:39:16 -07:00
sysevent.h OpenZFS 6939 - add sysevents to zfs core for commands 2017-07-12 21:28:13 -07:00
trace.h Remove duplicate typedefs from trace.h 2015-01-06 16:53:24 -08:00
trace_acl.h Linux 4.16 compat: inode_set_iversion() 2018-02-08 21:25:19 -08:00
trace_arc.h Support re-prioritizing asynchronous prefetches 2017-12-21 09:13:06 -08:00
trace_common.h OpenZFS 6531 - Provide mechanism to artificially limit disk performance 2016-05-26 10:11:51 -07:00
trace_dbgmsg.h Add line info and SET_ERROR() to ZFS debug log 2017-07-25 23:09:48 -07:00
trace_dbuf.h Crash in dbuf_evict_one with DTRACE_PROBE 2017-08-09 11:04:41 -07:00
trace_dmu.h tx_waited -> tx_dirty_delayed in trace_dmu.h 2018-01-31 16:13:26 -08:00
trace_dnode.h Fix build-it compilation regression 2017-01-24 08:50:15 -08:00
trace_multilist.h Fix build-it compilation regression 2017-01-24 08:50:15 -08:00
trace_txg.h Fix build-it compilation regression 2017-01-24 08:50:15 -08:00
trace_zil.h OpenZFS 8585 - improve batching done in zil_commit() 2017-12-05 09:39:16 -08:00
trace_zio.h Use cstyle -cpP in `make cstyle` check 2016-12-12 10:46:26 -08:00
trace_zrlock.h Use cstyle -cpP in `make cstyle` check 2016-12-12 10:46:26 -08:00
txg.h OpenZFS 8063 - verify that we do not attempt to access inactive txg 2017-05-10 13:52:22 -04:00
txg_impl.h Fix spelling 2017-01-03 11:31:18 -06:00
u8_textprep.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
u8_textprep_data.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
uberblock.h Multi-modifier protection (MMP) 2017-07-13 13:54:00 -04:00
uberblock_impl.h OpenZFS 8491 - uberblock on-disk padding to reserve space for smoothly merging zpool checkpoint & MMP in ZFS 2017-07-24 13:47:51 -04:00
uio_impl.h Add basic uio support 2011-02-10 09:21:43 -08:00
unique.h Illumos #3742 2013-11-04 10:55:25 -08:00
uuid.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
vdev.h Extend deadman logic 2018-01-25 13:40:38 -08:00
vdev_disk.h Remove custom root pool import code 2016-08-11 11:19:34 -07:00
vdev_file.h Use a dedicated taskq for vdev_file 2016-12-21 10:47:15 -08:00
vdev_impl.h Sequential scrub and resilvers 2017-11-15 17:27:01 -08:00
vdev_raidz.h Use cstyle -cpP in `make cstyle` check 2016-12-12 10:46:26 -08:00
vdev_raidz_impl.h Revert raidz_map and _col structure types 2018-01-09 14:46:52 -08:00
xvattr.h Project Quota on ZFS 2018-02-13 14:54:54 -08:00
zap.h OpenZFS 1300 - filename normalization doesn't work for removes 2017-02-02 14:13:41 -08:00
zap_impl.h OpenZFS 7793 - ztest fails assertion in dmu_tx_willuse_space 2017-03-07 09:51:59 -08:00
zap_leaf.h Handle zap_add() failures in mixed case mode 2018-02-09 10:15:53 -08:00
zcp.h OpenZFS 8677 - Open-Context Channel Programs 2018-02-08 16:05:57 -08:00
zcp_global.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zcp_iter.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zcp_prop.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zfeature.h Revert "zhack: Add 'feature disable' command" 2016-05-17 11:52:07 -07:00
zfs_acl.h Project Quota on ZFS 2018-02-13 14:54:54 -08:00
zfs_context.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zfs_ctldir.h Rename zfs_sb_t -> zfsvfs_t 2017-03-10 09:51:33 -08:00
zfs_debug.h Add line info and SET_ERROR() to ZFS debug log 2017-07-25 23:09:48 -07:00
zfs_delay.h cstyle: Resolve C style issues 2013-12-18 16:46:35 -08:00
zfs_dir.h Rename zfs_sb_t -> zfsvfs_t 2017-03-10 09:51:33 -08:00
zfs_fuid.h Rename zfs_sb_t -> zfsvfs_t 2017-03-10 09:51:33 -08:00
zfs_ioctl.h OpenZFS 8604 - Simplify snapshots unmounting code 2018-02-08 15:29:44 -08:00
zfs_onexit.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
zfs_project.h Project Quota on ZFS 2018-02-13 14:54:54 -08:00
zfs_ratelimit.h Add missing *_destroy/*_fini calls 2017-05-04 19:26:28 -04:00
zfs_rlock.h Rename zfs_sb_t -> zfsvfs_t 2017-03-10 09:51:33 -08:00
zfs_sa.h Project Quota on ZFS 2018-02-13 14:54:54 -08:00
zfs_stat.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
zfs_vfsops.h Project Quota on ZFS 2018-02-13 14:54:54 -08:00
zfs_vnops.h Rename zfs_* functions 2017-03-10 09:51:35 -08:00
zfs_znode.h Project Quota on ZFS 2018-02-13 14:54:54 -08:00
zil.h OpenZFS 8909 - 8585 can cause a use-after-free kernel panic 2017-12-28 10:18:04 -08:00
zil_impl.h OpenZFS 8909 - 8585 can cause a use-after-free kernel panic 2017-12-28 10:18:04 -08:00
zio.h OpenZFS 8857 - zio_remove_child() panic due to already destroyed parent zio 2018-02-14 15:30:09 -08:00
zio_checksum.h Remove dependency on linear ABD 2017-03-29 12:24:51 -07:00
zio_compress.h DLPX-44812 integrate EP-220 large memory scalability 2016-11-29 14:34:27 -08:00
zio_crypt.h Encryption Stability and On-Disk Format Fixes 2018-02-02 11:37:16 -08:00
zio_impl.h Native Encryption for ZFS on Linux 2017-08-14 10:36:48 -07:00
zio_priority.h Add -lhHpw options to "zpool iostat" for avg latency, histograms, & queues 2016-05-12 12:36:32 -07:00
zpl.h Use cstyle -cpP in `make cstyle` check 2016-12-12 10:46:26 -08:00
zrlock.h OpenZFS 6328 - Fix cstyle errors in zfs codebase 2017-01-12 09:42:11 -08:00
zvol.h Add port of FreeBSD 'volmode' property 2017-07-12 13:05:37 -07:00