zfs/include/sys
Alex Reece 463a8cfe2b Illumos 6844 - dnode_next_offset can detect fictional holes
6844 dnode_next_offset can detect fictional holes
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>

dnode_next_offset is used in a variety of places to iterate over the
holes or allocated blocks in a dnode. It operates under the premise that
it can iterate over the blockpointers of a dnode in open context while
holding only the dn_struct_rwlock as reader. Unfortunately, this premise
does not hold.

When we create the zio for a dbuf, we pass in the actual block pointer
in the indirect block above that dbuf. When we later zero the bp in
zio_write_compress, we are directly modifying the bp. The state of the
bp is now inconsistent from the perspective of dnode_next_offset: the bp
will appear to be a hole until zio_dva_allocate finally finishes filling
it in. In the meantime, dnode_next_offset can detect a hole in the dnode
when none exists.

I was able to experimentally demonstrate this behavior with the
following setup:
1. Create a file with 1 million dbufs.
2. Create a thread that randomly dirties L2 blocks by writing to the
first L0 block under them.
3. Observe dnode_next_offset, waiting for it to skip over a hole in the
middle of a file.
4. Do dnode_next_offset in a loop until we skip over such a non-existent
hole.

The fix is to ensure that it is valid to iterate over the indirect
blocks in a dnode while holding the dn_struct_rwlock by passing the zio
a copy of the BP and updating the actual BP in dbuf_write_ready while
holding the lock.

References:
  https://www.illumos.org/issues/6844
  https://github.com/openzfs/openzfs/pull/82
  DLPX-35372

Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4548
2016-04-27 16:24:15 -07:00
..
fm Illumos 3749 - zfs event processing should work on R/O root filesystems 2016-01-12 14:42:32 -08:00
fs Illumos 4929 - want prevsnap property 2016-01-11 11:58:26 -08:00
Makefile.am Add pn_alloc()/pn_free() functions 2016-04-21 09:49:25 -07:00
arc.h Illumos 5987 - zfs prefetch code needs work 2016-01-12 09:02:33 -08:00
arc_impl.h Illumos 6214 - zpools going south 2015-09-11 11:14:38 -07:00
avl.h Illumos 4745 - fix AVL code misspellings 2015-07-10 11:58:37 -07:00
avl_impl.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
blkptr.h Illumos 4757, 4913 2014-08-01 14:28:05 -07:00
bplist.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
bpobj.h Illumos 5810 - zdb should print details of bpobj 2015-05-11 15:10:24 -07:00
bptree.h Illumos 4914 - zfs on-disk bookmark structure should be named *_phys_t 2014-08-06 14:48:41 -07:00
bqueue.h Illumos 5960, 5925 2016-01-08 15:08:19 -08:00
dbuf.h Illumos 6844 - dnode_next_offset can detect fictional holes 2016-04-27 16:24:15 -07:00
ddt.h Add ddt, ddt_entry, and l2arc_hdr caches 2014-01-07 10:33:11 -08:00
dmu.h Illumos 4950 - files sometimes can't be removed from a full filesystem 2016-01-21 16:59:30 -08:00
dmu_impl.h Illumos 4757, 4913 2014-08-01 14:28:05 -07:00
dmu_objset.h Illumos 6267 - dn_bonus evicted too early 2015-10-13 14:12:02 -07:00
dmu_send.h Illumos 5765 - add support for estimating send stream size with lzc_send_space when source is a bookmark 2015-05-13 09:03:59 -07:00
dmu_traverse.h Illumos 4914 - zfs on-disk bookmark structure should be named *_phys_t 2014-08-06 14:48:41 -07:00
dmu_tx.h dmu_tx kstat cleanup 2014-03-04 12:22:24 -08:00
dmu_zfetch.h Illumos 5987 - zfs prefetch code needs work 2016-01-12 09:02:33 -08:00
dnode.h Illumos 5141 - zfs minimum indirect block size is 4K 2016-01-12 14:11:31 -08:00
dsl_bookmark.h Illumos 4368, 4369. 2014-07-29 10:55:29 -07:00
dsl_dataset.h Illumos 6171 - dsl_prop_unregister() slows down dataset eviction. 2016-01-12 10:53:12 -08:00
dsl_deadlist.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
dsl_deleg.h Illumos 4368, 4369. 2014-07-29 10:55:29 -07:00
dsl_destroy.h Illumos #3888 2013-11-04 11:18:14 -08:00
dsl_dir.h Illumos 6171 - dsl_prop_unregister() slows down dataset eviction. 2016-01-12 10:53:12 -08:00
dsl_pool.h Illumos 5981 - Deadlock in dmu_objset_find_dp 2015-07-06 09:31:35 -07:00
dsl_prop.h Illumos 6171 - dsl_prop_unregister() slows down dataset eviction. 2016-01-12 10:53:12 -08:00
dsl_scan.h Illumos 4914 - zfs on-disk bookmark structure should be named *_phys_t 2014-08-06 14:48:41 -07:00
dsl_synctask.h Illumos 4951 - ZFS administrative commands should use reserved space 2015-05-04 09:41:10 -07:00
dsl_userhold.h Illumos #3740 2013-11-04 11:17:48 -08:00
efi_partition.h Ext4's typical GPT partition type not recognized 2015-12-04 09:27:00 -08:00
metaslab.h Illumos 5213 - panic in metaslab_init due to space_map_open returning ENXIO 2014-11-14 15:37:45 -08:00
metaslab_impl.h Remove fastwrite mutex 2016-01-15 15:38:35 -08:00
mntent.h Make zfs mount according to relatime config in dataset 2016-04-05 18:55:59 -07:00
multilist.h Illumos 5497 - lock contention on arcs_mtx 2015-06-11 10:27:25 -07:00
nvpair.h Replace __va_list with va_list 2014-08-13 10:35:00 -07:00
nvpair_impl.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
pathname.h Add pn_alloc()/pn_free() functions 2016-04-21 09:49:25 -07:00
range_tree.h Illumos #4374 2014-07-30 09:20:35 -07:00
refcount.h Illumos 5045 - use atomic_{inc,dec}_* instead of atomic_add_* 2016-01-15 15:38:36 -08:00
rrwlock.h Illumos 5008 - lock contention (rrw_exit) while running a read only load 2015-07-06 09:34:13 -07:00
sa.h Prevent SA length overflow 2015-12-30 13:20:12 -08:00
sa_impl.h Illumos 5056 - ZFS deadlock on db_mtx and dn_holds 2015-04-28 16:25:34 -07:00
sdt.h Swap DTRACE_PROBE* with Linux tracepoints 2014-11-17 11:13:55 -08:00
spa.h Illumos 5746 - more checksumming in zfs send 2015-12-30 14:24:14 -08:00
spa_boot.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
spa_impl.h Add support for asynchronous zvol minor operations 2016-03-10 09:49:22 -08:00
space_map.h Illumos 5164-5165 - space map fixes 2014-10-23 15:30:32 -07:00
space_reftree.h Illumos #4101, #4102, #4103, #4105, #4106 2014-07-22 09:39:16 -07:00
trace.h Remove duplicate typedefs from trace.h 2015-01-06 16:53:24 -08:00
trace_acl.h Fix atime handling and relatime 2016-04-05 18:54:55 -07:00
trace_arc.h Illumos 5987 - zfs prefetch code needs work 2016-01-12 09:02:33 -08:00
trace_dbgmsg.h SET_ERROR should print strings 2016-01-15 15:38:35 -08:00
trace_dbuf.h SET_ERROR should print strings 2016-01-15 15:38:35 -08:00
trace_dmu.h Fix build failure with Linux 4.1 and FTRACE 2015-07-29 07:35:06 -07:00
trace_dnode.h Fix build failure with Linux 4.1 and FTRACE 2015-07-29 07:35:06 -07:00
trace_multilist.h Fix build failure with Linux 4.1 and FTRACE 2015-08-18 16:47:21 -07:00
trace_txg.h Fix build failure with Linux 4.1 and FTRACE 2015-07-29 07:35:06 -07:00
trace_zil.h Fix build failure with Linux 4.1 and FTRACE 2015-07-29 07:35:06 -07:00
trace_zrlock.h SET_ERROR should print strings 2016-01-15 15:38:35 -08:00
txg.h Illumos 4753 - increase number of outstanding async writes when sync task is waiting 2014-09-23 13:50:55 -07:00
txg_impl.h Illumos #4045 write throttle & i/o scheduler performance work 2013-12-06 09:32:43 -08:00
u8_textprep.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
u8_textprep_data.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
uberblock.h Illumos 5347 - idle pool may run itself out of space 2015-07-14 10:35:21 -07:00
uberblock_impl.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
uio_impl.h Add basic uio support 2011-02-10 09:21:43 -08:00
unique.h Illumos #3742 2013-11-04 10:55:25 -08:00
uuid.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
vdev.h FreeBSD r256956: Improve ZFS N-way mirror read performance by using load and locality information. 2016-02-26 11:24:35 -08:00
vdev_disk.h cstyle: Resolve C style issues 2013-12-18 16:46:35 -08:00
vdev_file.h Update all default taskq settings 2015-06-25 08:58:16 -07:00
vdev_impl.h FreeBSD r256956: Improve ZFS N-way mirror read performance by using load and locality information. 2016-02-26 11:24:35 -08:00
xvattr.h Add xvattr support 2011-03-02 11:43:50 -08:00
zap.h Add zap_prefetch() interface 2015-12-04 09:39:20 -08:00
zap_impl.h Illumos 5027 - zfs large block support 2015-05-11 12:23:16 -07:00
zap_leaf.h Illumos 5056 - ZFS deadlock on db_mtx and dn_holds 2015-04-28 16:25:34 -07:00
zfeature.h Illumos 4370, 4371 2014-07-28 14:29:58 -07:00
zfs_acl.h Illumos #3742 2013-11-04 10:55:25 -08:00
zfs_context.h Ensure correct return value type 2016-03-29 18:33:17 -07:00
zfs_ctldir.h Use spa as key besides objsetid for snapentry 2015-12-08 16:38:56 -08:00
zfs_debug.h Add dbgmsg kstat 2015-09-04 16:08:14 -07:00
zfs_delay.h cstyle: Resolve C style issues 2013-12-18 16:46:35 -08:00
zfs_dir.h Prototype/structure update for Linux 2011-02-10 09:27:21 -08:00
zfs_fuid.h Prototype/structure update for Linux 2011-02-10 09:27:21 -08:00
zfs_ioctl.h Illumos 5746 - more checksumming in zfs send 2015-12-30 14:24:14 -08:00
zfs_onexit.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
zfs_rlock.h Illumos #3742 2013-11-04 10:55:25 -08:00
zfs_sa.h Illumos 5027 - zfs large block support 2015-05-11 12:23:16 -07:00
zfs_stat.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
zfs_vfsops.h Fix zsb->z_hold_mtx deadlock 2016-01-15 15:33:45 -08:00
zfs_vnops.h Add pn_alloc()/pn_free() functions 2016-04-21 09:49:25 -07:00
zfs_znode.h Fix atime handling and relatime 2016-04-05 18:54:55 -07:00
zil.h Illumos 5269 - zpool import slow 2015-06-09 13:48:02 -07:00
zil_impl.h Illumos 5027 - zfs large block support 2015-05-11 12:23:16 -07:00
zio.h Illumos 5960, 5925 2016-01-08 15:08:19 -08:00
zio_checksum.h Illumos 5960, 5925 2016-01-08 15:08:19 -08:00
zio_compress.h Illumos #3742 2013-11-04 10:55:25 -08:00
zio_impl.h Illumos #3836 2013-11-05 12:14:56 -08:00
zio_priority.h Illumos 5960, 5925 2016-01-08 15:08:19 -08:00
zpl.h Add temporary mount options 2015-09-03 14:14:55 -07:00
zrlock.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
zvol.h Add support for asynchronous zvol minor operations 2016-03-10 09:49:22 -08:00