Commit Graph

294 Commits

Author SHA1 Message Date
John Wren Kennedy 435637d1ed ZTS: user_property_002_pos fails to destroy volume
During the cleanup function of this test, an attempt to destroy a volume
can fail because the volume is busy. This leaves the system with
unexpected datasets which in turn causes subsequent failures.

Reviewed-by: bunder2015 <omfgbunder@gmail.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: John Kennedy <john.kennedy@delphix.com>
Closes #8422
2019-02-19 11:12:47 -08:00
Serapheim Dimitropoulos c853f382db Change target size of metaslabs from 256GB to 16GB
= Old behavior

For vdev sizes 100GB to 50TB we keep ~200 metaslabs per
vdev and the metaslab size grows from 512MB to 256GB.
For vdev's bigger than that we start increasing the
number of metaslabs until we hit the 128K limit.

= New Behavior

For vdev sizes 100GB to 3TB we keep ~200 metaslabs per
vdev and the metaslab size grows from 512MB to 16GB.
For vdev's bigger than that we start increasing the
number of metaslabs until we hit the 128K limit.

= Reasoning

The old behavior makes metaslabs grow in size when
the vdev range is between 3TB (ms_size 16GB) and
32PB (ms_size 256GB). Even though keeping the number
of metaslabs is good in terms of potential number of
I/Os per TXG, these bigger metaslabs take longer
to be loaded and after they are loaded they can
take up a lot of memory because of their range trees.

This change tries to put a boundary in memory and
loading time for the specific range of vdev sizes.

Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #8324
2019-01-25 16:38:27 -08:00
loli10K 60b0a963f5 Off-by-one in zap_leaf_array_create()
Trying to set user properties with their length 1 byte shorter than the
maximum size triggers an assertion failure in zap_leaf_array_create():

  panic[cpu0]/thread=ffffff000a092c40:
  assertion failed: num_integers * integer_size < (8<<10) (0x2000 < 0x2000), file: ../../common/fs/zfs/zap_leaf.c, line: 233

  ffffff000a092500 genunix:process_type+167c35 ()
  ffffff000a0925a0 zfs:zap_leaf_array_create+1d2 ()
  ffffff000a092650 zfs:zap_entry_create+1be ()
  ffffff000a092720 zfs:fzap_update+ed ()
  ffffff000a0927d0 zfs:zap_update+1a5 ()
  ffffff000a0928d0 zfs:dsl_prop_set_sync_impl+5c6 ()
  ffffff000a092970 zfs:dsl_props_set_sync_impl+fc ()
  ffffff000a0929b0 zfs:dsl_props_set_sync+79 ()
  ffffff000a0929f0 zfs:dsl_sync_task_sync+10a ()
  ffffff000a092a80 zfs:dsl_pool_sync+3a3 ()
  ffffff000a092b50 zfs:spa_sync+4e6 ()
  ffffff000a092c20 zfs:txg_sync_thread+297 ()
  ffffff000a092c30 unix:thread_start+8 ()

This patch simply corrects the assertion.

Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8278
2019-01-18 09:58:46 -08:00
Brian Behlendorf 6e91a72fe3
Disable 'zfs remap' command
The implementation of 'zfs remap' has proven to be problematic since
it modifies the objset (but not its logical contents) by dirtying
metadata without owning it.  The consequence of which is that
dmu_objset_remap_indirects() is vulnerable to certain races.

For example, if we are in the middle of receiving into the filesystem
while it is being remapped.  Then it is possible we could evict the
objset when the receive completes (see dsl_dataset_clone_swap_sync_impl,
or dmu_recv_end_sync), but dmu_objset_remap_indirects() may be still
using the objset.  The result of which would be a panic.

Extended runs of ztest(8) have exposed other possible races which
can occur when using 'zfs remap'.  Several of these have been fixed
but there may be others which have not yet been encountered and
diagnosed.

Furthermore, the ability to manually remap a filesystem is no longer
particularly useful now that the removal code can map large chunks.
Coupled with the fact that explaining what this command does and why
it may be useful requires a detailed understanding of the internals
of device removal.  These are details users should not be bothered
with.

Therefore, the 'zfs remap' command is being disabled but not entirely
removed.  It may be removed in the future or potentially reworked
to address the issues described above.  Since 'zfs remap' has never
been part of a tagged release its removal is expected to have
minimal impact.

The ZTS tests have been updated to continue to exercise the command
to prevent atrophy, but it has been removed entirely from ztest(8).

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8238
2019-01-15 15:46:58 -08:00
Brian Behlendorf 99b0b5bc3f
ZTS: zpool_resilver_restart
Since the vdev initialize feature was integrated the ZTS
zpool_resilver_restart test has been hitting its internal
timeout more frequently.  This happens most often on
the coverage builder but not exclusively.  Increasing the
timeout for this test case prevents any false positives.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8273
2019-01-13 10:01:31 -08:00
Brian Behlendorf a769fb53a1 Add 'zpool status -i' option
Only display the full details of the vdev initialization state
in 'zpool status' output when requested with the -i option.
By default display '(initializing)' after vdevs when they are
being actively initialized.  This is consistent with the
established precident of appending '(resilvering), etc' and
fits within the default 80 column terminal width making it
easy to read.

Additionally, updated the 'zpool initialize' documentation to
make it clear the options are mutually exclusive, but allow
duplicate options like all other zfs/zpool commands.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8230
2019-01-07 11:03:18 -08:00
George Wilson 619f097693 OpenZFS 9102 - zfs should be able to initialize storage devices
PROBLEM
========

The first access to a block incurs a performance penalty on some platforms
(e.g. AWS's EBS, VMware VMDKs). Therefore we recommend that volumes are
"thick provisioned", where supported by the platform (VMware). This can
create a large delay in getting a new virtual machines up and running (or
adding storage to an existing Engine). If the thick provision step is
omitted, write performance will be suboptimal until all blocks on the LUN
have been written.

SOLUTION
=========

This feature introduces a way to 'initialize' the disks at install or in the
background to make sure we don't incur this first read penalty.

When an entire LUN is added to ZFS, we make all space available immediately,
and allow ZFS to find unallocated space and zero it out. This works with
concurrent writes to arbitrary offsets, ensuring that we don't zero out
something that has been (or is in the middle of being) written. This scheme
can also be applied to existing pools (affecting only free regions on the
vdev). Detailed design:
        - new subcommand:zpool initialize [-cs] <pool> [<vdev> ...]
                - start, suspend, or cancel initialization
        - Creates new open-context thread for each vdev
        - Thread iterates through all metaslabs in this vdev
        - Each metaslab:
                - select a metaslab
                - load the metaslab
                - mark the metaslab as being zeroed
                - walk all free ranges within that metaslab and translate
                  them to ranges on the leaf vdev
                - issue a "zeroing" I/O on the leaf vdev that corresponds to
                  a free range on the metaslab we're working on
                - continue until all free ranges for this metaslab have been
                  "zeroed"
                - reset/unmark the metaslab being zeroed
                - if more metaslabs exist, then repeat above tasks.
                - if no more metaslabs, then we're done.

        - progress for the initialization is stored on-disk in the vdev’s
          leaf zap object. The following information is stored:
                - the last offset that has been initialized
                - the state of the initialization process (i.e. active,
                  suspended, or canceled)
                - the start time for the initialization

        - progress is reported via the zpool status command and shows
          information for each of the vdevs that are initializing

Porting notes:
- Added zfs_initialize_value module parameter to set the pattern
  written by "zpool initialize".
- Added zfs_vdev_{initializing,removal}_{min,max}_active module options.

Authored by: George Wilson <george.wilson@delphix.com>
Reviewed by: John Wren Kennedy <john.kennedy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Richard Lowe <richlowe@richlowe.net>
Signed-off-by: Tim Chase <tim@chase2k.com>
Ported-by: Tim Chase <tim@chase2k.com>

OpenZFS-issue: https://www.illumos.org/issues/9102
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/c3963210eb
Closes #8230
2019-01-07 10:37:26 -08:00
bunder2015 5365b0747a Add missing MMP status code to libzfs_status
When MMP was merged the status codes in libzfs_status were not
updated to add the status code for ZPOOL_STATUS_IO_FAILURE_MMP.  This
commit corrects this and adds comments to help keep track of which
code is used for which status.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: bunder2015 <omfgbunder@gmail.com>
Closes #8148
Closes #8222
2019-01-03 12:15:46 -08:00
LOLi bdbd5477bc Fix ASSERT in zfs_receive_one()
This commit fixes the following ASSERT in zfs_receive_one() when
receiving a send stream from a root dataset with the "-e" option:

    $ sudo zfs snap source@snap
    $ sudo zfs send source@snap | sudo zfs recv -e destination/recv
    chopprefix > drrb->drr_toname
    ASSERT at libzfs_sendrecv.c:3804:zfs_receive_one()

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8121
2018-12-04 09:38:55 -08:00
Tom Caputi cef48f14da Remove races from scrub / resilver tests
Currently, several tests in the ZFS Test Suite that attempt to
test scrub and resilver behavior occasionally fail. A big reason
for this is that these tests use a combination of zinject and
zfs_scan_vdev_limit to attempt to slow these operations enough
to verify their test commands. This method works most of the time,
but provides no guarantees and leads to flaky behavior. This patch
adds a new tunable, zfs_scan_suspend_progress, that ensures that
scans make no progress, guaranteeing that tests can be run without
racing.

This patch also changes zfs_remove_max_bytes_pause to match this
new tunable. This provides some consistency between these two
similar tunables and ensures that the tunable will not misbehave
on 32-bit systems.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8111
2018-11-28 10:12:08 -08:00
LOLi 00369f3338 ZTS: fix "not found" errors
This commit fixes several "not found" errors caused by calling undefined
or incorrect shell functions in the following ZFS Test Suite groups:

   * alloc_class
   * channel_program/lua_core
   * channel_program/synctask_core
   * cli_root/zpool_import
   * cli_user/misc

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: bunder2015 <omfgbunder@gmail.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8152
2018-11-27 09:39:37 -08:00
LOLi 0cd5c941d0 zpool: allow split with whole-disk devices
This change allows 'zpool split' to work with whole-disk devices and
updates the ZFS Test Suite with a new script to exercise this
functionality.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6643 
Closes #8133
2018-11-20 10:22:53 -08:00
Sebastien Roy a10d50f999 OpenZFS 8115 - parallel zfs mount
Porting Notes:
* Use thread pools (tpool) API instead of introducing taskq interfaces
  to libzfs.
* Use pthread_mutext for locks as mutex_t isn't available.
* Ignore alternative libshare initialization since OpenZFS-7955 is
  not present on zfsonlinux.

Authored by: Sebastien Roy <seb@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Authored by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Matt Ahrens <mahrens@delphix.com>
Ported-by: Don Brady <don.brady@delphix.com>

OpenZFS-issue: https://www.illumos.org/issues/8115
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/a3f0e2b569
Closes #8092
2018-11-15 11:33:58 -08:00
loli10K d48091de81 zed: detect and offline physically removed devices
This commit adds a new test case to the ZFS Test Suite to verify ZED
can detect when a device is physically removed from a running system:
the device will be offlined if a spare is not available in the pool.

We implement this by using the existing libudev functionality and
without relying solely on the FM kernel module capabilities which have
been observed to be unreliable with some kernels.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #1537
Closes #7926
2018-11-09 11:17:24 -08:00
Tom Caputi d8244d34bd ZTS: Fix and reenable zfs_rename tests
zfs_rename_006_pos has been flaky in the past because it was
missing a call to block_device_wait to ensure the zvols it creates
are present before running dd. Whenever this this happened,
zfs_rename_009_neg would also fail because the first test would
leak a zvol clone that it did not know how to clean up. This patch
fixes the root cause and reenables the test. It also fixes some
minor grammar errors.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #5647 
Closes #5648 
Closes #8088
2018-11-07 16:59:27 -08:00
Paul Zuchowski c2bcfa71f4 ZTS: Fix test zfs_mount_006_pos
For Linux, place a file in the mount point folder so it will be
considered "busy".  Fix the while loop so it doesn't rm in
directories above the testdir.  Add Linux-specific code to test
overlay on|off.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #4990 
Closes #8081
2018-11-07 16:54:08 -08:00
Tom Caputi 80a91e7469 Defer new resilvers until the current one ends
Currently, if a resilver is triggered for any reason while an
existing one is running, zfs will immediately restart the existing
resilver from the beginning to include the new drive. This causes
problems for system administrators when a drive fails while another
is already resilvering. In this case, the optimal thing to do to
reduce risk of data loss is to wait for the current resilver to end
before immediately replacing the second failed drive, which allows
the system to operate with two incomplete drives for the minimum
amount of time.

This patch introduces the resilver_defer feature that essentially
does this for the admin without forcing them to wait and monitor
the resilver manually. The change requires an on-disk feature
since we must mark drives that are part of a deferred resilver in
the vdev config to ensure that we do not assume they are done
resilvering when an existing resilver completes.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: @mmaybee 
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #7732
2018-10-18 21:06:18 -07:00
Alek P 50a343d85c Fix changelist mounted-dataset iteration
Commit 0c6d093 caused a regression in the inherit codepath.
The fix is to restrict the changelist iteration on mountpoints and
add proper handling for 'legacy' mountpoints

Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Closes #7988 
Closes #7991
2018-10-10 21:13:13 -07:00
Tony Hutter 2ef0f8c329 Print "(repairing)" in zpool status again
Historically, zpool status prints "(repairing)" for any drives that
have errors during a scrub:

        NAME            STATE     READ WRITE CKSUM
        mypool          ONLINE       0     0     0
          mirror-0      ONLINE       0     0     0
            /tmp/file1  ONLINE      13     0     0  (repairing)
            /tmp/file2  ONLINE       0     0     0
            /tmp/file3  ONLINE       0     0     0

This was accidentally broken in "OpenZFS 9166 - zfs storage pool
checkpoint" (d2734cc).  This patch adds it back in.

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7779
Closes #7978
2018-10-09 20:30:32 -07:00
Prakash Surya 54eb2c410e Verify 'zfs destroy' will unshare the dataset
This change adds a new test case to the zfs-test suite to verify that
when 'zfs destroy' is used on a shared dataset, the dataset will be
unshared after the destroy operation completes.

Reviewed by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Prakash Surya <prakash.surya@delphix.com>
Closes #7941
2018-10-03 10:17:58 -07:00
Alek P 0c6d09361d changelist should be able to iter on mounts
Modified changelist_gather()ing for the mountpoint property.
Now instead of iterating on all dataset descendants, we read
/proc/self/mounts and iterate on the mounted descendant datasets only.

Switched changelist implementation from a uu_list_* to uu_avl_* in
order to  reduce changlist code-path's worst case time complexity.

Reviewed by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Closes #7967
2018-10-02 12:30:58 -07:00
LOLi 5140a58f3b zpool should detect invalid fs property on create
This change improve the handling of invalid filesystem properties when
specified at pool creation: this is useful when 'zpool create -n'
(dry run) is executed to detect invalid fs-level options (-O) before
the actual command is run.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7620 
Closes #7878
2018-09-13 13:37:42 -07:00
Roman Strashkin 733b5722b4 zpool split can create a corrupted pool
Added vdev_resilver_needed() check to verify VDEVs are fully
synced, so that after split the new pool will not be corrupted.

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Roman Strashkin <roman.strashkin@nexenta.com>
Closes #7865
Closes #7881
2018-09-12 18:14:42 -07:00
Don Brady cc99f275a2 Pool allocation classes
Allocation Classes add the ability to have allocation classes in a
pool that are dedicated to serving specific block categories, such
as DDT data, metadata, and small file blocks. A pool can opt-in to
this feature by adding a 'special' or 'dedup' top-level VDEV.

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Alek Pinchuk <apinchuk@datto.com>
Reviewed-by: Håkan Johansson <f96hajo@chalmers.se>
Reviewed-by: Andreas Dilger <andreas.dilger@chamcloud.com>
Reviewed-by: DHE <git@dehacked.net>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Gregor Kopka <gregor@kopka.net>
Reviewed-by: Kash Pande <kash@tripleback.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #5182
2018-09-05 18:33:36 -07:00
Don Brady e8bcb693d6 Add zfs module feature and property info to sysfs
This extends our sysfs '/sys/module/zfs' entry to include feature 
and property attributes. The primary consumer of this information 
is user processes, like the zfs CLI, that need to know what the 
current loaded ZFS module supports. The libzfs binary will consult 
this information when instantiating the zfs and zpool property 
tables and the pool features table.

This introduces 4 kernel objects (dirs) into '/sys/module/zfs'
with corresponding attributes (files):
  features.runtime
  features.pool
  properties.dataset
  properties.pool

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #7706
2018-09-02 12:09:53 -07:00
Brian Behlendorf bb91178e60
ZTS: Fix EBUSY volume destroy failures
It's possible for an unrelated process, like blkid, to have the
volume open when 'zfs destroy' is run.  Switch the cleanup functions
to the destroy_dataset() helper which handles this case by retrying
the destroy when the dataset is busy.  This was done not only for
volumes but also for file systems for consistency.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7854
2018-08-31 15:30:44 -07:00
bernie1995 0fe7c953b3 ZTS: path cleanup
Removing hardcoded paths in many scripts.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: bernie1995 <bernie.pikes@gmail.com>
Issue #7507 
Closes #7843
2018-08-30 13:46:55 -07:00
Brian Behlendorf 6c6949acae
ZTS: Fix zfs_create_013_pos
It's possible for an unrelated process, like blkid, to have the
volume open when 'zfs destroy' is run.  Switch the cleanup function
to the destroy_dataset() helper which handles this case by retrying
the destroy when the dataset is busy.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7847
2018-08-30 13:38:09 -07:00
Tom Caputi 47ab01a18f Always wait for txg sync when umounting dataset
Currently, when unmounting a filesystem, ZFS will only wait for
a txg sync if the dataset is dirty and not readonly. However, this
can be problematic in cases where a dataset is remounted readonly
immediately before being unmounted, which often happens when the
system is being shut down. Since encrypted datasets require that
all I/O is completed before the dataset is disowned, this issue
causes problems when write I/Os leak into the txgs after the
dataset is disowned, which can happen when sync=disabled.

While looking into fixes for this issue, it was discovered that
dsl_dataset_is_dirty() does not return B_TRUE when the dataset has
been removed from the txg dirty datasets list, but has not actually
been processed yet. Furthermore, the implementation is comletely
different from dmu_objset_is_dirty(), adding to the confusion.
Rather than relying on this function, this patch forces the umount
code path (and the remount readonly code path) to always perform a
txg sync on read-write datasets and removes the function altogether.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #7753
Closes #7795
2018-08-27 10:16:28 -07:00
LOLi c434d8806c Stack overflow when destroying deeply nested clones
Destroy operations on deeply nested chains of clones can overflow
the stack:

        Depth    Size   Location    (221 entries)
        -----    ----   --------
  0)    15664      48   mutex_lock+0x5/0x30
  1)    15616       8   mutex_lock+0x5/0x30
...
 26)    13576      72   dsl_dataset_remove_clones_key.isra.4+0x124/0x1e0 [zfs]
 27)    13504      72   dsl_dataset_remove_clones_key.isra.4+0x18a/0x1e0 [zfs]
 28)    13432      72   dsl_dataset_remove_clones_key.isra.4+0x18a/0x1e0 [zfs]
...
185)     2128      72   dsl_dataset_remove_clones_key.isra.4+0x18a/0x1e0 [zfs]
186)     2056      72   dsl_dataset_remove_clones_key.isra.4+0x18a/0x1e0 [zfs]
187)     1984      72   dsl_dataset_remove_clones_key.isra.4+0x18a/0x1e0 [zfs]
188)     1912     136   dsl_destroy_snapshot_sync_impl+0x4e0/0x1090 [zfs]
189)     1776      16   dsl_destroy_snapshot_check+0x0/0x90 [zfs]
...
218)      304     128   kthread+0xdf/0x100
219)      176      48   ret_from_fork+0x22/0x40
220)      128     128   kthread+0x0/0x100

Fix this issue by converting dsl_dataset_remove_clones_key() from
recursive to iterative.

Reviewed-by: Paul Zuchowski <pzuchowski@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7279 
Closes #7810
2018-08-22 11:03:31 -07:00
Serapheim Dimitropoulos a448a2557e Introduce read/write kstats per dataset
The following patch introduces a few statistics on reads and writes
grouped by dataset. These statistics are implemented as kstats
(backed by aggregate sums for performance) and can be retrieved by
using the dataset objset ID number. The motivation for this change is
to provide some preliminary analytics on dataset usage/performance.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #7705
2018-08-20 09:52:37 -07:00
Brian Behlendorf 089b16f48d
ZTS: Fix import_cache_device_replaced
Allow the 'zpool replace' to run slowly without overwhelming the vdev
queues by setting zfs_scan_vdev_limit=128k.  This limits the number of
concurrent slow IOs which need to be handled.  The net effect is the
test case runs approximately 3x faster putting it well under the 10
minute per-test time limit.

Rename import_cache* test cases to imprt_cachefile*.  Originally
these were renamed due to a maximum tar name limit, this limit was
removed by commit 1dfde3d9b.

Replaced instances of /var/tmp in zpool_import.cfg with $TEST_BASE_DIR.

Reviewed-by: bunder2015 <omfgbunder@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7765 
Closes #7802
2018-08-18 21:16:12 -07:00
Brian Behlendorf 1dfde3d9b2
Use posix format for dist tarballs
Traditionally Automake has defaulted to the V7 tar format when
creating tarballs for distributions.  One of the many limitions
of this format is a 99 character maximum path + file name limit.
This can cause problems when adding new test cases to the ZTS
due to the depth of the sub-tree and descriptive test names.

This change switches the build system to the posix (aliased as
pax) tar format which conforms to the POSIX.1-2001 specification.
This format does not suffer from the V7 limitations, was designed
to be compatible, and will become the default format in future
versions of GNU tar.

https://www.gnu.org/software/tar/manual/html_chapter/tar_8.html

As part of this change the blockfiles directories which were
originally removed due to this limit have been readded.

Reviewed by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7767
2018-08-15 09:52:28 -07:00
LOLi c8c308362c Allow inherited properties in zfs_check_settable()
This change modifies how 'checksum' and 'dedup' properties are verified
in zfs_check_settable() handling the case where they are explicitly
inherited in the dataset hierarchy when receiving a recursive send
stream.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7755 
Closes #7576 
Closes #7757
2018-08-03 14:56:25 -07:00
Brian Behlendorf 6da0998f59
ZTS: Fix zfs_create_007_pos
It's possible for an unrelated process, like blkid, to have the
volume open when 'zfs destroy' is run.  Switch the cleanup function
to the destroy_dataset() helper which handles this case by retrying
the destroy when the dataset is busy.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7763
2018-08-03 10:21:50 -07:00
Brian Behlendorf d441e85dd7
Add support for autoexpand property
While the autoexpand property may seem like a small feature it
depends on a significant amount of system infrastructure.  Enough
of that infrastructure is now in place that with a few modifications
for Linux it can be supported.

Auto-expand works as follows; when a block device is modified
(re-sized, closed after being open r/w, etc) a change uevent is
generated for udev.  The ZED, which is monitoring udev events,
passes the change event along to zfs_deliver_dle() if the disk
or partition contains a zfs_member as identified by blkid.

From here the device is matched against all imported pool vdevs
using the vdev_guid which was read from the label by blkid.  If
a match is found the ZED reopens the pool vdev.  This re-opening
is important because it allows the vdev to be briefly closed so
the disk partition table can be re-read.  Otherwise, it wouldn't
be possible to report the maximum possible expansion size.

Finally, if the property autoexpand=on a vdev expansion will be
attempted.  After performing some sanity checks on the disk to
verify that it is safe to expand,  the primary partition (-part1)
will be expanded and the partition table updated.  The partition
is then re-opened (again) to detect the updated size which allows
the new capacity to be used.

In order to make all of the above possible the following changes
were required:

* Updated the zpool_expand_001_pos and zpool_expand_003_pos tests.
  These tests now create a pool which is layered on a loopback,
  scsi_debug, and file vdev.  This allows for testing of non-
  partitioned block device (loopback), a partition block device
  (scsi_debug), and a file which does not receive udev change
  events.  This provided for better test coverage, and by removing
  the layering on ZFS volumes there issues surrounding layering
  one pool on another are avoided.

* zpool_find_vdev_by_physpath() updated to accept a vdev guid.
  This allows for matching by guid rather than path which is a
  more reliable way for the ZED to reference a vdev.

* Fixed zfs_zevent_wait() signal handling which could result
  in the ZED spinning when a signal was not handled.

* Removed vdev_disk_rrpart() functionality which can be abandoned
  in favor of kernel provided blkdev_reread_part() function.

* Added a rwlock which is held as a writer while a disk is being
  reopened.  This is important to prevent errors from occurring
  for any configuration related IOs which bypass the SCL_ZIO lock.
  The zpool_reopen_007_pos.ksh test case was added to verify IO
  error are never observed when reopening.  This is not expected
  to impact IO performance.

Additional fixes which aren't critical but were discovered and
resolved in the course of developing this functionality.

* Added PHYS_PATH="/dev/zvol/dataset" to the vdev configuration for
  ZFS volumes.  This is as good as a unique physical path, while the
  volumes are not used in the test cases anymore for other reasons
  this improvement was included.

Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Sara Hartse <sara.hartse@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #120
Closes #2437
Closes #5771
Closes #7366
Closes #7582
Closes #7629
2018-07-23 15:40:15 -07:00
Serapheim Dimitropoulos a7ed98d8b5 OpenZFS 9330 - stack overflow when creating a deeply nested dataset
Datasets that are deeply nested (~100 levels) are impractical. We just
put a limit of 50 levels to newly created datasets. Existing datasets
should work without a problem.

The problem can be seen by attempting to create a dataset using the -p
option with many levels:

    panic[cpu0]/thread=ffffff01cd282c20: BAD TRAP: type=8 (#df Double fault) rp=ffffffff

    fffffffffbc3aa60 unix:die+100 ()
    fffffffffbc3ab70 unix:trap+157d ()
    ffffff00083d7020 unix:_patch_xrstorq_rbx+196 ()
    ffffff00083d7050 zfs:dbuf_rele+2e ()
    ...
    ffffff00083d7080 zfs:dsl_dir_close+32 ()
    ffffff00083d70b0 zfs:dsl_dir_evict+30 ()
    ffffff00083d70d0 zfs:dbuf_evict_user+4a ()
    ffffff00083d7100 zfs:dbuf_rele_and_unlock+87 ()
    ffffff00083d7130 zfs:dbuf_rele+2e ()
    ... The block above repeats once per directory in the ...
    ... create -p command, working towards the root ...
    ffffff00083db9f0 zfs:dsl_dataset_drop_ref+19 ()
    ffffff00083dba20 zfs:dsl_dataset_rele+42 ()
    ffffff00083dba70 zfs:dmu_objset_prefetch+e4 ()
    ffffff00083dbaa0 zfs:findfunc+23 ()
    ffffff00083dbb80 zfs:dmu_objset_find_spa+38c ()
    ffffff00083dbbc0 zfs:dmu_objset_find+40 ()
    ffffff00083dbc20 zfs:zfs_ioc_snapshot_list_next+4b ()
    ffffff00083dbcc0 zfs:zfsdev_ioctl+347 ()
    ffffff00083dbd00 genunix:cdev_ioctl+45 ()
    ffffff00083dbd40 specfs:spec_ioctl+5a ()
    ffffff00083dbdc0 genunix:fop_ioctl+7b ()
    ffffff00083dbec0 genunix:ioctl+18e ()
    ffffff00083dbf10 unix:brand_sys_sysenter+1c9 ()

Porting notes:
* Added zfs_max_dataset_nesting module option with documentation.
* Updated zfs_rename_014_neg.ksh for Linux.
* Increase the zfs.sh stack warning to 15K.  Enough time has passed
  that 16K can be reasonably assumed to be the default value.  It
  was increased in the 3.15 kernel released in June of 2014.

Authored by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Garrett D'Amore <garrett@damore.org>

OpenZFS-issue: https://www.illumos.org/issues/9330
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/757a75a
Closes #7681
2018-07-09 13:02:50 -07:00
Serapheim Dimitropoulos 4d044c4c1d OpenZFS 9238 - ZFS Spacemap Encoding V2
Motivation
==========

The current space map encoding has the following disadvantages:
[1] Assuming 512 sector size each entry can represent at most 16MB for a segment.
    This makes the encoding very inefficient for large regions of space.
[2] As vdev-wide space maps have started to be used by new features (i.e.
    device removal, zpool checkpoint) we've started imposing limits in the
    vdevs that can be used with them based on the maximum addressable offset
    (currently 64PB for a top-level vdev).

New encoding
============

The layout can be found at space_map.h and it remains backwards compatible with
the old one. The introduced two-word entry format, besides extending the limits
imposed by the single-entry layout, also includes a vdev field and some extra
padding after its prefix.

The extra padding after the prefix should is reserved for future usage (e.g.
new prefixes for future encodings or new fields for flags). The new vdev field
not only makes the space maps more self-descriptive, but also opens the doors
for pool-wide space maps (expected to be used in the log spacemap project).

One final important note is that the number of bits used for vdevs is reduced
to 24 bits for blkptrs. That was decided as we don't know of any setups that
use more than 16M vdevs for the time being and we wanted to fit the vdev field
in the space map. In addition that gives us some extra bits in dva_t.

Other references:
=================

The new encoding is also discussed towards the end of the Log Space Map
presentation from 2017's OpenZFS summit.
Link: https://www.youtube.com/watch?v=jj2IxRkl5bQ

Authored by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <gwilson@zfsmail.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Gordon Ross <gwr@nexenta.com>
Ported-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Tim Chase <tim@chase2k.com>

OpenZFS-commit: https://github.com/openzfs/openzfs/commit/90a56e6d
OpenZFS-issue: https://www.illumos.org/issues/9238
Closes #7665
2018-07-05 12:02:34 -07:00
Serapheim Dimitropoulos d2734cce68 OpenZFS 9166 - zfs storage pool checkpoint
Details about the motivation of this feature and its usage can
be found in this blogpost:

    https://sdimitro.github.io/post/zpool-checkpoint/

A lightning talk of this feature can be found here:
https://www.youtube.com/watch?v=fPQA8K40jAM

Implementation details can be found in big block comment of
spa_checkpoint.c

Side-changes that are relevant to this commit but not explained
elsewhere:

* renames members of "struct metaslab trees to be shorter without
  losing meaning

* space_map_{alloc,truncate}() accept a block size as a
  parameter. The reason is that in the current state all space
  maps that we allocate through the DMU use a global tunable
  (space_map_blksz) which defauls to 4KB. This is ok for metaslab
  space maps in terms of bandwirdth since they are scattered all
  over the disk. But for other space maps this default is probably
  not what we want. Examples are device removal's vdev_obsolete_sm
  or vdev_chedkpoint_sm from this review. Both of these have a
  1:1 relationship with each vdev and could benefit from a bigger
  block size.

Porting notes:

* The part of dsl_scan_sync() which handles async destroys has
  been moved into the new dsl_process_async_destroys() function.

* Remove "VERIFY(!(flags & FWRITE))" in "kernel.c" so zhack can write
  to block device backed pools.

* ZTS:
  * Fix get_txg() in zpool_sync_001_pos due to "checkpoint_txg".

  * Don't use large dd block sizes on /dev/urandom under Linux in
    checkpoint_capacity.

  * Adopt Delphix-OS's setting of 4 (spa_asize_inflation =
    SPA_DVAS_PER_BP + 1) for the checkpoint_capacity test to speed
    its attempts to fill the pool

  * Create the base and nested pools with sync=disabled to speed up
    the "setup" phase.

  * Clear labels in test pool between checkpoint tests to avoid
    duplicate pool issues.

  * The import_rewind_device_replaced test has been marked as "known
    to fail" for the reasons listed in its DISCLAIMER.

  * New module parameters:

      zfs_spa_discard_memory_limit,
      zfs_remove_max_bytes_pause (not documented - debugging only)
      vdev_max_ms_count (formerly metaslabs_per_vdev)
      vdev_min_ms_count

Authored by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Richard Lowe <richlowe@richlowe.net>
Ported-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Tim Chase <tim@chase2k.com>

OpenZFS-issue: https://illumos.org/issues/9166
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/7159fdb8
Closes #7570
2018-06-26 10:07:42 -07:00
Brian Behlendorf e4a3297a04
ZTS: Adopt OpenZFS test analysis script
Adopt and extend the OpenZFS ZTS results analysis script for use
with ZFS on Linux.  This allows for automatic analysis of tests
which may be skipped for a variety or reasons or which are not
entirely reliable.

In addition to the list of 'known' failures, which have been updated
for ZFS on Linux, there in a new 'maybe' section.  This mapping
include tests which might be correctly skipped depending on the
test environment.  This may be because of a missing dependency or
lack of required kernel support.  This list also includes tests
which normally pass but might on occasion fail for a harmless
reason.

The script was also extended include a reason for why a given test
might be skipped or may fail.  The reason will be included after
the test in the "results other than PASS that are expected" section.
For failures it is preferable to set the reason to the GitHub issue
number and for skipped tests several generic reasons are available.
You may also specify a custom reason if needed.

All tests were added back in to the linux.run file even if they are
expected to failed.  There is value in running tests which may not
pass, the expected results for these tests has been encoded in
the new analysis script.

All tests which were disabled because they ran more slowly on a
32-bit system have been re-enabled.  Developers working on 32-bit
systems should assess what it reasonable for their environment.

The unnecessary dependency on physical block devices was removed for
the checksum, grow_pool, and grow_replicas test groups so they are
no longer skipped.  Updated the filetest_001_pos test case to run
properly now that it is enabled and moved the grow tests in to a
single directory.

Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7638
2018-06-20 14:03:13 -07:00
Alek P 6969afcefd Always continue recursive destroy after error
Currently, during a recursive zfs destroy the first error that is
encountered will stop the destruction of the datasets. Errors may
happen for a variety of reasons including competing deletions
and busy datasets.
This patch switches recursive destroy to always do a best-effort
recursive dataset destroy.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Closes #7574
2018-06-06 10:14:52 -07:00
Sara Hartse 74d42600d8 zpool reopen should detect expanded devices
Update bdev_capacity to have wholedisk vdevs query the
size of the underlying block device (correcting for the size
of the efi parition and partition alignment) and therefore detect
expanded space.

Correct vdev_get_stats_ex so that the expandsize is aligned
to metaslab size and new space is only reported if it is large
enough for a new metaslab.

Reviewed by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Wren Kennedy <jwk404@gmail.com>
Signed-off-by: sara hartse <sara.hartse@delphix.com>
External-issue: LX-165
Closes #7546 
Issue #7582
2018-05-31 10:36:37 -07:00
Brian Behlendorf ab6a2b5cd7
ZTS: Improve zpool_scrub_004_pos reliability
It's possible for the `zpool attach` portion of this test case
to complete before the `zpool scrub` can be issued.  Update the
test case to force the resilvering phase to take longer.

Reviewed-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5444 
Closes #7541
2018-05-15 08:58:46 -07:00
Pavel Zakharov 189bd0b670 OpenZFS 9190 - Fix cleanup routine in import_cachefile_device_replaced.ksh
Must clear slow-disk zinject injections in test cleanup routine.
Otherwise, when this test fails, it causes most subsequent tests
to fail.

Authored by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

OpenZFS-issue: https://illumos.org/issues/9190
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/762c6b4
Closes #7530
2018-05-14 14:30:28 -04:00
bunder2015 29badadd4e Fix shebangs on import tests
Incorrect shebangs were used when porting.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: bunder2015 <omfgbunder@gmail.com>
Closes #7523  
Closes #7524
2018-05-11 12:37:44 -07:00
Tim Chase f3d28f0a59 Streamline the zpool_import tests
Don't create an ext4 file system atop $DEV_DISKDIR/$DISK2.
There's likely to not be sufficient space for it to succeed.
Instead, simply create the vdev files in the directory where it
would have been mounted.

Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7459
2018-05-08 21:40:13 -07:00
Pavel Zakharov 6cb8e5306d OpenZFS 9075 - Improve ZFS pool import/load process and corrupted pool recovery
Some work has been done lately to improve the debugability of the ZFS pool
load (and import) process. This includes:

	7638 Refactor spa_load_impl into several functions
	8961 SPA load/import should tell us why it failed
	7277 zdb should be able to print zfs_dbgmsg's

To iterate on top of that, there's a few changes that were made to make the
import process more resilient and crash free. One of the first tasks during the
pool load process is to parse a config provided from userland that describes
what devices the pool is composed of. A vdev tree is generated from that config,
and then all the vdevs are opened.

The Meta Object Set (MOS) of the pool is accessed, and several metadata objects
that are necessary to load the pool are read. The exact configuration of the
pool is also stored inside the MOS. Since the configuration provided from
userland is external and might not accurately describe the vdev tree
of the pool at the txg that is being loaded, it cannot be relied upon to safely
operate the pool. For that reason, the configuration in the MOS is read early
on. In the past, the two configurations were compared together and if there was
a mismatch then the load process was aborted and an error was returned.

The latter was a good way to ensure a pool does not get corrupted, however it
made the pool load process needlessly fragile in cases where the vdev
configuration changed or the userland configuration was outdated. Since the MOS
is stored in 3 copies, the configuration provided by userland doesn't have to be
perfect in order to read its contents. Hence, a new approach has been adopted:
The pool is first opened with the untrusted userland configuration just so that
the real configuration can be read from the MOS. The trusted MOS configuration
is then used to generate a new vdev tree and the pool is re-opened.

When the pool is opened with an untrusted configuration, writes are disabled
to avoid accidentally damaging it. During reads, some sanity checks are
performed on block pointers to see if each DVA points to a known vdev;
when the configuration is untrusted, instead of panicking the system if those
checks fail we simply avoid issuing reads to the invalid DVAs.

This new two-step pool load process now allows rewinding pools accross
vdev tree changes such as device replacement, addition, etc. Loading a pool
from an external config file in a clustering environment also becomes much
safer now since the pool will import even if the config is outdated and didn't,
for instance, register a recent device addition.

With this code in place, it became relatively easy to implement a
long-sought-after feature: the ability to import a pool with missing top level
(i.e. non-redundant) devices. Note that since this almost guarantees some loss
of data, this feature is for now restricted to a read-only import.

Porting notes (ZTS):
* Fix 'make dist' target in zpool_import

* The maximum path length allowed by tar is 99 characters.  Several
  of the new test cases exceeded this limit resulting in them not
  being included in the tarball.  Shorten the names slightly.

* Set/get tunables using accessor functions.

* Get last synced txg via the "zfs_txg_history" mechanism.

* Clear zinject handlers in cleanup for import_cache_device_replaced
  and import_rewind_device_replaced in order that the zpool can be
  exported if there is an error.

* Increase FILESIZE to 8G in zfs-test.sh to allow for a larger
  ext4 file system to be created on ZFS_DISK2.  Also, there's
  no need to partition ZFS_DISK2 at all.  The partitioning had
  already been disabled for multipath devices.  Among other things,
  the partitioning steals some space from the ext4 file system,
  makes it difficult to accurately calculate the paramters to
  parted and can make some of the tests fail.

* Increase FS_SIZE and FILE_SIZE in the zpool_import test
  configuration now that FILESIZE is larger.

* Write more data in order that device evacuation take lonnger in
  a couple tests.

* Use mkdir -p to avoid errors when the directory already exists.

* Remove use of sudo in import_rewind_config_changed.

Authored by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>
Ported-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Tim Chase <tim@chase2k.com>

OpenZFS-issue: https://illumos.org/issues/9075
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/619c0123
Closes #7459
2018-05-08 21:35:27 -07:00
Paul Dagnelie ca0845d59e OpenZFS 9256 - zfs send space estimation off by > 10% on some datasets
Authored by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Richard Lowe <richlowe@richlowe.net>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>

Porting Notes:
* Added tuning to man page.
* Test case changes dropped, default behavior unchanged.

OpenZFS-issue: https://www.illumos.org/issues/9256
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/32356b3c56
Closes #7470
2018-05-08 08:59:24 -07:00
LOLi 4ceb8dd6fd Fix 'zpool create -t <tempname>'
Creating a pool with a temporary name fails when we also specify custom
dataset properties: this is because we mistakenly call
zfs_set_prop_nvlist() on the "real" pool name which, as expected,
cannot be found because the SPA is present in the namespace with the
temporary name.

Fix this by specifying the correct pool name when setting the dataset
properties.

Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7502 
Closes #7509
2018-05-07 21:11:58 -07:00
Brian Behlendorf c02c1becce
ZTS: Re-enable MMP tests
Commit 7fab6361 inadvertently disabled the MMP test cases by creating
and not removing an /etc/hostid file in the new zpool_split_props test
case.  When the file exists the ZTS skips the entire MMP test group
rather than modify what may be a system which is already configured.
Update the test case to remove the file.

Additionally, because the MMP tests were disabled a regression slipped
in as part of commit 9eb7b46ed0.  Fix it.

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7514
2018-05-07 21:08:33 -07:00
LOLi fb0da71fd9 ZTS: Fix zfs_diff_timestamp
When using mawk instead of gawk zfs_diff_timestamp fails consistently:
this is due to a subtle difference in how mawk handles substr().

From awk(1):
---
Finally, here is how mawk handles exceptional cases not discussed in
the AWK book or the Posix draft.  It is unsafe to assume consistency
across awks and safe to skip to the next section.
substr(s,  i,  n) returns the characters of s in the intersection of
the closed interval [1, length(s)] and the half-open interval [i, i+n).
When this intersection is empty, the empty string is returned; so
substr("ABC", 1, 0) = "" and substr("ABC", -4, 6) = "A".
---

To support running zfs_diff_timestamp with both gawk and mawk change
the second parameter passed to substr().

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7503 
Closes #7510
2018-05-06 20:42:29 -07:00
Tom Caputi 2c24b5b148 Fix issues found with zfs diff
Two deadlocks / ASSERT failures were introduced in a2c2ed1b which
would occur whenever arc_buf_fill() failed to decrypt a block of
data. This occurred because the call to arc_buf_destroy() which
was responsible for cleaning up the newly created buffer would
attempt to take out the hdr lock that it was already holding. This
was resolved by calling the underlying functions directly without
retaking the lock.

In addition, the dmu_diff() code did not properly ensure that keys
were loaded and mapped before begining dataset traversal. It turns
out that this code does not need to look at any encrypted values,
so the code was altered to perform raw IO only.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #7354 
Closes #7456
2018-05-01 11:24:20 -07:00
LOLi 3cbe89b12a Fix zfs incremental send remove '-o' properties
When receiving an incremental send stream with intermediary snapshots
zfs_receive_one() does not correctly identify the top-level dataset:
consequently we restore said snapshots as if they were children
datasets in the hierarchy, forcing inheritance of any property received
with 'zfs send -o' and effectively removing any locally set value.

The test case did not correctly verify this situation because it uses
adjacent snapshots, basically testing 'zfs send -i' instead of
'zfs send -I': this commit adds an additional intermediary snapshot to
the test script.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7478
2018-04-30 20:58:29 -07:00
LOLi b4555c777a Fix 'zfs remap <poolname@snapname>'
Only filesystems and volumes are valid 'zfs remap' parameters: when
passed a snapshot name zfs_remap_indirects() does not handle the
EINVAL returned from libzfs_core, which results in failing an assertion
and consequently crashing.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7454
2018-04-19 09:45:17 -07:00
bunder2015 3eba666332 ZTS: zpool_create_002 clean up leftover filedisk
zpool_create_002_pos did not clean up filedisk files left over from
running the test.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: bunder2015 <omfgbunder@gmail.com>
Closes #7435
Closes #7439
2018-04-15 15:17:44 -07:00
Matthew Ahrens a1d477c24c OpenZFS 7614, 9064 - zfs device evacuation/removal
OpenZFS 7614 - zfs device evacuation/removal
OpenZFS 9064 - remove_mirror should wait for device removal to complete

This project allows top-level vdevs to be removed from the storage pool
with "zpool remove", reducing the total amount of storage in the pool.
This operation copies all allocated regions of the device to be removed
onto other devices, recording the mapping from old to new location.
After the removal is complete, read and free operations to the removed
(now "indirect") vdev must be remapped and performed at the new location
on disk.  The indirect mapping table is kept in memory whenever the pool
is loaded, so there is minimal performance overhead when doing operations
on the indirect vdev.

The size of the in-memory mapping table will be reduced when its entries
become "obsolete" because they are no longer used by any block pointers
in the pool.  An entry becomes obsolete when all the blocks that use
it are freed.  An entry can also become obsolete when all the snapshots
that reference it are deleted, and the block pointers that reference it
have been "remapped" in all filesystems/zvols (and clones).  Whenever an
indirect block is written, all the block pointers in it will be "remapped"
to their new (concrete) locations if possible.  This process can be
accelerated by using the "zfs remap" command to proactively rewrite all
indirect blocks that reference indirect (removed) vdevs.

Note that when a device is removed, we do not verify the checksum of
the data that is copied.  This makes the process much faster, but if it
were used on redundant vdevs (i.e. mirror or raidz vdevs), it would be
possible to copy the wrong data, when we have the correct data on e.g.
the other side of the mirror.

At the moment, only mirrors and simple top-level vdevs can be removed
and no removal is allowed if any of the top-level vdevs are raidz.

Porting Notes:

* Avoid zero-sized kmem_alloc() in vdev_compact_children().

    The device evacuation code adds a dependency that
    vdev_compact_children() be able to properly empty the vdev_child
    array by setting it to NULL and zeroing vdev_children.  Under Linux,
    kmem_alloc() and related functions return a sentinel pointer rather
    than NULL for zero-sized allocations.

* Remove comment regarding "mpt" driver where zfs_remove_max_segment
  is initialized to SPA_MAXBLOCKSIZE.

  Change zfs_condense_indirect_commit_entry_delay_ticks to
  zfs_condense_indirect_commit_entry_delay_ms for consistency with
  most other tunables in which delays are specified in ms.

* ZTS changes:

    Use set_tunable rather than mdb
    Use zpool sync as appropriate
    Use sync_pool instead of sync
    Kill jobs during test_removal_with_operation to allow unmount/export
    Don't add non-disk names such as "mirror" or "raidz" to $DISKS
    Use $TEST_BASE_DIR instead of /tmp
    Increase HZ from 100 to 1000 which is more common on Linux

    removal_multiple_indirection.ksh
        Reduce iterations in order to not time out on the code
        coverage builders.

    removal_resume_export:
        Functionally, the test case is correct but there exists a race
        where the kernel thread hasn't been fully started yet and is
        not visible.  Wait for up to 1 second for the removal thread
        to be started before giving up on it.  Also, increase the
        amount of data copied in order that the removal not finish
        before the export has a chance to fail.

* MMP compatibility, the concept of concrete versus non-concrete devices
  has slightly changed the semantics of vdev_writeable().  Update
  mmp_random_leaf_impl() accordingly.

* Updated dbuf_remap() to handle the org.zfsonlinux:large_dnode pool
  feature which is not supported by OpenZFS.

* Added support for new vdev removal tracepoints.

* Test cases removal_with_zdb and removal_condense_export have been
  intentionally disabled.  When run manually they pass as intended,
  but when running in the automated test environment they produce
  unreliable results on the latest Fedora release.

  They may work better once the upstream pool import refectoring is
  merged into ZoL at which point they will be re-enabled.

Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Alex Reece <alex@delphix.com>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Richard Laager <rlaager@wiktel.com>
Reviewed by: Tim Chase <tim@chase2k.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Garrett D'Amore <garrett@damore.org>
Ported-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Tim Chase <tim@chase2k.com>

OpenZFS-issue: https://www.illumos.org/issues/7614
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/f539f1eb
Closes #6900
2018-04-14 12:16:17 -07:00
Tim Chase 4b0f5b2d7b Wait for resilver after online
This test performs a rapid offline/online cycle of each of several
mirror vdevs.  It can run so quickly that there isn't sufficient pool
redundancy to perform an offline.  The solution is to wait until the
pool is resilvered following the online operation.

Also, add a pool sync before the offline operation to help reduce
spurious errors.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Issue #6900
2018-04-13 18:04:10 -07:00
Seth Forshee 93b43af10d Allow mounting datasets more than once
Currently mounting an already mounted zfs dataset results in an
error, whereas it is typically allowed with other filesystems.
This causes some bad interactions with mount namespaces. Take
this sequence for example:

- Create a dataset
- Create a snapshot of the dataset
- Create a clone of the snapshot
- Create a new mount namespace
- Rename the original dataset

The rename results in unmounting and remounting the clone in the
original mount namespace, however the remount fails because the
dataset is still mounted in the new mount namespace. (Note that
this means the mount in the new mount namespace is never being
unmounted, so perhaps the unmount/remount of the clone isn't
actually necessary.)

The problem here is a result of the way mounting is implemented
in the kernel module. Since it is not mounting block devices it
uses mount_nodev() instead of the usual mount_bdev(). However,
mount_nodev() is written for filesystems for which each mount is
a new instance (i.e. a new super block), and zfs should be able
to detect when a mount request can be satisfied using an existing
super block.

Change zpl_mount() to call sget() directly with it's own test
callback. Passing the objset_t object as the fs data allows
checking if a superblock already exists for the dataset, and in
that case we just need to return a new reference for the sb's
root dentry.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Closes #5796
Closes #7207
2018-04-13 10:44:05 -07:00
bunder2015 1e37dee03f ZTS: clean up leftover ibackup_trunc files
zfs_receive_raw_incremental did not clean up ibackup_trunc.* files
left over from running the test.

Also changing the path of the ibackup files so they can be placed
in the correct directories when /var/tmp is not the temporary
directory.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: bunder2015 <omfgbunder@gmail.com>
Closes #7430
2018-04-13 10:35:55 -07:00
LOLi 7fab636188 Add 'zpool split' coverage to the ZFS Test Suite
This change adds five new tests to the ZTS:

 * zpool_split_cliargs: verify command line options and arguments
 * zpool_split_devices: verify zpool split accepts a device list
 * zpool_split_encryption: verify zpool can split encrypted pools
 * zpool_split_props: verify zpool split can set property values
 * zpool_split_vdevs: verify vdev layout when splitting the pool

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7409
2018-04-12 10:57:24 -07:00
LOLi 9966754ac5 Fix zpool set feature@<feature>=disabled
Commit e4010f2 accidentally allows zpool to set pool features to
"disabled"; this should only be allowed at pool creation. This commit
adds additional checks and test coverage to 'zpool set'.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7402
2018-04-11 14:45:58 -07:00
Giuseppe Di Natale 7b47628acb Clean up (k)shlib and cfg file shebangs
Most kshlib files are imported by other scripts
and do not have a shebang at the top of their files.
Make all kshlib follow this convention.

Remove shebangs from cfg files as well.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Close #7406
2018-04-08 19:37:22 -07:00
Tony Hutter 6c9af9e8f4 Fix "file is executable, but no shebang" warnings
Fedora 28's RPM build checks warn when executable files don't have a
shebang line.  These warnings are caused when we (incorrectly)
include data & config files in the_SCRIPTS automake lines. Files in
_SCRIPTS are marked executable by automake. This patch fixes the
issue by including non-executable scripts in a _DATA line instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7359 
Closes #7395
2018-04-06 16:34:21 -07:00
Tom Caputi 1bf9a552bb Make encrypted "zfs mount -a" failures consistent
Currently, "zfs mount -a" will print a warning and fail to mount
any encrypted datasets that do not have a key loaded. This patch
makes the behavior of this failure consistent with other failure
modes ("zfs mount -a" will silently continue, explict "zfs mount"
will print a message and return an error code.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #7382
2018-04-06 13:28:15 -07:00
Tony Hutter 21a4f5cc86 Fedora 28: Fix misc bounds check compiler warnings
Fix a bunch of (mostly) sprintf/snprintf truncation compiler
warnings that show up on Fedora 28 (GCC 8.0.1).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7361 
Closes #7368
2018-04-04 10:16:47 -07:00
LOLi f119d00c1f Fix add_nested_replacing_spare test case
Use 'zpool reopen' instead of 'zpool scrub' to kick in the spare device:
this is required to avoid spurious failures caused by a race condition
in events processing by the ZFS Event Daemon:

P1 (zpool scrub)                            P2 (zed)
---
zfs_ioc_pool_scan()
 -> dsl_scan()
  -> vdev_reopen()
   -> vdev_set_state(VDEV_STATE_CANT_OPEN)
                                            zfs_ioc_vdev_attach()
                                             -> spa_vdev_attach()
                                              -> dsl_resilver_restart()
  -> dsl_sync_task()
   -> dsl_scan_setup_check()
   <- dsl_scan_setup_check(): EBUSY

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7247 
Closes #7342
2018-04-03 17:30:14 -07:00
Andriy Gapon 5e00213e43 OpenZFS 9164 - assert: newds == os->os_dsl_dataset
Authored by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Don Brady <don.brady@delphix.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Richard Lowe <richlowe@richlowe.net>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>

Porting Notes:
* Re-enabled and tweaked the zpool_upgrade_007_pos test case
  to successfully run in under 5 minutes.

OpenZFS-issue: https://www.illumos.org/issues/9164
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/0e776dc06a
Closes #6112
Closes #7336
2018-03-30 12:00:40 -07:00
Alek P 272b5d730f Add JSON output support to channel programs
The changes piggyback JSON output support on top of channel programs 
(#6558).  This way the JSON output support is targeted to scripting 
use cases and is easily maintainable since it really only touches 
one function (zfs_do_channel_program()).

This patch ports Joyent's JSON nvlist library from illumos to enable 
easy JSON printing of channel program output nvlist.  To keep the 
delta small I also took advantage of the fact that printing in
zfs_do_channel_program() was almost always done before exiting 
the program.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Closes #7281
2018-03-19 12:40:58 -07:00
Paul Zuchowski 8e5d14844d zdb and inuse tests don't pass with real disks
Due to zpool create auto-partioning in Linux (i.e. sdb1),
certain utilities need to use the parition (sdb1) while
others use the whole disk name (sdb).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #6939 
Closes #7261
2018-03-07 17:03:33 -08:00
Giuseppe Di Natale c7b55e71b0 Introduce a destroy_dataset helper
Datasets can be busy when calling zfs destroy. Introduce
a helper function to destroy datasets and use it to destroy
datasets in zfs_allow_004_pos, zfs_promote_008_pos, and
zfs_destroy_002_pos.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #7224 
Closes #7246 
Closes #7249 
Closes #7267
2018-03-06 14:54:57 -08:00
LOLi 4af6873af6 Fix segfault in zfs_do_bookmark()
When invoked with wrong parameters 'zfs bookmark' fails to gracefully
validate user input and crashes.

This is a regression accidentally introduced in 587e228; this commit
adds additional tests to the ZFS Test Suite to exercise this codepath.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: KireinaHoro <i@jsteward.moe>
Signed-off-by:  loli10K <ezomori.nozomu@gmail.com>
Closes #7228 
Closes #7229
2018-02-26 09:55:18 -08:00
LOLi faa97c1619 Want 'zfs send -b'
This change implements 'zfs send -b' which can be used to send only
received property values whether or not they are overridden by local
settings.

This can be very useful during "restore" operations from a backup pool
because it allows to send only the property values originally sent
from the backup source, even though they were later modified on the
destination either by a 'zfs set' operation, explicit 'zfs inherit' or
overridden during the receive process via 'zfs receive -o|-x'.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7156
2018-02-21 12:32:06 -08:00
Tom Caputi b0918402dc Raw receive should change key atomically
Currently, raw zfs sends transfer the encrypted master keys and
objset_phys_t encryption parameters in the DRR_BEGIN payload of
each send file. Both of these are processed as soon as they are
read in dmu_recv_stream(), meaning that the new keys are set
before the new snapshot is received. In addition to the fact that
this changes the user's keys for the dataset earlier than they
might expect, the keys were never reset to what they originally
were in the event that the receive failed. This patch splits the
processing into objset handling and key handling, the later of
which is moved to dmu_recv_end() so that they key change can be
done atomically.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #7200
2018-02-21 12:31:03 -08:00
Nasf-Fan 9c5167d19f Project Quota on ZFS
Project quota is a new ZFS system space/object usage accounting
and enforcement mechanism. Similar as user/group quota, project
quota is another dimension of system quota. It bases on the new
object attribute - project ID.

Project ID is a numerical value to indicate to which project an
object belongs. An object only can belong to one project though
you (the object owner or privileged user) can change the object
project ID via 'chattr -p' or 'zfs project [-s] -p' explicitly.
The object also can inherit the project ID from its parent when
created if the parent has the project inherit flag (that can be
set via 'chattr +P' or 'zfs project -s [-p]').

By accounting the spaces/objects belong to the same project, we
can know how many spaces/objects used by the project. And if we
set the upper limit then we can control the spaces/objects that
are consumed by such project. It is useful when multiple groups
and users cooperate for the same project, or a user/group needs
to participate in multiple projects.

Support the following commands and functionalities:

zfs set projectquota@project
zfs set projectobjquota@project

zfs get projectquota@project
zfs get projectobjquota@project
zfs get projectused@project
zfs get projectobjused@project

zfs projectspace

zfs allow projectquota
zfs allow projectobjquota
zfs allow projectused
zfs allow projectobjused

zfs unallow projectquota
zfs unallow projectobjquota
zfs unallow projectused
zfs unallow projectobjused

chattr +/-P
chattr -p project_id
lsattr -p

This patch also supports tree quota based on the project quota via
"zfs project" commands set as following:
zfs project [-d|-r] <file|directory ...>
zfs project -C [-k] [-r] <file|directory ...>
zfs project -c [-0] [-d|-r] [-p id] <file|directory ...>
zfs project [-p id] [-r] [-s] <file|directory ...>

For "df [-i] $DIR" command, if we set INHERIT (project ID) flag on
the $DIR, then the proejct [obj]quota and [obj]used values for the
$DIR's project ID will be shown as the total/free (avail) resource.
Keep the same behavior as EXT4/XFS does.

Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by  Ned Bass <bass6@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fan Yong <fan.yong@intel.com>
TEST_ZIMPORT_POOLS="zol-0.6.1 zol-0.6.2 master"
Change-Id: Ib4f0544602e03fb61fd46a849d7ba51a6005693c
Closes #6290
2018-02-13 14:54:54 -08:00
Chunwei Chen eb9c4532dd Fix zdb -ed on objset for exported pool
zdb -ed on objset for exported pool would failed with:
  failed to own dataset 'qq/fs0': No such file or directory

The reason is that zdb pass objset name to spa_import, it uses that
name to create a spa. Later, when dmu_objset_own tries to lookup the spa
using real pool name, it can't find one.

We fix this by make sure we pass pool name rather than objset name to
spa_import.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099
Closes #6464
2018-02-09 10:11:34 -08:00
LOLi 9ca25e709b Verify ZFS Test Suite scripts executability
This change adds a make target 'testscheck' which verifies all ksh test
scripts, part of the ZFS Test Suite, have their executable bit set:
this should avoid adding new test scripts that cannot be executed by
the test-runner.py which would result in the following warning message:

   Warning: Test removed from TestGroup because it failed verification.

This also verifies both *.cfg and *.kshlib files used in the ZFS Test
Suite are not executable.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7131
2018-02-07 12:43:24 -08:00
Tom Caputi ae76f45cda Encryption Stability and On-Disk Format Fixes
The on-disk format for encrypted datasets protects not only
the encrypted and authenticated blocks themselves, but also
the order and interpretation of these blocks. In order to
make this work while maintaining the ability to do raw
sends, the indirect bps maintain a secure checksum of all
the MACs in the block below it along with a few other
fields that determine how the data is interpreted.

Unfortunately, the current on-disk format erroneously
includes some fields which are not portable and thus cannot
support raw sends. It is not possible to easily work around
this issue due to a separate and much smaller bug which
causes indirect blocks for encrypted dnodes to not be
compressed, which conflicts with the previous bug. In
addition, the current code generates incompatible on-disk
formats on big endian and little endian systems due to an
issue with how block pointers are authenticated. Finally,
raw send streams do not currently include dn_maxblkid when
sending both the metadnode and normal dnodes which are
needed in order to ensure that we are correctly maintaining
the portable objset MAC.

This patch zero's out the offending fields when computing
the bp MAC and ensures that these MACs are always
calculated in little endian order (regardless of the host
system's byte order). This patch also registers an errata
for the old on-disk format, which we detect by adding a
"version" field to newly created DSL Crypto Keys. We allow
datasets without a version (version 0) to only be mounted
for read so that they can easily be migrated. We also now
include dn_maxblkid in raw send streams to ensure the MAC
can be maintained correctly.

This patch also contains minor bug fixes and cleanups.

Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #6845
Closes #6864
Closes #7052
2018-02-02 11:37:16 -08:00
LOLi bee7e4ff12 Fix 'zfs receive -o' when used with '-e|-d'
When used in conjunction with one of '-e' or '-d' zfs receive options
none of the properties requested to be set (-o) are actually applied:
this is caused by a wrong assumption made about the toplevel dataset
in zfs_receive_one().

Fix this by correctly detecting the toplevel dataset.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7088
2018-01-30 15:54:33 -08:00
Chunwei Chen 522db29275 zpool import -d to specify device path
When we know which devices have the pool we are looking for, sometime
it's better if we can directly pass those device paths to zpool import
instead of letting it to search through all unrelated stuff, which might
take a lot of time if you have hundreds of disks.

This patch allows option -d <dev_path> to zpool import. You can have
multiple pairs of -d <dev_path>, and zpool import will only search
through those devices. For example:

    zpool import -d /dev/sda -d /dev/sdb

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7077
2018-01-26 10:49:46 -08:00
Brian Behlendorf fed90353d7
Support -fsanitize=address with --enable-asan
When --enable-asan is provided to configure then build all user
space components with fsanitize=address.  For kernel support
use the Linux KASAN feature instead.

https://github.com/google/sanitizers/wiki/AddressSanitizer

When using gcc version 4.8 any test case which intentionally
generates a core dump will fail when using --enable-asan.
The default behavior is to disable core dumps and only newer
versions allow this behavior to be controled at run time with
the ASAN_OPTIONS environment variable.

Additionally, this patch includes some build system cleanup.

* Rules.am updated to set the minimum AM_CFLAGS, AM_CPPFLAGS,
  and AM_LDFLAGS.  Any additional flags should be added on a
  per-Makefile basic.  The --enable-debug and --enable-asan
  options apply to all user space binaries and libraries.

* Compiler checks consolidated in always-compiler-options.m4
  and renamed for consistency.

* -fstack-check compiler flag was removed, this functionality
  is provided by asan when configured with --enable-asan.

* Split DEBUG_CFLAGS in to DEBUG_CFLAGS, DEBUG_CPPFLAGS, and
  DEBUG_LDFLAGS.

* Moved default kernel build flags in to module/Makefile.in and
  split in to ZFS_MODULE_CFLAGS and ZFS_MODULE_CPPFLAGS.  These
  flags are set with the standard ccflags-y kbuild mechanism.

* -Wframe-larger-than checks applied only to binaries or
  libraries which include source files which are built in
  both user space and kernel space.  This restriction is
  relaxed for user space only utilities.

* -Wno-unused-but-set-variable applied only to libzfs and
  libzpool.  The remaining warnings are the result of an
  ASSERT using a variable when is always declared.

* -D_POSIX_PTHREAD_SEMANTICS and -D__EXTENSIONS__ dropped
  because they are Solaris specific and thus not needed.

* Ensure $GDB is defined as gdb by default in zloop.sh.

Signed-off-by: DHE <git@dehacked.net>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7027
2018-01-10 10:49:27 -08:00
LOLi 390d679acd Fix 'zpool add' handling of nested interior VDEVs
When replacing a faulted device which was previously handled by a spare
multiple levels of nested interior VDEVs will be present in the pool
configuration; the following example illustrates one of the possible
situations:

   NAME                          STATE     READ WRITE CKSUM
   testpool                      DEGRADED     0     0     0
     raidz1-0                    DEGRADED     0     0     0
       spare-0                   DEGRADED     0     0     0
         replacing-0             DEGRADED     0     0     0
           /var/tmp/fault-dev    UNAVAIL      0     0     0  cannot open
           /var/tmp/replace-dev  ONLINE       0     0     0
         /var/tmp/spare-dev1     ONLINE       0     0     0
       /var/tmp/safe-dev         ONLINE       0     0     0
   spares
     /var/tmp/spare-dev1         INUSE     currently in use

This is safe and allowed, but get_replication() needs to handle this
situation gracefully to let zpool add new devices to the pool.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6678 
Closes #6996
2017-12-28 10:15:32 -08:00
LOLi c30e34faa1 ZTS: Fix create-o_ashift test case
The function that fills the uberblock ring buffer on every device label
has been reworked to avoid occasional failures caused by a race
condition that prevents 'zpool sync' from writing some uberblock
sequentially: this happens when the pool sync ioctl dispatch code calls
txg_wait_synced() while we're already waiting for a TXG to sync.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6924 
Closes #6977
2017-12-19 10:49:33 -08:00
LOLi 4e9b156960 Various ZED fixes
* Teach ZED to handle spares usingi the configured ashift: if the zpool
   'ashift' property is set then ZED should use its value when kicking
   in a hotspare; with this change 512e disks can be used as spares
   for VDEVs that were created with ashift=9, even if ZFS natively
   detects them as 4K block devices.

 * Introduce an additional auto_spare test case which verifies that in
   the face of multiple device failures an appropiate number of spares
   are kicked in.

 * Fix zed_stop() in "libtest.shlib" which did not correctly wait the
   target pid.

 * Fix ZED crashing on startup caused by a race condition in libzfs
   when used in multi-threaded context.

 * Convert ZED over to using the tpool library which is already present
   in the Illumos FMA code.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #2562 
Closes #6858
2017-12-08 16:58:41 -08:00
Brian Behlendorf 0c415a93d2
Disable create-o_ashift
Occasionally observed failure of create-o_ashift due to the test
case not being 100% reliable.  In order to prevent false positives
disable this test case until it can be made reliable.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #6924
Closes #6925
2017-12-06 10:13:54 -08:00
Brian Behlendorf ea39f75f64
Fix 'zpool create|add' replication level check
When the pool configuration contains a hole due to a previous device
removal ignore this top level vdev.  Failure to do so will result in
the current configuration being assessed to have a non-uniform
replication level and the expected warning will be disabled.

The zpool_add_010_pos test case was extended to cover this scenario.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6907 
Closes #6911
2017-12-04 11:50:35 -08:00
Tom Caputi d4a72f2386 Sequential scrub and resilvers
Currently, scrubs and resilvers can take an extremely
long time to complete. This is largely due to the fact
that zfs scans process pools in logical order, as
determined by each block's bookmark. This makes sense
from a simplicity perspective, but blocks in zfs are
often scattered randomly across disks, particularly
due to zfs's copy-on-write mechanisms.

This patch improves performance by splitting scrubs
and resilvers into a metadata scanning phase and an IO
issuing phase. The metadata scan reads through the
structure of the pool and gathers an in-memory queue
of I/Os, sorted by size and offset on disk. The issuing
phase will then issue the scrub I/Os as sequentially as
possible, greatly improving performance.

This patch also updates and cleans up some of the scan
code which has not been updated in several years.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Authored-by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Authored-by: Alek Pinchuk <apinchuk@datto.com>
Authored-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #3625 
Closes #6256
2017-11-15 17:27:01 -08:00
LOLi ee45fbd894 ZFS send fails to dump objects larger than 128PiB
When dumping objects larger than 128PiB it's possible for do_dump() to
miscalculate the FREE_RECORD offset due to an integer overflow
condition: this prevents the receiving end from correctly restoring
the dumped object.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6760
2017-10-26 16:58:38 -07:00
LOLi 88f9c9396b Allow 'zpool events' filtering by pool name
Additionally add four new tests:

 * zpool_events_clear: verify 'zpool events -c' functionality
 * zpool_events_cliargs: verify command line options and arguments
 * zpool_events_follow: verify 'zpool events -f'
 * zpool_events_poolname: verify events filtering by pool name

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #3285 
Closes #6762
2017-10-26 16:49:33 -07:00
Arkadiusz Bubała d3f2cd7e3b Added no_scrub_restart flag to zpool reopen
Added -n flag to zpool reopen that allows a running scrub
operation to continue if there is a device with Dirty Time Log.

By default if a component device has a DTL and zpool reopen
is executed all running scan operations will be restarted.

Added functional tests for `zpool reopen`

Tests covers following scenarios:
* `zpool reopen` without arguments,
* `zpool reopen` with pool name as argument,
* `zpool reopen` while scrubbing,
* `zpool reopen -n` while scrubbing,
* `zpool reopen -n` while resilvering,
* `zpool reopen` with bad arguments.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Arkadiusz Bubała <arkadiusz.bubala@open-e.com>
Closes #6076 
Closes #6746
2017-10-26 12:26:09 -07:00
Tom Caputi 4807c0badb Encryption patch follow-up
* PBKDF2 implementation changed to OpenSSL implementation.

* HKDF implementation moved to its own file and tests
  added to ensure correctness.

* Removed libzfs's now unnecessary dependency on libzpool
  and libicp.

* Ztest can now create and test encrypted datasets. This is
  currently disabled until issue #6526 is resolved, but
  otherwise functions as advertised.

* Several small bug fixes discovered after enabling ztest
  to run on encrypted datasets.

* Fixed coverity defects added by the encryption patch.

* Updated man pages for encrypted send / receive behavior.

* Fixed a bug where encrypted datasets could receive
  DRR_WRITE_EMBEDDED records.

* Minor code cleanups / consolidation.

Signed-off-by: Tom Caputi <tcaputi@datto.com>
2017-10-11 16:54:48 -04:00
Brian Behlendorf 70f02287f8 Fix ARC behavior on 32-bit systems
With the addition of the ABD changes consumption of the virtual
address space has been greatly reduced.  This exposed an issue on
CONFIG_HIGHMEM systems where free memory was being calculated
incorrectly.  Functionally this didn't cause any major problems
prior to ABD because a lack of available virtual address space
was used as an indicator of low memory.

This patch makes the following changes to address the issue and
in the process realigns the code further with OpenZFS.  There
are no substantive changes in behavior for 64-bit systems.

* Added CONFIG_HIGHMEM case to the arc_all_memory() and
  arc_free_memory() functions to only consider low memory pages
  on CONFIG_HIGHMEM systems.

* The arc_free_memory() function was updated to return bytes
  instead of pages to be consistent with the other helper
  functions.  In user space we make up some reasonable values
  since currently only testing is performed in this context.

* Adds three new values to the arcstats kstat to provide visibility
  in to the ARC's assessment of the memory situation:
  memory_all_bytes, memory_free_bytes, and memory_available_bytes.

* Added kmem_reap() call to arc_available_memory() for 32-bit
  builds to realign code with OpenZFS.

* Reduced size of test file in /async_destroy_001_pos.ksh to
  speed up test case.  Multiple txgs are still required.

* Move vdevs used by zpool_clear_001_pos and zpool_upgrade_002_pos
  to TEST_BASE_DIR location to speed up test cases.

Reviewed-by: David Quigley <david.quigley@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5352
Closes #6734
2017-10-10 15:19:19 -07:00
LOLi b59b22972d Add 'zfs diff' coverage to the ZFS Test Suite
This change adds four new tests to the ZTS:

 * zfs_diff_changes: verify type of changes diplayed (-, +, R and M)
 * zfs_diff_cliargs: verify command line options and arguments
 * zfs_diff_timestamp: verify 'zfs diff -t'
 * zfs_diff_types: verify type of objects (files, dirs, pipes...)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Wren Kennedy <john.kennedy@delphix.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6686
2017-09-28 13:04:14 -07:00
LOLi 3fd3e56cfd Fix some ZFS Test Suite issues
* Add 'zfs bookmark' coverage (zfs_bookmark_cliargs)

 * Add OpenZFS 8166 coverage (zpool_scrub_offline_device)

 * Fix "busy" zfs_mount_remount failures

 * Fix bootfs_003_pos, bootfs_004_neg, zdb_005_pos local cleanup

 * Update usage of $KEEP variable, add get_all_pools() function

 * Enable history_008_pos and rsend_019_pos (non-32bit builders)

 * Enable zfs_copies_005_neg, update local cleanup

 * Fix zfs_send_007_pos (large_dnode + OpenZFS 8199)

 * Fix rollback_003_pos (use dataset name, not mountpoint, to unmount)

 * Update default_raidz_setup() to work properly with more than 3 disks

 * Use $TEST_BASE_DIR instead of hardcoded (/var)/tmp for file VDEVs

 * Update usage of /dev/random to /dev/urandom

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Issue #6086 
Closes #5658 
Closes #6143 
Closes #6421 
Closes #6627 
Closes #6632
2017-09-25 10:32:34 -07:00
LOLi 835db58592 Add -vnP support to 'zfs send' for bookmarks
This leverages the functionality introduced in cf7684b to expose
verbose, dry-run and parsable 'zfs send' options for bookmarks.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #3666 
Closes #6601
2017-09-08 15:24:31 -07:00
Brian Behlendorf 5c214ae318 Fix volume WR_INDIRECT log replay
The portion of the zvol_replay_write() handler responsible for
replaying indirect log records for some reason never existed.
As a result indirect log records were not being correctly replayed.

This went largely unnoticed since the majority of zvol log records
were of the type WR_COPIED or WR_NEED_COPY prior to OpenZFS 7578.

This patch updates zvol_replay_write() to correctly handle these
log records and adds a new test case which verifies volume replay
to prevent any regression.  The existing test case which verified
replay on filesystem was renamed slog_replay_fs.ksh for clarity.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6603 
Closes #6615
2017-09-08 15:07:00 -07:00
LOLi 08de8c16f5 Fix remounting snapshots read-write
It's not enough to preserve/restore MS_RDONLY on the superblock flags
to avoid remounting a snapshot read-write: be explicit about our
intentions to the VFS layer so the readonly bit is updated correctly
in do_remount_sb().

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6510 
Closes #6515
2017-08-17 14:28:17 -07:00
Tom Caputi b525630342 Native Encryption for ZFS on Linux
This change incorporates three major pieces:

The first change is a keystore that manages wrapping
and encryption keys for encrypted datasets. These
commands mostly involve manipulating the new
DSL Crypto Key ZAP Objects that live in the MOS. Each
encrypted dataset has its own DSL Crypto Key that is
protected with a user's key. This level of indirection
allows users to change their keys without re-encrypting
their entire datasets. The change implements the new
subcommands "zfs load-key", "zfs unload-key" and
"zfs change-key" which allow the user to manage their
encryption keys and settings. In addition, several new
flags and properties have been added to allow dataset
creation and to make mounting and unmounting more
convenient.

The second piece of this patch provides the ability to
encrypt, decyrpt, and authenticate protected datasets.
Each object set maintains a Merkel tree of Message
Authentication Codes that protect the lower layers,
similarly to how checksums are maintained. This part
impacts the zio layer, which handles the actual
encryption and generation of MACs, as well as the ARC
and DMU, which need to be able to handle encrypted
buffers and protected data.

The last addition is the ability to do raw, encrypted
sends and receives. The idea here is to send raw
encrypted and compressed data and receive it exactly
as is on a backup system. This means that the dataset
on the receiving system is protected using the same
user key that is in use on the sending side. By doing
so, datasets can be efficiently backed up to an
untrusted system without fear of data being
compromised.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #494 
Closes #5769
2017-08-14 10:36:48 -07:00
Giuseppe Di Natale c1dd2f783a Disable zfs_send_007_pos
Test case zfs_send_007_pos regularly is killed
by test-runner during zfs-tests on buildbot. Disable
it for now until further investigation can be done.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6422
2017-07-28 22:37:27 -07:00
LOLi 650258d7c7 zfs promote|rename .../%recv should be an error
If we are in the middle of an incremental 'zfs receive', the child
.../%recv will exist. If we run 'zfs promote' .../%recv, it will "work",
but then zfs gets confused about the status of the new dataset.
Attempting to do this promote should be an error.

Similarly renaming .../%recv datasets should not be allowed.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #4843 
Closes #6339
2017-07-28 14:12:34 -07:00
Giuseppe Di Natale 39554216df zfs_mount_001_neg: use log_must_busy in cleanup
Use log_must_busy when destroying the snapshot
and dataset during cleanup in zfs_mount_001_neg.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6382
2017-07-24 11:10:25 -07:00
Giuseppe Di Natale 0c656a964d Disable nbmand tests on kernels w/o support
This change allows mountpoint_003_pos and send-c_props
to run on Linux kernels that do not support mandatory
locking. Linux kernel versions greater than or equal to
4.4 no longer support mandatory locking and the test
suite will now account for that.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6346 
Closes #6347 
Closes #6362
2017-07-24 11:03:50 -07:00
Olaf Faaland 379ca9cf2b Multi-modifier protection (MMP)
Add multihost=on|off pool property to control MMP.  When enabled
a new thread writes uberblocks to the last slot in each label, at a
set frequency, to indicate to other hosts the pool is actively imported.
These uberblocks are the last synced uberblock with an updated
timestamp.  Property defaults to off.

During tryimport, find the "best" uberblock (newest txg and timestamp)
repeatedly, checking for change in the found uberblock.  Include the
results of the activity test in the config returned by tryimport.
These results are reported to user in "zpool import".

Allow the user to control the period between MMP writes, and the
duration of the activity test on import, via a new module parameter
zfs_multihost_interval.  The period is specified in milliseconds.  The
activity test duration is calculated from this value, and from the
mmp_delay in the "best" uberblock found initially.

Add a kstat interface to export statistics about Multiple Modifier
Protection (MMP) updates. Include the last synced txg number, the
timestamp, the delay since the last MMP update, the VDEV GUID, the VDEV
label that received the last MMP update, and the VDEV path.  Abbreviated
output below.

$ cat /proc/spl/kstat/zfs/mypool/multihost
31 0 0x01 10 880 105092382393521 105144180101111
txg   timestamp  mmp_delay   vdev_guid   vdev_label vdev_path
20468    261337  250274925   68396651780       3    /dev/sda
20468    261339  252023374   6267402363293     1    /dev/sdc
20468    261340  252000858   6698080955233     1    /dev/sdx
20468    261341  251980635   783892869810      2    /dev/sdy
20468    261342  253385953   8923255792467     3    /dev/sdd
20468    261344  253336622   042125143176      0    /dev/sdab
20468    261345  253310522   1200778101278     2    /dev/sde
20468    261346  253286429   0950576198362     2    /dev/sdt
20468    261347  253261545   96209817917       3    /dev/sds
20468    261349  253238188   8555725937673     3    /dev/sdb

Add a new tunable zfs_multihost_history to specify the number of MMP
updates to store history for. By default it is set to zero meaning that
no MMP statistics are stored.

When using ztest to generate activity, for automated tests of the MMP
function, some test functions interfere with the test.  For example, the
pool is exported to run zdb and then imported again.  Add a new ztest
function, "-M", to alter ztest behavior to prevent this.

Add new tests to verify the new functionality.  Tests provided by
Giuseppe Di Natale.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Ned Bass <bass6@llnl.gov>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #745
Closes #6279
2017-07-13 13:54:00 -04:00
LOLi cf8738d853 Add port of FreeBSD 'volmode' property
The volmode property may be set to control the visibility of ZVOL
block devices.

This allow switching ZVOL between three modes:
   full - existing fully functional behaviour (default)
   dev  - hide partitions on ZVOL block devices
   none - not exposing volumes outside ZFS

Additionally the new zvol_volmode module parameter can be used to
control the default behaviour.

This functionality can be used, for instance, on "backup" pools to
avoid cluttering /dev with unneeded zd* devices.

Original-patch-by: mav <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>

FreeBSD-commit: https://github.com/freebsd/freebsd/commit/dd28e6bb
Closes #1796 
Closes #3438 
Closes #6233
2017-07-12 13:05:37 -07:00
LOLi 92e43c1718 Fix 'zpool clear' on readonly pools
Illumos 4080 inadvertently allows 'zpool clear' on readonly pools: fix
this by reintroducing a check (POOL_CHECK_READONLY) in zfs_ioc_clear
registration code.

Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6306
2017-07-07 10:39:53 -07:00
Alek P 0ea05c64f8 Implemented zpool scrub pause/resume
Currently, there is no way to pause a scrub. Pausing may
be useful when the pool is busy with other I/O to preserve
bandwidth.

This patch adds the ability to pause and resume scrubbing.
This is achieved by maintaining a persistent on-disk scrub state.
While the state is 'paused' we do not scrub any more blocks.
We do however perform regular scan housekeeping such as
freeing async destroyed and deadlist blocks while paused.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Thomas Caputi <tcaputi@datto.com>
Reviewed-by: Serapheim Dimitropoulos <serapheimd@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Closes #6167
2017-07-06 22:16:13 -07:00
bunder2015 47770d30f2 Fix zpool_add_005_pos
Under Linux the existence of a block device in /etc/fstab is
not sufficient to prevent the use of the force flag.  Without
the force flag a warning will be printed that the device has
a filesystem of a given type.  Providing the force option
will overwrite that filesystem as long as it is not actively
mounted.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Tested-by: bunder2015 <omfgbunder@gmail.com>
Signed-off-by: bunder2015 <omfgbunder@gmail.com>
Closes #6267 
Closes #6272
2017-06-27 10:06:07 -07:00
bunder2015 7517376f93 Fix arithmetic error message in zfs_clone_010_pos
zfs_clone_010_pos.ksh: line 234: ZFS_MAXPROPLEN: arithmetic syntax error

Reviewed-by: Kash Pande <kash@tripleback.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: bunder2015 <omfgbunder@gmail.com>
Closes #6268
2017-06-26 14:48:54 -07:00
Giuseppe Di Natale 1fbfcf1159 Fix zpool_import_all_001_pos
Cleanup zpool_import_all_001_pos to no longer use devices.
The test is meant to test zpool import -a and by no longer
requiring devices, a number of dependencies are no longer
necessary.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6198
2017-06-13 09:05:55 -07:00
Giuseppe Di Natale 829aaf2801 Skip tests that are slow on 32-bit builders
zpool_create_024_pos, zvol_misc_002_pos, write_dirs_002_pos are slow
on the buildbot 32-bit builder. Skip the test cases for now on 32-bit
builders.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6195
2017-06-06 19:04:01 -07:00
Håkan Johansson 6eb6073a04 Allow add of raidz and mirror with same redundancy
Allow new members to be added to a pool mixing raidz and mirror vdevs
without giving -f, as long as they have matching redundancy.  This case
was missed in #5915, which only handled zpool create.

Add zfstest zpool_add_010_pos.ksh, with test of zpool create
followed by zpool add of mixed raidz and mirror vdevs.

Add some more mixed raidz and mirror cases to zpool_create_006_pos.ksh.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Haakan Johansson <f96hajo@chalmers.se>
Issue #5915 
Closes #6181
2017-06-05 13:53:09 -07:00
Giuseppe Di Natale 099700d9df zpool iostat/status -c improvements
Users can now provide their own scripts to be run
with 'zpool iostat/status -c'. User scripts should be
placed in ~/.zpool.d to be included in zpool's
default search path.

Provide a script which can be used with
'zpool iostat|status -c' that will return the type of
device (hdd, sdd, file).

Provide a script to get various values from smartctl
when using 'zpool iostat/status -c'.

Allow users to define the ZPOOL_SCRIPTS_PATH
environment variable which can be used to override
the default 'zpool iostat/status -c' search path.

Allow the ZPOOL_SCRIPTS_ENABLED environment
variable to enable or disable 'zpool status/iostat -c'
functionality.

Use the new smart script to provide the serial command.

Install /etc/sudoers.d/zfs file which contains the sudoer
rule for smartctl as a sample.

Allow 'zpool iostat/status -c' tests to run in tree.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6121 
Closes #6153
2017-06-05 10:52:15 -07:00
Yuri Pankov bda77af11c OpenZFS 8077 - zfs-tests suite fails zpool_get_002_pos
Authored by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <jwk404@gmail.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: bunder2015 <omfgbunder@gmail.com>

Porting Notes:
* Also corrected a quoting mistake found in our copy

OpenZFS-issue: https://www.illumos.org/issues/8077
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/481467d
Closes #6163
2017-05-25 17:52:27 -07:00
Yuri Pankov ea8c83fdda OpenZFS 8071 - zfs-tests: 7290 missed some cases
Authored by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <jwk404@gmail.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: bunder2015 <omfgbunder@gmail.com>

OpenZFS-issue: https://www.illumos.org/issues/8071
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/e84991e
Closes #6161
2017-05-25 17:22:55 -07:00
Yuri Pankov 7bc181e6db OpenZFS 8072 - zfs-tests: several test cases incorrectly spell TESTPOOL
Authored by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <jwk404@gmail.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/8072
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/56e4733
Closes #6137
2017-05-25 10:33:12 -07:00
Brian Behlendorf 3f03fc8df3 Add zpool events tests
* events_001_pos - Verify the expected events are generated when
  invoking the various zpool sub-commands.  These events must
  appear in `zpool event` and be consumed by the ZED.

* events_002_pos - Verify the ZED consumes events which were
  generated while it wasn't running when it is started.
  Additionally, verify that events are only processed once.

As part of this change the default.cfg used by the test suite
was changed to a default.cfg.in file.  This was needed so the
install location of all zed scripts, not only the enabled ones,
could be reliably determined.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6128
2017-05-22 12:34:42 -04:00
Brian Behlendorf 95401cb6f7 Enable remaining tests
Enable most of the remaining test cases which were previously
disabled.  The required fixes are as follows:

* cache_001_pos - No changes required.

* cache_010_neg - Updated to use losetup under Linux.  Loopback
  cache devices are allowed, ZVOLs as cache devices are not.
  Disabled until all the builders pass reliably.

* cachefile_001_pos, cachefile_002_pos, cachefile_003_pos,
  cachefile_004_pos - Set set_device_dir path in cachefile.cfg,
  updated CPATH1 and CPATH2 to reference unique files.

* zfs_clone_005_pos - Wait for udev to create volumes.

* zfs_mount_007_pos - Updated mount options to expected Linux names.

* zfs_mount_009_neg, zfs_mount_all_001_pos - No changes required.

* zfs_unmount_005_pos, zfs_unmount_009_pos, zfs_unmount_all_001_pos -
  Updated to expect -f to not unmount busy mount points under Linux.

* rsend_019_pos - Observed to occasionally take a long time on both
  32-bit systems and the kmemleak builder.

* zfs_written_property_001_pos - Switched sync(1) to sync_pool.

* devices_001_pos, devices_002_neg - Updated create_dev_file() helper
  for Linux.

* exec_002_neg.ksh - Fixed mmap_exec.c to preserve errno.  Updated
  test case to expect EPERM from Linux as described by mmap(2).

* grow_pool_001_pos - Adding missing setup.ksh and cleanup.ksh
  scripts from OpenZFS.

* grow_replicas_001_pos.ksh - Added missing $SLICE_* variables.

* history_004_pos, history_006_neg, history_008_pos - Fixed by
  previous commits and were not enabled.  No changes required.

* zfs_allow_010_pos - Added missing spaces after assorted zfs
  commands in delegate_common.kshlib.

* inuse_* - Illumos dump device tests skipped.  Remaining test
  cases updated to correctly create required partitions.

* large_files_001_pos - Fixed largest_file.c to accept EINVAL
  as well as EFBIG as described in write(2).

* link_count_001 - Added nproc to required commands.

* umountall_001 - Updated to use umount -a.

* online_offline_001_* - Pull in OpenZFS change to file_trunc.c
  to make the '-c 0' option run the test in a loop.  Included
  online_offline.cfg file in all test cases.

* rename_dirs_001_pos - Updated to use the rename_dir test binary,
  pkill restricted to exact matches and total runtime reduced.

* slog_013_neg, write_dirs_002_pos - No changes required.

* slog_013_pos.ksh - Updated to use losetup under Linux.

* slog_014_pos.ksh - ZED will not be running, manually degrade
  the damaged vdev as expected.

* nopwrite_varying_compression, nopwrite_volume - Forced pool
  sync with sync_pool to ensure up to date property values.

* Fixed typos in ZED log messages.  Refactored zed_* helper
  functions to resolve all-syslog exit=1 errors in zedlog.

* zfs_copies_005_neg, zfs_get_004_pos, zpool_add_004_pos,
  zpool_destroy_001_pos, largest_pool_001_pos, clone_001_pos.ksh,
  clone_001_pos, - Skip until layering pools on zvols is solid.

* largest_pool_001_pos - Limited to 7eb pool, maximum
  supported size in 8eb-1 on Linux.

* zpool_expand_001_pos, zpool_expand_003_neg - Requires
  additional support from the ZED, updated skip reason.

* zfs_rollback_001_pos, zfs_rollback_002_pos - Properly cleanup
  busy mount points under Linux between test loops.

* privilege_001_pos, privilege_003_pos, rollback_003_pos,
  threadsappend_001_pos - Skip with log_unsupported.

* snapshot_016_pos - No changes required.

* snapshot_008_pos - Increased LIMIT from 512K to 2M and added
  sync_pool to avoid false positives.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6128
2017-05-22 12:34:32 -04:00
Alek P bec1067d54 Implemented zpool sync command
This addition will enable us to sync an open TXG to the main pool
on demand. The functionality is similar to 'sync(2)' but 'zpool sync'
will return when data has hit the main storage instead of potentially
just the ZIL as is the case with the 'sync(2)' cmd.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Closes #6122
2017-05-19 12:33:11 -07:00
Tony Hutter 4a283c7f77 Force fault a vdev with 'zpool offline -f'
This patch adds a '-f' option to 'zpool offline' to fault a vdev
instead of bringing it offline.  Unlike the OFFLINE state, the
FAULTED state will trigger the FMA code, allowing for things like
autoreplace and triggering the slot fault LED.  The -f faults
persist across imports, unless they were set with the temporary
(-t) flag.  Both persistent and temporary faults can be cleared
with zpool clear.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #6094
2017-05-19 12:30:16 -07:00
Brian Behlendorf 8c54ddd33a Enable additional test cases
Enable additional test cases, in most cases this required a few
minor modifications to the test scripts.  In a few cases a real
bug was uncovered and fixed.  And in a handful of cases where pools
are layered on pools the test case will be skipped until this is
supported.  Details below for each test case.

* zpool_add_004_pos - Skip test on Linux until adding zvols to pools
  is fully supported and deadlock free.

* zpool_add_005_pos.ksh - Skip dumpadm portion of the test which isn't
  relevant for Linux.  The find_vfstab_dev, find_mnttab_dev, and
  save_dump_dev functions were updated accordingly for Linux.  Add
  O_EXCL to the in-use check to prevent the -f (force) option from
  working for mounted filesystems and improve the resulting error.

* zpool_add_006_pos - Update test case such that it doesn't depend
  on nested pools.  Switch to truncate from mkfile to reduce space
  requirements and speed up the test case.

* zpool_clear_001_pos - Speed up test case by filling filesystem to
  25% capacity.

* zpool_create_002_pos, zpool_create_004_pos - Use sparse files for
  file vdevs in order to avoid increasing the partition size.

* zpool_create_006_pos - 6ba1ce9 allows raidz+mirror configs with
  similar redundancy.  Updating the valid_args and forced_args cases.

* zpool_create_008_pos - Disable overlapping partition portion.

* zpool_create_011_neg - Fix to correctly create the extra partition.
  Modified zpool_vdev.c to use fstat64_blk() wrapper which includes
  the st_size even for block devices.

* zpool_create_012_neg - Updated to properly find swap devices.

* zpool_create_014_neg, zpool_create_015_neg - Updated to use
  swap_setup() and swap_cleanup() wrappers which do the right thing
  on Linux and Illumos.  Removed '-n' option which succeeds under
  Linux due to differences in the in-use checks.

* zpool_create_016_pos.ksh - Skipped test case isn't useful.

* zpool_create_020_pos - Added missing / to cleanup() function.
  Remove cache file prior to test to ensure a clean environment
  and avoid false positives.

* zpool_destroy_001_pos - Removed test case which creates a pool on
  a zvol.  This is more likely to deadlock under Linux and has never
  been completely supported on any platform.

* zpool_destroy_002_pos - 'zpool destroy -f' is unsupported on Linux.
  Mount point must not be busy in order to unmount them.

* zfs_destroy_001_pos - Handle EBUSY error which can occur with
  volumes when racing with udev.

* zpool_expand_001_pos, zpool_expand_003_neg - Skip test on Linux
  until adding zvols to pools is fully supported and deadlock free.
  The test could be modified to use loop-back devices but it would
  be preferable to use the test case as is for improved coverage.

* zpool_export_004_pos - Updated test case to such that it doesn't
  depend on nested pools.  Normal file vdev under /var/tmp are fine.

* zpool_import_all_001_pos - Updated to skip partition 1, which is
  known as slice 2, on Illumos.  This prevents overwriting the
  default TESTPOOL which was causing the failure.

* zpool_import_002_pos, zpool_import_012_pos - No changes needed.

* zpool_remove_003_pos - No changes needed

* zpool_upgrade_002_pos, zpool_upgrade_004_pos - Root cause addressed
  by upstream OpenZFS commit 3b7f360.

* zpool_upgrade_007_pos - Disabled in test case due to known failure.
  Opened issue https://github.com/zfsonlinux/zfs/issues/6112

* zvol_misc_002_pos - Updated to to use ext2.

* zvol_misc_001_neg, zvol_misc_003_neg, zvol_misc_004_pos,
  zvol_misc_005_neg, zvol_misc_006_pos - Moved to skip list, these
  test case could be updated to use Linux's crash dump facility.

* zvol_swap_* - Updated to use swap_setup/swap_cleanup helpers.
  File creation switched from /tmp to /var/tmp.  Enabled minimal
  useful tests for Linux, skip test cases which aren't applicable.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #3484
Issue #5634
Issue #2437
Issue #5202
Issue #4034
Closes #6095
2017-05-11 14:27:57 -07:00
LOLi a3eeab2de6 Add property overriding (-o|-x) to 'zfs receive'
This allows users to specify "-o property=value" to override and
"-x property" to exclude properties when receiving a zfs send stream.
Both native and user properties can be specified.

This is useful when using zfs send/receive for periodic
backup/replication because it lets users change properties such as
canmount, mountpoint, or compression without modifying the source.

References:
   https://www.illumos.org/issues/2745
   https://www.illumos.org/issues/3753

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Alek Pinchuk <apinchuk@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #1350 
Closes #5349
2017-05-09 16:21:09 -07:00
Brian Behlendorf 35b7842f68 Enable all zfs_destroy test cases
* zfs_destroy_001_pos - Unable to reproduce the failures locally.
  Re-enabled to determine observed buildbot failure rate.

* zfs_destroy_005_neg - Updated for expected Linux behavior.
  Busy mount points, even snapshots, are expected to fail.

* zfs_destroy_010_pos - Resolved transient EBUSY with retry.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #5635 
Issue #5893 
Closes #6091
2017-05-03 18:27:59 -07:00
LOLi dddef7d600 More ashift improvements
This commit allow higher ashift values (up to 16) in 'zpool create'

The ashift value was previously limited to 13 (8K block) in b41c990
because the limited number of uberblocks we could fit in the
statically sized (128K) vdev label ring buffer could prevent the
ability the safely roll back a pool to recover it.

Since b02fe35 the largest uberblock size we support is 8K: this
allow us to store a minimum number of 16 uberblocks in the vdev
label, even with higher ashift values.

Additionally change 'ashift' pool property behaviour: if set it will
be used as the default hint value in subsequent vdev operations
('zpool add', 'attach' and 'replace'). A custom ashift value can still
be specified from the command line, if desired.

Finally, fix a bug in add-o_ashift.ksh caused by a missing variable.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #2024 
Closes #4205 
Closes #4740 
Closes #5763
2017-05-03 09:31:05 -07:00
Olaf Faaland 9d3f7b8791 Write label 2,3 uberblocks when vdev expands
When vdev_psize increases, the location of labels 2 and 3 changes
because their location is relative to the end of the device.

The configs for labels 2 and 3 are written during the next spa_sync()
because the vdev is added to the dirty config list.  However, the
uberblock rings are not re-written in their new location, leaving the
device vulnerable to the beginning of the device being overwritten or
damaged.

This patch copies the uberblock ring from label 0 to labels 2 and 3,
in their new locations, at the next sync after vdev_psize increases.

Also, add a test zpool_expand_004_pos.ksh to confirm the uberblocks
are copied.

Reviewed-by: BearBabyLiu <liu.huang@zte.com.cn>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #5108
2017-05-02 13:55:24 -07:00
LOLi e7fbeb606a Add zfs_nicebytes() to print human-readable sizes
* Add zfs_nicebytes() to print human-readable sizes

Some 'zfs', 'zpool' and 'zdb' output strings can be confusing to the
user when no units are specified. This add a new zfs_nicenum_format
"ZFS_NICENUM_BYTES" used to print bytes in their human-readable form.

Additionally, update some test cases to use machine-parsable 'zfs get'.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #2414 
Closes #3185 
Closes #3594 
Closes #6032
2017-05-02 13:43:53 -07:00
Tony Hutter d6418de057 Prebaked scripts for zpool status/iostat -c
This patch updates the "zpool status/iostat -c" commands to only run
"pre-baked" scripts from the /etc/zfs/zpool.d directory (or wherever
you install to).  The scripts can only be run from -c as an unprivileged
user (unless the ZPOOL_SCRIPTS_AS_ROOT environment var is
set by root).  This was done to encourage scripts to be written is such
a way that normal users can use them, and to be cautious.  If your
script needs to run a privileged command, consider adding the
appropriate line in /etc/sudoers.  See zpool(8) for an example of how
to do this.

The patch also allows the scripts to output custom column names.  If
the script outputs a line like:

name=value

then "name" is used for the column name, and "value" is its value.
Multiple columns can be specified by outputting multiple lines.  Column
names and values can have spaces.  If the value is empty, a dash (-) is
printed instead.

After all the "name=value" lines are read (if any), zpool will take the
next the next line of output (if any) and print it without a column
header.  After that, no more lines will be processed. This can be
useful for printing errors.

Lastly, this patch also disables the -c option with the latency and
request size histograms, since it produced awkward output and made the
code harder to maintain.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #5852
2017-04-21 09:27:04 -07:00
George Melikov 3e67c38c34 zfs_receive_010_pos: change dd arguments
The  `dd` command as written will not create a hole in the file.
Additionally, the `stride` argument isn't understood by `dd` so
it's replaced with `seek` which isn't equivilant but will result in
a single whole which is sufficient for the test case.  Finally,
`conv=notrunc` is added to avoid truncating the file.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #6023
2017-04-20 12:05:39 -07:00
Richard Yao 066753103f OpenZFS 6392 - zdb: introduce -V for verbatim import
Authored by: Richard Yao <ryao@gentoo.org>
Approved by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Yuri Pankov <yuri.pankov@gmail.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>

Porting Notes:
  This was already implemented in ZFS on Linux. This patch
  is to resolved the deltas present in our version.

OpenZFS-issue: https://www.illumos.org/issues/6392
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/9bb97de
Closes #6020
2017-04-18 09:50:15 -07:00
Giuseppe Di Natale f995e5ec43 Clean up correctly in zpool_scrub_004_pos
Ensure `zinject -c` all gets called whenever
zpool_scrub_004_pos exits.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Issue #5444
Closes #6021
2017-04-18 09:45:45 -07:00
Brian Behlendorf dd49132a1d OpenZFS 7535 - need test for resumed send of top most filesystem
Authored by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

Porting Notes:
- zfs_share_001_pos.ksh - Older versions of exportfs will match
  multiple exports that share a common prefix.  Reorder the 'fs'
  list so unshares occur from most to least unique.
- zfs_share_005_pos.ksh - Enabled and updated for Linux.

OpenZFS-issue: https://www.illumos.org/issues/7535
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/ac89d1e
Closes #5979
2017-04-12 08:47:42 -07:00
Yuri Pankov dbb38f6605 OpenZFS 6865 - want zfs-tests cases for zpool labelclear command
Authored by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

Porting Notes:
- Updated 'zpool labelclear' and 'zdb -l' such that they attempt
  to find a vdev given solely its short name.  This behavior is
  consistent with the upstream OpenZFS code and the test cases
  depend on it.  The actual implementation differs slightly due
  to device naming conventions on Linux.
- auto_online_001_pos, auto_replace_001_pos and add-o_ashift
  test cases updated to expect failure when no label exists.
- read_efi_label() and zpool_label_disk_check() are read-only
  operations and should use O_RDONLY at open time to enforce this.
- zpool_label_disk() and zpool_relabel_disk() write the partition
  information using O_DIRECT an fsync() and page cache invalidation
  to ensure a consistent view of the device.
- dump_label() in zdb should invalidate the page cache in order
  to get the authoritative label from disk.

OpenZFS-issue: https://www.illumos.org/issues/6865
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/c95076c
Closes #5981
2017-04-11 09:54:39 -07:00
Giuseppe Di Natale 42db43e982 OpenZFS 2932 - support crash dumps to raidz, etc. pools
Authored by: Bill Pijewski <wdp@joyent.com>
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@nexenta.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/2932
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/810e43b
Closes #5984
Closes #5216
2017-04-10 10:24:17 -07:00
George Melikov a8d6ae1e16 zfstest - replace dircmp with diff
`dircmp` doesn't exist in Linux while `diff` is already used
by zfstests on all platforms.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Wren Kennedy <john.kennedy@delphix.com>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #5996
2017-04-09 16:17:55 -07:00
George Melikov 316da92825 zfs_receive_010_pos.ksh local => typeset change
Ksh uses `typeset`, `local` is a Bash analog.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Wren Kennedy <john.kennedy@delphix.com>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #5995
2017-04-09 16:01:54 -07:00
Brian Behlendorf 10f251191f OpenZFS 7629 - Fix for 7290 neglected to remove some escape sequences
Authored by: John Wren Kennedy <john.kennedy@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

Porting Notes:
- Multiple changes in this commit were applied in c1d9abf.

OpenZFS-issue: https://www.illumos.org/issues/7629
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/f5fb56d
Closes #5980
2017-04-07 09:30:05 -07:00
John Wren Kennedy c1d9abf905 OpenZFS 7290 - ZFS test suite needs to control what utilities it can run
Authored by: John Wren Kennedy <john.kennedy@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>

Porting Notes:
- Utilities which aren't available under Linux have been removed.
- Because of sudo's default secure path behavior PATH must be
  explicitly reset at the top of libtest.shlib.  This avoids the
  need for all users to customize secure path on their system.
- Updated ZoL infrastructure to manage constrained path
- Updated all test cases
- Check permissions for usergroup tests
- When testing in-tree create links under bin/
- Update fault cleanup such that missing files during
  cleanup aren't fatal.
- Configure su environment with constrained path

OpenZFS-issue: https://www.illumos.org/issues/7290
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/1d32ba6
Closes #5903
2017-04-06 09:25:36 -07:00
George Melikov e55ebf6afd zfs_get_005_neg.ksh fix typos
`test_options_bookmark` function must have an `s` at the end.

Reviewed-by: Marcel Telka <marcel@telka.sk>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #5957
2017-04-03 11:06:04 -07:00
Brian Behlendorf 8be64caabc Fix add-o_ashift.ksh permissions
Test cases must be executable or they will be skipped.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5947
2017-03-31 09:25:23 -07:00
LOLi ff61d1a495 Check ashift validity in 'zpool add'
df83110 added the ability to specify a custom "ashift" value from the command
line in 'zpool add' and 'zpool attach'. This commit adds additional checks to
the provided ashift to prevent invalid values from being used, which could
result in disastrous consequences for the whole pool.

Additionally provide ASHIFT_MAX and ASHIFT_MIN definitions in spa.h.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #5878
2017-03-28 17:21:11 -07:00
Olaf Faaland 3c9e0d673e Dump unique configurations and Uberblocks in zdb -lu
For zdb -l, detect when the configuration nvlist in some label l (l>0)
is the same as a configuration already dumped.  If so, do not dump it.

Make a similar check when dumping Uberblocks for zdb -lu.  Check whether
a label already dumped contains an identical Uberblock.  If so, do not
dump the Uberblock.

When dumping a configuration or Uberblock, state which labels it is
found in (0-3), for example: labels = 1 2 3

Detecting redundant uberblocks or configurations is accomplished by
calculating checksums of the uberblocks and the packed nvlists
containing the configuration.

If there is nothing unique to be dumped for a label (ie the
configuration and uberblocks have checksums matching those already
dumped) print nothing for that label.

With additional l's or u's, increase verbosity as follows:

-l      Dump each unique configuration only once.
        Indicate which labels it appears in.
-ll     In addition, dump label space usage stats.
-lll    Dump every configuration, unique or not.

-u      Dump each unique, valid, uberblock only once.
        Indicate which labels it appears in.
-uu     In addition, state which slots are invalid.
-uuu    Dump every uberblock, unique or not.
-uuuu   Dump the uberblock blockpointer (used to be -uuu)

Make exit values conform to the manual page.  Failing to unpack a
configuration nvlist is considered an error, as well as failing to open
or read from the device.

Add three tests, zdb_00{3,4,5}_pos to verify the above functionality.

An example of the output:
	------------------------------------
	LABEL 0
	------------------------------------
	    version: 5000
	    name: 'pool'
	    state: 1
	    txg: 880
	    < ... redacted ... >
	    features_for_read:
		com.delphix:hole_birth
		com.delphix:embedded_data
	    labels = 0
	    Uberblock[0]
		magic = 0000000000bab10c
		version = 5000
		txg = 0
		guid_sum = 3038694082047428541
		timestamp = 1487715500 UTC = Tue Feb 21 14:18:20 2017
		labels = 0 1 2 3
	    Uberblock[4]
		magic = 0000000000bab10c
		version = 5000
		txg = 772
		guid_sum = 9045970794941528051
		timestamp = 1487727291 UTC = Tue Feb 21 17:34:51 2017
		labels = 0
	    < ... redacted ... >
	------------------------------------
	LABEL 1
	------------------------------------
	    version: 5000
	    name: 'pool'
	    state: 1
	    txg: 14
	    < ... redacted ... >
		com.delphix:embedded_data
	    labels = 1 2 3
	    Uberblock[4]
		magic = 0000000000bab10c
		version = 5000
		txg = 4
		guid_sum = 7793930272573252584
		timestamp = 1487727521 UTC = Tue Feb 21 17:38:41 2017
		labels = 1 2 3
	    < ... redacted ... >

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed-by: Don Brady <don.brady@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #5738
2017-03-06 16:01:45 -08:00
Prakash Surya cbeeb4afb3 OpenZFS 7761 - bootfs_005_neg's pool destruction must handle EBUSY
Authored by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@gmail.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>

OpenZFS-issue: https://www.illumos.org/issues/7761
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/ad309d3
Closes #5818
2017-02-24 11:06:14 -08:00
Prakash Surya ce456d483c OpenZFS 7762 - avoid division by zero in property_alias_001_pos
Authored by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/7762
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/ebaf15cb
Closes #5798
2017-02-15 17:19:54 -08:00
John Wren Kennedy 2171eb7112 OpenZFS 7260 - disable libdiskmgmt in zfstest unless it's required
Authored by: John Wren Kennedy <john.kennedy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>

OpenZFS-issue: https://www.illumos.org/issues/7260
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/447b1e1
Closes #5794

Porting notes:
- The library libdiskmgmt is specific to illumos so these changes
  currently have no impact under Linux.  This mechanism could be
  potentially leveraged in the future.
2017-02-15 11:09:33 -08:00
Olaf Faaland a454868b0c Use file-based pools for zpool_expand test 002 and enable it
Use -pH flags in get_pool_prop so that numeric properties such as size
can be compared.  The zpool_expand test suite is currently the only one
which uses get_pool_prop for a numeric property.

Add TEMPFILE and TEMPFILE{0,1,2} to default.cfg for tests that must
build pools on top of files, such as this one where expansion is
necessary but the entries in DISKS may not point to entities that can be
expanded.

Base the pool used for testing on file-type VDEVs instead of using zvols
within an underlying pool, to avoid issues that come up when pools are
backed by other pools.

Remove shell variables EX_1GB and EX_2GB used to recognize correct expansion,
and instead calculate the appropriate values based on the variables used to
control file or volume size, org_size and exp_size.  This change is also
made in test 001 although that test is not enabled because it depends on
FMA.

Finally, enable zpool_expand_002_pos.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Don Brady <don.brady@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #5757
2017-02-13 15:30:22 -08:00
Akash Ayare 6dd95a910a OpenZFS 7027 - zfs_written_property_001_pos makes unreasonable assumptions about metadata space usage
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Gordon Ross <gordon.ross@nexenta.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/7027
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/48cb8b9
Issue #2441
Closes #5778
2017-02-13 12:21:13 -08:00
Marcel Telka 37bb2fc7dc OpenZFS 7398 - zfs test zfs_get_005_neg does not work as expected
Authored by: Marcel Telka <marcel@telka.sk>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>

OpenZFS-issue: https://www.illumos.org/issues/7398
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/4220fdc
Closes #5782
2017-02-13 11:32:56 -08:00
Marcel Telka 94cc33f017 OpenZFS 7103 - failed test cli_root/zfs_snapshot/zfs_snapshot_009_pos
Authored by: Marcel Telka <marcel@telka.sk>
Reviewed by: - Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: - Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: - John Kennedy <john.kennedy@delphix.com>
Approved by: - Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>

OpenZFS-issue: https://www.illumos.org/issues/7103
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/3bfdbb4
Closes #5780
2017-02-13 11:28:54 -08:00
Matthew Ahrens a115cf35f8 OpenZFS 7162 - Intermittent failures from ro_props_001_pos
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/7162
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/9ec0cbeb
Closes #5511
Closes #5779
2017-02-13 11:26:45 -08:00
Matthew Ahrens d7958b4cda OpenZFS 7104 - increase indirect block size
Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>

OpenZFS-issue: https://www.illumos.org/issues/7104
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/4b5c8e9
Closes #5679
2017-02-09 10:27:02 -08:00
George Melikov d834b9ce5b Add `wait_freeing` helper function to ZTS
Sometimes the ZTS checks freed space just after `zfs destroy snapshot` and
gets an unexpected value because of space being freed asynchronously.
For cases like this add a `wait_freeing` function which blocks until the
pools `freeing` property drops to zero.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #5740
2017-02-08 15:27:37 -08:00
George Melikov 2e0e443ac4 OpenZFS 7247 - zfs receive of deduplicated stream fails
Authored by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>

OpenZFS-issue: https://www.illumos.org/issues/7247
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/2ad25b4
Closes #5689 

Porting notes:
- tests/zfs-tests/tests/functional/cli_root/zfs_receive/zfs_receive_013_pos.ksh
  renamed as zfs_receive_015_pos.ksh, zfs_receive_013_pos.ksh is now
  used for OpenZFS test.
- libzfs_sendrecv.c: SMALLEST_POSSIBLE_MAX_DDT_MB is always used
  for all 32-bit builds.
2017-02-04 09:10:24 -08:00