Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/7162
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/9ec0cbebCloses#5511Closes#5779
Authored by: John Wren Kennedy <john.kennedy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7496
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/5dc1fd7Closes#5781
Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7104
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/4b5c8e9Closes#5679
The ziltest.sh script is a test case designed to verify the correct
functioning of the ZIL. For historical reasons it was never added
to the test suite and was always run independantly.
This change rectifies that. The existing ziltest.sh has been
translated in to `slog_015_pos.ksh` and added to the existing
slog test cases.
Reviewed-by: Don Brady <don.brady@intel.com>
Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#5758
Sometimes the ZTS checks freed space just after `zfs destroy snapshot` and
gets an unexpected value because of space being freed asynchronously.
For cases like this add a `wait_freeing` function which blocks until the
pools `freeing` property drops to zero.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes#5740
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#5669Closes#5743
Authored by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7247
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/2ad25b4Closes#5689
Porting notes:
- tests/zfs-tests/tests/functional/cli_root/zfs_receive/zfs_receive_013_pos.ksh
renamed as zfs_receive_015_pos.ksh, zfs_receive_013_pos.ksh is now
used for OpenZFS test.
- libzfs_sendrecv.c: SMALLEST_POSSIBLE_MAX_DDT_MB is always used
for all 32-bit builds.
Commit 539d33c seems to have significantly increased the run time
of the sparse_001_pos.ksh and truncate_001_pos.ksh test cases on
32-bit systems. This is likely due to dirty blocks from frees
being deferred to later txgs.
At the moment this is resulting in frequent failures on the
32-bit builders. Disable this test case until the issue can be
analyzed and resolved.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #5727Closes#5728
Authored by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Ported-by: Don Brady <don.brady@intel.com>
OpenZFS-issue: https://www.illumos.org/issues/6866
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/3e4fae5Closes#5730
Porting Notes:
- Omitted the illumos-specific `/dev/dsk` and `/dev/rdsk`
path conversions since they don't apply on linux.
Earlier commit added a test that created tmpfile_003_pos,
but did not add it to the appropriate .gitignore.
ace1eae Add support for O_TMPFILE
Fix the omission.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes#5735
Convert explicit `typeset -i` and `typeset -l` declarations to
`typeset` in order to prevent 32-bit overflow from occurs with
disks >2G.
TEST_ZFSTESTS_DISKSIZE=4G
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#5715Closes#5714
Running tests locally were failing on cleanup scripts due to having a
pool named "pool". Match on word so the cleanup logic will cleanup
"testpool.*" while ignoring "pool".
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Tim Crawford <tcrawford@datto.com>
Closes#5703
The .write/.read file operations callbacks can be retired since
support for .read_iter/.write_iter and .aio_read/.aio_write has
been added. The vfs_write()/vfs_read() entry functions will
select the correct interface for the kernel. This is desirable
because all VFS write/read operations now rely on common code.
This change also add the generic write checks to make sure that
ulimits are enforced correctly on write.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes#5587Closes#5673
Porting notes:
- statvfs64 is replaced by statfs64.
- ZFS_SUPER_MAGIC definition moved in include/sys/fs/zfs.h
to share it between user and kernel space.
Authored by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7336
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/dd862f6dCloses#5651
After volume creation wait until the new block devices have settled
before destroying them. Failure to do some can result in EBUSY
being returned and the test case failing.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#5636Closes#5637
Enable zpool_clear_001_pos, zpool_create_024_pos and inherit_001_pos. These
are no longer slow.
Also disable zfs_destroy_001_pos, zfs_allow_010_pos and snapused_004_pos,
as they fail very often.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes#5613
Fix dmu_object_next() to correctly handle unallocated objects on
large_dnode datasets.
We implement this by scanning the dnode block until we find the correct
offset to be used in dnode_next_offset(). This is necessary because we
can't assume *objectp is a hole even if dmu_object_info() returns
ENOENT.
This fixes a couple of issues with zfs receive on large_dnode datasets.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#5027Closes#5532
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Haakan T Johansson <f96hajo@chalmers.se>
Closes#5547Closes#5543
Fix a regression accidentally introduced by e0ab3ab.
Additionally, add a new script zpool_import_014_pos.ksh to
the ZFS test suite to exercise 'zpool import -t' functionality.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#5466Closes#5515
When iterating over the input nvlist in dsl_props_set_sync_impl() when we don't
preserve the nvpair name before looking up ZPROP_VALUE, so when we later go to
process it nvpair_name() is always "value" and not the actual property name.
This fixes a couple of bugs in zfs_ioc_recv():
* Received properties were not restored correctly when failing to receive an
incremental send stream
* Received properties were not completely replaced by the new ones when
successfully receiving an incremental send stream
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#5497
This branch contains the following fixes/improvements.
* Fix setting i_flags
* Fix wrong operator in xvattr.h
* Fix fchange macro in zpl_ioctl_setflags()
* Added configure check to use inode_set_flags()
* Added a test case for chattr for better test coverage
Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes#5486Closes#5470Closes#5469
zpool iostat allows you to specify only certain vdevs to display.
Currently, if you run 'zpool iostat -c CMD vdev1 vdev2 ...'
on specific vdevs, it will actually run the command on *all* vdevs,
and just display the results for the vdevs you specify. This patch
corrects the behavior to only run the command on the specified vdevs,
and also enables the zpool_iostat_005_pos.ksh tests.
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes#5443
When running the ZFS Test Suite with a kmemleak enabled kernel
the following test cases run far slower than usual and may hit
their timeout threshold. Skip the following test cases.
Test: cli_root/zfs_get/zfs_get_009_pos (run as root) [55:43]
Test: cli_root/zpool_clear/zpool_clear_001_pos (run as root) [11:32]
Test: cli_root/zpool_create/zpool_create_024_pos (run as root) [11:01]
Test: features/async_destroy/async_destroy_001_pos (run as root) [41:15]
Test: inheritance/inherit_001_pos (run as root) [09:08]
Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #5479Closes#5480
Enable picky cstyle checks and resolve the new warnings. The vast
majority of the changes needed were to handle minor issues with
whitespace formatting. This patch contains no functional changes.
Non-whitespace changes are as follows:
* 8 times ; to { } in for/while loop
* fix missing ; in cmd/zed/agents/zfs_diagnosis.c
* comment (confim -> confirm)
* change endline , to ; in cmd/zpool/zpool_main.c
* a number of /* BEGIN CSTYLED */ /* END CSTYLED */ blocks
* /* CSTYLED */ markers
* change == 0 to !
* ulong to unsigned long in module/zfs/dsl_scan.c
* rearrangement of module_param lines in module/zfs/metaslab.c
* add { } block around statement after for_each_online_node
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Håkan Johansson <f96hajo@chalmers.se>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#5465
Don't count '@' for dataset namelen if not a snapshot. This
fixes making a pool unimportable when the dataset namelen
is 255.
Add test file for zfs create name length 255.
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes#5432Closes#5456
Update the test case to correctly interpret how Linux reports
the mount options.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: ChaoyuZhang <zhang.chaoyu@zte.com.cn>
Closes#5410
The zpool_scrub_004_pos test case currently fails when testing on
a 32-bit system. Conditionally skip this test case on 32-bit
systems until the root cause is identified and resolved.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #5444Closes#5445
This script was disabled as the avail/used space changed slightly.
Add sync_pool() and a short delay after snapshots are created to
ensure everything in flight has been written.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: ChaoyuZhang <zhang.chaoyu@zte.com.cn>
Closes#5201Closes#5419
This patch adds a command (-c) option to zpool status and zpool iostat. The
-c option allows you to run an arbitrary command on each vdev and display
the first line of output in zpool status/iostat. The environment vars
VDEV_PATH and VDEV_UPATH are set to the vdev's path and "underlying path"
before running the command. For device mapper, multipath, or partitioned
vdevs, VDEV_UPATH is the actual underlying /dev/sd* disk. This can be useful
if the command you're running requires a /dev/sd* device.
The patch also uses /sys/block/<dev>/slaves/ to lookup the underlying device
instead of using libdevmapper. This not only removes the libdevmapper
requirement at build time, but also allows you to resolve device mapper
devices without being root. This means that UDEV_UPATH get set correctly
when running zpool status/iostat as an unprivileged user.
Example:
$ zpool status -c 'echo I am $VDEV_PATH, $VDEV_UPATH'
NAME STATE READ WRITE CKSUM
mypool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
mpatha ONLINE 0 0 0 I am /dev/mapper/mpatha, /dev/sdc
sdb ONLINE 0 0 0 I am /dev/sdb1, /dev/sdb
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes#5368
Allow `zfs unshare <protocol> -a` command to share or unshare all datasets
of a given protocol, nfs or smb.
Additionally, enable most of ZFS Test Suite zfs_share/zfs_unshare test cases.
To work around some Illumos-specific functionalities ($SHARE/$UNSHARE) some
function wrappers were added around them.
Finally, fix and issue in smb_is_share_active() that would leave SMB shares
exported when invoking 'zfs unshare -a'
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#3238Closes#5367
Each test in the performance regression test suite
creates a pool and a dataset for use. Unfortunately,
these tests do not cleanup the pool and dataset
correctly once they complete. Each test now kills
fio and iostat, destroys the dataset, and finally
destroys the pool. Each test also now traps the
SIGTERM signal to handle cases where test-runner
kills a test.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Requires-builders: all
Closes#5407
The isainfo(1) utility was used by the ZFS Test Suite to determine
when running on a 32-bit platform. This non-portable check has been
replaced with an is_32bit helper function which uses getconf(1).
The getconf(1) utility is available for Linux, FreeBSD, and Illumos.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Linux 3.11 add O_TMPFILE to open(2), which allow creating an unlinked file on
supported filesystem. It's basically doing open(2) and unlink(2) atomically.
The filesystem support is added through i_op->tmpfile. We basically copy the
create operation except we get rid of the link and name related stuff and add
the new node to unlinked set.
We also add support for linkat(2) to link tmpfile. However, since all previous
file operation will skip ZIL, we force a txg_wait_synced to make sure we are
sync safe.
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
The async_destroy_001_pos test case currently hangs when testing on
a 32-bit system. Conditionally skip this test case on 32-bit
systems until the root cause is identified and resolved.
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #5352
Issue #5347
Due to the instability of the migration tests, the test will skip.
The migration tests focus on migrating test file from fs to ZFS fs.
We can create zpool and ext2 directly by loop device, rather than
by set_partition
Reviewed-by: Sydney Vanda <sydney.m.vanda@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: legend-hua <liu.hua130@zte.com.cn>
Closes#5315
Sometimes it is desirable to specifically disable one or several
features directly on the 'zpool create' command line.
$ zpool create -o feature@<feature>=disabled ...
Original-patch-by: Turbo Fredriksson <turbo@bayour.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#3460Closes#5142Closes#5324
When creating and destroying pools in tight loop it's possible to
exhaust the number of allowed threads on a system. This results
in taskq_create() failling and a NULL dereference.
Resolve the issue by falling back to opening the vdevs all
synchronously.
Reviewed-by: Denys Rtveliashvili <denys@rtveliashvili.name>
Reviewed-by: Håkan Johansson <f96hajo@chalmers.se>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes zfsonlinux/spl#521
Closes#4637
Log function should be "log_fail", rather than "log_failED"
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: legend-hua <liu.hua130@zte.com.cn>
Closes#5300
The file tests/zfs-tests/tests/stress/Makefile.am gets mistakenly
removed by the distclean target because it's empty. Adding a
`SUBDIRS =` line prevents the removal.
This directory is being preserved as the location to add assorted
stress tests. These may include but are not limited to.
http://kernel.ubuntu.com/~cking/stress-ng/https://github.com/zfsonlinux/zfsstress/
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#5308
CID 147643: Type: String not null terminated
- make sure that the string is null terminated before strlen
and fprintf.
CID 152204: Type: Copy into fixed size buffer
- since strlcpy isn't availabe here, use strncpy and terminate
the string manually.
CID 49339: Type: Buffer not null terminated
- since strlcpy isn't availabe here, terminate the string
manually before fprintf.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: GeLiXin <ge.lixin@zte.com.cn>
Closes#5283
Authored by: Akash Ayare <aayare@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Reviewed-by: yuxiang <guo.yong33@zte.com.cn>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Bug was caused due to a change in functionality. At some point, ZFS
snapshots no longer created associated device files which were being
used in the test. To resolve this issue, a clone of the snapshot can be
produced which will also create the expected device files; then, the
test will behave as it did historically.
OpenZFS-issue: https://www.illumos.org/issues/6877
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/2200f27Closes#5275
Porting Notes:
- Hardcoded /dev/zvol/rdsk changed to $ZVOL_RDEVDIR for compatibility.
- Enabled in linux runfile.
These tests all pass once updated to wait for udev to create the
expected linked under /dev/zvol/.
Reviewed-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Reviewed-by: yuxiang <guo.yong33@zte.com.cn>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#5275
In this test the 'ls -ls' command was used to print testfile size in
blocks. Because the environment variable BLOCK_SIZE was set
the 'ls -ls' command detected this and output its block count as the
number of 8192 blocks. Rather than change the variable name
the -k was was added to force ls to return 1k blocks. This has the
additional advantage of behaving consistently across platforms.
For additional details on GNU 'ls' behavior regarding block size:
https://www.gnu.org/software/coreutils/manual/html_node/Block-size.html
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: yuxiang <guo.yong33@zte.com.cn>
Closes#5269
The following new test cases need to have execute permissions set:
userquota/groupspace_003_pos.ksh
userquota/userquota_013_pos.ksh
userquota/userspace_003_pos.ksh
upgrade/upgrade_userobj_001_pos.ksh
upgrade/setup.ksh
upgrade/cleanup.ksh
The following source files accidentally were marked executable:
lib/libzpool/kernel.c
lib/libshare/nfs.c
lib/libzfs/libzfs_dataset.c
lib/libzfs/libzfs_util.c
tests/zfs-tests/cmd/rm_lnkcnt_zero_file/rm_lnkcnt_zero_file.c
tests/zfs-tests/cmd/dir_rd_update/dir_rd_update.c
cmd/zed/zed_exec.c
module/icp/core/kcf_sched.c
module/zfs/dsl_pool.c
module/zfs/arc.c
module/nvpair/nvpair.c
man/man5/zfs-module-parameters.5
Reviewed-by: GeLiXin <ge.lixin@zte.com.cn>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#5241
The variable snapprops_nvlist was never initialized, so properties
were not applied to the received snapshot.
Additionally, add zfs_receive_013_pos.ksh script to ZFS test suite to exercise
'zfs receive' functionality for user properties.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#4338
Introduce a make recipe for flake8 to enable python
style checking. Ensure all python scripts pass flake8.
Return an error code of 0 for arcstat.py -v and
dbufstat.py -v. Add test cases for python scripts.
Reviewed by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ian Lee <IanLee1521@gmail.com>
Closes#5230
This patch tracks dnode usage for each user/group in the
DMU_USER/GROUPUSED_OBJECT ZAPs. ZAP entries dedicated to dnode
accounting have the key prefixed with "obj-" followed by the UID/GID
in string format (as done for the block accounting).
A new SPA feature has been added for dnode accounting as well as
a new ZPL version. The SPA feature must be enabled in the pool
before upgrading the zfs filesystem. During the zfs version upgrade,
a "quotacheck" will be executed by marking all dnode as dirty.
ZoL-bug-id: https://github.com/zfsonlinux/zfs/issues/3500
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Implement tests to ensure that python scripts
that are distributed with ZFS continue to at
minimum run without errors. This will help prevent
accidental breaking of these scripts.
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
As part of its tests, filetest_001_pos.ksh wipes the entire vdev to
create checksum errors. This patch uses the setup/cleanup scripts from
the scrub_mirror test to create a custom 100MB pool, rather than
using the entire device size that is passed into zfs-tests.sh
(which defaults to 2GB). This speeds up the buildbot tests, and also
makes it possible for someone to use real disks (say, 1TB) without the
test taking an insanely long amount of time.
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Approved by: Garrett D'Amore <garrett@damore.org>
Ported by: Tony Hutter <hutter2@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/4185
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/45818ee
Porting Notes:
This code is ported on top of the Illumos Crypto Framework code:
b5e030c8db
The list of porting changes includes:
- Copied module/icp/include/sha2/sha2.h directly from illumos
- Removed from module/icp/algs/sha2/sha2.c:
#pragma inline(SHA256Init, SHA384Init, SHA512Init)
- Added 'ctx' to lib/libzfs/libzfs_sendrecv.c:zio_checksum_SHA256() since
it now takes in an extra parameter.
- Added CTASSERT() to assert.h from for module/zfs/edonr_zfs.c
- Added skein & edonr to libicp/Makefile.am
- Added sha512.S. It was generated from sha512-x86_64.pl in Illumos.
- Updated ztest.c with new fletcher_4_*() args; used NULL for new CTX argument.
- In icp/algs/edonr/edonr_byteorder.h, Removed the #if defined(__linux) section
to not #include the non-existant endian.h.
- In skein_test.c, renane NULL to 0 in "no test vector" array entries to get
around a compiler warning.
- Fixup test files:
- Rename <sys/varargs.h> -> <varargs.h>, <strings.h> -> <string.h>,
- Remove <note.h> and define NOTE() as NOP.
- Define u_longlong_t
- Rename "#!/usr/bin/ksh" -> "#!/bin/ksh -p"
- Rename NULL to 0 in "no test vector" array entries to get around a
compiler warning.
- Remove "for isa in $($ISAINFO); do" stuff
- Add/update Makefiles
- Add some userspace headers like stdio.h/stdlib.h in places of
sys/types.h.
- EXPORT_SYMBOL *_Init/*_Update/*_Final... routines in ICP modules.
- Update scripts/zfs2zol-patch.sed
- include <sys/sha2.h> in sha2_impl.h
- Add sha2.h to include/sys/Makefile.am
- Add skein and edonr dirs to icp Makefile
- Add new checksums to zpool_get.cfg
- Move checksum switch block from zfs_secpolicy_setprop() to
zfs_check_settable()
- Fix -Wuninitialized error in edonr_byteorder.h on PPC
- Fix stack frame size errors on ARM32
- Don't unroll loops in Skein on 32-bit to save stack space
- Add memory barriers in sha2.c on 32-bit to save stack space
- Add filetest_001_pos.ksh checksum sanity test
- Add option to write psudorandom data in file_write utility
coverity scan CID:147536, type: Argument cannot be negative
- may write or close fd which is negative
coverity scan CID:147537, type: Argument cannot be negative
- may call dup2 with a negative fd
coverity scan CID:147538, type: Argument cannot be negative
- may read or fchown with a negative fd
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: GeLiXin <ge.lixin@zte.com.cn>
Closes#5185
Because the macro ZFS_MAXPROPLEN used in function print_dataset
differs between platforms set it appropriately and calculate the expected
number of passes.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: yuxiang <guo.yong33@zte.com.cn>
Closes#5154
The default blocksize in Linux is 1024 due to a GNU-ism. Setting the
expected blocksize resolves the issue. As mentioned in the PR an
alternate solution would be to set POSIXLY_CORRECT=1.
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: yuxiang <guo.yong33@zte.com.cn>
Closes#5167
1.coverity scan CID:147445 function zfs_do_send in zfs_main.c
Buffer not null terminated (BUFFER_SIZE_WARNING)
2.coverity scan CID:147443 function zfs_do_bookmark in zfs_main.c
Buffer not null terminated (BUFFER_SIZE_WARNING)
3.coverity scan CID:147660 function main in zinject.c
Passing string argv[0] of unknown size to strcpy
By the way, the leak of g_zfs is fixed.
4.coverity scan CID: 147442 function make_disks in zpool_vdev.c
Buffer not null terminated (BUFFER_SIZE_WARNING)
5.coverity scan CID: 147661 function main in dir_rd_update.c
passing string cp1 of unknown size to strcpy
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Closes#5130
Due to how the Linux VFS was designed busy mount points
cannot be destroyed even when given the force option. Update
the zfs_destroy_001_pos test case to expect this behavior when
running under Linux.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: candychencan <chen.can2@zte.com.cn>
Closes#5132
zfs_commands.cfg have printed "No such file or directory", When executing
script/zfs-test.sh. The script is a symlink to ../../../zfs-script-config.sh
So delete the symlink, and directly source $SRCDIR/zfs-script-config.sh
when it exists from default.cfg.in
Reviewed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: legend-hua <liu.hua130@zte.com.cn>
Closes#5133
Fix misleading error message:
"The /dev/zfs device is missing and must be created.", if /etc/mtab is missing.
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Eric Desrochers <eric.desrochers@canonical.com>
Closes#4680Closes#5029
The FALLOC_FL_PUNCH_HOLE flag was introduced in the 2.6.38
kernel. To prevent breaking the build on older systems wrap its use
in a conditional. When FALLOC_FL_PUNCH_HOLE isn't available
return a non-zero status and error message.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: legend-hua <liu.hua130@zte.com.cn>
Closes#5101
Author: John Wren Kennedy <john.kennedy@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: David Quigley <david.quigley@intel.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Ported-by: Don Brady <don.brady@intel.com>
OpenZFS-issue: https://www.illumos.org/issues/6950
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/dcbf3bd6
Delphix-commit: https://github.com/delphix/delphix-os/commit/978ed49Closes#4929
ZFS Test Suite Performance Regression Tests
This was pulled into OpenZFS via the compressed arc featureand was
separated out in zfsonlinux as a separate pull request from PR-4768.
It originally came in as QA-4903 in Delphix-OS from John Kennedy.
Expected Usage:
$ DISKS="sdb sdc sdd" zfs-tests.sh -r perf-regression.run
Porting Notes:
1. Added assertions in the setup script to make sure required tools
(fio, mpstat, ...) are present.
2. For the config.json generation in perf.shlib used arcstats and
other binaries instead of dtrace to query the values.
3. For the perf data collection:
- use "zpool iostat -lpvyL" instead of the io.d dtrace script
(currently not collecting zfs_read/write latency stats)
- mpstat and iostat take different arguments
- prefetch_io.sh is a placeholder that uses arcstats instead of
dtrace
4. Build machines require fio, mdadm and sysstat pakage (YMMV).
Future Work:
- Need a way to measure zfs_read and zfs_write latencies per pool.
- Need tools to takes two sets of output and display/graph the
differences
- Bring over additional regression tests from Delphix
When using real devices, specify DISKS="sdb sdc sdd" opposed to
/dev/sdb in zfs-tests.sh - otherwise errors with directory names and
disk names registering as "/dev//dev/sdb" for some tests. The same
goes for mpath: DISK="mpatha mpathad mpathb"
Expected Usage:
$ DISKS="sdb sdc sdd" zfs-tests.sh
SLICE_PREFIX is now set as "p" for a loop device (ie loop0p2) or
"" for a real device (ie sdb2), or either for multipath devices
(ie mpatha1 or mpath1p1) instead of only "p" by default. Note that
kpartx partitioning is not currently supported in this patch
(ie "partx") and may need to be disabled on Debian distributions.
Functions added for determining test directory (/dev or /dev/mapper)
as well as slice prefix are determined and exported mostly in the cfg
file of each test group directory.
Currently zpools cannot be created on whole mpath devices that have
been partitioned. In order to fix this tests have either been revised
to use a partition instead, or if there is a size constraint and the
pool needs to be created on the whole disk, partitions are then deleted
if the device is a multipath device. This functionality is added to
default_cleanup() or to individual cleanup scripts if a non-default
cleanup method is used.
The max partitions is currently set at 8 to account for all of the
tests thus far.
Patch changes are generally encompassed in "if is_linux" construct.
Signed-off-by: Sydney Vanda <sydney.m.vanda@intel.com>
Reviewed-by: John Salinas <John.Salinas@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: David Quigley <david.quigley@intel.com>
Closes#4447Closes#4964Closes#5074
Older versions of blkid may not promptly detect ZFS labels when
they're located on partitions. In order to ensure this test passes
reliably always perform a scan of default search paths (-s).
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: liaoyuxiangqin <guo.yong33@zte.com.cn>
Closes#4987Closes#5047
Issues:
Under Linux, when executing zfs_destroy_004.ksh destroy $fs is an
error. The key issue here is that illumos kernel treats this case
differently than the Linux kernel. On illumos you can unmount and
destroy a filesystem which is busy and all consumers of it get EIO.
On Linux the expected behavior is to prevent the unmount and destroy.
Cause analysis:
When create $fs file system and mount file system to $mntp.
cd $mntp, linux isn't allow to destroy $fs in this mount contents.
No matter what destroy with parameters.
Solution:
So log_mustnot $ZFS destroy $fs is ok.
cd $olddir and destroy $fs.
Signed-off-by: caoxuewen cao.xuewen@zte.com.cn
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#5012
As the scripts zfs_create_003_pos.ksh and zfs_create_006_pos.ksh can
run successfully in the linux, add them to the <linux.run> file to
increase test scene.
Signed-off-by: ChaoyuZhang <zhang.chaoyu@zte.com.cn>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#5002
Update zfs_mount_005_pos.ksh and zfs_mount_010_neg.ksh to reflect
the expected Linux behavior. The is_linux wrapper is used so the
test case may be used on Linux and non-Linux platforms.
Signed-off-by: liuhuang <liu.huang@zte.com.cn>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#5000
Just cleanup the new fs created during the test, so the "$found"
should be "true".
Signed-off-by: ChaoyuZhang <zhang.chaoyu@zte.com.cn>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#4978
From the man page of dirname: " Both dirname() and basename()
may modify the contents of path, so it may be desirable to pass
a copy when calling one of these functions." And in fact on linux
using dirname actually changes the contents of the passed parameter as
evident from the following failure when running the ctime test:
link(/root/zfs-mount, /root/zfs-mount/link_file)
Fix this by creating a copy of the input parameter and passing that
to dirname, thus not compromising the original parameter, allowing
the creation of hard link to succeed.
Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#4977
Updated test case history_001_pos.ksh so it can run in tree. The
original test case assumed /usr/sbin/zfs and /usr/sbin/zpool were
the only valid locations for these utilities. The same modification
has already been made too history_common.kshlib.
The only other failing test case was history_010_pos and that was
the result of the ":linux" suffix not being appended when checking
the long output in the test case.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#4882
Creating the pool in a striped rather than mirrored configuration
provides enough space for all upgrade tests to run. Test case
zpool_upgrade_007_pos still fails and must be investigated so
it has been left disabled.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#4852
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Calling dsl_dataset_name on a dataset with a 256 byte buffer is asking
for trouble. We should check every dataset on import, using a 1024 byte
buffer and checking each time to see if the dataset's new name is longer
than 256 bytes.
OpenZFS-issue: https://www.illumos.org/issues/6876
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/ca8674e
2605 want to resume interrupted zfs send
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed by: Xin Li <delphij@freebsd.org>
Reviewed by: Arne Jansen <sensille@gmx.net>
Approved by: Dan McDonald <danmcd@omniti.com>
Ported-by: kernelOfTruth <kerneloftruth@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/2605
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/9c3fd12
6980 6902 causes zfs send to break due to 32-bit/64-bit struct mismatch
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/6980
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/ea4a67f
Porting notes:
- All rsend and snapshop tests enabled and updated for Linux.
- Fix misuse of input argument in traverse_visitbp().
- Fix ISO C90 warnings and errors.
- Fix gcc 'missing braces around initializer' in
'struct send_thread_arg to_arg =' warning.
- Replace 4 argument fletcher_4_native() with 3 argument version,
this change was made in OpenZFS 4185 which has not been ported.
- Part of the sections for 'zfs receive' and 'zfs send' was
rewritten and reordered to approximate upstream.
- Fix mktree xattr creation, 'user.' prefix required.
- Minor fixes to newly enabled test cases
- Long holds for volumes allowed during receive for minor registration.
Justification
-------------
This feature adds support for variable length dnodes. Our motivation is
to eliminate the overhead associated with using spill blocks. Spill
blocks are used to store system attribute data (i.e. file metadata) that
does not fit in the dnode's bonus buffer. By allowing a larger bonus
buffer area the use of a spill block can be avoided. Spill blocks
potentially incur an additional read I/O for every dnode in a dnode
block. As a worst case example, reading 32 dnodes from a 16k dnode block
and all of the spill blocks could issue 33 separate reads. Now suppose
those dnodes have size 1024 and therefore don't need spill blocks. Then
the worst case number of blocks read is reduced to from 33 to two--one
per dnode block. In practice spill blocks may tend to be co-located on
disk with the dnode blocks so the reduction in I/O would not be this
drastic. In a badly fragmented pool, however, the improvement could be
significant.
ZFS-on-Linux systems that make heavy use of extended attributes would
benefit from this feature. In particular, ZFS-on-Linux supports the
xattr=sa dataset property which allows file extended attribute data
to be stored in the dnode bonus buffer as an alternative to the
traditional directory-based format. Workloads such as SELinux and the
Lustre distributed filesystem often store enough xattr data to force
spill bocks when xattr=sa is in effect. Large dnodes may therefore
provide a performance benefit to such systems.
Other use cases that may benefit from this feature include files with
large ACLs and symbolic links with long target names. Furthermore,
this feature may be desirable on other platforms in case future
applications or features are developed that could make use of a
larger bonus buffer area.
Implementation
--------------
The size of a dnode may be a multiple of 512 bytes up to the size of
a dnode block (currently 16384 bytes). A dn_extra_slots field was
added to the current on-disk dnode_phys_t structure to describe the
size of the physical dnode on disk. The 8 bits for this field were
taken from the zero filled dn_pad2 field. The field represents how
many "extra" dnode_phys_t slots a dnode consumes in its dnode block.
This convention results in a value of 0 for 512 byte dnodes which
preserves on-disk format compatibility with older software.
Similarly, the in-memory dnode_t structure has a new dn_num_slots field
to represent the total number of dnode_phys_t slots consumed on disk.
Thus dn->dn_num_slots is 1 greater than the corresponding
dnp->dn_extra_slots. This difference in convention was adopted
because, unlike on-disk structures, backward compatibility is not a
concern for in-memory objects, so we used a more natural way to
represent size for a dnode_t.
The default size for newly created dnodes is determined by the value of
a new "dnodesize" dataset property. By default the property is set to
"legacy" which is compatible with older software. Setting the property
to "auto" will allow the filesystem to choose the most suitable dnode
size. Currently this just sets the default dnode size to 1k, but future
code improvements could dynamically choose a size based on observed
workload patterns. Dnodes of varying sizes can coexist within the same
dataset and even within the same dnode block. For example, to enable
automatically-sized dnodes, run
# zfs set dnodesize=auto tank/fish
The user can also specify literal values for the dnodesize property.
These are currently limited to powers of two from 1k to 16k. The
power-of-2 limitation is only for simplicity of the user interface.
Internally the implementation can handle any multiple of 512 up to 16k,
and consumers of the DMU API can specify any legal dnode value.
The size of a new dnode is determined at object allocation time and
stored as a new field in the znode in-memory structure. New DMU
interfaces are added to allow the consumer to specify the dnode size
that a newly allocated object should use. Existing interfaces are
unchanged to avoid having to update every call site and to preserve
compatibility with external consumers such as Lustre. The new
interfaces names are given below. The versions of these functions that
don't take a dnodesize parameter now just call the _dnsize() versions
with a dnodesize of 0, which means use the legacy dnode size.
New DMU interfaces:
dmu_object_alloc_dnsize()
dmu_object_claim_dnsize()
dmu_object_reclaim_dnsize()
New ZAP interfaces:
zap_create_dnsize()
zap_create_norm_dnsize()
zap_create_flags_dnsize()
zap_create_claim_norm_dnsize()
zap_create_link_dnsize()
The constant DN_MAX_BONUSLEN is renamed to DN_OLD_MAX_BONUSLEN. The
spa_maxdnodesize() function should be used to determine the maximum
bonus length for a pool.
These are a few noteworthy changes to key functions:
* The prototype for dnode_hold_impl() now takes a "slots" parameter.
When the DNODE_MUST_BE_FREE flag is set, this parameter is used to
ensure the hole at the specified object offset is large enough to
hold the dnode being created. The slots parameter is also used
to ensure a dnode does not span multiple dnode blocks. In both of
these cases, if a failure occurs, ENOSPC is returned. Keep in mind,
these failure cases are only possible when using DNODE_MUST_BE_FREE.
If the DNODE_MUST_BE_ALLOCATED flag is set, "slots" must be 0.
dnode_hold_impl() will check if the requested dnode is already
consumed as an extra dnode slot by an large dnode, in which case
it returns ENOENT.
* The function dmu_object_alloc() advances to the next dnode block
if dnode_hold_impl() returns an error for a requested object.
This is because the beginning of the next dnode block is the only
location it can safely assume to either be a hole or a valid
starting point for a dnode.
* dnode_next_offset_level() and other functions that iterate
through dnode blocks may no longer use a simple array indexing
scheme. These now use the current dnode's dn_num_slots field to
advance to the next dnode in the block. This is to ensure we
properly skip the current dnode's bonus area and don't interpret it
as a valid dnode.
zdb
---
The zdb command was updated to display a dnode's size under the
"dnsize" column when the object is dumped.
For ZIL create log records, zdb will now display the slot count for
the object.
ztest
-----
Ztest chooses a random dnodesize for every newly created object. The
random distribution is more heavily weighted toward small dnodes to
better simulate real-world datasets.
Unused bonus buffer space is filled with non-zero values computed from
the object number, dataset id, offset, and generation number. This
helps ensure that the dnode traversal code properly skips the interior
regions of large dnodes, and that these interior regions are not
overwritten by data belonging to other dnodes. A new test visits each
object in a dataset. It verifies that the actual dnode size matches what
was stored in the ztest block tag when it was created. It also verifies
that the unused bonus buffer space is filled with the expected data
patterns.
ZFS Test Suite
--------------
Added six new large dnode-specific tests, and integrated the dnodesize
property into existing tests for zfs allow and send/recv.
Send/Receive
------------
ZFS send streams for datasets containing large dnodes cannot be received
on pools that don't support the large_dnode feature. A send stream with
large dnodes sets a DMU_BACKUP_FEATURE_LARGE_DNODE flag which will be
unrecognized by an incompatible receiving pool so that the zfs receive
will fail gracefully.
While not implemented here, it may be possible to generate a
backward-compatible send stream from a dataset containing large
dnodes. The implementation may be tricky, however, because the send
object record for a large dnode would need to be resized to a 512
byte dnode, possibly kicking in a spill block in the process. This
means we would need to construct a new SA layout and possibly
register it in the SA layout object. The SA layout is normally just
sent as an ordinary object record. But if we are constructing new
layouts while generating the send stream we'd have to build the SA
layout object dynamically and send it at the end of the stream.
For sending and receiving between pools that do support large dnodes,
the drr_object send record type is extended with a new field to store
the dnode slot count. This field was repurposed from unused padding
in the structure.
ZIL Replay
----------
The dnode slot count is stored in the uppermost 8 bits of the lr_foid
field. The bits were unused as the object id is currently capped at
48 bits.
Resizing Dnodes
---------------
It should be possible to resize a dnode when it is dirtied if the
current dnodesize dataset property differs from the dnode's size, but
this functionality is not currently implemented. Clearly a dnode can
only grow if there are sufficient contiguous unused slots in the
dnode block, but it should always be possible to shrink a dnode.
Growing dnodes may be useful to reduce fragmentation in a pool with
many spill blocks in use. Shrinking dnodes may be useful to allow
sending a dataset to a pool that doesn't support the large_dnode
feature.
Feature Reference Counting
--------------------------
The reference count for the large_dnode pool feature tracks the
number of datasets that have ever contained a dnode of size larger
than 512 bytes. The first time a large dnode is created in a dataset
the dataset is converted to an extensible dataset. This is a one-way
operation and the only way to decrement the feature count is to
destroy the dataset, even if the dataset no longer contains any large
dnodes. The complexity of reference counting on a per-dnode basis was
too high, so we chose to track it on a per-dataset basis similarly to
the large_block feature.
Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#3542
- Use a fixed buffer of random bytes when random xattr values are in
effect. This eliminates the potential performance bottleneck of
reading from /dev/urandom for each file. This also allows us to
verify xattrs in random value mode.
- Show the rate of operations per second in addition to elapsed time
for each phase of the test. This may be useful for benchmarking.
- Set default xattr size to 6 so that verify doesn't fail if user
doesn't specify a size. We need at least six bytes to store the
leading "size=X" string that is used for verification.
- Allow user to execute just one phase of the test. Acceptable
values for -o and their meanings are:
1 - run the create phase
2 - run the setxattr phase
3 - run the getxattr phase
4 - run the unlink phase
Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
This is a new implementation of RAIDZ1/2/3 routines using x86_64
scalar, SSE, and AVX2 instruction sets. Included are 3 parity
generation routines (P, PQ, and PQR) and 7 reconstruction routines,
for all RAIDZ level. On module load, a quick benchmark of supported
routines will select the fastest for each operation and they will
be used at runtime. Original implementation is still present and
can be selected via module parameter.
Patch contains:
- specialized gen/rec routines for all RAIDZ levels,
- new scalar raidz implementation (unrolled),
- two x86_64 SIMD implementations (SSE and AVX2 instructions sets),
- fastest routines selected on module load (benchmark).
- cmd/raidz_test - verify and benchmark all implementations
- added raidz_test to the ZFS Test Suite
New zfs module parameters:
- zfs_vdev_raidz_impl (str): selects the implementation to use. On
module load, the parameter will only accept first 3 options, and
the other implementations can be set once module is finished
loading. Possible values for this option are:
"fastest" - use the fastest math available
"original" - use the original raidz code
"scalar" - new scalar impl
"sse" - new SSE impl if available
"avx2" - new AVX2 impl if available
See contents of `/sys/module/zfs/parameters/zfs_vdev_raidz_impl` to
get the list of supported values. If an implementation is not supported
on the system, it will not be shown. Currently selected option is
enclosed in `[]`.
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#4328
ZFS allows for specific permissions to be delegated to normal users
with the `zfs allow` and `zfs unallow` commands. In addition, non-
privileged users should be able to run all of the following commands:
* zpool [list | iostat | status | get]
* zfs [list | get]
Historically this functionality was not available on Linux. In order
to add it the secpolicy_* functions needed to be implemented and mapped
to the equivalent Linux capability. Only then could the permissions on
the `/dev/zfs` be relaxed and the internal ZFS permission checks used.
Even with this change some limitations remain. Under Linux only the
root user is allowed to modify the namespace (unless it's a private
namespace). This means the mount, mountpoint, canmount, unmount,
and remount delegations cannot be supported with the existing code. It
may be possible to add this functionality in the future.
This functionality was validated with the cli_user and delegation test
cases from the ZFS Test Suite. These tests exhaustively verify each
of the supported permissions which can be delegated and ensures only
an authorized user can perform it.
Two minor bug fixes were required for test-running.py. First, the
Timer() object cannot be safely created in a `try:` block when there
is an unconditional `finally` block which references it. Second,
when running as a normal user also check for scripts using the
both the .ksh and .sh suffixes.
Finally, existing users who are simulating delegations by setting
group permissions on the /dev/zfs device should revert that
customization when updating to a version with this change.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes#362Closes#434Closes#4100Closes#4394Closes#4410Closes#4487
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Ported by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/6531
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/97e8130
Porting notes:
- Added new IO delay tracepoints, and moved common ZIO tracepoint macros
to a new trace_common.h file.
- Used zio_delay_taskq() in place of OpenZFS's timeout_generic() function.
- Updated zinject man page
- Updated zpool_scrub test files
This is a purely cosmetical change, to consistently prefer one of
two (both acceptable) choises for the word parsable in documentation and
code. I don't really care which to use, but acording to wiktionary
https://en.wiktionary.org/wiki/parsable#English parsable is preferred.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#4682
Commit e0ab3ab introduced new per-vdev ZAP tests which should have
used the $ZPOOL and $ZDB variabled. The tests passed the automated
testing since both utilities but when running in-tree all of the new
tests fail.
Signed-off-by: Don Brady <don.brady@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#4515
6736 ZFS per-vdev ZAPs
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
References:
https://www.illumos.org/issues/6736https://github.com/openzfs/openzfs/commit/215198a
Ported-by: Don Brady <don.brady@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#4515
Call block_device_wait when creating/destroying volumes in order
to make the operations synchronous as expected by the test cases.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#4560
Commit 4967a3e introduced a typo that caused the ZPL to store the
intended default ACL as an access ACL. Due to caching this problem
may not become visible until the filesystem is remounted or the inode
is evicted from the cache. Fix the typo and add a regression test.
Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Closes#4520