Updating based on PR Feedback(4)

1. When testing out installing a VM with virtual manager on Linux and a
   dataset with direct=always, there an ASSERT failure in
   abd_alloc_from_pages(). Originally zfs_setup_direct() did an
   alignment check of the UIO using SPA_MINBLOCKSIZE with
   zfs_uio_aligned(). The idea behind this was maybe the page alignment
   restriction could be changed to use ashift as the alignment check in
   the future. Howver, this diea never came to be. The alignment
   restrictions for Direct I/O are based on PAGE_SIZE. Updating the
   check zfs_setup_direct() for the UIO to use PAGE_SIZE fixed the
   issue.
2. Updated other alignment check in dmu_read_impl() to also use
   PAGE_SIZE.
3. As a consequence of updating the UIO alignment checks the ZTS test
   case dio_unaligned_filesize began to fail. This is because there was
   no way to detect reading past the end of the file before issue EINVAL
   in the ZPL and VOPs layers in FreeBSD. This was resolved by moving
   zfs_setup_direct() into zfs_write() and zfs_read(). This allows for
   other error checking to take place before checking any Direct I/O
   limitations. Updating the call site of zfs_setup_direct() did require
   a bit of changes to the logic in that function. In particular Direct
   I/O can just be avoid altogether depending on the checks in
   zfs_setup_direct() and there is no reason to return EAGAIN at all.
4. After moving zfs_setup_direct() into zfs_write() and zfs_read(),
   there was no reason to call zfs_check_direct_enabled() in the ZPL
   layer in Linux or in the VNOPS layer of FreeBSD. This function was
   completely removed. This allowed for much of the code in both those
   layers to return to their original code.
5. Upated the checksum verify module parameter for Direct I/O writes to
   only be a boolean and return EIO in the event a checksum verify
   failure occurs. By default, this module parameter is set to 1 for
   Linux and 0 for FreeBSD. The module parameter has been changed to
   zfs_vdev_direct_write_verify. There are still counters on the
   top-level VDEV for checksum verify failures, but this could be
   removed. It would still be good to to leave the ZED event dio_verify
   for checksum failures as a notification that an application was
   manipulating the contents of a buffer after issuing that buffer with
   for I/O using Direct I/O. As part of this cahnge, man pages were
   updated, the ZTS test case dio_writy_verify was updated, and all
   comments relating to the module parameter were udpated as well.
6. Updated comments in dio_property ZTS test to properly reflect that
   stride_dd is being called with check_write and check_read.

Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
This commit is contained in:
Brian Atkinson 2024-08-22 11:58:50 -06:00
parent 6e0ffaf627
commit 71ce314930
33 changed files with 252 additions and 708 deletions

View File

@ -655,9 +655,9 @@ int param_set_min_auto_ashift(ZFS_MODULE_PARAM_ARGS);
int param_set_max_auto_ashift(ZFS_MODULE_PARAM_ARGS); int param_set_max_auto_ashift(ZFS_MODULE_PARAM_ARGS);
/* /*
* VDEV checksum verification precentage for Direct I/O writes * VDEV checksum verification for Direct I/O writes
*/ */
extern uint_t zfs_vdev_direct_write_verify_pct; extern uint_t zfs_vdev_direct_write_verify;
#ifdef __cplusplus #ifdef __cplusplus
} }

View File

@ -46,9 +46,6 @@ extern int mappedread(znode_t *, int, zfs_uio_t *);
extern int mappedread_sf(znode_t *, int, zfs_uio_t *); extern int mappedread_sf(znode_t *, int, zfs_uio_t *);
extern void update_pages(znode_t *, int64_t, int, objset_t *); extern void update_pages(znode_t *, int64_t, int, objset_t *);
extern int zfs_check_direct_enabled(znode_t *, int, boolean_t *);
extern int zfs_setup_direct(znode_t *, zfs_uio_t *, zfs_uio_rw_t, int *);
/* /*
* Platform code that asynchronously drops zp's inode / vnode_t. * Platform code that asynchronously drops zp's inode / vnode_t.
* *

View File

@ -416,12 +416,10 @@ May be increased up to
.Sy ASHIFT_MAX Po 16 Pc , .Sy ASHIFT_MAX Po 16 Pc ,
but this may negatively impact pool space efficiency. but this may negatively impact pool space efficiency.
. .
.It Sy zfs_vdev_direct_write_verify_pct Ns = Ns Sy Linux 2 | FreeBSD 0 Pq uint .It Sy zfs_vdev_direct_write_verify Ns = Ns Sy Linux 1 | FreeBSED 0 Pq uint
If non-zero, then a Direct I/O write's checksum will be verified every If non-zero, then a Direct I/O write's checksum will be verified every
percentage (pct) of Direct I/O writes that are issued to a top-level VDEV time the write is issued and before it is commited to the block pointer.
before it is committed and the block pointer is updated. In the event the checksum is not valid then the I/O operation will return EIO.
In the event the checksum is not valid then the I/O operation will be
redirected through the ARC.
This module parameter can be used to detect if the This module parameter can be used to detect if the
contents of the users buffer have changed in the process of doing a Direct I/O contents of the users buffer have changed in the process of doing a Direct I/O
write. write.
@ -432,7 +430,7 @@ Each verify error causes a
zevent. zevent.
Direct Write I/O checkum verify errors can be seen with Direct Write I/O checkum verify errors can be seen with
.Nm zpool Cm status Fl d . .Nm zpool Cm status Fl d .
The default value for this is 2 percent on Linux, but is 0 for The default value for this is 1 on Linux, but is 0 for
.Fx .Fx
because user pages can be placed under write protection in because user pages can be placed under write protection in
.Fx .Fx

View File

@ -100,14 +100,14 @@ The number of delay events is ratelimited by the
module parameter. module parameter.
.It Sy dio_verify .It Sy dio_verify
Issued when there was a checksum verify error after a Direct I/O write has been Issued when there was a checksum verify error after a Direct I/O write has been
issued and is redirected through the ARC. issued.
This event can only take place if the module parameter This event can only take place if the module parameter
.Sy zfs_vdev_direct_write_verify_pct .Sy zfs_vdev_direct_write_verify
is not set to zero. is not set to zero.
See See
.Xr zfs 4 .Xr zfs 4
for more details on the for more details on the
.Sy zfs_vdev_direct_write_verify_pct .Sy zfs_vdev_direct_write_verify
module paramter. module paramter.
.It Sy config .It Sy config
Issued every time a vdev change have been done to the pool. Issued every time a vdev change have been done to the pool.

View File

@ -85,7 +85,7 @@ to set pool GUID as key for pool objects instead of pool names.
Display the number of Direct I/O write checksum verify errors that have occured Display the number of Direct I/O write checksum verify errors that have occured
on a top-level VDEV. on a top-level VDEV.
See See
.Sx zfs_vdev_direct_write_verify_pct .Sx zfs_vdev_direct_write_verify
in in
.Xr zfs 4 .Xr zfs 4
for details about the conditions that can cause Direct I/O write checksum for details about the conditions that can cause Direct I/O write checksum

View File

@ -4274,29 +4274,6 @@ ioflags(int ioflags)
return (flags); return (flags);
} }
static int
zfs_freebsd_read_direct(znode_t *zp, zfs_uio_t *uio, zfs_uio_rw_t rw,
int ioflag, cred_t *cr)
{
int ret;
int flags = ioflag;
ASSERT3U(rw, ==, UIO_READ);
/* On error, return to fallback to the buffred path */
ret = zfs_setup_direct(zp, uio, rw, &flags);
if (ret)
return (ret);
ASSERT(uio->uio_extflg & UIO_DIRECT);
ret = zfs_read(zp, uio, flags, cr);
zfs_uio_free_dio_pages(uio, rw);
return (ret);
}
#ifndef _SYS_SYSPROTO_H_ #ifndef _SYS_SYSPROTO_H_
struct vop_read_args { struct vop_read_args {
struct vnode *a_vp; struct vnode *a_vp;
@ -4311,85 +4288,37 @@ zfs_freebsd_read(struct vop_read_args *ap)
{ {
zfs_uio_t uio; zfs_uio_t uio;
int error = 0; int error = 0;
znode_t *zp = VTOZ(ap->a_vp);
int ioflag = ioflags(ap->a_ioflag);
boolean_t is_direct;
zfs_uio_init(&uio, ap->a_uio); zfs_uio_init(&uio, ap->a_uio);
error = zfs_read(VTOZ(ap->a_vp), &uio, ioflags(ap->a_ioflag),
error = zfs_check_direct_enabled(zp, ioflag, &is_direct); ap->a_cred);
/*
if (error) { * XXX We occasionally get an EFAULT for Direct I/O reads on
return (error); * FreeBSD 13. This still needs to be resolved. The EFAULT comes
} else if (is_direct) { * from:
error = * zfs_uio_get__dio_pages_alloc() ->
zfs_freebsd_read_direct(zp, &uio, UIO_READ, ioflag, * zfs_uio_get_dio_pages_impl() ->
ap->a_cred); * zfs_uio_iov_step() ->
/* * zfs_uio_get_user_pages().
* XXX We occasionally get an EFAULT for Direct I/O reads on * We return EFAULT from zfs_uio_iov_step(). When a Direct I/O
* FreeBSD 13. This still needs to be resolved. The EFAULT comes * read fails to map in the user pages (returning EFAULT) the
* from: * Direct I/O request is broken up into two separate IO requests
* zfs_uio_get__dio_pages_alloc() -> * and issued separately using Direct I/O.
* zfs_uio_get_dio_pages_impl() -> */
* zfs_uio_iov_step() ->
* zfs_uio_get_user_pages().
* We return EFAULT from zfs_uio_iov_step(). When a Direct I/O
* read fails to map in the user pages (returning EFAULT) the
* Direct I/O request is broken up into two separate IO requests
* and issued separately using Direct I/O.
*/
#ifdef ZFS_DEBUG #ifdef ZFS_DEBUG
if (error == EFAULT) { if (error == EFAULT && uio.uio_extflg & UIO_DIRECT) {
#if 0 #if 0
printf("%s(%d): Direct I/O read returning EFAULT " printf("%s(%d): Direct I/O read returning EFAULT "
"uio = %p, zfs_uio_offset(uio) = %lu " "uio = %p, zfs_uio_offset(uio) = %lu "
"zfs_uio_resid(uio) = %lu\n", "zfs_uio_resid(uio) = %lu\n",
__FUNCTION__, __LINE__, &uio, zfs_uio_offset(&uio), __FUNCTION__, __LINE__, &uio, zfs_uio_offset(&uio),
zfs_uio_resid(&uio)); zfs_uio_resid(&uio));
#endif #endif
}
#endif
/*
* On error we will return unless the error is EAGAIN, which
* just tells us to fallback to the buffered path.
*/
if (error != EAGAIN)
return (error);
else
ioflag &= ~O_DIRECT;
} }
#endif
error = zfs_read(zp, &uio, ioflag, ap->a_cred);
return (error); return (error);
} }
static int
zfs_freebsd_write_direct(znode_t *zp, zfs_uio_t *uio, zfs_uio_rw_t rw,
int ioflag, cred_t *cr)
{
int ret;
int flags = ioflag;
ASSERT3U(rw, ==, UIO_WRITE);
/* On error, return to fallback to the buffred path */
ret = zfs_setup_direct(zp, uio, rw, &flags);
if (ret)
return (ret);
ASSERT(uio->uio_extflg & UIO_DIRECT);
ret = zfs_write(zp, uio, flags, cr);
zfs_uio_free_dio_pages(uio, rw);
return (ret);
}
#ifndef _SYS_SYSPROTO_H_ #ifndef _SYS_SYSPROTO_H_
struct vop_write_args { struct vop_write_args {
struct vnode *a_vp; struct vnode *a_vp;
@ -4403,36 +4332,9 @@ static int
zfs_freebsd_write(struct vop_write_args *ap) zfs_freebsd_write(struct vop_write_args *ap)
{ {
zfs_uio_t uio; zfs_uio_t uio;
int error = 0;
znode_t *zp = VTOZ(ap->a_vp);
int ioflag = ioflags(ap->a_ioflag);
boolean_t is_direct;
zfs_uio_init(&uio, ap->a_uio); zfs_uio_init(&uio, ap->a_uio);
return (zfs_write(VTOZ(ap->a_vp), &uio, ioflags(ap->a_ioflag),
error = zfs_check_direct_enabled(zp, ioflag, &is_direct); ap->a_cred));
if (error) {
return (error);
} else if (is_direct) {
error =
zfs_freebsd_write_direct(zp, &uio, UIO_WRITE, ioflag,
ap->a_cred);
/*
* On error we will return unless the error is EAGAIN, which
* just tells us to fallback to the buffered path.
*/
if (error != EAGAIN)
return (error);
else
ioflag &= ~O_DIRECT;
}
error = zfs_write(zp, &uio, ioflag, ap->a_cred);
return (error);
} }
/* /*

View File

@ -309,7 +309,7 @@ zpl_uio_init(zfs_uio_t *uio, struct kiocb *kiocb, struct iov_iter *to,
} }
static ssize_t static ssize_t
zpl_iter_read_buffered(struct kiocb *kiocb, struct iov_iter *to) zpl_iter_read(struct kiocb *kiocb, struct iov_iter *to)
{ {
cred_t *cr = CRED(); cred_t *cr = CRED();
fstrans_cookie_t cookie; fstrans_cookie_t cookie;
@ -322,15 +322,14 @@ zpl_iter_read_buffered(struct kiocb *kiocb, struct iov_iter *to)
crhold(cr); crhold(cr);
cookie = spl_fstrans_mark(); cookie = spl_fstrans_mark();
int flags = (filp->f_flags | zfs_io_flags(kiocb)) & ~O_DIRECT; ssize_t ret = -zfs_read(ITOZ(filp->f_mapping->host), &uio,
int error = -zfs_read(ITOZ(filp->f_mapping->host), &uio, filp->f_flags | zfs_io_flags(kiocb), cr);
flags, cr);
spl_fstrans_unmark(cookie); spl_fstrans_unmark(cookie);
crfree(cr); crfree(cr);
if (error < 0) if (ret < 0)
return (error); return (ret);
ssize_t read = count - uio.uio_resid; ssize_t read = count - uio.uio_resid;
kiocb->ki_pos += read; kiocb->ki_pos += read;
@ -340,71 +339,6 @@ zpl_iter_read_buffered(struct kiocb *kiocb, struct iov_iter *to)
return (read); return (read);
} }
static ssize_t
zpl_iter_read_direct(struct kiocb *kiocb, struct iov_iter *to)
{
cred_t *cr = CRED();
struct file *filp = kiocb->ki_filp;
struct inode *ip = filp->f_mapping->host;
ssize_t count = iov_iter_count(to);
int flags = filp->f_flags | zfs_io_flags(kiocb);
zfs_uio_t uio;
ssize_t ret;
zpl_uio_init(&uio, kiocb, to, kiocb->ki_pos, count, 0);
/* On error, return to fallback to the buffered path. */
ret = zfs_setup_direct(ITOZ(ip), &uio, UIO_READ, &flags);
if (ret)
return (-ret);
ASSERT(uio.uio_extflg & UIO_DIRECT);
crhold(cr);
fstrans_cookie_t cookie = spl_fstrans_mark();
int error = -zfs_read(ITOZ(ip), &uio, flags, cr);
spl_fstrans_unmark(cookie);
crfree(cr);
zfs_uio_free_dio_pages(&uio, UIO_READ);
if (error < 0)
return (error);
ssize_t read = count - uio.uio_resid;
kiocb->ki_pos += read;
zpl_file_accessed(filp);
return (read);
}
static ssize_t
zpl_iter_read(struct kiocb *kiocb, struct iov_iter *to)
{
struct inode *ip = kiocb->ki_filp->f_mapping->host;
struct file *filp = kiocb->ki_filp;
int flags = filp->f_flags | zfs_io_flags(kiocb);
boolean_t is_direct;
int error = zfs_check_direct_enabled(ITOZ(ip), flags, &is_direct);
if (error) {
return (-error);
} else if (is_direct) {
ssize_t read = zpl_iter_read_direct(kiocb, to);
if (read >= 0 || read != -EAGAIN)
return (read);
/* Otherwise fallback to buffered read */
}
return (zpl_iter_read_buffered(kiocb, to));
}
static inline ssize_t static inline ssize_t
zpl_generic_write_checks(struct kiocb *kiocb, struct iov_iter *from, zpl_generic_write_checks(struct kiocb *kiocb, struct iov_iter *from,
size_t *countp) size_t *countp)
@ -430,249 +364,57 @@ zpl_generic_write_checks(struct kiocb *kiocb, struct iov_iter *from,
return (0); return (0);
} }
static ssize_t
zpl_iter_write_buffered(struct kiocb *kiocb, struct iov_iter *from)
{
cred_t *cr = CRED();
struct file *filp = kiocb->ki_filp;
struct inode *ip = filp->f_mapping->host;
size_t wrote;
size_t count = iov_iter_count(from);
zfs_uio_t uio;
zpl_uio_init(&uio, kiocb, from, kiocb->ki_pos, count, from->iov_offset);
crhold(cr);
fstrans_cookie_t cookie = spl_fstrans_mark();
int flags = (filp->f_flags | zfs_io_flags(kiocb)) & ~O_DIRECT;
int error = -zfs_write(ITOZ(ip), &uio, flags, cr);
spl_fstrans_unmark(cookie);
crfree(cr);
if (error < 0)
return (error);
wrote = count - uio.uio_resid;
kiocb->ki_pos += wrote;
if (wrote > 0)
iov_iter_advance(from, wrote);
return (wrote);
}
static ssize_t
zpl_iter_write_direct(struct kiocb *kiocb, struct iov_iter *from)
{
cred_t *cr = CRED();
struct file *filp = kiocb->ki_filp;
struct inode *ip = filp->f_mapping->host;
size_t wrote;
int flags = filp->f_flags | zfs_io_flags(kiocb);
size_t count = iov_iter_count(from);
zfs_uio_t uio;
zpl_uio_init(&uio, kiocb, from, kiocb->ki_pos, count, from->iov_offset);
/* On error, return to fallback to the buffered path. */
ssize_t ret = zfs_setup_direct(ITOZ(ip), &uio, UIO_WRITE, &flags);
if (ret)
return (-ret);
ASSERT(uio.uio_extflg & UIO_DIRECT);
crhold(cr);
fstrans_cookie_t cookie = spl_fstrans_mark();
int error = -zfs_write(ITOZ(ip), &uio, flags, cr);
spl_fstrans_unmark(cookie);
crfree(cr);
zfs_uio_free_dio_pages(&uio, UIO_WRITE);
if (error < 0)
return (error);
wrote = count - uio.uio_resid;
kiocb->ki_pos += wrote;
return (wrote);
}
static ssize_t static ssize_t
zpl_iter_write(struct kiocb *kiocb, struct iov_iter *from) zpl_iter_write(struct kiocb *kiocb, struct iov_iter *from)
{ {
struct inode *ip = kiocb->ki_filp->f_mapping->host; cred_t *cr = CRED();
fstrans_cookie_t cookie;
struct file *filp = kiocb->ki_filp; struct file *filp = kiocb->ki_filp;
int flags = filp->f_flags | zfs_io_flags(kiocb); struct inode *ip = filp->f_mapping->host;
zfs_uio_t uio;
size_t count = 0; size_t count = 0;
boolean_t is_direct; ssize_t ret;
ssize_t ret = zpl_generic_write_checks(kiocb, from, &count); ret = zpl_generic_write_checks(kiocb, from, &count);
if (ret) if (ret)
return (ret); return (ret);
loff_t offset = kiocb->ki_pos; zpl_uio_init(&uio, kiocb, from, kiocb->ki_pos, count, from->iov_offset);
ret = zfs_check_direct_enabled(ITOZ(ip), flags, &is_direct); crhold(cr);
cookie = spl_fstrans_mark();
if (ret) { ret = -zfs_write(ITOZ(ip), &uio,
return (-ret); filp->f_flags | zfs_io_flags(kiocb), cr);
} else if (is_direct) {
ssize_t wrote = zpl_iter_write_direct(kiocb, from);
if (wrote >= 0 || wrote != -EAGAIN) { spl_fstrans_unmark(cookie);
return (wrote); crfree(cr);
}
/* if (ret < 0)
* If we are falling back to a buffered write, then the return (ret);
* file position should not be updated at this point.
*/
ASSERT3U(offset, ==, kiocb->ki_pos);
}
return (zpl_iter_write_buffered(kiocb, from)); ssize_t wrote = count - uio.uio_resid;
kiocb->ki_pos += wrote;
return (wrote);
} }
#else /* !HAVE_VFS_RW_ITERATE */ #else /* !HAVE_VFS_RW_ITERATE */
static ssize_t
zpl_aio_read_buffered(struct kiocb *kiocb, const struct iovec *iov,
unsigned long nr_segs, loff_t pos)
{
cred_t *cr = CRED();
fstrans_cookie_t cookie;
struct file *filp = kiocb->ki_filp;
size_t count;
ssize_t ret;
ret = generic_segment_checks(iov, &nr_segs, &count, VERIFY_WRITE);
if (ret)
return (ret);
zfs_uio_t uio;
zfs_uio_iovec_init(&uio, iov, nr_segs, kiocb->ki_pos, UIO_USERSPACE,
count, 0);
crhold(cr);
cookie = spl_fstrans_mark();
int flags = (filp->f_flags | zfs_io_flags(kiocb)) & ~O_DIRECT;
int error = -zfs_read(ITOZ(filp->f_mapping->host), &uio,
flags, cr);
spl_fstrans_unmark(cookie);
crfree(cr);
if (error < 0)
return (error);
ssize_t read = count - uio.uio_resid;
kiocb->ki_pos += read;
zpl_file_accessed(filp);
return (read);
}
static ssize_t
zpl_aio_read_direct(struct kiocb *kiocb, const struct iovec *iov,
unsigned long nr_segs, loff_t pos)
{
cred_t *cr = CRED();
fstrans_cookie_t cookie;
struct file *filp = kiocb->ki_filp;
struct inode *ip = filp->f_mapping->host;
int flags = filp->f_flags | zfs_io_flags(kiocb);
size_t count;
ssize_t ret;
ret = generic_segment_checks(iov, &nr_segs, &count, VERIFY_WRITE);
if (ret)
return (ret);
zfs_uio_t uio;
zfs_uio_iovec_init(&uio, iov, nr_segs, kiocb->ki_pos, UIO_USERSPACE,
count, 0);
/* On error, return to fallback to the buffered path */
ret = zfs_setup_direct(ITOZ(ip), &uio, UIO_READ, &flags);
if (ret)
return (-ret);
ASSERT(uio.uio_extflg & UIO_DIRECT);
crhold(cr);
cookie = spl_fstrans_mark();
int error = -zfs_read(ITOZ(ip), &uio, flags, cr);
spl_fstrans_unmark(cookie);
crfree(cr);
zfs_uio_free_dio_pages(&uio, UIO_READ);
if (error < 0)
return (error);
ssize_t read = count - uio.uio_resid;
kiocb->ki_pos += read;
zpl_file_accessed(filp);
return (read);
}
static ssize_t static ssize_t
zpl_aio_read(struct kiocb *kiocb, const struct iovec *iov, zpl_aio_read(struct kiocb *kiocb, const struct iovec *iov,
unsigned long nr_segs, loff_t pos) unsigned long nr_segs, loff_t pos)
{ {
struct inode *ip = kiocb->ki_filp->f_mapping->host; cred_t *cr = CRED();
fstrans_cookie_t cookie;
struct file *filp = kiocb->ki_filp; struct file *filp = kiocb->ki_filp;
int flags = filp->f_flags | zfs_io_flags(kiocb);
size_t count; size_t count;
ssize_t ret; ssize_t ret;
boolean_t is_direct;
ret = generic_segment_checks(iov, &nr_segs, &count, VERIFY_WRITE); ret = generic_segment_checks(iov, &nr_segs, &count, VERIFY_WRITE);
if (ret) if (ret)
return (ret); return (ret);
ret = zfs_check_direct_enabled(ITOZ(ip), flags, &is_direct);
if (ret) {
return (-ret);
} else if (is_direct) {
ssize_t read = zpl_aio_read_direct(kiocb, iov, nr_segs, pos);
if (read >= 0 || read != -EAGAIN)
return (read);
/* Otherwise fallback to buffered read */
}
return (zpl_aio_read_buffered(kiocb, iov, nr_segs, pos));
}
static ssize_t
zpl_aio_write_buffered(struct kiocb *kiocb, const struct iovec *iov,
unsigned long nr_segs, loff_t pos)
{
cred_t *cr = CRED();
fstrans_cookie_t cookie;
struct file *filp = kiocb->ki_filp;
struct inode *ip = filp->f_mapping->host;
size_t count;
ssize_t ret;
ret = generic_segment_checks(iov, &nr_segs, &count, VERIFY_READ);
if (ret)
return (ret);
zfs_uio_t uio; zfs_uio_t uio;
zfs_uio_iovec_init(&uio, iov, nr_segs, kiocb->ki_pos, UIO_USERSPACE, zfs_uio_iovec_init(&uio, iov, nr_segs, kiocb->ki_pos, UIO_USERSPACE,
count, 0); count, 0);
@ -680,110 +422,64 @@ zpl_aio_write_buffered(struct kiocb *kiocb, const struct iovec *iov,
crhold(cr); crhold(cr);
cookie = spl_fstrans_mark(); cookie = spl_fstrans_mark();
int flags = (filp->f_flags | zfs_io_flags(kiocb)) & ~O_DIRECT; ret = -zfs_read(ITOZ(filp->f_mapping->host), &uio,
int error = -zfs_write(ITOZ(ip), &uio, flags, cr); flip->f_flags | zfs_io_flags(kiocb), cr);
spl_fstrans_unmark(cookie); spl_fstrans_unmark(cookie);
crfree(cr); crfree(cr);
if (error < 0) if (ret < 0)
return (error);
ssize_t wrote = count - uio.uio_resid;
kiocb->ki_pos += wrote;
return (wrote);
}
static ssize_t
zpl_aio_write_direct(struct kiocb *kiocb, const struct iovec *iov,
unsigned long nr_segs, loff_t pos)
{
cred_t *cr = CRED();
fstrans_cookie_t cookie;
struct file *filp = kiocb->ki_filp;
struct inode *ip = filp->f_mapping->host;
int flags = filp->f_flags | zfs_io_flags(kiocb);
size_t count;
ssize_t ret;
ret = generic_segment_checks(iov, &nr_segs, &count, VERIFY_READ);
if (ret)
return (ret); return (ret);
zfs_uio_t uio; ssize_t read = count - uio.uio_resid;
zfs_uio_iovec_init(&uio, iov, nr_segs, kiocb->ki_pos, UIO_USERSPACE, kiocb->ki_pos += read;
count, 0);
/* On error, return to fallback to the buffered path. */ zpl_file_accessed(filp);
ret = zfs_setup_direct(ITOZ(ip), &uio, UIO_WRITE, &flags);
if (ret)
return (-ret);
ASSERT(uio.uio_extflg & UIO_DIRECT); return (read);
crhold(cr);
cookie = spl_fstrans_mark();
int error = -zfs_write(ITOZ(ip), &uio, flags, cr);
spl_fstrans_unmark(cookie);
crfree(cr);
zfs_uio_free_dio_pages(&uio, UIO_WRITE);
if (error < 0)
return (error);
ssize_t wrote = count - uio.uio_resid;
kiocb->ki_pos += wrote;
return (wrote);
} }
static ssize_t static ssize_t
zpl_aio_write(struct kiocb *kiocb, const struct iovec *iov, zpl_aio_write(struct kiocb *kiocb, const struct iovec *iov,
unsigned long nr_segs, loff_t pos) unsigned long nr_segs, loff_t pos)
{ {
cred_t *cr = CRED();
fstrans_cookie_t cookie;
struct file *filp = kiocb->ki_filp; struct file *filp = kiocb->ki_filp;
struct inode *ip = filp->f_mapping->host; struct inode *ip = filp->f_mapping->host;
int flags = filp->f_flags | zfs_io_flags(kiocb);
size_t ocount;
size_t count; size_t count;
ssize_t ret; ssize_t ret;
boolean_t is_direct;
ret = generic_segment_checks(iov, &nr_segs, &ocount, VERIFY_READ); ret = generic_segment_checks(iov, &nr_segs, &count, VERIFY_READ);
if (ret) if (ret)
return (ret); return (ret);
count = ocount; ret = geeric_write_checks(filep, &pos, &count, S_ISBLK(ip->i_mode));
ret = generic_write_checks(filp, &pos, &count, S_ISBLK(ip->i_mode));
if (ret) if (ret)
return (ret); return (ret);
kiocb->ki_pos = pos; kiocb->ki_pos = pos;
ret = zfs_check_direct_enabled(ITOZ(ip), flags, &is_direct); zfs_uio_t uio;
zfs_uio_iovec_init(&uio, iov, nr_segs, kiocb->ki_pos, UIO_USERSPACE,
count, 0);
if (ret) { crhold(cr);
return (-ret); cookie = spl_fstrans_mark();
} else if (is_direct) {
ssize_t wrote = zpl_aio_write_direct(kiocb, iov, nr_segs, pos);
if (wrote >= 0 || wrote != -EAGAIN) { ret = -zfs_write(ITOZ(ip), &uio,
return (wrote); filp->f_flags | zfs_io_flags(kiocb), cr);
}
/* spl_fstrans_unmark(cookie);
* If we are falling back to a buffered write, then the crfree(cr);
* file position should not be updated at this point.
*/
ASSERT3U(pos, ==, kiocb->ki_pos);
}
return (zpl_aio_write_buffered(kiocb, iov, nr_segs, pos)); if (ret < 0)
return (ret);
ssize_t wrote = count - uio.uio_resid;
kiocb->ki_pos += wrote;
return (wrote);
} }
#endif /* HAVE_VFS_RW_ITERATE */ #endif /* HAVE_VFS_RW_ITERATE */

View File

@ -1191,7 +1191,7 @@ dmu_read_impl(dnode_t *dn, uint64_t offset, uint64_t size,
/* Allow Direct I/O when requested and properly aligned */ /* Allow Direct I/O when requested and properly aligned */
if ((flags & DMU_DIRECTIO) && zfs_dio_page_aligned(buf) && if ((flags & DMU_DIRECTIO) && zfs_dio_page_aligned(buf) &&
zfs_dio_aligned(offset, size, SPA_MINBLOCKSIZE)) { zfs_dio_aligned(offset, size, PAGESIZE)) {
abd_t *data = abd_get_from_buf(buf, size); abd_t *data = abd_get_from_buf(buf, size);
err = dmu_read_abd(dn, offset, size, data, flags); err = dmu_read_abd(dn, offset, size, data, flags);
abd_free(data); abd_free(data);

View File

@ -104,7 +104,7 @@ dmu_write_direct_done(zio_t *zio)
if (zio->io_error != 0) { if (zio->io_error != 0) {
if (zio->io_flags & ZIO_FLAG_DIO_CHKSUM_ERR) if (zio->io_flags & ZIO_FLAG_DIO_CHKSUM_ERR)
ASSERT3U(zio->io_error, ==, EAGAIN); ASSERT3U(zio->io_error, ==, EIO);
/* /*
* In the event of an I/O error this block has been freed in * In the event of an I/O error this block has been freed in

View File

@ -159,14 +159,14 @@ uint_t zfs_vdev_max_auto_ashift = 14;
uint_t zfs_vdev_min_auto_ashift = ASHIFT_MIN; uint_t zfs_vdev_min_auto_ashift = ASHIFT_MIN;
/* /*
* VDEV checksum verification percentage for Direct I/O writes. This is * VDEV checksum verification for Direct I/O writes. This is neccessary for
* neccessary for Linux, because user pages can not be placed under write * Linux, because anonymous pages can not be placed under write protection
* protection during Direct I/O writes. * during Direct I/O writes.
*/ */
#if !defined(__FreeBSD__) #if !defined(__FreeBSD__)
uint_t zfs_vdev_direct_write_verify_pct = 2; uint_t zfs_vdev_direct_write_verify = 1;
#else #else
uint_t zfs_vdev_direct_write_verify_pct = 0; uint_t zfs_vdev_direct_write_verify = 0;
#endif #endif
void void
@ -6527,9 +6527,9 @@ ZFS_MODULE_PARAM(zfs, zfs_, dio_write_verify_events_per_second, UINT, ZMOD_RW,
"Rate Direct I/O write verify events to this many per second"); "Rate Direct I/O write verify events to this many per second");
/* BEGIN CSTYLED */ /* BEGIN CSTYLED */
ZFS_MODULE_PARAM(zfs_vdev, zfs_vdev_, direct_write_verify_pct, UINT, ZMOD_RW, ZFS_MODULE_PARAM(zfs_vdev, zfs_vdev_, direct_write_verify, UINT, ZMOD_RW,
"Percentage of Direct I/O writes per top-level VDEV for checksum " "Direct I/O writes will perform for checksum verification before "
"verification to be performed"); "commiting write");
ZFS_MODULE_PARAM(zfs, zfs_, checksum_events_per_second, UINT, ZMOD_RW, ZFS_MODULE_PARAM(zfs, zfs_, checksum_events_per_second, UINT, ZMOD_RW,
"Rate limit checksum events to this many checksum errors per second " "Rate limit checksum events to this many checksum errors per second "

View File

@ -202,28 +202,6 @@ zfs_access(znode_t *zp, int mode, int flag, cred_t *cr)
return (error); return (error);
} }
int
zfs_check_direct_enabled(znode_t *zp, int ioflags, boolean_t *is_direct)
{;
zfsvfs_t *zfsvfs = ZTOZSB(zp);
*is_direct = B_FALSE;
int error;
if ((error = zfs_enter(zfsvfs, FTAG)) != 0)
return (error);
if (ioflags & O_DIRECT &&
zfsvfs->z_os->os_direct != ZFS_DIRECT_DISABLED) {
*is_direct = B_TRUE;
} else if (zfsvfs->z_os->os_direct == ZFS_DIRECT_ALWAYS) {
*is_direct = B_TRUE;
}
zfs_exit(zfsvfs, FTAG);
return (0);
}
/* /*
* Determine if Direct I/O has been requested (either via the O_DIRECT flag or * Determine if Direct I/O has been requested (either via the O_DIRECT flag or
* the "direct" dataset property). When inherited by the property only apply * the "direct" dataset property). When inherited by the property only apply
@ -236,12 +214,11 @@ zfs_check_direct_enabled(znode_t *zp, int ioflags, boolean_t *is_direct)
* synhronized with the ARC. * synhronized with the ARC.
* *
* It is possible that a file's pages could be mmap'ed after it is checked * It is possible that a file's pages could be mmap'ed after it is checked
* here. If so, that is handled according in zfs_read() and zfs_write(). See * here. If so, that is handled coorarding in zfs_write(). See comments in the
* comments in the following two areas for how this handled: * following area for how this is handled:
* zfs_read() -> mappedread()
* zfs_write() -> update_pages() * zfs_write() -> update_pages()
*/ */
int static int
zfs_setup_direct(struct znode *zp, zfs_uio_t *uio, zfs_uio_rw_t rw, zfs_setup_direct(struct znode *zp, zfs_uio_t *uio, zfs_uio_rw_t rw,
int *ioflagp) int *ioflagp)
{ {
@ -250,49 +227,49 @@ zfs_setup_direct(struct znode *zp, zfs_uio_t *uio, zfs_uio_rw_t rw,
int ioflag = *ioflagp; int ioflag = *ioflagp;
int error = 0; int error = 0;
if ((error = zfs_enter_verify_zp(zfsvfs, zp, FTAG)) != 0) if (os->os_direct == ZFS_DIRECT_DISABLED ||
return (error); zn_has_cached_data(zp, zfs_uio_offset(uio),
zfs_uio_offset(uio) + zfs_uio_resid(uio) - 1)) {
if (os->os_direct == ZFS_DIRECT_DISABLED) { /*
error = EAGAIN; * Direct I/O is disabled or the region is mmap'ed. In either
* case the I/O request will just directed through the ARC.
*/
ioflag &= ~O_DIRECT;
goto out; goto out;
} else if (os->os_direct == ZFS_DIRECT_ALWAYS && } else if (os->os_direct == ZFS_DIRECT_ALWAYS &&
zfs_uio_page_aligned(uio) && zfs_uio_page_aligned(uio) &&
zfs_uio_aligned(uio, SPA_MINBLOCKSIZE)) { zfs_uio_aligned(uio, PAGE_SIZE)) {
if ((rw == UIO_WRITE && zfs_uio_resid(uio) >= zp->z_blksz) || if ((rw == UIO_WRITE && zfs_uio_resid(uio) >= zp->z_blksz) ||
(rw == UIO_READ)) { (rw == UIO_READ)) {
ioflag |= O_DIRECT; ioflag |= O_DIRECT;
} }
} else if (os->os_direct == ZFS_DIRECT_ALWAYS && (ioflag & O_DIRECT)) {
/*
* Direct I/O was requested through the direct=always, but it
* is not properly PAGE_SIZE aligned. The request will be
* directed through the ARC.
*/
ioflag &= ~O_DIRECT;
} }
if (ioflag & O_DIRECT) { if (ioflag & O_DIRECT) {
if (!zfs_uio_page_aligned(uio) || if (!zfs_uio_page_aligned(uio) ||
!zfs_uio_aligned(uio, SPA_MINBLOCKSIZE)) { !zfs_uio_aligned(uio, PAGE_SIZE)) {
error = SET_ERROR(EINVAL); error = SET_ERROR(EINVAL);
goto out; goto out;
} }
if (zn_has_cached_data(zp, zfs_uio_offset(uio), error = zfs_uio_get_dio_pages_alloc(uio, rw);
zfs_uio_offset(uio) + zfs_uio_resid(uio) - 1)) { if (error) {
error = SET_ERROR(EAGAIN);
goto out; goto out;
} }
error = zfs_uio_get_dio_pages_alloc(uio, rw);
if (error)
goto out;
} else {
error = EAGAIN;
goto out;
} }
IMPLY(ioflag & O_DIRECT, uio->uio_extflg & UIO_DIRECT); IMPLY(ioflag & O_DIRECT, uio->uio_extflg & UIO_DIRECT);
ASSERT0(error); ASSERT0(error);
*ioflagp = ioflag;
out: out:
zfs_exit(zfsvfs, FTAG); *ioflagp = ioflag;
return (error); return (error);
} }
@ -380,8 +357,16 @@ zfs_read(struct znode *zp, zfs_uio_t *uio, int ioflag, cred_t *cr)
error = 0; error = 0;
goto out; goto out;
} }
ASSERT(zfs_uio_offset(uio) < zp->z_size); ASSERT(zfs_uio_offset(uio) < zp->z_size);
/*
* Setting up Direct I/O if requested.
*/
error = zfs_setup_direct(zp, uio, UIO_READ, &ioflag);
if (error) {
goto out;
}
#if defined(__linux__) #if defined(__linux__)
ssize_t start_offset = zfs_uio_offset(uio); ssize_t start_offset = zfs_uio_offset(uio);
#endif #endif
@ -424,22 +409,7 @@ zfs_read(struct znode *zp, zfs_uio_t *uio, int ioflag, cred_t *cr)
#endif #endif
if (zn_has_cached_data(zp, zfs_uio_offset(uio), if (zn_has_cached_data(zp, zfs_uio_offset(uio),
zfs_uio_offset(uio) + nbytes - 1)) { zfs_uio_offset(uio) + nbytes - 1)) {
/*
* It is possible that a files pages have been mmap'ed
* since our check for Direct I/O reads and the read
* being issued. In this case, we will use the ARC to
* keep it synchronized with the page cache. In order
* to do this we temporarily remove the UIO_DIRECT
* flag.
*/
boolean_t uio_direct_mmap = B_FALSE;
if (uio->uio_extflg & UIO_DIRECT) {
uio->uio_extflg &= ~UIO_DIRECT;
uio_direct_mmap = B_TRUE;
}
error = mappedread(zp, nbytes, uio); error = mappedread(zp, nbytes, uio);
if (uio_direct_mmap)
uio->uio_extflg |= UIO_DIRECT;
} else { } else {
error = dmu_read_uio_dbuf(sa_get_db(zp->z_sa_hdl), error = dmu_read_uio_dbuf(sa_get_db(zp->z_sa_hdl),
uio, nbytes); uio, nbytes);
@ -494,6 +464,12 @@ zfs_read(struct znode *zp, zfs_uio_t *uio, int ioflag, cred_t *cr)
out: out:
zfs_rangelock_exit(lr); zfs_rangelock_exit(lr);
/*
* Cleanup for Direct I/O if requested.
*/
if (uio->uio_extflg & UIO_DIRECT)
zfs_uio_free_dio_pages(uio, UIO_READ);
ZFS_ACCESSTIME_STAMP(zfsvfs, zp); ZFS_ACCESSTIME_STAMP(zfsvfs, zp);
zfs_exit(zfsvfs, FTAG); zfs_exit(zfsvfs, FTAG);
return (error); return (error);
@ -631,6 +607,15 @@ zfs_write(znode_t *zp, zfs_uio_t *uio, int ioflag, cred_t *cr)
return (SET_ERROR(EINVAL)); return (SET_ERROR(EINVAL));
} }
/*
* Setting up Direct I/O if requested.
*/
error = zfs_setup_direct(zp, uio, UIO_WRITE, &ioflag);
if (error) {
zfs_exit(zfsvfs, FTAG);
return (SET_ERROR(error));
}
/* /*
* Pre-fault the pages to ensure slow (eg NFS) pages * Pre-fault the pages to ensure slow (eg NFS) pages
* don't hold up txg. * don't hold up txg.
@ -641,6 +626,7 @@ zfs_write(znode_t *zp, zfs_uio_t *uio, int ioflag, cred_t *cr)
return (SET_ERROR(EFAULT)); return (SET_ERROR(EFAULT));
} }
/* /*
* If in append mode, set the io offset pointer to eof. * If in append mode, set the io offset pointer to eof.
*/ */
@ -676,6 +662,7 @@ zfs_write(znode_t *zp, zfs_uio_t *uio, int ioflag, cred_t *cr)
lr = zfs_rangelock_enter(&zp->z_rangelock, woff, n, RL_WRITER); lr = zfs_rangelock_enter(&zp->z_rangelock, woff, n, RL_WRITER);
} }
if (zn_rlimit_fsize_uio(zp, uio)) { if (zn_rlimit_fsize_uio(zp, uio)) {
zfs_rangelock_exit(lr); zfs_rangelock_exit(lr);
zfs_exit(zfsvfs, FTAG); zfs_exit(zfsvfs, FTAG);
@ -896,15 +883,27 @@ zfs_write(znode_t *zp, zfs_uio_t *uio, int ioflag, cred_t *cr)
zfs_uioskip(uio, nbytes); zfs_uioskip(uio, nbytes);
tx_bytes = nbytes; tx_bytes = nbytes;
} }
/* /*
* There is a a window where a file's pages can be mmap'ed after * There is a window where a file's pages can be mmap'ed after
* the Direct I/O write has started. In this case we will still * zfs_setup_direct() is called. This is due to the fact that
* call update_pages() to make sure there is consistency * the rangelock in this function is acquired after calling
* between the ARC and the page cache. This is unfortunate * zfs_setup_direct(). This is done so that
* zfs_uio_prefaultpages() does not attempt to fault in pages
* on Linux for Direct I/O requests. This is not necessary as
* the pages are pinned in memory and can not be faulted out.
* Ideally, the rangelock would be held before calling
* zfs_setup_direct() and zfs_uio_prefaultpages(); however,
* this can lead to a deadlock as zfs_getpage() also acquires
* the rangelock as a RL_WRITER and prefaulting the pages can
* lead to zfs_getpage() being called.
*
* In the case of the pages being mapped after
* zfs_setup_direct() is called, the call to update_pages()
* will still be made to make sure there is consistency between
* the ARC and the Linux page cache. This is an ufortunate
* situation as the data will be read back into the ARC after * situation as the data will be read back into the ARC after
* the Direct I/O write has completed, but this is the pentalty * the Direct I/O write has completed, but this is the penality
* for writing to a mmap'ed region of the file using O_DIRECT. * for writing to a mmap'ed region of a file using Direct I/O.
*/ */
if (tx_bytes && if (tx_bytes &&
zn_has_cached_data(zp, woff, woff + tx_bytes - 1)) { zn_has_cached_data(zp, woff, woff + tx_bytes - 1)) {
@ -987,6 +986,12 @@ zfs_write(znode_t *zp, zfs_uio_t *uio, int ioflag, cred_t *cr)
zfs_znode_update_vfs(zp); zfs_znode_update_vfs(zp);
zfs_rangelock_exit(lr); zfs_rangelock_exit(lr);
/*
* Cleanup for Direct I/O if requested.
*/
if (uio->uio_extflg & UIO_DIRECT)
zfs_uio_free_dio_pages(uio, UIO_WRITE);
/* /*
* If we're in replay mode, or we made no progress, or the * If we're in replay mode, or we made no progress, or the
* uio data is inaccessible return an error. Otherwise, it's * uio data is inaccessible return an error. Otherwise, it's

View File

@ -804,7 +804,7 @@ zio_notify_parent(zio_t *pio, zio_t *zio, enum zio_wait_type wait,
ASSERT3U(*countp, >, 0); ASSERT3U(*countp, >, 0);
if (zio->io_flags & ZIO_FLAG_DIO_CHKSUM_ERR) { if (zio->io_flags & ZIO_FLAG_DIO_CHKSUM_ERR) {
ASSERT3U(*errorp, ==, EAGAIN); ASSERT3U(*errorp, ==, EIO);
ASSERT3U(pio->io_child_type, ==, ZIO_CHILD_LOGICAL); ASSERT3U(pio->io_child_type, ==, ZIO_CHILD_LOGICAL);
pio->io_flags |= ZIO_FLAG_DIO_CHKSUM_ERR; pio->io_flags |= ZIO_FLAG_DIO_CHKSUM_ERR;
} }
@ -4521,13 +4521,12 @@ zio_vdev_io_assess(zio_t *zio)
/* /*
* If a Direct I/O write checksum verify error has occurred then this * If a Direct I/O write checksum verify error has occurred then this
* I/O should not attempt to be issued again. Instead the EAGAIN will * I/O should not attempt to be issued again. Instead the EIO will
* be returned and this write will attempt to be issued through the * be returned.
* ARC instead.
*/ */
if (zio->io_flags & ZIO_FLAG_DIO_CHKSUM_ERR) { if (zio->io_flags & ZIO_FLAG_DIO_CHKSUM_ERR) {
ASSERT3U(zio->io_child_type, ==, ZIO_CHILD_LOGICAL); ASSERT3U(zio->io_child_type, ==, ZIO_CHILD_LOGICAL);
ASSERT3U(zio->io_error, ==, EAGAIN); ASSERT3U(zio->io_error, ==, EIO);
zio->io_pipeline = ZIO_INTERLOCK_PIPELINE; zio->io_pipeline = ZIO_INTERLOCK_PIPELINE;
return (zio); return (zio);
} }
@ -4850,6 +4849,7 @@ static zio_t *
zio_dio_checksum_verify(zio_t *zio) zio_dio_checksum_verify(zio_t *zio)
{ {
zio_t *pio = zio_unique_parent(zio); zio_t *pio = zio_unique_parent(zio);
int error;
ASSERT3P(zio->io_vd, !=, NULL); ASSERT3P(zio->io_vd, !=, NULL);
ASSERT3P(zio->io_bp, !=, NULL); ASSERT3P(zio->io_bp, !=, NULL);
@ -4858,38 +4858,28 @@ zio_dio_checksum_verify(zio_t *zio)
ASSERT3B(pio->io_prop.zp_direct_write, ==, B_TRUE); ASSERT3B(pio->io_prop.zp_direct_write, ==, B_TRUE);
ASSERT3U(pio->io_child_type, ==, ZIO_CHILD_LOGICAL); ASSERT3U(pio->io_child_type, ==, ZIO_CHILD_LOGICAL);
if (zfs_vdev_direct_write_verify_pct == 0 || zio->io_error != 0) if (zfs_vdev_direct_write_verify == 0 || zio->io_error != 0)
goto out; goto out;
/* if ((error = zio_checksum_error(zio, NULL)) != 0) {
* A Direct I/O write checksum verification will only be zio->io_error = error;
* performed based on the top-level VDEV percentage for checks. if (error == ECKSUM) {
*/ mutex_enter(&zio->io_vd->vdev_stat_lock);
uint32_t rand = random_in_range(100); zio->io_vd->vdev_stat.vs_dio_verify_errors++;
int error; mutex_exit(&zio->io_vd->vdev_stat_lock);
zio->io_error = SET_ERROR(EIO);
zio->io_flags |= ZIO_FLAG_DIO_CHKSUM_ERR;
if (rand < zfs_vdev_direct_write_verify_pct) { /*
if ((error = zio_checksum_error(zio, NULL)) != 0) { * The EIO error must be propagated up to the logical
zio->io_error = error; * parent ZIO in zio_notify_parent() so it can be
if (error == ECKSUM) { * returned to dmu_write_abd().
mutex_enter(&zio->io_vd->vdev_stat_lock); */
zio->io_vd->vdev_stat.vs_dio_verify_errors++; zio->io_flags &= ~ZIO_FLAG_DONT_PROPAGATE;
mutex_exit(&zio->io_vd->vdev_stat_lock);
zio->io_error = SET_ERROR(EAGAIN);
zio->io_flags |= ZIO_FLAG_DIO_CHKSUM_ERR;
/* (void) zfs_ereport_post(FM_EREPORT_ZFS_DIO_VERIFY,
* The EAGAIN error must be propagated up to the zio->io_spa, zio->io_vd, &zio->io_bookmark,
* logical parent ZIO in zio_notify_parent() so zio, 0);
* it can be returned to dmu_write_abd().
*/
zio->io_flags &= ~ZIO_FLAG_DONT_PROPAGATE;
(void) zfs_ereport_post(
FM_EREPORT_ZFS_DIO_VERIFY,
zio->io_spa, zio->io_vd, &zio->io_bookmark,
zio, 0);
}
} }
} }
@ -5243,8 +5233,8 @@ zio_done(zio_t *zio)
} }
if ((zio->io_error == EIO || !(zio->io_flags & if ((zio->io_error == EIO || !(zio->io_flags &
(ZIO_FLAG_SPECULATIVE | ZIO_FLAG_DONT_PROPAGATE | (ZIO_FLAG_SPECULATIVE | ZIO_FLAG_DONT_PROPAGATE))) &&
ZIO_FLAG_DIO_CHKSUM_ERR))) && !(zio->io_flags & ZIO_FLAG_DIO_CHKSUM_ERR) &&
zio == zio->io_logical) { zio == zio->io_logical) {
/* /*
* For logical I/O requests, tell the SPA to log the * For logical I/O requests, tell the SPA to log the

View File

@ -41,6 +41,7 @@
static char *outputfile = NULL; static char *outputfile = NULL;
static int blocksize = 131072; /* 128K */ static int blocksize = 131072; /* 128K */
static int wr_err_expected = 0;
static int numblocks = 100; static int numblocks = 100;
static char *execname = NULL; static char *execname = NULL;
static int print_usage = 0; static int print_usage = 0;
@ -56,28 +57,33 @@ static void
usage(void) usage(void)
{ {
(void) fprintf(stderr, (void) fprintf(stderr,
"usage %s -o outputfile [-b blocksize] [-n numblocks]\n" "usage %s -o outputfile [-b blocksize] [-e wr_error_expected]\n"
" [-p randpattern] [-h help]\n" " [-n numblocks] [-p randpattern] [-h help]\n"
"\n" "\n"
"Testing whether checksum verify works correctly for O_DIRECT.\n" "Testing whether checksum verify works correctly for O_DIRECT.\n"
"when manipulating the contents of a userspace buffer.\n" "when manipulating the contents of a userspace buffer.\n"
"\n" "\n"
" outputfile: File to write to.\n" " outputfile: File to write to.\n"
" blocksize: Size of each block to write (must be at \n" " blocksize: Size of each block to write (must be at \n"
" least >= 512).\n" " least >= 512).\n"
" numblocks: Total number of blocksized blocks to write.\n" " wr_err_expected: Whether pwrite() is expected to return EIO\n"
" randpattern: Fill data buffer with random data. Default \n" " while manipulating the contents of the\n"
" behavior is to fill the buffer with the \n" " buffer.\n"
" known data pattern (0xdeadbeef).\n" " numblocks: Total number of blocksized blocks to\n"
" help: Print usage information and exit.\n" " write.\n"
" randpattern: Fill data buffer with random data. Default\n"
" behavior is to fill the buffer with the \n"
" known data pattern (0xdeadbeef).\n"
" help: Print usage information and exit.\n"
"\n" "\n"
" Required parameters:\n" " Required parameters:\n"
" outputfile\n" " outputfile\n"
"\n" "\n"
" Default Values:\n" " Default Values:\n"
" blocksize -> 131072\n" " blocksize -> 131072\n"
" numblocks -> 100\n" " wr_err_expexted -> false\n"
" randpattern -> false\n", " numblocks -> 100\n"
" randpattern -> false\n",
execname); execname);
(void) exit(1); (void) exit(1);
} }
@ -91,12 +97,16 @@ parse_options(int argc, char *argv[])
extern int optind, optopt; extern int optind, optopt;
execname = argv[0]; execname = argv[0];
while ((c = getopt(argc, argv, "b:hn:o:p")) != -1) { while ((c = getopt(argc, argv, "b:ehn:o:p")) != -1) {
switch (c) { switch (c) {
case 'b': case 'b':
blocksize = atoi(optarg); blocksize = atoi(optarg);
break; break;
case 'e':
wr_err_expected = 1;
break;
case 'h': case 'h':
print_usage = 1; print_usage = 1;
break; break;
@ -153,8 +163,10 @@ write_thread(void *arg)
while (!args->entire_file_written) { while (!args->entire_file_written) {
wrote = pwrite(ofd, buf, blocksize, offset); wrote = pwrite(ofd, buf, blocksize, offset);
if (wrote != blocksize) { if (wrote != blocksize) {
perror("write"); if (wr_err_expected)
exit(2); assert(errno == EIO);
else
exit(2);
} }
offset = ((offset + blocksize) % total_data); offset = ((offset + blocksize) % total_data);

View File

@ -212,7 +212,6 @@ read_entire_file(int ifd, int ofd, void *buf)
} }
} }
if (stride > 1) { if (stride > 1) {
if (lseek(ifd, (stride - 1) * bsize, SEEK_CUR) == -1) { if (lseek(ifd, (stride - 1) * bsize, SEEK_CUR) == -1) {
perror("input lseek"); perror("input lseek");

View File

@ -93,7 +93,7 @@ VDEV_FILE_LOGICAL_ASHIFT vdev.file.logical_ashift vdev_file_logical_ashift
VDEV_FILE_PHYSICAL_ASHIFT vdev.file.physical_ashift vdev_file_physical_ashift VDEV_FILE_PHYSICAL_ASHIFT vdev.file.physical_ashift vdev_file_physical_ashift
VDEV_MAX_AUTO_ASHIFT vdev.max_auto_ashift zfs_vdev_max_auto_ashift VDEV_MAX_AUTO_ASHIFT vdev.max_auto_ashift zfs_vdev_max_auto_ashift
VDEV_MIN_MS_COUNT vdev.min_ms_count zfs_vdev_min_ms_count VDEV_MIN_MS_COUNT vdev.min_ms_count zfs_vdev_min_ms_count
VDEV_DIRECT_WR_VERIFY_PCT vdev.direct_write_verify_pct zfs_vdev_direct_write_verify_pct VDEV_DIRECT_WR_VERIFY vdev.direct_write_verify zfs_vdev_direct_write_verify
VDEV_VALIDATE_SKIP vdev.validate_skip vdev_validate_skip VDEV_VALIDATE_SKIP vdev.validate_skip vdev_validate_skip
VOL_INHIBIT_DEV UNSUPPORTED zvol_inhibit_dev VOL_INHIBIT_DEV UNSUPPORTED zvol_inhibit_dev
VOL_MODE vol.mode zvol_volmode VOL_MODE vol.mode zvol_volmode

View File

@ -43,7 +43,6 @@ function cleanup
{ {
zfs set recordsize=$rs $TESTPOOL/$TESTFS zfs set recordsize=$rs $TESTPOOL/$TESTFS
log_must rm -f $tmp_file log_must rm -f $tmp_file
check_dio_write_chksum_verify_failures $TESTPOOL "raidz" 0
} }
log_onexit cleanup log_onexit cleanup

View File

@ -44,7 +44,6 @@ function cleanup
{ {
zfs set direct=standard $TESTPOOL/$TESTFS zfs set direct=standard $TESTPOOL/$TESTFS
rm $tmp_file rm $tmp_file
check_dio_write_chksum_verify_failures $TESTPOOL "raidz" 0
} }
log_assert "Verify direct=always mixed small async requests" log_assert "Verify direct=always mixed small async requests"

View File

@ -44,7 +44,6 @@ verify_runnable "global"
function cleanup function cleanup
{ {
log_must rm -f "$mntpnt/direct-*" log_must rm -f "$mntpnt/direct-*"
check_dio_write_chksum_verify_failures $TESTPOOL "raidz" 0
} }
function check_fio_ioengine function check_fio_ioengine

View File

@ -46,7 +46,6 @@ function cleanup
{ {
log_must rm -f "$mntpnt/direct-*" log_must rm -f "$mntpnt/direct-*"
log_must zfs set compression=off $TESTPOOL/$TESTFS log_must zfs set compression=off $TESTPOOL/$TESTFS
check_dio_write_chksum_verify_failures $TESTPOOL "raidz" 0
} }
log_assert "Verify compression works using Direct I/O." log_assert "Verify compression works using Direct I/O."

View File

@ -45,7 +45,6 @@ function cleanup
{ {
log_must rm -f "$mntpnt/direct-*" log_must rm -f "$mntpnt/direct-*"
log_must zfs set dedup=off $TESTPOOL/$TESTFS log_must zfs set dedup=off $TESTPOOL/$TESTFS
check_dio_write_chksum_verify_failures $TESTPOOL "raidz" 0
} }
log_assert "Verify deduplication works using Direct I/O." log_assert "Verify deduplication works using Direct I/O."

View File

@ -59,6 +59,4 @@ for bs in "4k" "128k" "1m"; do
done done
done done
check_dio_write_chksum_verify_failures $TESTPOOL1 "stripe" 0
log_pass "Verified encryption works using Direct I/O" log_pass "Verified encryption works using Direct I/O"

View File

@ -41,7 +41,6 @@ function cleanup
{ {
zfs set recordsize=$rs $TESTPOOL/$TESTFS zfs set recordsize=$rs $TESTPOOL/$TESTFS
log_must rm -f $tmp_file log_must rm -f $tmp_file
check_dio_write_chksum_verify_failures $TESTPOOL "raidz" 0
} }
log_assert "Verify the number direct/buffered requests when growing a file" log_assert "Verify the number direct/buffered requests when growing a file"

View File

@ -57,14 +57,6 @@ for type in "" "mirror" "raidz" "draid"; do;
verify_dio_write_count $TESTPOOL1 $recsize $((4 * recsize)) \ verify_dio_write_count $TESTPOOL1 $recsize $((4 * recsize)) \
$mntpnt $mntpnt
if [[ "$type" == "" ]]; then
check_dio_write_chksum_verify_failures $TESTPOOL1 \
"stripe" 0
else
check_dio_write_chksum_verify_failures $TESTPOOL1 \
"$type" 0
fi
destroy_pool $TESTPOOL1 destroy_pool $TESTPOOL1
done done
done done

View File

@ -42,7 +42,6 @@ verify_runnable "global"
function cleanup function cleanup
{ {
log_must rm -f $src_file $new_file $tmp_file log_must rm -f $src_file $new_file $tmp_file
check_dio_write_chksum_verify_failures $TESTPOOL "raidz" 0
} }
log_assert "Verify mixed buffered and Direct I/O are coherent." log_assert "Verify mixed buffered and Direct I/O are coherent."

View File

@ -45,7 +45,6 @@ function cleanup
{ {
zfs set recordsize=$rs $TESTPOOL/$TESTFS zfs set recordsize=$rs $TESTPOOL/$TESTFS
log_must rm -f "$tmp_file" log_must rm -f "$tmp_file"
check_dio_write_chksum_verify_failures $TESTPOOL "raidz" 0
} }
log_assert "Verify mixed Direct I/O and mmap I/O" log_assert "Verify mixed Direct I/O and mmap I/O"

View File

@ -43,7 +43,6 @@ function cleanup
{ {
zfs set recordsize=$rs $TESTPOOL/$TESTFS zfs set recordsize=$rs $TESTPOOL/$TESTFS
log_must rm -f "$tmp_file" log_must rm -f "$tmp_file"
check_dio_write_chksum_verify_failures $TESTPOOL "raidz" 0
} }
log_assert "Verify Direct I/O overwrites" log_assert "Verify Direct I/O overwrites"

View File

@ -44,7 +44,6 @@ function cleanup
{ {
zfs set direct=standard $TESTPOOL/$TESTFS zfs set direct=standard $TESTPOOL/$TESTFS
log_must rm -f $tmp_file log_must rm -f $tmp_file
check_dio_write_chksum_verify_failures $TESTPOOL "raidz" 0
} }
log_assert "Verify the direct=always|disabled|standard property" log_assert "Verify the direct=always|disabled|standard property"
@ -61,7 +60,8 @@ count=8
# #
# Check when "direct=always" any aligned IO is done as direct. # Check when "direct=always" any aligned IO is done as direct.
# Note that "flag=direct" is not set in the following calls to dd(1). # Note that the "-D" and "-d" flags are not set in the following calls to
# stride_dd.
# #
log_must zfs set direct=always $TESTPOOL/$TESTFS log_must zfs set direct=always $TESTPOOL/$TESTFS
@ -92,7 +92,8 @@ log_must rm -f $tmp_file
# #
# Check when "direct=disabled" there are never any direct requests. # Check when "direct=disabled" there are never any direct requests.
# Note that "flag=direct" is always set in the following calls to dd(1). # Note that the "-D" and "-d" flags are always set in the following calls to
# stride_dd.
# #
log_must zfs set direct=disabled $TESTPOOL/$TESTFS log_must zfs set direct=disabled $TESTPOOL/$TESTFS

View File

@ -45,7 +45,6 @@ verify_runnable "global"
function cleanup function cleanup
{ {
log_must rm -f "$tmp_file" log_must rm -f "$tmp_file"
check_dio_write_chksum_verify_failures $TESTPOOL "raidz" 0
} }
log_assert "Verify randomly sized mixed Direct I/O and buffered I/O" log_assert "Verify randomly sized mixed Direct I/O and buffered I/O"

View File

@ -61,14 +61,6 @@ for type in "" "mirror" "raidz" "draid"; do
done done
done done
if [[ "$type" == "" ]]; then
check_dio_write_chksum_verify_failures $TESTPOOL1 \
"stripe" 0
else
check_dio_write_chksum_verify_failures $TESTPOOL1 \
"$type" 0
fi
destroy_pool $TESTPOOL1 destroy_pool $TESTPOOL1
done done
done done

View File

@ -44,7 +44,6 @@ function cleanup
zfs set recordsize=$rs $TESTPOOL/$TESTFS zfs set recordsize=$rs $TESTPOOL/$TESTFS
zfs set direct=standard $TESTPOOL/$TESTFS zfs set direct=standard $TESTPOOL/$TESTFS
log_must rm -f $tmp_file log_must rm -f $tmp_file
check_dio_write_chksum_verify_failures $TESTPOOL "raidz" 0
} }
log_onexit cleanup log_onexit cleanup

View File

@ -49,7 +49,6 @@ function cleanup
{ {
log_must rm -f "$filename" log_must rm -f "$filename"
log_must set recordsize=$rs $TESTPOOL/$TESTFS log_must set recordsize=$rs $TESTPOOL/$TESTFS
check_dio_write_chksum_verify_failures $TESTPOOL "raidz" 0
} }
log_assert "Verify Direct I/O reads can read an entire file that is not \ log_assert "Verify Direct I/O reads can read an entire file that is not \

View File

@ -77,7 +77,7 @@ do
# Manipulate the user's buffer while running O_DIRECT write # Manipulate the user's buffer while running O_DIRECT write
# workload with the buffer. # workload with the buffer.
log_must manipulate_user_buffer -o "$mntpnt/direct-write.iso" \ log_must manipulate_user_buffer -o "$mntpnt/direct-write.iso" \
-n $NUMBLOCKS -b $BS -n $NUMBLOCKS -b $BS
# Reading back the contents of the file # Reading back the contents of the file
log_must stride_dd -i $mntpnt/direct-write.iso -o /dev/null \ log_must stride_dd -i $mntpnt/direct-write.iso -o /dev/null \

View File

@ -33,7 +33,7 @@
# Verify checksum verify works for Direct I/O writes. # Verify checksum verify works for Direct I/O writes.
# #
# STRATEGY: # STRATEGY:
# 1. Set the module parameter zfs_vdev_direct_write_verify_pct to 30. # 1. Set the module parameter zfs_vdev_direct_write_verify to 0.
# 2. Check that manipulating the user buffer while Direct I/O writes are # 2. Check that manipulating the user buffer while Direct I/O writes are
# taking place does not cause any panics with compression turned on. # taking place does not cause any panics with compression turned on.
# 3. Start a Direct I/O write workload while manipulating the user buffer # 3. Start a Direct I/O write workload while manipulating the user buffer
@ -42,7 +42,7 @@
# zpool status -d and checking for zevents. We also make sure there # zpool status -d and checking for zevents. We also make sure there
# are reported data errors when reading the file back. # are reported data errors when reading the file back.
# 5. Repeat steps 3 and 4 for 3 iterations. # 5. Repeat steps 3 and 4 for 3 iterations.
# 6. Set zfs_vdev_direct_write_verify_pct set to 1 and repeat 3. # 6. Set zfs_vdev_direct_write_verify set to 1 and repeat 3.
# 7. Verify there are Direct I/O write verify failures using # 7. Verify there are Direct I/O write verify failures using
# zpool status -d and checking for zevents. We also make sure there # zpool status -d and checking for zevents. We also make sure there
# there are no reported data errors when reading the file back because # there are no reported data errors when reading the file back because
@ -58,22 +58,22 @@ function cleanup
log_must zpool clear $TESTPOOL log_must zpool clear $TESTPOOL
# Clearing out dio_verify from event logs # Clearing out dio_verify from event logs
log_must zpool events -c log_must zpool events -c
log_must set_tunable32 VDEV_DIRECT_WR_VERIFY_PCT 2 log_must set_tunable32 VDEV_DIRECT_WR_VERIFY $DIO_WR_VERIFY_TUNABLE
} }
log_assert "Verify checksum verify works for Direct I/O writes." log_assert "Verify checksum verify works for Direct I/O writes."
if is_freebsd; then if is_freebsd; then
log_unsupported "FeeBSD is capable of stable pages for O_DIRECT writes" log_unsupported "FreeBSD is capable of stable pages for O_DIRECT writes"
fi fi
log_onexit cleanup log_onexit cleanup
ITERATIONS=3 ITERATIONS=3
NUMBLOCKS=300 NUMBLOCKS=300
VERIFY_PCT=30
BS=$((128 * 1024)) # 128k BS=$((128 * 1024)) # 128k
mntpnt=$(get_prop mountpoint $TESTPOOL/$TESTFS) mntpnt=$(get_prop mountpoint $TESTPOOL/$TESTFS)
typeset DIO_WR_VERIFY_TUNABLE=$(get_tunable VDEV_DIRECT_WR_VERIFY)
# Get a list of vdevs in our pool # Get a list of vdevs in our pool
set -A array $(get_disklist_fullpath $TESTPOOL) set -A array $(get_disklist_fullpath $TESTPOOL)
@ -82,7 +82,7 @@ set -A array $(get_disklist_fullpath $TESTPOOL)
firstvdev=${array[0]} firstvdev=${array[0]}
log_must zfs set recordsize=128k $TESTPOOL/$TESTFS log_must zfs set recordsize=128k $TESTPOOL/$TESTFS
log_must set_tunable32 VDEV_DIRECT_WR_VERIFY_PCT $VERIFY_PCT log_must set_tunable32 VDEV_DIRECT_WR_VERIFY 0
# First we will verify there are no panics while manipulating the contents of # First we will verify there are no panics while manipulating the contents of
# the user buffer during Direct I/O writes with compression. The contents # the user buffer during Direct I/O writes with compression. The contents
@ -101,25 +101,21 @@ if [[ $total_dio_wr -lt 1 ]]; then
log_fail "No Direct I/O writes $total_dio_wr" log_fail "No Direct I/O writes $total_dio_wr"
fi fi
log_must rm -f "$mntpnt/direct-write.iso"
# Clearing out DIO counts for Zpool # Clearing out DIO counts for Zpool
log_must zpool clear $TESTPOOL log_must zpool clear $TESTPOOL
# Clearing out dio_verify from event logs # Clearing out dio_verify from event logs
log_must zpool events -c log_must zpool events -c
log_must rm -f "$mntpnt/direct-write.iso"
# Next we will verify there are checksum errors for Direct I/O writes while # Next we will verify there are checksum errors for Direct I/O writes while
# manipulating the contents of the user pages. # manipulating the contents of the user pages.
log_must zfs set compression=off $TESTPOOL/$TESTFS log_must zfs set compression=off $TESTPOOL/$TESTFS
for i in $(seq 1 $ITERATIONS); do for i in $(seq 1 $ITERATIONS); do
log_note "Verifying 30% of Direct I/O write checksums iteration \ log_note "Verifying Direct I/O write checksums iteration \
$i of $ITERATIONS with \ $i of $ITERATIONS with zfs_vdev_direct_write_verify=0"
zfs_vdev_direct_write_verify_pct=$VERIFY_PCT"
prev_dio_wr=$(get_iostats_stat $TESTPOOL direct_write_count) prev_dio_wr=$(get_iostats_stat $TESTPOOL direct_write_count)
prev_arc_wr=$(get_iostats_stat $TESTPOOL arc_write_count)
log_must manipulate_user_buffer -o "$mntpnt/direct-write.iso" \ log_must manipulate_user_buffer -o "$mntpnt/direct-write.iso" \
-n $NUMBLOCKS -b $BS -n $NUMBLOCKS -b $BS
@ -131,9 +127,7 @@ for i in $(seq 1 $ITERATIONS); do
# Getting new Direct I/O and ARC write counts. # Getting new Direct I/O and ARC write counts.
curr_dio_wr=$(get_iostats_stat $TESTPOOL direct_write_count) curr_dio_wr=$(get_iostats_stat $TESTPOOL direct_write_count)
curr_arc_wr=$(get_iostats_stat $TESTPOOL arc_write_count)
total_dio_wr=$((curr_dio_wr - prev_dio_wr)) total_dio_wr=$((curr_dio_wr - prev_dio_wr))
total_arc_wr=$((curr_arc_wr - prev_arc_wr))
# Verifying there are checksum errors # Verifying there are checksum errors
log_note "Making sure there are checksum errors for the ZPool" log_note "Making sure there are checksum errors for the ZPool"
@ -144,23 +138,13 @@ for i in $(seq 1 $ITERATIONS); do
log_fail "No checksum failures for ZPool $TESTPOOL" log_fail "No checksum failures for ZPool $TESTPOOL"
fi fi
# Getting checksum verify failures
verify_failures=$(get_zpool_status_chksum_verify_failures $TESTPOOL "raidz")
log_note "Making sure we have Direct I/O writes logged" log_note "Making sure we have Direct I/O writes logged"
if [[ $total_dio_wr -lt 1 ]]; then if [[ $total_dio_wr -lt 1 ]]; then
log_fail "No Direct I/O writes $total_dio_wr" log_fail "No Direct I/O writes $total_dio_wr"
fi fi
log_note "Making sure we have Direct I/O write checksum verifies with ZPool" log_note "Making sure we have no Direct I/O write checksum verifies \
check_dio_write_chksum_verify_failures $TESTPOOL "raidz" 1 with ZPool"
check_dio_write_chksum_verify_failures $TESTPOOL "raidz" 0
# In the event of checksum verify error, the write will be redirected
# through the ARC. We check here that we have ARC writes.
log_note "Making sure we have ARC writes have taken place in the event \
a Direct I/O checksum verify failures occurred"
if [[ $total_arc_wr -lt $verify_failures ]]; then
log_fail "ARC writes $total_arc_wr < $verify_failures"
fi
log_must rm -f "$mntpnt/direct-write.iso" log_must rm -f "$mntpnt/direct-write.iso"
done done
@ -168,19 +152,22 @@ done
log_must zpool status -v $TESTPOOL log_must zpool status -v $TESTPOOL
log_must zpool sync $TESTPOOL log_must zpool sync $TESTPOOL
# Finally we will verfiy that with checking every Direct I/O write we have no # Finally we will verfiy that with checking every Direct I/O write we have no
# errors at all. # errors at all.
VERIFY_PCT=100 # Create the file before trying to manipulate the contents
log_must set_tunable32 VDEV_DIRECT_WR_VERIFY_PCT $VERIFY_PCT log_must file_write -o create -f "$mntpnt/direct-write.iso" -b $BS \
-c $NUMBLOCKS -w
log_must set_tunable32 VDEV_DIRECT_WR_VERIFY 1
for i in $(seq 1 $ITERATIONS); do for i in $(seq 1 $ITERATIONS); do
log_note "Verifying every Direct I/O write checksums iteration $i of \ log_note "Verifying every Direct I/O write checksums iteration $i of \
$ITERATIONS with zfs_vdev_direct_write_verify_pct=$VERIFY_PCT" $ITERATIONS with zfs_vdev_direct_write_verify=1"
prev_dio_wr=$(get_iostats_stat $TESTPOOL direct_write_count) prev_dio_wr=$(get_iostats_stat $TESTPOOL direct_write_count)
prev_arc_wr=$(get_iostats_stat $TESTPOOL arc_write_count)
log_must manipulate_user_buffer -o "$mntpnt/direct-write.iso" \ log_must manipulate_user_buffer -o "$mntpnt/direct-write.iso" \
-n $NUMBLOCKS -b $BS -n $NUMBLOCKS -b $BS -e
# Reading file back to verify there no are checksum errors # Reading file back to verify there no are checksum errors
filesize=$(get_file_size "$mntpnt/direct-write.iso") filesize=$(get_file_size "$mntpnt/direct-write.iso")
@ -190,16 +177,11 @@ for i in $(seq 1 $ITERATIONS); do
# Getting new Direct I/O and ARC Write counts. # Getting new Direct I/O and ARC Write counts.
curr_dio_wr=$(get_iostats_stat $TESTPOOL direct_write_count) curr_dio_wr=$(get_iostats_stat $TESTPOOL direct_write_count)
curr_arc_wr=$(get_iostats_stat $TESTPOOL arc_write_count)
total_dio_wr=$((curr_dio_wr - prev_dio_wr)) total_dio_wr=$((curr_dio_wr - prev_dio_wr))
total_arc_wr=$((curr_arc_wr - prev_arc_wr))
log_note "Making sure there are no checksum errors with the ZPool" log_note "Making sure there are no checksum errors with the ZPool"
log_must check_pool_status $TESTPOOL "errors" "No known data errors" log_must check_pool_status $TESTPOOL "errors" "No known data errors"
# Geting checksum verify failures
verify_failures=$(get_zpool_status_chksum_verify_failures $TESTPOOL "raidz")
log_note "Making sure we have Direct I/O writes logged" log_note "Making sure we have Direct I/O writes logged"
if [[ $total_dio_wr -lt 1 ]]; then if [[ $total_dio_wr -lt 1 ]]; then
log_fail "No Direct I/O writes $total_dio_wr" log_fail "No Direct I/O writes $total_dio_wr"
@ -207,16 +189,8 @@ for i in $(seq 1 $ITERATIONS); do
log_note "Making sure we have Direct I/O write checksum verifies with ZPool" log_note "Making sure we have Direct I/O write checksum verifies with ZPool"
check_dio_write_chksum_verify_failures "$TESTPOOL" "raidz" 1 check_dio_write_chksum_verify_failures "$TESTPOOL" "raidz" 1
# In the event of checksum verify error, the write will be redirected
# through the ARC. We check here that we have ARC writes.
log_note "Making sure we have ARC writes have taken place in the event \
a Direct I/O checksum verify failures occurred"
if [[ $total_arc_wr -lt $verify_failures ]]; then
log_fail "ARC writes $total_arc_wr < $verify_failures"
fi
log_must rm -f "$mntpnt/direct-write.iso"
done done
log_must rm -f "$mntpnt/direct-write.iso"
log_pass "Verified checksum verify works for Direct I/O writes." log_pass "Verified checksum verify works for Direct I/O writes."