zfs/module
Alexander Motin b0cbc1aa9a
Use big transactions for small recordsize writes.
When ZFS appends files in chunks bigger than recordsize, it borrows
buffer from ARC and fills it before opening transaction.  This
supposed to help in case of page faults to not hold transaction open
indefinitely.  The problem appears when recordsize is set lower than
default 128KB. Since each block is committed in separate transaction,
per-transaction overhead becomes significant, and what is even worse,
active use of of per-dataset and per-pool locks to protect space use
accounting for each transaction badly hurts the code SMP scalability.
The same transaction size limitation applies in case of file rewrite,
but without even excuse of buffer borrowing.

To address the issue, disable the borrowing mechanism if recordsize
is smaller than default and the write request is 4x bigger than it.
In such case writes up to 32MB are executed in single transaction,
that dramatically reduces overhead and lock contention.  Since the
borrowing mechanism is not used for file rewrites, and it was never
used by zvols, which seem to work fine, I don't think this change
should create significant problems, partially because in addition to
the borrowing mechanism there are also used pre-faults.

My tests with 4/8 threads writing several files same time on datasets
with 32KB recordsize in 1MB requests show reduction of CPU usage by
the user threads by 25-35%.  I would measure it in GB/s, but at that
block size we are now limited by the lock contention of single write
issue taskqueue, which is a separate problem we are going to work on.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #14964
2023-06-27 17:00:30 -07:00
..
avl Suppress Clang Static Analyzer false positive in the AVL tree code. 2023-03-08 13:51:21 -08:00
icp powerpc64: Support ELFv2 asm on Big Endian 2023-04-27 12:49:21 -07:00
lua Add loongarch64 support 2023-04-25 16:05:45 -07:00
nvpair nvpair: Constify string functions 2023-03-14 15:25:50 -07:00
os Add a delay to tearing down threads. 2023-06-26 13:57:12 -07:00
unicode Illumos #15286: do_composition() needs sign awareness 2023-01-05 11:16:21 -08:00
zcommon Create zap for root vdev 2023-04-20 10:07:56 -07:00
zfs Use big transactions for small recordsize writes. 2023-06-27 17:00:30 -07:00
zstd Resolve WS-2021-0184 vulnerability in zstd 2023-02-02 15:12:51 -08:00
.gitignore FreeBSD: Ignore symlink to i386 includes 2022-08-02 16:34:23 -07:00
Kbuild.in Finally drop long disabled vdev cache. 2023-06-09 12:40:55 -07:00
Makefile.bsd Finally drop long disabled vdev cache. 2023-06-09 12:40:55 -07:00
Makefile.in autoconf: use include directives instead of recursing down lib 2022-05-10 10:18:11 -07:00