zfs/module
Matthew Ahrens 1dc32a67e9
Improve zfs send performance by bypassing the ARC
When doing a zfs send on a dataset with small recordsize (e.g. 8K),
performance is dominated by the per-block overheads.  This is especially
true with `zfs send --compressed`, which further reduces the amount of
data sent, for the same number of blocks.  Several threads are involved,
but the limiting factor is the `send_prefetch` thread, which is 100% on
CPU.

The main job of the `send_prefetch` thread is to issue zio's for the
data that will be needed by the main thread.  It does this by calling
`arc_read(ARC_FLAG_PREFETCH)`.  This has an immediate cost of creating
an arc_hdr, which takes around 14% of one CPU.  It also induces later
costs by other threads:

 * Since the data was only prefetched, dmu_send()->dmu_dump_write() will
   need to call arc_read() again to get the data.  This will have to
   look up the arc_hdr in the hash table and copy the data from the
   scatter ABD in the arc_hdr to a linear ABD in arc_buf.  This takes
   27% of one CPU.

 * dmu_dump_write() needs to arc_buf_destroy()  This takes 11% of one
   CPU.

 * arc_adjust() will need to evict this arc_hdr, taking about 50% of one
   CPU.

All of these costs can be avoided by bypassing the ARC if the data is
not already cached.  This commit changes `zfs send` to check for the
data in the ARC, and if it is not found then we directly call
`zio_read()`, reading the data into a linear ABD which is used by
dmu_dump_write() directly.

The performance improvement is best expressed in terms of how many
blocks can be processed by `zfs send` in one second.  This change
increases the metric by 50%, from ~100,000 to ~150,000.  When the amount
of data per block is small (e.g. 2KB), there is a corresponding
reduction in the elapsed time of `zfs send >/dev/null` (from 86 minutes
to 58 minutes in this test case).

In addition to improving the performance of `zfs send`, this change
makes `zfs send` not pollute the ARC cache.  In most cases the data will
not be reused, so this allows us to keep caching useful data in the MRU
(hit-once) part of the ARC.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10067
2020-03-10 10:51:04 -07:00
..
avl Wrap Linux module macros 2019-11-01 10:41:03 -07:00
icp Fix icp include directories for in-tree build 2020-02-20 08:10:47 -08:00
lua cppcheck: (error) Address of local auto-variable assigned 2019-12-18 17:25:42 -08:00
nvpair cppcheck: (warning) Possible null pointer dereference: nvh 2019-12-18 17:25:57 -08:00
os Improve performance of zio_taskq_member 2020-03-03 10:29:38 -08:00
spl OpenZFS restructuring - move platform specific sources 2019-09-06 11:26:26 -07:00
unicode Wrap Linux module macros 2019-11-01 10:41:03 -07:00
zcommon Change default to overlay=on 2020-03-06 09:28:19 -08:00
zfs Improve zfs send performance by bypassing the ARC 2020-03-10 10:51:04 -07:00
.gitignore Adapt gitignore for modules 2019-12-02 13:23:47 -08:00
Makefile.in module/Makefile.in: don't run xargs if empty 2019-10-08 10:10:23 -07:00