OpenZFS on Linux and FreeBSD
Go to file
Brian Behlendorf 82a37189aa Implement SA based xattrs
The current ZFS implementation stores xattrs on disk using a hidden
directory.  In this directory a file name represents the xattr name
and the file contexts are the xattr binary data.  This approach is
very flexible and allows for arbitrarily large xattrs.  However,
it also suffers from a significant performance penalty.  Accessing
a single xattr can requires up to three disk seeks.

  1) Lookup the dnode object.
  2) Lookup the dnodes's xattr directory object.
  3) Lookup the xattr object in the directory.

To avoid this performance penalty Linux filesystems such as ext3
and xfs try to store the xattr as part of the inode on disk.  When
the xattr is to large to store in the inode then a single external
block is allocated for them.  In practice most xattrs are small
and this approach works well.

The addition of System Attributes (SA) to zfs provides us a clean
way to make this optimization.  When the dataset property 'xattr=sa'
is set then xattrs will be preferentially stored as System Attributes.
This allows tiny xattrs (~100 bytes) to be stored with the dnode and
up to 64k of xattrs to be stored in the spill block.  If additional
xattr space is required, which is unlikely under Linux, they will be
stored using the traditional directory approach.

This optimization results in roughly a 3x performance improvement
when accessing xattrs which brings zfs roughly to parity with ext4
and xfs (see table below).  When multiple xattrs are stored per-file
the performance improvements are even greater because all of the
xattrs stored in the spill block will be cached.

However, by default SA based xattrs are disabled in the Linux port
to maximize compatibility with other implementations.  If you do
enable SA based xattrs then they will not be visible on platforms
which do not support this feature.

----------------------------------------------------------------------
   Time in seconds to get/set one xattr of N bytes on 100,000 files
------+--------------------------------+------------------------------
      |            setxattr            |            getxattr
bytes |  ext4     xfs zfs-dir  zfs-sa  |  ext4     xfs zfs-dir  zfs-sa
------+--------------------------------+------------------------------
1     |  2.33   31.88   21.50    4.57  |  2.35    2.64    6.29    2.43
32    |  2.79   30.68   21.98    4.60  |  2.44    2.59    6.78    2.48
256   |  3.25   31.99   21.36    5.92  |  2.32    2.71    6.22    3.14
1024  |  3.30   32.61   22.83    8.45  |  2.40    2.79    6.24    3.27
4096  |  3.57  317.46   22.52   10.73  |  2.78   28.62    6.90    3.94
16384 |   n/a 2342.39   34.30   19.20  |   n/a   45.44  145.90    7.55
65536 |   n/a 2941.39  128.15  131.32* |   n/a  141.92  256.85  262.12*

Legend:
* ext4      - Stock RHEL6.1 ext4 mounted with '-o user_xattr'.
* xfs       - Stock RHEL6.1 xfs mounted with default options.
* zfs-dir   - Directory based xattrs only.
* zfs-sa    - Prefer SAs but spill in to directories as needed, a
              trailing * indicates overflow in to directories occured.

NOTE: Ext4 supports 4096 bytes of xattr name/value pairs per file.
NOTE: XFS and ZFS have no limit on xattr name/value pairs per file.
NOTE: Linux limits individual name/value pairs to 65536 bytes.
NOTE: All setattr/getattr's were done after dropping the cache.
NOTE: All tests were run against a single hard drive.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #443
2011-11-28 15:45:51 -08:00
cmd Simplify BDI integration 2011-11-08 10:19:03 -08:00
config In autoconf v2.68, AC_LANG_PROGRAM must be quoted 2011-11-28 11:16:33 -08:00
dracut Simplify BDI integration 2011-11-08 10:19:03 -08:00
etc Simplify BDI integration 2011-11-08 10:19:03 -08:00
include Implement SA based xattrs 2011-11-28 15:45:51 -08:00
lib Allow leading digits in userquota/groupquota names 2011-11-21 16:29:18 -08:00
man Simplify BDI integration 2011-11-08 10:19:03 -08:00
module Implement SA based xattrs 2011-11-28 15:45:51 -08:00
patches Add build system 2010-08-31 13:41:27 -07:00
scripts Simplify BDI integration 2011-11-08 10:19:03 -08:00
udev Simplify BDI integration 2011-11-08 10:19:03 -08:00
.gitignore Ignore unsigned module build products 2010-03-09 14:14:09 -08:00
AUTHORS Add "ashift" property to zpool create 2011-06-17 16:35:49 -07:00
COPYING Relocate COPYING+COPYRIGHT, remove README cruft 2008-12-01 15:34:53 -08:00
COPYRIGHT Update COPYRIGHT to reference zpios CDDL exceptions. 2010-05-18 14:25:28 -07:00
ChangeLog Add build system 2010-08-31 13:41:27 -07:00
DISCLAIMER Update COPYRIGHT and DISCLAIMER. 2010-05-18 10:32:23 -07:00
META Prep zfs-0.6.0-rc6 tag 2011-10-06 14:10:45 -07:00
Makefile.am Move udev rules from /etc/udev to /lib/udev 2011-08-08 16:21:10 -07:00
Makefile.in Simplify BDI integration 2011-11-08 10:19:03 -08:00
OPENSOLARIS.LICENSE Add CDDL license file 2008-12-01 14:49:34 -08:00
README.markdown Fix markdown rendering 2010-09-15 09:09:37 -07:00
ZFS.RELEASE Update to onnv_147 2010-08-26 14:24:34 -07:00
autogen.sh Make autogen.sh executable 2011-07-26 10:15:35 -07:00
configure Linux 3.1 compat, fops->fsync() 2011-11-10 10:03:08 -08:00
configure.ac Fix autoconf variable substitution in init scripts. 2011-08-19 16:26:14 -07:00
zfs-modules.spec.in Fix depmod warning 2011-11-10 10:26:06 -08:00
zfs-script-config.sh.in Unconditionally load core kernel modules 2010-11-11 11:38:25 -08:00
zfs.spec.in Include distribution in release 2011-10-19 11:43:27 -07:00
zfs_config.h.in Linux 3.1 compat, fops->fsync() 2011-11-10 10:03:08 -08:00

README.markdown

Native ZFS for Linux! ZFS is an advanced file system and volume manager which was originally developed for Solaris. It has been successfully ported to FreeBSD and now there is a functional Linux ZFS kernel port too. The port currently includes a fully functional and stable SPA, DMU, and ZVOL with a ZFS Posix Layer (ZPL) on the way!

$ ./configure
$ make pkg

Full documentation for building, configuring, and using ZFS can be found at: http://zfsonlinux.org