zfs/module/spl
Ned Bass ec2b41049f Taskq locking optimizations
Testing has shown that tq->tq_lock can be highly contended when a
large number of small work items are dispatched.  The lock hold time
is reduced by the following changes:

1) Use exclusive threads in the work_waitq

When a single work item is dispatched we only need to wake a single
thread to service it.  The current implementation uses non-exclusive
threads so all threads are woken when the dispatcher calls wake_up().
If a large number of threads are in the queue this overhead can become
non-negligible.

2) Conditionally add/remove threads from work waitq outside of tq_lock

Taskq threads need only add themselves to the work wait queue if there
are no pending work items.  Furthermore, the add and remove function
calls can be made outside of the taskq lock since the wait queues are
protected from concurrent access by their own spinlocks.

3) Call wake_up() outside of tq->tq_lock

Again, the wait queues are protected by their own spinlock, so the
dispatcher functions can drop tq->tq_lock before calling wake_up().

A new splat test taskq:contention was added in a prior commit to measure
the impact of these changes.  The following table summarizes the
results using data from the kernel lock profiler.

                        tq_lock time    %diff   Wall clock (s)  %diff
original:               39117614.10     0       41.72           0
exclusive threads:      31871483.61     18.5    34.2            18.0
unlocked add/rm waitq:  13794303.90     64.7    16.17           61.2
unlocked wake_up():     1589172.08      95.9    16.61           60.2

Each row reflects the average result over 5 test runs.
/proc/lock_stats was zeroed out before and collected after each run.
Column 1 is the cumulative hold time in microseconds for tq->tq_lock.
The tests are cumulative; each row reflects the code changes of the
previous rows.  %diff is calculated with respect to "original" as
100*(orig-new)/orig.

Although calling wake_up() outside of the taskq lock dramatically
reduced the taskq lock hold time, the test actually took slightly more
wall clock time.  This is because the point of contention shifts from
the taskq lock to the wait queue lock.  But the change still seems
worthwhile since it removes our taskq implementation as a bottleneck,
assuming the small increase in wall clock time to be statistical
noise.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #32
2012-01-18 10:36:57 -08:00
..
.gitignore sun-misc-gitignore 2010-01-08 09:37:54 -08:00
Makefile.in Fix zlib compression 2011-02-25 16:56:22 -08:00
spl-atomic.c Linux 2.6.39 compat, DEFINE_SPINLOCK() 2011-04-20 12:01:11 -07:00
spl-condvar.c Block in cv_destroy() on all waiters 2011-02-04 14:09:08 -08:00
spl-cred.c Add crgetfsuid()/crgetfsgid() helpers 2011-03-22 12:18:44 -07:00
spl-debug.c Prepend spl_ to all init/fini functions 2011-11-11 09:18:28 -08:00
spl-err.c Prefix all SPL debug macros with 'S' 2010-07-20 13:30:40 -07:00
spl-generic.c Prepend spl_ to all init/fini functions 2011-11-11 09:18:28 -08:00
spl-kmem.c Proxmox VE kernel compat, invalidate_inodes() 2011-12-21 14:29:45 -08:00
spl-kobj.c Remove VN_HOLD/VN_RELE/VOP_PUTPAGE 2011-01-12 11:38:05 -08:00
spl-kstat.c Prepend spl_ to all init/fini functions 2011-11-11 09:18:28 -08:00
spl-module.c Linux 2.6.39 compat, DEFINE_SPINLOCK() 2011-04-20 12:01:11 -07:00
spl-mutex.c Public Release Prep 2010-05-17 15:18:00 -07:00
spl-proc.c Prepend spl_ to all init/fini functions 2011-11-11 09:18:28 -08:00
spl-rwlock.c Public Release Prep 2010-05-17 15:18:00 -07:00
spl-taskq.c Taskq locking optimizations 2012-01-18 10:36:57 -08:00
spl-thread.c Add Thread Specific Data (TSD) Implementation 2010-12-07 10:02:32 -08:00
spl-time.c Minor 32-bit fix cast to hrtime_t before the mutliply. 2010-05-23 09:51:17 -07:00
spl-tsd.c Prepend spl_ to all init/fini functions 2011-11-11 09:18:28 -08:00
spl-vnode.c Linux 3.1 compat, kern_path_parent() 2011-11-09 16:51:25 -08:00
spl-xdr.c Prefix all SPL debug macros with 'S' 2010-07-20 13:30:40 -07:00
spl-zlib.c Prepend spl_ to all init/fini functions 2011-11-11 09:18:28 -08:00