Deep recursive call chains are contributing to segfaults in ztest due to
heavy stack use. Inlining zio_execute() helps reduce the stack depth of
the zio_notify_parent() -> zio_execute() -> zio_wait() recursive cycle.
I am no longer seeing ztest segfaults in this code path with this change.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Deep recursive call chains are contributing to segfaults in ztest due
to heavy stack use. Inlining dbuf_findbp() helps reduce the stack depth
of the dbuf_findbp() -> dbuf_hold_impl() cycle. However, segfaults are
still occurring in this code path, so further reductions are still needed.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Deep recursive call chains are contributing to segfaults in ztest due
to heavy stack use. Inlining zio_notify_parent() helps reduce the
stack depth of the zio_notify_parent() -> zio_execute() -> zio_done()
recursive cycle. I am no longer seeing ztest segfaults in this code
path with this change combined with the zio_done() stack reduction in
the previous commit.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
The spa_load function may call itself recursively through
the spa_load_impl function. This call path of spa_load->
spa_load_impl->spa_load->spa_load_impl takes 640 bytes of
stack. By forcing spa_load_impl to be inlined as part of
spa_load the can be reduced to 448 bytes, for a savings of
192 bytes,
The feature branch 'fix-taskq' in Linux's ZFS tree changes the taskq_dispatch()
flag from TQ_SLEEP to TQ_NOSLEEP to avoid sleeping in some circumstances.
However, this has the side effect that taskq_dispatch() now may fail, and since
the return code was not even being checked, it could lead to zio's not being
scheduled to execute.
I'm fixing this in a simplistic but not very elegant way, by just looping until
taskq_dispatch() succeeds.
Signed-off-by: Ricardo M. Correia <ricardo.correia@oracle.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Extend the Makefiles with an uninstall target to cleanly
remove a package which was installed with 'make install'.
Additionally, ensure a 'depmod -a' is run as part of the
install to update the module dependency information.
The ZFS update to onnv_141 brought with it support for a
security label attribute called mlslabel. This feature
depends on zones to work correctly and thus I am disabling
it under Linux. Equivilant functionality could be added
at some point in the future.
Due to limited stack space recursive functions are frowned upon in
the Linux kernel. However, they often are the most elegant solution
to a problem. The following code preserves the recursive function
traverse_visitbp() but moves the local variables AND function
arguments to the heap to minimize the stack frame size. Enough
space is initially allocated on the stack for 20 levels of recursion.
This change does ugly-up-the-code but it reduces the worst case
usage from roughly 4160 bytes to 960 bytes on x86_64 archs.
These changes are now taken care of by the fix-stack-traverse_impl
topic branch which not only solves the uninit problem but also
moves these locals off the stack and on to the heap.
Move dsl_dataset_t local variable from the stack to the heap.
This reduces the stack usage of this function from 2048 bytes
to 176 bytes for x84_64 arches.
This may not strictly be needed but it does keep gcc happy. We
should keep our eye on this though if the extra bcopy significantly
impacts performance. It may.
There are cases where under Linux it is not safe to sleep in
taskq_dispatch(). Rather than adding Linux specific code to
detect these cases I opted to keep it simple and just never
allow a sleep here. The impact of this should be minimal.
I missed a instanse of removing the & operator when reducing the
stack usage in this function. This unfortunately doesn't cause
a compile warning but it is does cause ztest failures. Anyway,
update the topic branch to correct this mistake.
Certain function must never be automatically inlined by gcc because
they are stack heavy or called recursively. This patch flags all
such functions I have found as 'noinline' to prevent gcc from making
the optimization.