Deep recursive call chains are contributing to segfaults in ztest due to
heavy stack use. Inlining zio_execute() helps reduce the stack depth of
the zio_notify_parent() -> zio_execute() -> zio_wait() recursive cycle.
I am no longer seeing ztest segfaults in this code path with this change.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Deep recursive call chains are contributing to segfaults in ztest due
to heavy stack use. Inlining dbuf_findbp() helps reduce the stack depth
of the dbuf_findbp() -> dbuf_hold_impl() cycle. However, segfaults are
still occurring in this code path, so further reductions are still needed.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Deep recursive call chains are contributing to segfaults in ztest due
to heavy stack use. Inlining zio_notify_parent() helps reduce the
stack depth of the zio_notify_parent() -> zio_execute() -> zio_done()
recursive cycle. I am no longer seeing ztest segfaults in this code
path with this change combined with the zio_done() stack reduction in
the previous commit.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
The spa_load function may call itself recursively through
the spa_load_impl function. This call path of spa_load->
spa_load_impl->spa_load->spa_load_impl takes 640 bytes of
stack. By forcing spa_load_impl to be inlined as part of
spa_load the can be reduced to 448 bytes, for a savings of
192 bytes,
A number of ztest functions create one or more 312B ztest_od_t data
structures. To conserve stack usage, this commit moves all of these data
structures to the heap. However, I am still seeing ztest segfaults due
to heavy stack usage of the dbuf_findbp() -> dbuf_hold_impl() recursion.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Using sparse files for the test configurations had atleast three
significant advantages.
1) Actually test sparse files to ensure they work.
2) Drastically reduce required disk space for the regression test
suite. This turns out to be fairly important when running the
test suite in a virtualized environment.
3) Significantly speed of the test suite. Run time of zconfig.sh
dropped from 2m:56s to 1m:00s on my test system, zpios-sanity.sh
nows runs in only 0m:26s.
This change updates zconfig.sh to reference /dev/zvol/ instead
of simply /dev/. It also extends the texts to verify correct
minor device creation for import/export and module load/unload.
The feature branch 'fix-taskq' in Linux's ZFS tree changes the taskq_dispatch()
flag from TQ_SLEEP to TQ_NOSLEEP to avoid sleeping in some circumstances.
However, this has the side effect that taskq_dispatch() now may fail, and since
the return code was not even being checked, it could lead to zio's not being
scheduled to execute.
I'm fixing this in a simplistic but not very elegant way, by just looping until
taskq_dispatch() succeeds.
Signed-off-by: Ricardo M. Correia <ricardo.correia@oracle.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
The splat module is only needed for the spl regression tests.
But if we add it to MODULES then 'zfs.sh -u' will be able to
unload it if needed, The downside if 'zfs.sh' will always
load it but it's overhead is minimal and in a production
setting you'll always be doing a 'modprobe zfs' anyway so
this is really just for testing.
Extend the Makefiles with an uninstall target to cleanly
remove a package which was installed with 'make install'.
Additionally, ensure a 'depmod -a' is run as part of the
install to update the module dependency information.
The common.sh script assumed that it was either being run from
in-tree or was installed under /usr/libexec/zfs. If this was
not the case, because of say the default --prefix=/usr/local,
then the paths would be wrong. To fix this common.sh is now
generated from common.sh.in with the correct path information
provided at configure time.
The long term fix for Debian and Slackware style packaging is
to add native support for building these packages. Unfortunately,
that is a large chunk of work I don't have time for right now.
That said it would be nice to have at least basic packages for
these distributions.
As a quick short/medium term solution I've settled on using alien
to convert the RPM packages to DEB or TGZ style packages. The
build system has been updated with the following build targets
which will first build RPM packages and then convert them as
needed to the target package type:
make rpm: Create .rpm packages
make deb: Create .deb packages
make tgz: Create .tgz packages
make pkg: Create the right package type for your distribution
The solution comes with lot of caveats and your mileage may vary.
But basically the big limitations are that the resulting packages:
1) Will not have the correct dependency information.
2) Will not include the kernel version in the release.
3) Will not handle all differences between distributions.
But the resulting packages should be easy to install and remove
from your system and take care of running 'depmod -a' and such.
As I said at the top this is not the right long term solution.
If any of the upstream distribution maintainers want to jump in
and help do this right for their distribution I'd love the help.
The dmu_objset_pool() and dmu_objset_name() symbols are needed
by Lustre and should be exported.
Signed-off-by: Ricardo M. Correia <ricardo.correia@oracle.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Some buggy NPTL threading implementations include the guard area within
the stack size allocations. In this case we need to allocate an extra
page to account for the guard area since we only have two pages of usable
stack on Linux. Added an autoconf test that detects such implementations
by running a test program designed to segfault if the bug is present.
Set a flag NPTL_GUARD_WITHIN_STACK that is tested to decide if extra
stack space must be allocated for the guard area.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Some buggy NPTL threading implementations include the guard area within
the stack size allocations. In this case we need to allocate an extra
page to account for the guard area since we only have two pages of usable
stack on Linux. Added an autoconf test that detects such implementations
by running a test program designed to segfault if the bug is present.
Set a flag NPTL_GUARD_WITHIN_STACK that is tested to decide if extra
stack space must be allocated for the guard area.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
The stack check implementation in older versions of gcc has
a fairly low default limit on STACK_CHECK_MAX_FRAME_SIZE of
roughly 4096. This results in numerous warning when it is
used with code which was designed to run in user space and
thus may be relatively stack heavy. The avoid these warnings,
which are fatal with -Werror, this patch targets the use of
-fstack-check to libraries which are compiled in both user
space and kernel space. The only utility which uses this
flag is ztest which is designed to simulate running in the
kernel and must meet the -fstack-check requirements. All
other user space utilities do not use -fstack-check.
warning: frame size too large for reliable stack checking
warning: try reducing the number of local variables
For some reason which remains mysterious to me the shared library
which calls pthread_create() must be linked with -pthread. If this
is not done on 32-bit system the default ulimit stack size is used.
Surprisingly, on a 64-bit system the stack limit specified by the
pthread_attr is honored even when -pthread is not passed when linking
the shared library.
While in theory I like the idea of compiler warnings always being
fatal. In practice this causes problems when small harmless errors
cause build failures for end users. To handle this I've updated
the build system such that -Werror is only used when --enable-debug
is passed to configure. This is how I always build when developing
so I'll catch all build warnings and end users will not get stuck
by minor issues.