Update links to refer to the official ZFS on Linux website instead of
@behlendorf's personal fork on github.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Until now the notion of an internal debug logging infrastructure
was conflated with enabling ASSERT()s. This patch clarifies things
by cleanly breaking the two subsystem apart. The result of this
is the following behavior.
--enable-debug - Enable/disable code wrapped in ASSERT()s.
--disable-debug ASSERT()s are used to check invariants and
are never required for correct operation.
They are disabled by default because they
may impact performance.
--enable-debug-log - Enable/disable the debug log infrastructure.
--disable-debug-log This infrastructure allows the spl code and
its consumer to log messages to an in-kernel
log. The granularity of the logging can be
controlled by a debug mask. By default the
mask disables most debug messages resulting
in a negligible performance impact. Because
of this the debug log is enabled by default.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
To avoid symbol conflicts with dependent packages the debug
header must be split in to several parts. The <sys/debug.h>
header now only contains the Solaris macro's such as ASSERT
and VERIFY. The spl-debug.h header contain the spl specific
debugging infrastructure and should be included by any package
which needs to use the spl logging. Finally the spl-trace.h
header contains internal data structures only used for the log
facility and should not be included by anythign by spl-debug.c.
This way dependent packages can include the standard Solaris
headers without picking up any SPL debug macros. However, if
the dependant package want to integrate with the SPL debugging
subsystem they can then explicitly include spl-debug.h.
Along with this change I have dropped the CHECK_STACK macros
because the upstream Linux kernel now has much better stack
depth checking built in and we don't need this complexity.
Additionally SBUG has been replaced with PANIC and provided as
part of the Solaris macro set. While the Solaris version is
really panic() that conflicts with the Linux kernel so we'll
just have to make due to PANIC. It should rarely be called
directly, the prefered usage would be an ASSERT or VERIFY.
There's lots of change here but this cleanup was overdue.
Updated AUTHORS, COPYING, DISCLAIMER, and INSTALL files. Added
standardized headers to all source file to clearly indicate the
copyright, license, and to give credit where credit is due.
The run time stack overflow checking is being disabled by default
because it is not safe for use with 2.6.29 and latter kernels. These
kernels do now have their own stack overflow checking so this support
has become redundant anyway. It can be re-enabled for older kernels or
arches without stack overflow checking by redefining CHECK_STACK().
The previous credential implementation simply provided the needed types and
a couple of dummy functions needed. This update correctly ties the basic
Solaris credential API in to one of two Linux kernel APIs.
Prior to 2.6.29 the linux kernel embeded all credentials in the task
structure. For these kernels, we pass around the entire task struct as if
it were the credential, then we use the helper functions to extract the
credential related bits.
As of 2.6.29 a new credential type was added which we can and do fairly
cleanly layer on top of. Once again the helper functions nicely hide
the implementation details from all callers.
Three tests were added to the splat test framework to verify basic
correctness. They should be extended as needed when need credential
functions are added.
- Initial SLES testing uncovered a long standing bug in the debug
tracing. The tcd_for_each() macro expected a NULL to terminate
the trace_data[i] array but this was only ever true due to luck.
All trace_data[] iterators are now properly capped by TCD_TYPE_MAX.
- SPLAT_MAJOR 229 conflicted with a 'hvc' device on my SLES system.
Since this was always an arbitrary choice I picked something else.
- The HAVE_PGDAT_LIST case should set pgdat_list_addr to the value stored
at the address of the memory location returned by kallsyms_lookup_name().
This fixes an oops when unloading the modules, in the case where memory
tracking was enabled and there were memory leaks. The comment in the
code explains what was the problem.
* spl-10-fix-assert-verify-ndebug.patch
This fixes ASSERT*() and VERIFY*() macros in non-debug builds. VERIFY*()
macros are supposed to check the condition and panic even in production
builds, and ASSERT*() macros don't need to evaluate the arguments.
Also some 32-bit fixes.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@165 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
already supprt atomic64_t types.
* spl-07-kmem-cleanup.patch
This moves all the debugging code from sys/kmem.h to spl-kmem.c, because
the huge macros were hard to debug and were bloating functions that
allocated memory. I also fixed some other minor problems, including
32-bit fixes and a reported memory leak which was just due to using the
wrong free function.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@163 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
from Ricardo which removes a dependency on the GPL-only symbol
set_cpus_allowed(). Using this symbol is simpler but in the name
of portability we are adopting a spinlock based solution here
to remove this dependency.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@160 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
1) Ensure mutex_init() never fails in the case of ENOMEM by retrying
forever. I don't think I've ever seen this happen but it was clear
after code inspection that if it did we would immediately crash.
2) Enable full debugging in check.sh for sanity tests. Might as well
get as much debug as we can in the case of a failure.
3) Reworked list of kmem caches tracked by SPL in to a hash with the
key based on the address of the kmem_cache_t. This should speed
up the constructor/destructor/shrinker lookup needed now for newer
kernel which removed the destructor support.
4) Updated kmem_cache_create to handle the case where CONFIG_SLUB
is defined. The slub would occasionally merge slab caches which
resulted in non-unique keys for our hash lookup in 3). To fix this
we detect if the slub is enabled and then set the needed flag
to prevent this merging from ever occuring.
5) New kernels removed the proc_dir_entry pointer from items
registered by sysctl. This means we can no long be sneaky and
manually insert things in to the sysctl tree simply by walking
the proc tree. So I'm forced to create a seperate tree for
all the things I can't easily support via sysctl interface.
I don't like it but it will do for now.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@124 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
compiled out when doing performance runs.
- Bite the bullet and fully autoconfize the debug options in the configure
time parameters. By default all the debug support is disable in the core
SPL build, but available to modules which enable it when building against
the SPL. To enable particular SPL debug support use the follow configure
options:
--enable-debug Internal ASSERTs
--enable-debug-kmem Detailed memory accounting
--enable-debug-mutex Detailed mutex tracking
--enable-debug_kstat Kstat info exported to /proc
--enable-debug-callb Additional callb debug
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@111 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
crashes but it's not clear to me yet if these are a problem with
the mutex implementation or ZFSs usage of it.
Minor taskq fixes to add new tasks to the end of the pending list.
Minor enhansements to the debug infrastructure.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@94 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
ensure I never add anything to the stack I don't absolutely need.
All this debug code could be removed from a production build
anyway so I'm not so worried about the performance impact. We
may also consider revisting the mutex and condvar implementation
to ensure no additional stack is used there.
Initial indications are I have reduced the worst case stack
usage to 9080 bytes. Still to large for the default 8k stacks
so I have been forced to run with 16k stacks until I can
reduce the worst offenders.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@83 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
process of destroying the stacks. Threshhold set fairly aggressively
top 80% of stack usage.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@82 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
- Replacing all BUG_ON()'s with proper ASSERT()'s
- Using ENTRY,EXIT,GOTO, and RETURN macro to instument call paths
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@78 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
changes bring over everything lustre had for debugging with
two exceptions. I dropped by the debug daemon and upcalls
just because it made things a little easier. They can be
readded easily enough if we feel they are needed.
Everything compiles and seems to work on first inspection
but I suspect there are a handful of issues still lingering
which I'll be sorting out right away. I just wanted to get
all these changes commited and safe. I'm getting a little
paranoid about losing them.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@75 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
muck with #includes in existing Solaris style source to get it
to find the right stuff.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@18 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c