There are 3 fixes in thie commit. First, update ztest_run() to store
the thread id and not the address of the kthread_t. This will be freed
on thread exit and is not safe to use. This is pretty close to how
things were done in the original ztest code before I got there.
Second, for extra paranoia update thread_exit() to return a special
TS_MAGIC value via pthread_exit(). This value is then verified in
pthread_join() to ensure the thread exited cleanly. This can be
done cleanly because the kthread doesn't provide a return code
mechanism we need to worry about.
Third, replace the ztest deadman thread with a signal handler. We
cannot use the previous approach because the correct behavior for
pthreads is to wait for all threads to exit before terminating the
process. Since the deadman thread won't call exit by design we
end up hanging in kernel_exit(). To avoid this we just setup a
SIGALRM signal handle and register a deadman alarm. IMHO this
is simpler and cleaner anyway.
Much to my surprise bcopy() under Linux appears to copy the data in
word sized chunks. It does the right thing but if you buffer is not
a multiple of the word size you will be reading past the end of your
buffer. Or at least that is what valgrind is reporting. We should
be using mempcy() anyway on Linux so replace bcopy() with memcpy()
to resolve the issue.
==305== Thread 211:
==305== Invalid read of size 8
==305== at 0x3BCD28357D: _wordcopy_fwd_dest_aligned (in /lib64/libc-2.11.1.so)
==305== by 0x3BCD282B05: bcopy (in /lib64/libc-2.11.1.so)
==305== by 0x58D7FEF: dmu_write (dmu.c:730)
==305== by 0x591C942: spa_history_write (spa_history.c:165)
==305== by 0x591D255: spa_history_log_sync (spa_history.c:277)
==305== by 0x591D545: log_internal (spa_history.c:450)
==305== by 0x591D5EC: spa_history_log_internal (spa_history.c:475)
==305== by 0x5902319: dsl_prop_set_sync (dsl_prop.c:707)
==305== by 0x5906A7D: dsl_sync_task_group_sync (dsl_synctask.c:199)
==305== by 0x58FF4EC: dsl_pool_sync (dsl_pool.c:376)
==305== by 0x591744C: spa_sync (spa.c:5365)
==305== by 0x5922C85: txg_sync_thread (txg.c:414)
Move create/destroy function to correct places. I'm not sure why
this wasn't caught upstream it should have been, regardless let's
just fix it here.
Personally I find it handy to be able to enable full debugging in
zfs with the 'debug=' command line option so I'm enabled that as
well.
On a Linux system simply use the native aprintf and vasprintf
functions respectively. Also update the call points to correctly
use va_copy() or va_start() as appropriate.
Accidentally dropped the zeroing of this structure in the
gcc-missing-braces topic branch which was causing a fall positive
space leak in ztest. Ensure the structure is zero'ed before use.