These changes are now taken care of by the fix-stack-traverse_impl
topic branch which not only solves the uninit problem but also
moves these locals off the stack and on to the heap.
Move dsl_dataset_t local variable from the stack to the heap.
This reduces the stack usage of this function from 2048 bytes
to 176 bytes for x84_64 arches.
Much to my surprise bcopy() under Linux appears to copy the data in
word sized chunks. It does the right thing but if you buffer is not
a multiple of the word size you will be reading past the end of your
buffer. Or at least that is what valgrind is reporting. We should
be using mempcy() anyway on Linux so replace bcopy() with memcpy()
to resolve the issue.
==305== Thread 211:
==305== Invalid read of size 8
==305== at 0x3BCD28357D: _wordcopy_fwd_dest_aligned (in /lib64/libc-2.11.1.so)
==305== by 0x3BCD282B05: bcopy (in /lib64/libc-2.11.1.so)
==305== by 0x58D7FEF: dmu_write (dmu.c:730)
==305== by 0x591C942: spa_history_write (spa_history.c:165)
==305== by 0x591D255: spa_history_log_sync (spa_history.c:277)
==305== by 0x591D545: log_internal (spa_history.c:450)
==305== by 0x591D5EC: spa_history_log_internal (spa_history.c:475)
==305== by 0x5902319: dsl_prop_set_sync (dsl_prop.c:707)
==305== by 0x5906A7D: dsl_sync_task_group_sync (dsl_synctask.c:199)
==305== by 0x58FF4EC: dsl_pool_sync (dsl_pool.c:376)
==305== by 0x591744C: spa_sync (spa.c:5365)
==305== by 0x5922C85: txg_sync_thread (txg.c:414)
This may not strictly be needed but it does keep gcc happy. We
should keep our eye on this though if the extra bcopy significantly
impacts performance. It may.
The following cleanup was missed in the first pass when the ZVOL
implementation was updated. An extra instance of a zvol_state_t
was removed from the stack and the error handling was simplified.
There are cases where under Linux it is not safe to sleep in
taskq_dispatch(). Rather than adding Linux specific code to
detect these cases I opted to keep it simple and just never
allow a sleep here. The impact of this should be minimal.
I missed a instanse of removing the & operator when reducing the
stack usage in this function. This unfortunately doesn't cause
a compile warning but it is does cause ztest failures. Anyway,
update the topic branch to correct this mistake.
Certain function must never be automatically inlined by gcc because
they are stack heavy or called recursively. This patch flags all
such functions I have found as 'noinline' to prevent gcc from making
the optimization.
Reduce stack usage in dsl_deleg_get, gcc flagged it as consuming a
whopping 1040 bytes or potentially 1/4 of a 4K stack. This patch
moves all the large structures and buffer off the stack and on to
the heap. This includes 2 zap_cursor_t structs each 52 bytes in
size, 2 zap_attribute_t structs each 280 bytes in size, and 1
256 byte char array. The total saves on the stack is 880 bytes
after you account for the 5 new pointers added.
Also the source buffer length has been increased from MAXNAMELEN
to MAXNAMELEN+strlen(MOS_DIR_NAME)+1 as described by the comment in
dsl_dir_name(). A buffer overrun may have been possible with the
slightly smaller buffer.
Reduce kernel stack usage by lzjb_compress() by moving uint16 array
off the stack and on to the heap. The exact performance implications
of this I have not measured but we absolutely need to keep stack
usage to a minimum. If/when this becomes and issue we optimize.
Move xiou stat structures from a header to the dmu.c source as is
done with all the other kstat interfaces. This information is local
to dmu.c registered the xuio kstat and should stay that way.
If your only going to allow one allocator to be used and it is defined
at compile time there is no point including the others in the build.
This patch could/should be refined for Linux to make the metaslab
configurable at run time. That might be a bit tricky however since
you would need to quiese all IO. Short of that making it configurable
as a module load option would be a reasonable compromise.
In the linux kernel 'current' is defined to mean the current process
and can never be used as a local variable in a function. Simply
replace all usage of 'current' with 'curr' in this function.
Updated fix to detect if we are in an interrupt and only sleep if it
is safe to do some. I guess it must be safe to sleep under Solaris
this must be handled in a sort interrupt handler there
The ZVOL interfaces changed significantly with the latest update. I've
updated the Linux version of the code to handle this and it looks like
the net result has been a simpler implementation which is good! Plus,
I'm relatively sure the ZIL integration is right this time although it
needs some serious crash testing to verify that.
Also minor additions to vdev_disk for .hold and .rele callbacks.
Currently, they do nothing and I may be able to simply stub them out
with NULLs for Linux since opening the device in Linux should have
much the same effort. More investigation is needed though since
the ZFS interface may make some demands here I'm overlooking.
Almost exclusively this patch handled the addition of another char
array to the zfs_cmd_t structure. Unfortunately c90 doesn't allow
zero filling the entire struct with the '= { 0 };' shorthand.
Originally these changes were on other gcc-* topic branches but
because c90 touches the same bit of code and I'd like to keep all
the gcc-* branches completely parallel I've moved these few bits
over here. This is one of the downsides of not just having one
big patch stack.
Fix new instances or changes in gcc flagged unused code. These are
mostly related to variables which are not used when debugging is
disabled and the ASSERTs are compiled out.