This was done becaese fix-stack-ztest was added to the stack
in series after fix-pthreads because fix-stack-ztest depends
on many of the pthreads changes.
While ztest does run in user space we run it with the same stack
restrictions it would have in kernel space. This ensures that any
stack related issues which would be hit in the kernel can be caught
and debugged in user space instead.
This patch is a first pass to limit the stack usage of every ztest
function to 1024 bytes. Subsequent updates can further reduce this
Due to limited stack space recursive functions are frowned upon in
the Linux kernel. However, they often are the most elegant solution
to a problem. The following code preserves the recursive function
traverse_visitbp() but moves the local variables AND function
arguments to the heap to minimize the stack frame size. Enough
space is initially allocated on the stack for 20 levels of recursion.
This change does ugly-up-the-code but it reduces the worst case
usage from roughly 4160 bytes to 960 bytes on x86_64 archs.
These changes are now taken care of by the fix-stack-traverse_impl
topic branch which not only solves the uninit problem but also
moves these locals off the stack and on to the heap.
Move dsl_dataset_t local variable from the stack to the heap.
This reduces the stack usage of this function from 2048 bytes
to 176 bytes for x84_64 arches.
For all module/library functions ensure so stack frame exceeds 1024
bytes. Ideally this should be set lower to say 512 bytes but there
are still numerous functions which exceed even this limit. For now
this is set to 1024 to ensure we catch the worst offenders.
Additionally, set the limit for ztest to 1024 bytes since the idea
here is to catch stack issues in user space before we find them by
overrunning a kernel stack. This should also be reduced to 512
bytes as soon as all the trouble makes are fixed.
Finally, add -fstack-check to gcc build options when --enable-debug
is specified at configure time. This ensures that each page on the
stack will be touched and we will generate a segfault on stack
overflow.
Over time we can gradually fix the following functions:
536 zfs:dsl_deadlist_regenerate
536 zfs:dsl_load_sets
536 zfs:zil_parse
544 zfs:zfs_ioc_recv
552 zfs:dsl_deadlist_insert_bpobj
552 zfs:vdev_dtl_sync
584 zfs:copy_create_perms
608 zfs:ddt_class_contains
608 zfs:ddt_prefetch
608 zfs:__dprintf
616 zfs:ddt_lookup
648 zfs:dsl_scan_ddt
696 zfs:dsl_deadlist_merge
736 zfs:ddt_zap_walk
744 zfs:dsl_prop_get_all_impl
872 zfs:dnode_evict_dbufs