The current test rig consists of two 60 disk dragon drawers in configured
in 4-x15 mode. Each drawer has 4 SAS connections to my node for a total
of 8 SAS connections spread over 4 dual-port LSI SAS adapters. The
configures are as follows:
- raid0: All 120 drives in a single pool.
- raidz: 15 RAIDZ groups of 7+1.
- raidz2: 15 RAIDZ2 groups of 6+2.
This change extends the existing in-tree test infrastructure such
that it can also be run as part of a the installed package. This
simplifies testing on multiple systems and is generally all around
useful. The scripts may still be run in-tree and will use the
in-tree build products as long as .script-config exists.
Supported and tested distros now include SLES10, SLES11, Chaos 4.x,
RHEL5, and Fedora 11. This update was mainly to address rebuildable
kernel module rpms, and correct rpm dependencies for each distro.
Under FC11 rpm builds by default add the --fortify-source option which
ensures that functions flagged with certain attributes must have their
return codes checked. Normally this is just a warning but we always
build with -Werror so this is fatal. Simply wrap the function in a
verify call to ensure we catch a failure if there is one.
With this patch applied I get the following failure 100% of the time,
I'd prefer to debug it and keep moving forward but I do not have the
time right now so I'm reverting the patch to the version which worked.
Ricardo please fix.
(gdb) bt
0 ztest_dmu_write_parallel (za=0x2aaaac898960) at
../../cmd/ztest/ztest.c:2566
1 0x0000000000405a79 in ztest_thread (arg=<value optimized out>)
at ../../cmd/ztest/ztest.c:3862
2 0x00002b2e6a7a841d in zk_thread_helper (arg=<value optimized out>)
at ../../lib/libzpool/kernel.c:131
3 0x000000379be06367 in start_thread (arg=<value optimized out>)
at pthread_create.c:297
4 0x000000379b2d30ad in clone () from /lib64/libc.so.6
This resolves previous scalabily concerns about the cost of calling
curthread which previously required a list walk. The kthread address
is now tracked as thread specific data which can be quickly returned.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
There is concern that READA may do more than simply reorder the queue.
There may be an increased chance that a requested marked READA will
fail because the elevator considers it optional. For this reason, all
read requests, even speculative ones, have been converted back to READ.