The upstream commit cb code had a few bugs:
1) The arguments of the list_move_tail() call in txg_dispatch_callbacks()
were reversed by mistake. This caused the commit callbacks to not be
called at all.
2) ztest had a bug in ztest_dmu_commit_callbacks() where "error" was not
initialized correctly. This seems to have caused the test to always take
the simulated error code path, which made ztest unable to detect whether
commit cbs were being called for transactions that successfuly complete.
3) ztest had another bug in ztest_dmu_commit_callbacks() where the commit
cb threshold was not being compared correctly.
4) The commit cb taskq was using 'max_ncpus * 2' as the maxalloc argument
of taskq_create(), which could have caused unnecessary delays in the txg
sync thread.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Add two additional basic sanity tests to confirm zvol snapshots
and clones work. The snapshot test is basically the same as the
example provided in the wiki. The clone test goes one step father
and clones the snapshot then modifies it to match the original
modified volume. It them compares them to ensure everything was
modified as expected.
These are just meant to be sanity tests to catch obvious breakage
before tagging a release. They are still not a substitute for a
full regression test suite.
It appears that in earlier kernels the maximum name length of a
kobject was KOBJ_NAME_LEN (20) bytes. This was later extended to
dynamically allocate enough memory if it was over KOBJ_NAME_LEN,
and finally it was always made dynamic. Unfortunately, util this
last step happened it doesn't look like it always safe to use
names larger than KOBJ_NAME_LEN. For example, under the RHEL5
2.6.18 kernel if the kobject name length exceeds KOBJ_NAME_LEN
a NULL dereference is tripped.
To avoid this issue the build system has been update to check
to see if KOBJ_NAME_LEN is defined. If it is we have to assume
the maximum kobject name length is only 20 bytes. This 20 byte
name must minimally include the following components.
<zpool>/<dataset>[@snapshot[partition]]
While the zfs utilities do block until the expected device appears
they can only do this for full devices, not partitions. This means
that once as device appears it still may take a little bit of time
before the kernel rescans the partition table, updates sysfs, udev
is notified and the partition devices are created. The test case
itself could block briefly waiting for the partition beause it knows
what to expect. But for now the simpler thing to do is just delay.
See previous commit for details. But the gist is with the removal of
the zvol path component the regression tests must be updated to use
the correct path name.
Interestingly this looks like an upstream bug as well. If for some
reason we are unable to get a zvols statistics, because perhaps the
zpool is hopelessly corrupt, we would trigger the VERIFY. This
commit adds the proper error handling just to propagate the error
back to user space. Now the user space tools still must handle this
properly but in the worst case the tool will crash or perhaps have
some missing output. That's far far better than crashing the host.
Closes#45
Several folks have now remarked that when the regression tests
fail they leave a mess behind. This was done intentionally at
the time to facilitate debugging the wreckage.
However, this also means that you may need to do some manual
cleanup such as removing the loopback devices before re-running
the tests. To simplify this proceedure I've added the '-c'
option to zconfig.sh which will attempt to cleanup the mess
from a previous test before starting.
This is somewhat dangerous because it must guess as to which
loopback devices you were using. But this risk is fairly minimal
because devices which are currently still is use can not be
cleaned up. And because only devices with 'zpool' in the name
are considered for removal. That said if your running parallel
copies of say zconfig.sh this may cause you some trouble.
Update the zconfig.sh test script to verify not only that volumes,
snapshots, and clones are created and removed properly. But also
verify that the partition information for each of these types of
devices is properly enumerated by the kernel.
Tests 4 and 5 now also create two partitions on the original volume
and these partitions are expected to also exist on the snapshot and
the clone. Correctness is verified after import/export, module
load/unload, dataset creation, and pool destruction.
Additionally, the code to create a partition table was refactored
in to a small helper function to simplify the test cases. And
finally all of the function variables were flagged 'local' to ensure
their scope is limited. This should have been done a while ago.
During spa_load the spl->spa_deferred_bpobj maybe be opened and closed
multiple times. It's critical that when the object is closed the
bpo->bpo_object is set to zero to indicate the object is closed.
If it's not during spl_load_retry the spl->spa_deferred_bpobj can
be closes twice resulting in a NULL deref.
This appears to have been fixed upstream the same way.
This reverts commit 411dd65af1.
gcc version 4.1.2 does not like having differing prototypes
for zio_execute, one version in the .c with inline and one
version in the .h without. Thus I'm reverting this change
and we'll see how critical this particular stack reduction is.