splat taskq:front: Fix race

The taskq:front test has a race condition where task 4 and 8
race to complete, due to an incorrectly calculated set of delay
"factors" (T). If task 4 wins and actually finishes first, the
verification of the order of completion will fail.

The delays calculated to order task completion do not take into
account the terminal line in the table, and so are all off by
a factor of 1. This causes all the tasks in all queues to finish
sooner than expected and the accumulated error is the root cause
of tasks 4 and 8 racing to complete first. Before the change the
"actual" table looks like I commented in #130.

I changed:

* the table in the comment to correctly reflect the test and the
  factor timings needed.
* the individual task delay factors of T so that ONLY 1 task will
  every 2T. (on average)
* 1T was reduced from 100ms to 50ms. This halves the duration of
  the test and makes any remaining raciness more likely to cause
  failures, but it did not cause the test to fail.
* simplified the delay factor logic by using a table look-up
  instead of a switch.
* Added a "task started" message so that with -v it is possible
  to see the order tasks are started.
* Moved the "task completed" message inside the spinlock so that
  with -v the message truly reflects the absolute order of
  completion as guaranteed by the spinlock.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #130
This commit is contained in:
Steven Johnson 2012-12-05 11:39:32 +08:00 committed by Brian Behlendorf
parent 053678f3b0
commit 9b88fa165f
1 changed files with 32 additions and 30 deletions

View File

@ -754,50 +754,52 @@ splat_taskq_test5(struct file *file, void *arg)
* scheduled. Each rows represent one time unit and each column * scheduled. Each rows represent one time unit and each column
* one of the three worker threads. * one of the three worker threads.
* *
* +-----+ * NB: The Horizontal Line is the LAST Time unit consumed by the Task,
* | | * and must be included in the factor calculation.
* +-----+ | * T
* | | 5 +-----+ * 17-> +-----+
* | | | | * 16 | T6 |
* | +-----| | * 15-> +-----+ |
* | 4 | | | * 14 | T6 | |
* +-----+ | 8 | * 13-> | | 5 +-----+
* | | | | * 12 | | | T6 |
* | | 7 +-----+ * 11-> | +-----| |
* | | | | * 10 | 4 | T6 | |
* | |-----+ | * 9-> +-----+ | 8 |
* | 6 | | | * 8 | T5 | | |
* +-----+ | | * 7-> | | 7 +-----+
* | | | | * 6 | | | T7 |
* | 1 | 2 | 3 | * 5-> | +-----+ |
* +-----+-----+-----+ * 4 | 6 | T5 | |
* 3-> +-----+ | |
* 2 | T3 | | |
* 1 | 1 | 2 | 3 |
* 0 +-----+-----+-----+
* *
*/ */
static void static void
splat_taskq_test6_func(void *arg) splat_taskq_test6_func(void *arg)
{ {
/* Delays determined by above table */
static const int factor[SPLAT_TASKQ_ORDER_MAX+1] = {0,3,5,7,6,6,5,6,6};
splat_taskq_id_t *tq_id = (splat_taskq_id_t *)arg; splat_taskq_id_t *tq_id = (splat_taskq_id_t *)arg;
splat_taskq_arg_t *tq_arg = tq_id->arg; splat_taskq_arg_t *tq_arg = tq_id->arg;
int factor;
/* Delays determined by above table */
switch (tq_id->id) {
default: factor = 0; break;
case 1: factor = 2; break;
case 2: case 4: case 5: factor = 4; break;
case 6: case 7: case 8: factor = 5; break;
case 3: factor = 6; break;
}
msleep(factor * 100);
splat_vprint(tq_arg->file, tq_arg->name, splat_vprint(tq_arg->file, tq_arg->name,
"Taskqid %d complete for taskq '%s'\n", "Taskqid %d starting for taskq '%s'\n",
tq_id->id, tq_arg->name); tq_id->id, tq_arg->name);
if (tq_id->id < SPLAT_TASKQ_ORDER_MAX+1) {
msleep(factor[tq_id->id] * 50);
}
spin_lock(&tq_arg->lock); spin_lock(&tq_arg->lock);
tq_arg->order[tq_arg->flag] = tq_id->id; tq_arg->order[tq_arg->flag] = tq_id->id;
tq_arg->flag++; tq_arg->flag++;
splat_vprint(tq_arg->file, tq_arg->name,
"Taskqid %d complete for taskq '%s'\n",
tq_id->id, tq_arg->name);
spin_unlock(&tq_arg->lock); spin_unlock(&tq_arg->lock);
} }