Fix zpool on zvol lock inversion deadlock
In all but one case the spa_namespace_lock is taken before the bdev->bd_mutex lock. But Linux __blkdev_get() function calls fops->open() with the bdev->bd_mutex lock held and we must somehow still safely acquire the spa_namespace_lock. To avoid a potential lock inversion deadlock we preemptively try to take the spa_namespace_lock(). Normally it will not be contended and this is safe because spa_open_common() handles the case where the caller already holds the spa_namespace_lock. When it is contended we risk a lock inversion if we were to block waiting for the lock. Luckily, the __blkdev_get() function allows us to return -ERESTARTSYS which will result in bdev->bd_mutex being dropped, reacquired, and fops->open() being called again. This process can be repeated safely until both locks are acquired. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jorgen Lundman <lundman@lundman.net> Closes #612
This commit is contained in:
parent
d5446cfc52
commit
65d56083b4
|
@ -891,11 +891,39 @@ zvol_first_open(zvol_state_t *zv)
|
|||
{
|
||||
objset_t *os;
|
||||
uint64_t volsize;
|
||||
int locked = 0;
|
||||
int error;
|
||||
uint64_t ro;
|
||||
|
||||
/*
|
||||
* In all other cases the spa_namespace_lock is taken before the
|
||||
* bdev->bd_mutex lock. But in this case the Linux __blkdev_get()
|
||||
* function calls fops->open() with the bdev->bd_mutex lock held.
|
||||
*
|
||||
* To avoid a potential lock inversion deadlock we preemptively
|
||||
* try to take the spa_namespace_lock(). Normally it will not
|
||||
* be contended and this is safe because spa_open_common() handles
|
||||
* the case where the caller already holds the spa_namespace_lock.
|
||||
*
|
||||
* When it is contended we risk a lock inversion if we were to
|
||||
* block waiting for the lock. Luckily, the __blkdev_get()
|
||||
* function allows us to return -ERESTARTSYS which will result in
|
||||
* bdev->bd_mutex being dropped, reacquired, and fops->open() being
|
||||
* called again. This process can be repeated safely until both
|
||||
* locks are acquired.
|
||||
*/
|
||||
if (!mutex_owned(&spa_namespace_lock)) {
|
||||
locked = mutex_tryenter(&spa_namespace_lock);
|
||||
if (!locked)
|
||||
return (-ERESTARTSYS);
|
||||
}
|
||||
|
||||
/* lie and say we're read-only */
|
||||
error = dmu_objset_own(zv->zv_name, DMU_OST_ZVOL, 1, zvol_tag, &os);
|
||||
|
||||
if (locked)
|
||||
mutex_exit(&spa_namespace_lock);
|
||||
|
||||
if (error)
|
||||
return (-error);
|
||||
|
||||
|
|
Loading…
Reference in New Issue