* [PATCH] dm: fix AB-BA deadlock in __dm_destroy()
@ 2015-10-01 8:31 Junichi Nomura
2015-10-01 12:56 ` Mikulas Patocka
2015-10-01 20:45 ` Mike Snitzer
0 siblings, 2 replies; 4+ messages in thread
From: Junichi Nomura @ 2015-10-01 8:31 UTC (permalink / raw)
To: device-mapper development, Mikulas Patocka
__dm_destroy() takes io_barrier SRCU lock (dm_get_live_table) and
suspend_lock in reverse order. That can cause AB-BA deadlock:
Example:
__dm_destroy dm_swap_table
---------------------------------------------------
mutex_lock(suspend_lock)
dm_get_live_table()
srcu_read_lock(io_barrier)
dm_sync_table()
synchronize_srcu(io_barrier)
.. waiting for dm_put_live_table()
mutex_lock(suspend_lock)
.. waiting for suspend_lock
This patch fixes the lock ordering.
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Fixes: ab7c7bb6f4ab ("dm: hold suspend_lock while suspending device during device deletion")
Cc: Mikulas Patocka <mpatocka@redhat.com>
---
The problem could be reproduced with this script but it might take long.
(In my environment, it took more than 10 minutes)
-- cut here --
#!/bin/bash
t0="0 1024 zero"
t1="0 1024 error"
mapname=testmap
work1()
{
while true; do
dmsetup create --notable $mapname
echo "$t0" | dmsetup load $mapname
dmsetup resume $mapname
dmsetup remove_all
done
}
work2()
{
while true; do
echo "$t1" | dmsetup load $mapname
dmsetup resume $mapname
echo "$t0" | dmsetup load $mapname
dmsetup resume $mapname
done
}
work1 &
work2 &
wait
-- cut here --
When starting the script, it will emit a lot of errors such as "No such
device or address" and stops when the deadlock occurs.
Backtrace of dmsetup will look like this:
# ps auxw|grep dmsetup
root 32209 0.0 0.0 130024 3060 pts/0 D+ 03:26 0:00 dmsetup resume testmap
root 32210 0.0 0.0 130024 3048 pts/0 D+ 03:26 0:00 dmsetup remove_all
# cat /proc/32210/stack
[<ffffffffa00029ea>] __dm_destroy+0xba/0x280 [dm_mod]
[<ffffffffa0003ec3>] dm_destroy+0x13/0x20 [dm_mod]
[<ffffffffa0007edd>] dm_hash_remove_all+0x6d/0x130 [dm_mod]
[<ffffffffa0007fc2>] remove_all+0x22/0x30 [dm_mod]
[<ffffffffa0009a65>] ctl_ioctl+0x255/0x4d0 [dm_mod]
[<ffffffffa0009cf3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
[<ffffffff81210c82>] do_vfs_ioctl+0x2d2/0x4b0
[<ffffffff81210ed9>] SyS_ioctl+0x79/0x90
[<ffffffff816859ee>] entry_SYSCALL_64_fastpath+0x12/0x71
[<ffffffffffffffff>] 0xffffffffffffffff
# cat /proc/32209/stack
[<ffffffff810e1d34>] __synchronize_srcu+0xf4/0x130
[<ffffffff810e1d94>] synchronize_srcu+0x24/0x30
[<ffffffffa000406d>] dm_swap_table+0x17d/0x2e0 [dm_mod]
[<ffffffffa00090fa>] dev_suspend+0x9a/0x240 [dm_mod]
[<ffffffffa0009a65>] ctl_ioctl+0x255/0x4d0 [dm_mod]
[<ffffffffa0009cf3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
[<ffffffff81210c82>] do_vfs_ioctl+0x2d2/0x4b0
[<ffffffff81210ed9>] SyS_ioctl+0x79/0x90
[<ffffffff816859ee>] entry_SYSCALL_64_fastpath+0x12/0x71
[<ffffffffffffffff>] 0xffffffffffffffff
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 6264781..7289ece 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2837,8 +2837,6 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
might_sleep();
- map = dm_get_live_table(md, &srcu_idx);
-
spin_lock(&_minor_lock);
idr_replace(&_minor_idr, MINOR_ALLOCED, MINOR(disk_devt(dm_disk(md))));
set_bit(DMF_FREEING, &md->flags);
@@ -2852,14 +2850,14 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
* do not race with internal suspend.
*/
mutex_lock(&md->suspend_lock);
+ map = dm_get_live_table(md, &srcu_idx);
if (!dm_suspended_md(md)) {
dm_table_presuspend_targets(map);
dm_table_postsuspend_targets(map);
}
- mutex_unlock(&md->suspend_lock);
-
/* dm_put_live_table must be before msleep, otherwise deadlock is possible */
dm_put_live_table(md, srcu_idx);
+ mutex_unlock(&md->suspend_lock);
/*
* Rare, but there may be I/O requests still going to complete,
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] dm: fix AB-BA deadlock in __dm_destroy()
2015-10-01 8:31 [PATCH] dm: fix AB-BA deadlock in __dm_destroy() Junichi Nomura
@ 2015-10-01 12:56 ` Mikulas Patocka
2015-10-01 20:45 ` Mike Snitzer
1 sibling, 0 replies; 4+ messages in thread
From: Mikulas Patocka @ 2015-10-01 12:56 UTC (permalink / raw)
To: Junichi Nomura; +Cc: device-mapper development, Mike Snitzer
I think this patch is OK.
It should be also backported to stable kernels starting with 3.11. I think
older versions are not affected because they don't have srcu.
Acked-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # 3.11+
Mikulas
On Thu, 1 Oct 2015, Junichi Nomura wrote:
> __dm_destroy() takes io_barrier SRCU lock (dm_get_live_table) and
> suspend_lock in reverse order. That can cause AB-BA deadlock:
>
> Example:
>
> __dm_destroy dm_swap_table
> ---------------------------------------------------
> mutex_lock(suspend_lock)
> dm_get_live_table()
> srcu_read_lock(io_barrier)
> dm_sync_table()
> synchronize_srcu(io_barrier)
> .. waiting for dm_put_live_table()
> mutex_lock(suspend_lock)
> .. waiting for suspend_lock
>
> This patch fixes the lock ordering.
>
> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
> Fixes: ab7c7bb6f4ab ("dm: hold suspend_lock while suspending device during device deletion")
> Cc: Mikulas Patocka <mpatocka@redhat.com>
> ---
> The problem could be reproduced with this script but it might take long.
> (In my environment, it took more than 10 minutes)
>
> -- cut here --
> #!/bin/bash
>
> t0="0 1024 zero"
> t1="0 1024 error"
> mapname=testmap
>
> work1()
> {
> while true; do
> dmsetup create --notable $mapname
> echo "$t0" | dmsetup load $mapname
> dmsetup resume $mapname
> dmsetup remove_all
> done
> }
>
> work2()
> {
> while true; do
> echo "$t1" | dmsetup load $mapname
> dmsetup resume $mapname
> echo "$t0" | dmsetup load $mapname
> dmsetup resume $mapname
> done
> }
>
> work1 &
> work2 &
> wait
> -- cut here --
>
> When starting the script, it will emit a lot of errors such as "No such
> device or address" and stops when the deadlock occurs.
> Backtrace of dmsetup will look like this:
>
> # ps auxw|grep dmsetup
> root 32209 0.0 0.0 130024 3060 pts/0 D+ 03:26 0:00 dmsetup resume testmap
> root 32210 0.0 0.0 130024 3048 pts/0 D+ 03:26 0:00 dmsetup remove_all
>
> # cat /proc/32210/stack
> [<ffffffffa00029ea>] __dm_destroy+0xba/0x280 [dm_mod]
> [<ffffffffa0003ec3>] dm_destroy+0x13/0x20 [dm_mod]
> [<ffffffffa0007edd>] dm_hash_remove_all+0x6d/0x130 [dm_mod]
> [<ffffffffa0007fc2>] remove_all+0x22/0x30 [dm_mod]
> [<ffffffffa0009a65>] ctl_ioctl+0x255/0x4d0 [dm_mod]
> [<ffffffffa0009cf3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
> [<ffffffff81210c82>] do_vfs_ioctl+0x2d2/0x4b0
> [<ffffffff81210ed9>] SyS_ioctl+0x79/0x90
> [<ffffffff816859ee>] entry_SYSCALL_64_fastpath+0x12/0x71
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> # cat /proc/32209/stack
> [<ffffffff810e1d34>] __synchronize_srcu+0xf4/0x130
> [<ffffffff810e1d94>] synchronize_srcu+0x24/0x30
> [<ffffffffa000406d>] dm_swap_table+0x17d/0x2e0 [dm_mod]
> [<ffffffffa00090fa>] dev_suspend+0x9a/0x240 [dm_mod]
> [<ffffffffa0009a65>] ctl_ioctl+0x255/0x4d0 [dm_mod]
> [<ffffffffa0009cf3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
> [<ffffffff81210c82>] do_vfs_ioctl+0x2d2/0x4b0
> [<ffffffff81210ed9>] SyS_ioctl+0x79/0x90
> [<ffffffff816859ee>] entry_SYSCALL_64_fastpath+0x12/0x71
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 6264781..7289ece 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -2837,8 +2837,6 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
>
> might_sleep();
>
> - map = dm_get_live_table(md, &srcu_idx);
> -
> spin_lock(&_minor_lock);
> idr_replace(&_minor_idr, MINOR_ALLOCED, MINOR(disk_devt(dm_disk(md))));
> set_bit(DMF_FREEING, &md->flags);
> @@ -2852,14 +2850,14 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
> * do not race with internal suspend.
> */
> mutex_lock(&md->suspend_lock);
> + map = dm_get_live_table(md, &srcu_idx);
> if (!dm_suspended_md(md)) {
> dm_table_presuspend_targets(map);
> dm_table_postsuspend_targets(map);
> }
> - mutex_unlock(&md->suspend_lock);
> -
> /* dm_put_live_table must be before msleep, otherwise deadlock is possible */
> dm_put_live_table(md, srcu_idx);
> + mutex_unlock(&md->suspend_lock);
>
> /*
> * Rare, but there may be I/O requests still going to complete,
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: dm: fix AB-BA deadlock in __dm_destroy()
2015-10-01 8:31 [PATCH] dm: fix AB-BA deadlock in __dm_destroy() Junichi Nomura
2015-10-01 12:56 ` Mikulas Patocka
@ 2015-10-01 20:45 ` Mike Snitzer
2015-10-02 2:45 ` Junichi Nomura
1 sibling, 1 reply; 4+ messages in thread
From: Mike Snitzer @ 2015-10-01 20:45 UTC (permalink / raw)
To: Junichi Nomura; +Cc: device-mapper development, Mikulas Patocka
On Thu, Oct 01 2015 at 4:31am -0400,
Junichi Nomura <j-nomura@ce.jp.nec.com> wrote:
> __dm_destroy() takes io_barrier SRCU lock (dm_get_live_table) and
> suspend_lock in reverse order. That can cause AB-BA deadlock:
>
> Example:
>
> __dm_destroy dm_swap_table
> ---------------------------------------------------
> mutex_lock(suspend_lock)
> dm_get_live_table()
> srcu_read_lock(io_barrier)
> dm_sync_table()
> synchronize_srcu(io_barrier)
> .. waiting for dm_put_live_table()
> mutex_lock(suspend_lock)
> .. waiting for suspend_lock
>
> This patch fixes the lock ordering.
>
> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
> Fixes: ab7c7bb6f4ab ("dm: hold suspend_lock while suspending device during device deletion")
> Cc: Mikulas Patocka <mpatocka@redhat.com>
> ---
> The problem could be reproduced with this script but it might take long.
> (In my environment, it took more than 10 minutes)
Hi,
Thanks for fixing this. What prompted you to chase this down? Was it
the work you were doing to reproduce Bart's blk-mq mpath failure that
exposed this issue?
FYI, interestingly, your fix looks to be applicable to this issue too:
https://bugzilla.redhat.com/show_bug.cgi?id=1267650
Thanks again,
Mike
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-10-02 2:45 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-01 8:31 [PATCH] dm: fix AB-BA deadlock in __dm_destroy() Junichi Nomura
2015-10-01 12:56 ` Mikulas Patocka
2015-10-01 20:45 ` Mike Snitzer
2015-10-02 2:45 ` Junichi Nomura
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.