* RFC: try to avoid del_gendisk vs passthrough from ->release deadlocks
@ 2024-10-08 11:57 Christoph Hellwig
2024-10-08 11:57 ` [PATCH 1/2] block: also mark disk-owned queues as dying in __blk_mark_disk_dead Christoph Hellwig
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Christoph Hellwig @ 2024-10-08 11:57 UTC (permalink / raw)
To: Jens Axboe; +Cc: Sergey Senozhatsky, YangYang, linux-block
Hi all,
this is my attempted fix for the problem reported by Sergey in the
"block: del_gendisk() vs blk_queue_enter() race condition" thread. As
I don't have a reproducer this is all just best guest so far, so handle
it with care!
Diffstat
block/genhd.c | 40 ++++++++++++++++++++++++++--------------
include/linux/blkdev.h | 1 +
2 files changed, 27 insertions(+), 14 deletions(-)
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH 1/2] block: also mark disk-owned queues as dying in __blk_mark_disk_dead 2024-10-08 11:57 RFC: try to avoid del_gendisk vs passthrough from ->release deadlocks Christoph Hellwig @ 2024-10-08 11:57 ` Christoph Hellwig 2024-10-09 5:06 ` Sergey Senozhatsky 2024-10-08 11:57 ` [PATCH 2/2] block: mark the disk dead before taking open_mutx in del_gendisk Christoph Hellwig 2024-10-08 15:23 ` RFC: try to avoid del_gendisk vs passthrough from ->release deadlocks Sergey Senozhatsky 2 siblings, 1 reply; 7+ messages in thread From: Christoph Hellwig @ 2024-10-08 11:57 UTC (permalink / raw) To: Jens Axboe; +Cc: Sergey Senozhatsky, YangYang, linux-block When del_gendisk shuts down access to a gendisk, it could lead to a deadlock with sd or, which try to submit passthrough SCSI commands from their ->release method under open_mutex. The submission can be blocked in blk_enter_queue while del_gendisk can't get to actually telling them top stop and wake them up. As the disk is going away there is no real point in sending these commands, but we have no really good way to distinguish between the cases. For now mark even standalone (aka SCSI queues) as dying in del_gendisk to avoid this deadlock, but the real fix will be to split freeing a disk from freezing a queue for not disk associated requests. Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org> Signed-off-by: Christoph Hellwig <hch@lst.de> --- block/genhd.c | 14 ++++++++++++-- include/linux/blkdev.h | 1 + 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/block/genhd.c b/block/genhd.c index 1c05dd4c6980b5..ac1c496ad43343 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -589,8 +589,16 @@ static void __blk_mark_disk_dead(struct gendisk *disk) if (test_and_set_bit(GD_DEAD, &disk->state)) return; - if (test_bit(GD_OWNS_QUEUE, &disk->state)) - blk_queue_flag_set(QUEUE_FLAG_DYING, disk->queue); + /* + * Also mark the disk dead if it is not owned by the gendisk. This + * means we can't allow /dev/sg passthrough or SCSI internal commands + * while unbinding a ULP. That is more than just a bit ugly, but until + * we untangle q_usage_counter into one owned by the disk and one owned + * by the queue this is as good as it gets. The flag will be cleared + * at the end of del_gendisk if it wasn't set before. + */ + if (!test_and_set_bit(QUEUE_FLAG_DYING, &disk->queue->queue_flags)) + set_bit(QUEUE_FLAG_RESURRECT, &disk->queue->queue_flags); /* * Stop buffered writers from dirtying pages that can't be written out. @@ -719,6 +727,8 @@ void del_gendisk(struct gendisk *disk) * again. Else leave the queue frozen to fail all I/O. */ if (!test_bit(GD_OWNS_QUEUE, &disk->state)) { + if (test_bit(QUEUE_FLAG_RESURRECT, &q->queue_flags)) + clear_bit(QUEUE_FLAG_DYING, &q->queue_flags); blk_queue_flag_clear(QUEUE_FLAG_INIT_DONE, q); __blk_mq_unfreeze_queue(q, true); } else { diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 50c3b959da2816..391e3eb3bb5e61 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -590,6 +590,7 @@ struct request_queue { /* Keep blk_queue_flag_name[] in sync with the definitions below */ enum { QUEUE_FLAG_DYING, /* queue being torn down */ + QUEUE_FLAG_RESURRECT, /* temporarily dying */ QUEUE_FLAG_NOMERGES, /* disable merge attempts */ QUEUE_FLAG_SAME_COMP, /* complete on same CPU-group */ QUEUE_FLAG_FAIL_IO, /* fake timeout */ -- 2.45.2 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] block: also mark disk-owned queues as dying in __blk_mark_disk_dead 2024-10-08 11:57 ` [PATCH 1/2] block: also mark disk-owned queues as dying in __blk_mark_disk_dead Christoph Hellwig @ 2024-10-09 5:06 ` Sergey Senozhatsky 2024-10-09 7:34 ` Christoph Hellwig 0 siblings, 1 reply; 7+ messages in thread From: Sergey Senozhatsky @ 2024-10-09 5:06 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Jens Axboe, Sergey Senozhatsky, YangYang, linux-block On (24/10/08 13:57), Christoph Hellwig wrote: > When del_gendisk shuts down access to a gendisk, it could lead to a > deadlock with sd or, which try to submit passthrough SCSI commands from > their ->release method under open_mutex. The submission can be blocked > in blk_enter_queue while del_gendisk can't get to actually telling them > top stop and wake them up. > > As the disk is going away there is no real point in sending these > commands, but we have no really good way to distinguish between the > cases. For now mark even standalone (aka SCSI queues) as dying in > del_gendisk to avoid this deadlock, but the real fix will be to split > freeing a disk from freezing a queue for not disk associated requests. > > Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org> > Signed-off-by: Christoph Hellwig <hch@lst.de> FWIW Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> [..] > if (!test_bit(GD_OWNS_QUEUE, &disk->state)) { > + if (test_bit(QUEUE_FLAG_RESURRECT, &q->queue_flags)) > + clear_bit(QUEUE_FLAG_DYING, &q->queue_flags); Don't know if we also want to clear QUEUE_FLAG_RESURRECT here, just in case. > blk_queue_flag_clear(QUEUE_FLAG_INIT_DONE, q); > __blk_mq_unfreeze_queue(q, true); > } else { ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] block: also mark disk-owned queues as dying in __blk_mark_disk_dead 2024-10-09 5:06 ` Sergey Senozhatsky @ 2024-10-09 7:34 ` Christoph Hellwig 0 siblings, 0 replies; 7+ messages in thread From: Christoph Hellwig @ 2024-10-09 7:34 UTC (permalink / raw) To: Sergey Senozhatsky; +Cc: Christoph Hellwig, Jens Axboe, YangYang, linux-block On Wed, Oct 09, 2024 at 02:06:02PM +0900, Sergey Senozhatsky wrote: > > if (!test_bit(GD_OWNS_QUEUE, &disk->state)) { > > + if (test_bit(QUEUE_FLAG_RESURRECT, &q->queue_flags)) > > + clear_bit(QUEUE_FLAG_DYING, &q->queue_flags); > > Don't know if we also want to clear QUEUE_FLAG_RESURRECT here, just in > case. Yes, we really should do that. ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2] block: mark the disk dead before taking open_mutx in del_gendisk 2024-10-08 11:57 RFC: try to avoid del_gendisk vs passthrough from ->release deadlocks Christoph Hellwig 2024-10-08 11:57 ` [PATCH 1/2] block: also mark disk-owned queues as dying in __blk_mark_disk_dead Christoph Hellwig @ 2024-10-08 11:57 ` Christoph Hellwig 2024-10-09 5:04 ` Sergey Senozhatsky 2024-10-08 15:23 ` RFC: try to avoid del_gendisk vs passthrough from ->release deadlocks Sergey Senozhatsky 2 siblings, 1 reply; 7+ messages in thread From: Christoph Hellwig @ 2024-10-08 11:57 UTC (permalink / raw) To: Jens Axboe; +Cc: Sergey Senozhatsky, YangYang, linux-block Now that we stop sd and sr from submitting passthrough commands from their ->release methods we can and should start the drain before taking ->open_mutex, so that we can entirely prevent this kind of deadlock by ensuring that the disk is clearly marked dead before open_mutex is taken in del_gendisk. This includes a revert of commit 7e04da2dc701 ("block: fix deadlock between sd_remove & sd_release"), which was a partial fix for a similar deadlock. Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org> Suggested-by: Sergey Senozhatsky <senozhatsky@chromium.org> Signed-off-by: Christoph Hellwig <hch@lst.de> --- block/genhd.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/block/genhd.c b/block/genhd.c index ac1c496ad43343..98c8d13171c642 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -655,16 +655,6 @@ void del_gendisk(struct gendisk *disk) if (WARN_ON_ONCE(!disk_live(disk) && !(disk->flags & GENHD_FL_HIDDEN))) return; - disk_del_events(disk); - - /* - * Prevent new openers by unlinked the bdev inode. - */ - mutex_lock(&disk->open_mutex); - xa_for_each(&disk->part_tbl, idx, part) - bdev_unhash(part); - mutex_unlock(&disk->open_mutex); - /* * Tell the file system to write back all dirty data and shut down if * it hasn't been notified earlier. @@ -673,10 +663,22 @@ void del_gendisk(struct gendisk *disk) blk_report_disk_dead(disk, false); /* - * Drop all partitions now that the disk is marked dead. + * Then mark the disk dead to stop new requests from being served ASAP. + * This needs to happen before taking ->open_mutex to prevent deadlocks + * with SCSI ULPs that send passthrough commands from their ->release + * methods. */ - mutex_lock(&disk->open_mutex); __blk_mark_disk_dead(disk); + + disk_del_events(disk); + + /* + * Prevent new openers by unlinking the bdev inode, and drop all + * partitions. + */ + mutex_lock(&disk->open_mutex); + xa_for_each(&disk->part_tbl, idx, part) + bdev_unhash(part); xa_for_each_start(&disk->part_tbl, idx, part, 1) drop_partition(part); mutex_unlock(&disk->open_mutex); -- 2.45.2 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] block: mark the disk dead before taking open_mutx in del_gendisk 2024-10-08 11:57 ` [PATCH 2/2] block: mark the disk dead before taking open_mutx in del_gendisk Christoph Hellwig @ 2024-10-09 5:04 ` Sergey Senozhatsky 0 siblings, 0 replies; 7+ messages in thread From: Sergey Senozhatsky @ 2024-10-09 5:04 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Jens Axboe, Sergey Senozhatsky, YangYang, linux-block On (24/10/08 13:57), Christoph Hellwig wrote: > Now that we stop sd and sr from submitting passthrough commands from > their ->release methods we can and should start the drain before taking > ->open_mutex, so that we can entirely prevent this kind of deadlock by > ensuring that the disk is clearly marked dead before open_mutex is > taken in del_gendisk. > > This includes a revert of commit 7e04da2dc701 ("block: fix deadlock > between sd_remove & sd_release"), which was a partial fix for a similar > deadlock. > > Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org> > Suggested-by: Sergey Senozhatsky <senozhatsky@chromium.org> > Signed-off-by: Christoph Hellwig <hch@lst.de> FWIW Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RFC: try to avoid del_gendisk vs passthrough from ->release deadlocks 2024-10-08 11:57 RFC: try to avoid del_gendisk vs passthrough from ->release deadlocks Christoph Hellwig 2024-10-08 11:57 ` [PATCH 1/2] block: also mark disk-owned queues as dying in __blk_mark_disk_dead Christoph Hellwig 2024-10-08 11:57 ` [PATCH 2/2] block: mark the disk dead before taking open_mutx in del_gendisk Christoph Hellwig @ 2024-10-08 15:23 ` Sergey Senozhatsky 2 siblings, 0 replies; 7+ messages in thread From: Sergey Senozhatsky @ 2024-10-08 15:23 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Jens Axboe, Sergey Senozhatsky, YangYang, linux-block On (24/10/08 13:57), Christoph Hellwig wrote: > Hi all, > > this is my attempted fix for the problem reported by Sergey in the > "block: del_gendisk() vs blk_queue_enter() race condition" thread. As > I don't have a reproducer this is all just best guest so far, so handle > it with care! This looks promising. Thanks for working on this. Let me try to test it (albeit without a reproducer). ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-10-09 7:34 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-10-08 11:57 RFC: try to avoid del_gendisk vs passthrough from ->release deadlocks Christoph Hellwig 2024-10-08 11:57 ` [PATCH 1/2] block: also mark disk-owned queues as dying in __blk_mark_disk_dead Christoph Hellwig 2024-10-09 5:06 ` Sergey Senozhatsky 2024-10-09 7:34 ` Christoph Hellwig 2024-10-08 11:57 ` [PATCH 2/2] block: mark the disk dead before taking open_mutx in del_gendisk Christoph Hellwig 2024-10-09 5:04 ` Sergey Senozhatsky 2024-10-08 15:23 ` RFC: try to avoid del_gendisk vs passthrough from ->release deadlocks Sergey Senozhatsky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).