[PATCH 1/2] block: call blk_mq_exit_queue from disk

linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 1/2] block: call blk_mq_exit_queue from disk_release for never added disks
@ 2022-07-18  6:29 Christoph Hellwig
  2022-07-18  6:29 ` [PATCH 2/2] Revert "ublk_drv: fix request queue leak" Christoph Hellwig
  2022-07-18  7:18 ` [PATCH 1/2] block: call blk_mq_exit_queue from disk_release for never added disks Ming Lei
  0 siblings, 2 replies; 16+ messages in thread
From: Christoph Hellwig @ 2022-07-18  6:29 UTC (permalink / raw)
  To: axboe; +Cc: ming.lei, linux-block

To undo the all initialization from blk_mq_init_allocated_queue in case
of a probe failure where add_disk is never called we have to call
blk_mq_exit_queue from put_disk.

We should be doing this in general, but can't do it for the normal
teardown case (yet) as the tagset can be gone by the time the disk is
released once it was added.  I hope to sort this out properly eventual
but for now this isolated hack will do it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/genhd.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/block/genhd.c b/block/genhd.c
index 44dfcf67ed96a..ad8a3678d4480 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1138,6 +1138,18 @@ static void disk_release(struct device *dev)
 	might_sleep();
 	WARN_ON_ONCE(disk_live(disk));
 
+	/*
+	 * To undo the all initialization from blk_mq_init_allocated_queue in
+	 * case of a probe failure where add_disk is never called we have to
+	 * call blk_mq_exit_queue here. We can't do this for the more common
+	 * teardown case (yet) as the tagset can be gone by the time the disk
+	 * is released once it was added.
+	 */
+	if (queue_is_mq(disk->queue) &&
+	    test_bit(GD_OWNS_QUEUE, &disk->state) &&
+	    !test_bit(GD_ADDED, &disk->state))
+		blk_mq_exit_queue(disk->queue);
+
 	blkcg_exit_queue(disk->queue);
 
 	disk_release_events(disk);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/2] Revert "ublk_drv: fix request queue leak"
  2022-07-18  6:29 [PATCH 1/2] block: call blk_mq_exit_queue from disk_release for never added disks Christoph Hellwig
@ 2022-07-18  6:29 ` Christoph Hellwig
  2022-07-19 12:37   ` Ming Lei
  2022-07-18  7:18 ` [PATCH 1/2] block: call blk_mq_exit_queue from disk_release for never added disks Ming Lei
  1 sibling, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2022-07-18  6:29 UTC (permalink / raw)
  To: axboe; +Cc: ming.lei, linux-block

This was just a rather broken way to paper over a core block layer bug
that is now fixed.

This reverts commit cebbe577cb17ed9b04b50d9e6802a8bacffbadca.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/ublk_drv.c | 21 ++++++---------------
 1 file changed, 6 insertions(+), 15 deletions(-)

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index b633d268b90ae..4e0248ef6bec5 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -155,8 +155,6 @@ static DEFINE_MUTEX(ublk_ctl_mutex);
 
 static struct miscdevice ublk_misc;
 
-static struct lock_class_key ublk_bio_compl_lkclass;
-
 static inline bool ublk_can_use_task_work(const struct ublk_queue *ubq)
 {
 	if (IS_BUILTIN(CONFIG_BLK_DEV_UBLK) &&
@@ -633,7 +631,7 @@ static void ublk_commit_rqs(struct blk_mq_hw_ctx *hctx)
 static int ublk_init_hctx(struct blk_mq_hw_ctx *hctx, void *driver_data,
 		unsigned int hctx_idx)
 {
-	struct ublk_device *ub = driver_data;
+	struct ublk_device *ub = hctx->queue->queuedata;
 	struct ublk_queue *ubq = ublk_get_queue(ub, hctx->queue_num);
 
 	hctx->driver_data = ubq;
@@ -1074,8 +1072,6 @@ static void ublk_cdev_rel(struct device *dev)
 {
 	struct ublk_device *ub = container_of(dev, struct ublk_device, cdev_dev);
 
-	blk_mq_destroy_queue(ub->ub_queue);
-
 	put_disk(ub->ub_disk);
 
 	blk_mq_free_tag_set(&ub->tag_set);
@@ -1165,17 +1161,14 @@ static int ublk_add_dev(struct ublk_device *ub)
 	if (err)
 		goto out_deinit_queues;
 
-	ub->ub_queue = blk_mq_init_queue(&ub->tag_set);
-	if (IS_ERR(ub->ub_queue))
-		goto out_cleanup_tags;
-	ub->ub_queue->queuedata = ub;
-
-	disk = ub->ub_disk = blk_mq_alloc_disk_for_queue(ub->ub_queue,
-						 &ublk_bio_compl_lkclass);
+	disk = ub->ub_disk = blk_mq_alloc_disk(&ub->tag_set, ub);
 	if (IS_ERR(disk)) {
 		err = PTR_ERR(disk);
-		goto out_free_request_queue;
+		goto out_cleanup_tags;
 	}
+	ub->ub_queue = ub->ub_disk->queue;
+
+	ub->ub_queue->queuedata = ub;
 
 	blk_queue_logical_block_size(ub->ub_queue, bsize);
 	blk_queue_physical_block_size(ub->ub_queue, bsize);
@@ -1207,8 +1200,6 @@ static int ublk_add_dev(struct ublk_device *ub)
 
 	return 0;
 
-out_free_request_queue:
-	blk_mq_destroy_queue(ub->ub_queue);
 out_cleanup_tags:
 	blk_mq_free_tag_set(&ub->tag_set);
 out_deinit_queues:
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Revert "ublk_drv: fix request queue leak"
  2022-07-18  6:29 ` [PATCH 2/2] Revert "ublk_drv: fix request queue leak" Christoph Hellwig
@ 2022-07-19 12:37   ` Ming Lei
  2022-07-20  6:07     ` Christoph Hellwig
  0 siblings, 1 reply; 16+ messages in thread
From: Ming Lei @ 2022-07-19 12:37 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, linux-block, ming.lei

Hi Christoph,

On Mon, Jul 18, 2022 at 08:29:28AM +0200, Christoph Hellwig wrote:
> This was just a rather broken way to paper over a core block layer bug
> that is now fixed.
> 
> This reverts commit cebbe577cb17ed9b04b50d9e6802a8bacffbadca.

This change will break START_DEV/STOP_DEV, which is supposed to run
multiple cycles after the device is added, especially this way can
help to implement error recovery from userside, such as one ubq_daemon
is crashed/hang, the device can be recovered by sending STOP_DEV/START_DEV
commands again after new ubq_daemon is setup.

So here we do need separated request_queue/disk, and the model is
similar with scsi's, in which disk rebind needs to be supported
and GD_OWNS_QUEUE can't be set.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Revert "ublk_drv: fix request queue leak"
  2022-07-19 12:37   ` Ming Lei
@ 2022-07-20  6:07     ` Christoph Hellwig
  2022-07-20  7:47       ` Ming Lei
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2022-07-20  6:07 UTC (permalink / raw)
  To: Ming Lei; +Cc: Christoph Hellwig, axboe, linux-block

On Tue, Jul 19, 2022 at 08:37:23PM +0800, Ming Lei wrote:
> This change will break START_DEV/STOP_DEV, which is supposed to run
> multiple cycles after the device is added, especially this way can
> help to implement error recovery from userside, such as one ubq_daemon
> is crashed/hang, the device can be recovered by sending STOP_DEV/START_DEV
> commands again after new ubq_daemon is setup.

What is broken in START_DEV/STOP_DEV?  Please explain the semantics you
want and what doesn't work.  FYI, there is nothing in the test suite the
complains.  And besides the obvious block layer bug that Jens found you
seemed to be perfectly happy with the semantics.

> So here we do need separated request_queue/disk, and the model is
> similar with scsi's, in which disk rebind needs to be supported
> and GD_OWNS_QUEUE can't be set.

SCSI needs it because it needs the request_queue to probe for what ULP
to bind to, and it allows to unbind the ULP.  None of that is the case
here.  And managing the lifetimes separately is a complete mess, so
don't do it.  Especially not in a virtual driver where you don't have
to cater to a long set protocol like SCSI.

> 
> Thanks,
> Ming
---end quoted text---

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Revert "ublk_drv: fix request queue leak"
  2022-07-20  6:07     ` Christoph Hellwig
@ 2022-07-20  7:47       ` Ming Lei
  2022-07-20  9:00         ` Christoph Hellwig
  0 siblings, 1 reply; 16+ messages in thread
From: Ming Lei @ 2022-07-20  7:47 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, linux-block, ming.lei

On Wed, Jul 20, 2022 at 08:07:05AM +0200, Christoph Hellwig wrote:
> On Tue, Jul 19, 2022 at 08:37:23PM +0800, Ming Lei wrote:
> > This change will break START_DEV/STOP_DEV, which is supposed to run
> > multiple cycles after the device is added, especially this way can
> > help to implement error recovery from userside, such as one ubq_daemon
> > is crashed/hang, the device can be recovered by sending STOP_DEV/START_DEV
> > commands again after new ubq_daemon is setup.
> 
> What is broken in START_DEV/STOP_DEV?  Please explain the semantics you
> want and what doesn't work.  FYI, there is nothing in the test suite the
> complains.  And besides the obvious block layer bug that Jens found you
> seemed to be perfectly happy with the semantics.

START_DEV calls add_disk(), and STOP_DEV calls del_gendisk(), but if 
GD_OWNS_QUEUE is set, blk_mq_exit_queue() will be called in
del_gendisk(), then the following START_DEV will stuck.

> 
> > So here we do need separated request_queue/disk, and the model is
> > similar with scsi's, in which disk rebind needs to be supported
> > and GD_OWNS_QUEUE can't be set.
> 
> SCSI needs it because it needs the request_queue to probe for what ULP
> to bind to, and it allows to unbind the ULP.  None of that is the case
> here.  And managing the lifetimes separately is a complete mess, so
> don't do it.  Especially not in a virtual driver where you don't have
> to cater to a long set protocol like SCSI.

If blk_mq_exit_queue is called in del_gendisk() for scsi, how can
re-bind work as expected since it needs one completely workable
request queue instead of partial exited one?


Thanks,
Ming


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Revert "ublk_drv: fix request queue leak"
  2022-07-20  7:47       ` Ming Lei
@ 2022-07-20  9:00         ` Christoph Hellwig
  2022-07-20 10:16           ` Ming Lei
  2022-07-20 10:23           ` Ming Lei
  0 siblings, 2 replies; 16+ messages in thread
From: Christoph Hellwig @ 2022-07-20  9:00 UTC (permalink / raw)
  To: Ming Lei; +Cc: Christoph Hellwig, axboe, linux-block

On Wed, Jul 20, 2022 at 03:47:27PM +0800, Ming Lei wrote:
> > What is broken in START_DEV/STOP_DEV?  Please explain the semantics you
> > want and what doesn't work.  FYI, there is nothing in the test suite the
> > complains.  And besides the obvious block layer bug that Jens found you
> > seemed to be perfectly happy with the semantics.
> 
> START_DEV calls add_disk(), and STOP_DEV calls del_gendisk(), but if 
> GD_OWNS_QUEUE is set, blk_mq_exit_queue() will be called in
> del_gendisk(), then the following START_DEV will stuck.

Uh, yeah.  alloc_disk and add_disk are supposed to be paired and
not split over different ioctls.  The lifetime rules here are
rather broken.

> > > similar with scsi's, in which disk rebind needs to be supported
> > > and GD_OWNS_QUEUE can't be set.
> > 
> > SCSI needs it because it needs the request_queue to probe for what ULP
> > to bind to, and it allows to unbind the ULP.  None of that is the case
> > here.  And managing the lifetimes separately is a complete mess, so
> > don't do it.  Especially not in a virtual driver where you don't have
> > to cater to a long set protocol like SCSI.
> 
> If blk_mq_exit_queue is called in del_gendisk() for scsi, how can
> re-bind work as expected since it needs one completely workable
> request queue instead of partial exited one?

For !GD_OWNS_QUEUE blk_mq_exit_queue is not called from del_gendisk().

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Revert "ublk_drv: fix request queue leak"
  2022-07-20  9:00         ` Christoph Hellwig
@ 2022-07-20 10:16           ` Ming Lei
  2022-07-20 10:23           ` Ming Lei
  1 sibling, 0 replies; 16+ messages in thread
From: Ming Lei @ 2022-07-20 10:16 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, linux-block, ming.lei

On Wed, Jul 20, 2022 at 11:00:40AM +0200, Christoph Hellwig wrote:
> On Wed, Jul 20, 2022 at 03:47:27PM +0800, Ming Lei wrote:
> > > What is broken in START_DEV/STOP_DEV?  Please explain the semantics you
> > > want and what doesn't work.  FYI, there is nothing in the test suite the
> > > complains.  And besides the obvious block layer bug that Jens found you
> > > seemed to be perfectly happy with the semantics.
> > 
> > START_DEV calls add_disk(), and STOP_DEV calls del_gendisk(), but if 
> > GD_OWNS_QUEUE is set, blk_mq_exit_queue() will be called in
> > del_gendisk(), then the following START_DEV will stuck.
> 
> Uh, yeah.  alloc_disk and add_disk are supposed to be paired and
> not split over different ioctls.  The lifetime rules here are
> rather broken.

Can you explain what this way breaks? And why can't one disk be
added/deleted multiple times?

> 
> > > > similar with scsi's, in which disk rebind needs to be supported
> > > > and GD_OWNS_QUEUE can't be set.
> > > 
> > > SCSI needs it because it needs the request_queue to probe for what ULP
> > > to bind to, and it allows to unbind the ULP.  None of that is the case
> > > here.  And managing the lifetimes separately is a complete mess, so
> > > don't do it.  Especially not in a virtual driver where you don't have
> > > to cater to a long set protocol like SCSI.
> > 
> > If blk_mq_exit_queue is called in del_gendisk() for scsi, how can
> > re-bind work as expected since it needs one completely workable
> > request queue instead of partial exited one?
> 
> For !GD_OWNS_QUEUE blk_mq_exit_queue is not called from del_gendisk().

That is why scsi can't set GD_OWNS_QUEUE, for any driver, if disk rebind
or similar behavior is needed, GD_OWNS_QUEUE can't be set. That is why
ublk_drv uses separated queue/disk.


thanks,
Ming


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Revert "ublk_drv: fix request queue leak"
  2022-07-20  9:00         ` Christoph Hellwig
  2022-07-20 10:16           ` Ming Lei
@ 2022-07-20 10:23           ` Ming Lei
  2022-07-20 13:08             ` Christoph Hellwig
  1 sibling, 1 reply; 16+ messages in thread
From: Ming Lei @ 2022-07-20 10:23 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, linux-block, ming.lei

On Wed, Jul 20, 2022 at 11:00:40AM +0200, Christoph Hellwig wrote:
> On Wed, Jul 20, 2022 at 03:47:27PM +0800, Ming Lei wrote:
> > > What is broken in START_DEV/STOP_DEV?  Please explain the semantics you
> > > want and what doesn't work.  FYI, there is nothing in the test suite the
> > > complains.  And besides the obvious block layer bug that Jens found you
> > > seemed to be perfectly happy with the semantics.
> > 
> > START_DEV calls add_disk(), and STOP_DEV calls del_gendisk(), but if 
> > GD_OWNS_QUEUE is set, blk_mq_exit_queue() will be called in
> > del_gendisk(), then the following START_DEV will stuck.
> 
> Uh, yeah.  alloc_disk and add_disk are supposed to be paired and
> not split over different ioctls.  The lifetime rules here are
> rather broken.

Even though alloc_disk and add_disk is paired here, GD_OWNS_QUEUE still
can't be set because request queue has to be workable for the new alloc/
added disk, just like scsi.

So it is nothing to do with pair of alloc_disk/add_disk().

Not mention I don't see any thing wrong with adding/deleting disk
multiple times.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Revert "ublk_drv: fix request queue leak"
  2022-07-20 10:23           ` Ming Lei
@ 2022-07-20 13:08             ` Christoph Hellwig
  2022-07-20 15:33               ` Ming Lei
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2022-07-20 13:08 UTC (permalink / raw)
  To: Ming Lei; +Cc: Christoph Hellwig, axboe, linux-block

On Wed, Jul 20, 2022 at 06:23:22PM +0800, Ming Lei wrote:
> Even though alloc_disk and add_disk is paired here, GD_OWNS_QUEUE still
> can't be set because request queue has to be workable for the new alloc/
> added disk, just like scsi.

How so?  dm has totally normall disk/request_queue lifetimes.  The only
caveat is that the blk-mq bits of the queue are added after the initial
non-mq disk allocation.  There is no newly added disk after the disk
and queue are torn down.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Revert "ublk_drv: fix request queue leak"
  2022-07-20 13:08             ` Christoph Hellwig
@ 2022-07-20 15:33               ` Ming Lei
  2022-07-21  5:12                 ` Christoph Hellwig
  0 siblings, 1 reply; 16+ messages in thread
From: Ming Lei @ 2022-07-20 15:33 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, linux-block, ming.lei

On Wed, Jul 20, 2022 at 03:08:45PM +0200, Christoph Hellwig wrote:
> On Wed, Jul 20, 2022 at 06:23:22PM +0800, Ming Lei wrote:
> > Even though alloc_disk and add_disk is paired here, GD_OWNS_QUEUE still
> > can't be set because request queue has to be workable for the new alloc/
> > added disk, just like scsi.
> 
> How so?  dm has totally normall disk/request_queue lifetimes.  The only
> caveat is that the blk-mq bits of the queue are added after the initial
> non-mq disk allocation.  There is no newly added disk after the disk
> and queue are torn down.

I meant that request queue is supposed to be low level stuff for implementing
disk function, and request queue hasn't to be released and re-allocated
after each disk whole lifetime(alloc disk, add disk, del_gendisk, release disk).

IMO, the limit is just from GD_OWNS_QUEUE which moves releasing some queue
resource into del_gendisk or disk release(as the fixes you posted), and this
way is fragile, frankly speaking. In theory, allocating queue and releasing
queue should be completely symmetrical, but GD_OWNS_QUEUE does change this way.

And GD_OWNS_QUEUE can't be set for SCSI, so the two modes have to be
supported by block layer.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Revert "ublk_drv: fix request queue leak"
  2022-07-20 15:33               ` Ming Lei
@ 2022-07-21  5:12                 ` Christoph Hellwig
  0 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2022-07-21  5:12 UTC (permalink / raw)
  To: Ming Lei; +Cc: Christoph Hellwig, axboe, linux-block

On Wed, Jul 20, 2022 at 11:33:24PM +0800, Ming Lei wrote:
> I meant that request queue is supposed to be low level stuff for implementing
> disk function, and request queue hasn't to be released and re-allocated
> after each disk whole lifetime(alloc disk, add disk, del_gendisk, release disk).

It doesn't have to.  But it is pretty damn convenient if we can.

What we can't easily do is to re-add a disk after del_gendisk was
called but before the final reference went away, which is what ublk
does do currently.  In other words it does not even follow the SCSI
model, which works but is somewhat cumbersome, but comes up with
yet another weird model that is probably going to fall apart pretty
quickly when doing STAT_DEV, STOP_DEV loops.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] block: call blk_mq_exit_queue from disk_release for never added disks
  2022-07-18  6:29 [PATCH 1/2] block: call blk_mq_exit_queue from disk_release for never added disks Christoph Hellwig
  2022-07-18  6:29 ` [PATCH 2/2] Revert "ublk_drv: fix request queue leak" Christoph Hellwig
@ 2022-07-18  7:18 ` Ming Lei
  2022-07-18  8:33   ` Ming Lei
  2022-07-18 13:07   ` Christoph Hellwig
  1 sibling, 2 replies; 16+ messages in thread
From: Ming Lei @ 2022-07-18  7:18 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, linux-block, ming.lei

On Mon, Jul 18, 2022 at 08:29:27AM +0200, Christoph Hellwig wrote:
> To undo the all initialization from blk_mq_init_allocated_queue in case
> of a probe failure where add_disk is never called we have to call
> blk_mq_exit_queue from put_disk.
> 
> We should be doing this in general, but can't do it for the normal
> teardown case (yet) as the tagset can be gone by the time the disk is
> released once it was added.  I hope to sort this out properly eventual
> but for now this isolated hack will do it.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  block/genhd.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/block/genhd.c b/block/genhd.c
> index 44dfcf67ed96a..ad8a3678d4480 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -1138,6 +1138,18 @@ static void disk_release(struct device *dev)
>  	might_sleep();
>  	WARN_ON_ONCE(disk_live(disk));
>  
> +	/*
> +	 * To undo the all initialization from blk_mq_init_allocated_queue in
> +	 * case of a probe failure where add_disk is never called we have to
> +	 * call blk_mq_exit_queue here. We can't do this for the more common
> +	 * teardown case (yet) as the tagset can be gone by the time the disk
> +	 * is released once it was added.
> +	 */
> +	if (queue_is_mq(disk->queue) &&
> +	    test_bit(GD_OWNS_QUEUE, &disk->state) &&
> +	    !test_bit(GD_ADDED, &disk->state))
> +		blk_mq_exit_queue(disk->queue);
> +

Request queue is allocated successfully, but disk allocation may fail,
so blk_mq_exit_queue still may be missed in this case.


thanks,
Ming


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] block: call blk_mq_exit_queue from disk_release for never added disks
  2022-07-18  7:18 ` [PATCH 1/2] block: call blk_mq_exit_queue from disk_release for never added disks Ming Lei
@ 2022-07-18  8:33   ` Ming Lei
  2022-07-18 13:08     ` Christoph Hellwig
  2022-07-18 13:07   ` Christoph Hellwig
  1 sibling, 1 reply; 16+ messages in thread
From: Ming Lei @ 2022-07-18  8:33 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, linux-block, ming.lei

On Mon, Jul 18, 2022 at 03:18:52PM +0800, Ming Lei wrote:
> On Mon, Jul 18, 2022 at 08:29:27AM +0200, Christoph Hellwig wrote:
> > To undo the all initialization from blk_mq_init_allocated_queue in case
> > of a probe failure where add_disk is never called we have to call
> > blk_mq_exit_queue from put_disk.
> > 
> > We should be doing this in general, but can't do it for the normal
> > teardown case (yet) as the tagset can be gone by the time the disk is
> > released once it was added.  I hope to sort this out properly eventual
> > but for now this isolated hack will do it.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > ---
> >  block/genhd.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/block/genhd.c b/block/genhd.c
> > index 44dfcf67ed96a..ad8a3678d4480 100644
> > --- a/block/genhd.c
> > +++ b/block/genhd.c
> > @@ -1138,6 +1138,18 @@ static void disk_release(struct device *dev)
> >  	might_sleep();
> >  	WARN_ON_ONCE(disk_live(disk));
> >  
> > +	/*
> > +	 * To undo the all initialization from blk_mq_init_allocated_queue in
> > +	 * case of a probe failure where add_disk is never called we have to
> > +	 * call blk_mq_exit_queue here. We can't do this for the more common
> > +	 * teardown case (yet) as the tagset can be gone by the time the disk
> > +	 * is released once it was added.
> > +	 */
> > +	if (queue_is_mq(disk->queue) &&
> > +	    test_bit(GD_OWNS_QUEUE, &disk->state) &&
> > +	    !test_bit(GD_ADDED, &disk->state))
> > +		blk_mq_exit_queue(disk->queue);
> > +
> 
> Request queue is allocated successfully, but disk allocation may fail,
> so blk_mq_exit_queue still may be missed in this case.

Or we do have request queue uses without disk attached, such as nvme
admin/connection queue.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] block: call blk_mq_exit_queue from disk_release for never added disks
  2022-07-18  8:33   ` Ming Lei
@ 2022-07-18 13:08     ` Christoph Hellwig
  0 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2022-07-18 13:08 UTC (permalink / raw)
  To: Ming Lei; +Cc: Christoph Hellwig, axboe, linux-block

On Mon, Jul 18, 2022 at 04:33:33PM +0800, Ming Lei wrote:
> > Request queue is allocated successfully, but disk allocation may fail,
> > so blk_mq_exit_queue still may be missed in this case.
> 
> Or we do have request queue uses without disk attached, such as nvme
> admin/connection queue.

All these need to call blk_mq_destroy_queue anyway, and as far as I
can tell do that already.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] block: call blk_mq_exit_queue from disk_release for never added disks
  2022-07-18  7:18 ` [PATCH 1/2] block: call blk_mq_exit_queue from disk_release for never added disks Ming Lei
  2022-07-18  8:33   ` Ming Lei
@ 2022-07-18 13:07   ` Christoph Hellwig
  2022-07-18 15:16     ` Ming Lei
  1 sibling, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2022-07-18 13:07 UTC (permalink / raw)
  To: Ming Lei; +Cc: Christoph Hellwig, axboe, linux-block

On Mon, Jul 18, 2022 at 03:18:52PM +0800, Ming Lei wrote:
> Request queue is allocated successfully, but disk allocation may fail,
> so blk_mq_exit_queue still may be missed in this case.

Yes.  That's a separate issue, though and solved by this one liner:

diff --git a/block/blk-mq.c b/block/blk-mq.c
index d716b7f3763f3..70177ee74295b 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3960,7 +3960,7 @@ struct gendisk *__blk_mq_alloc_disk(struct blk_mq_tag_set *set, void *queuedata,
 
 	disk = __alloc_disk_node(q, set->numa_node, lkclass);
 	if (!disk) {
-		blk_put_queue(q);
+		blk_mq_destroy_queue(q);
 		return ERR_PTR(-ENOMEM);
 	}
 	set_bit(GD_OWNS_QUEUE, &disk->state);

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] block: call blk_mq_exit_queue from disk_release for never added disks
  2022-07-18 13:07   ` Christoph Hellwig
@ 2022-07-18 15:16     ` Ming Lei
  0 siblings, 0 replies; 16+ messages in thread
From: Ming Lei @ 2022-07-18 15:16 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, linux-block, ming.lei

On Mon, Jul 18, 2022 at 03:07:25PM +0200, Christoph Hellwig wrote:
> On Mon, Jul 18, 2022 at 03:18:52PM +0800, Ming Lei wrote:
> > Request queue is allocated successfully, but disk allocation may fail,
> > so blk_mq_exit_queue still may be missed in this case.
> 
> Yes.  That's a separate issue, though and solved by this one liner:
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index d716b7f3763f3..70177ee74295b 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -3960,7 +3960,7 @@ struct gendisk *__blk_mq_alloc_disk(struct blk_mq_tag_set *set, void *queuedata,
>  
>  	disk = __alloc_disk_node(q, set->numa_node, lkclass);
>  	if (!disk) {
> -		blk_put_queue(q);
> +		blk_mq_destroy_queue(q);
>  		return ERR_PTR(-ENOMEM);

OK, but it is still tricky, since there isn't the required order between
calling blk_mq_free_tag_set and put_disk, but now put_disk has to be called
before calling blk_mq_free_tag_set, so you may have to audit drivers
first, and make sure that all put_disk is called before blk_mq_free_tag_set()
in driver's error handling code path.



Thanks,
Ming


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-07-21  5:12 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-07-18  6:29 [PATCH 1/2] block: call blk_mq_exit_queue from disk_release for never added disks Christoph Hellwig
2022-07-18  6:29 ` [PATCH 2/2] Revert "ublk_drv: fix request queue leak" Christoph Hellwig
2022-07-19 12:37   ` Ming Lei
2022-07-20  6:07     ` Christoph Hellwig
2022-07-20  7:47       ` Ming Lei
2022-07-20  9:00         ` Christoph Hellwig
2022-07-20 10:16           ` Ming Lei
2022-07-20 10:23           ` Ming Lei
2022-07-20 13:08             ` Christoph Hellwig
2022-07-20 15:33               ` Ming Lei
2022-07-21  5:12                 ` Christoph Hellwig
2022-07-18  7:18 ` [PATCH 1/2] block: call blk_mq_exit_queue from disk_release for never added disks Ming Lei
2022-07-18  8:33   ` Ming Lei
2022-07-18 13:08     ` Christoph Hellwig
2022-07-18 13:07   ` Christoph Hellwig
2022-07-18 15:16     ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).