public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH -next] block: fix blktrace debugfs entries leak
@ 2023-05-11  6:56 Yu Kuai
  2023-05-11 15:28 ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: Yu Kuai @ 2023-05-11  6:56 UTC (permalink / raw)
  To: hch, ming.lei, axboe
  Cc: linux-block, linux-kernel, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Commit 99d055b4fd4b ("block: remove per-disk debugfs files in
blk_unregister_queue") moves blk_trace_shutdown() from
blk_release_queue() to blk_unregister_queue(), this is safe if blktrace
is created through sysfs, however, there are some regression in corner
cases:

1) for scsi, passthrough io can still be issued after del_gendisk, and
   blktrace debugfs entries will be removed immediately after
   del_gendisk(), therefor passthrough io can't be tracked and blktrace
   will complain:

   failed read of /sys/kernel/debug/block/sdb/trace0: 5/Input/output error

2) blktrace can still be enabled after del_gendisk() through ioctl if the
   disk is opened before del_gendisk(), and if blktrace is not shutdown
   through ioctl before closing the disk, debugfs entries will be
   leaked.

It seems 1) is not important, while 2) needs to be fixed apparently.

Fix this problem by shutdown blktrace in blk_free_queue(),
disk_release() is not used because scsi sg support blktrace without
gendisk, and this is safe because queue is not freed yet, and
blk_trace_shutdown() is reentrant.

Fixes: 99d055b4fd4b ("block: remove per-disk debugfs files in blk_unregister_queue")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/blk-core.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index 00c74330fa92..a0c949533a5d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -263,6 +263,10 @@ static void blk_free_queue_rcu(struct rcu_head *rcu_head)
 
 static void blk_free_queue(struct request_queue *q)
 {
+	mutex_lock(&q->debugfs_mutex);
+	blk_trace_shutdown(q);
+	mutex_unlock(&q->debugfs_mutex);
+
 	blk_free_queue_stats(q->stats);
 	if (queue_is_mq(q))
 		blk_mq_release(q);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH -next] block: fix blktrace debugfs entries leak
  2023-05-11  6:56 [PATCH -next] block: fix blktrace debugfs entries leak Yu Kuai
@ 2023-05-11 15:28 ` Christoph Hellwig
  2023-05-12  7:14   ` Yu Kuai
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2023-05-11 15:28 UTC (permalink / raw)
  To: Yu Kuai
  Cc: hch, ming.lei, axboe, linux-block, linux-kernel, yukuai3,
	yi.zhang, yangerkun

On Thu, May 11, 2023 at 02:56:33PM +0800, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@huawei.com>
> 
> Commit 99d055b4fd4b ("block: remove per-disk debugfs files in
> blk_unregister_queue") moves blk_trace_shutdown() from
> blk_release_queue() to blk_unregister_queue(), this is safe if blktrace
> is created through sysfs, however, there are some regression in corner
> cases:
> 
> 1) for scsi, passthrough io can still be issued after del_gendisk, and
>    blktrace debugfs entries will be removed immediately after
>    del_gendisk(), therefor passthrough io can't be tracked and blktrace
>    will complain:
> 
>    failed read of /sys/kernel/debug/block/sdb/trace0: 5/Input/output error

But that is the right thing.  The only thing that has a name is the
gendisk and it is gone at this point.  Leaking the debugfs entries
that are named after, and ultimatively associated with the gendisk
(even if the code is still a bit confused about this) will create a lot
of trouble for us.

> 2) blktrace can still be enabled after del_gendisk() through ioctl if the
>    disk is opened before del_gendisk(), and if blktrace is not shutdown
>    through ioctl before closing the disk, debugfs entries will be
>    leaked.

Yes.

> It seems 1) is not important, while 2) needs to be fixed apparently.
> 
> Fix this problem by shutdown blktrace in blk_free_queue(),
> disk_release() is not used because scsi sg support blktrace without
> gendisk, and this is safe because queue is not freed yet, and
> blk_trace_shutdown() is reentrant.

I think disk_release is the right place for "normal" blktrace.  The
odd cdev based blktrace for /dev/sg will need separate handling.
To be honest I'm not even sure how /dev/sg based passthrough is
even supposed to work in practice, but I'll need to spend some more
time to familarize myself with it.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH -next] block: fix blktrace debugfs entries leak
  2023-05-11 15:28 ` Christoph Hellwig
@ 2023-05-12  7:14   ` Yu Kuai
  2023-05-30  2:07     ` Yu Kuai
  0 siblings, 1 reply; 6+ messages in thread
From: Yu Kuai @ 2023-05-12  7:14 UTC (permalink / raw)
  To: Christoph Hellwig, Yu Kuai
  Cc: ming.lei, axboe, linux-block, linux-kernel, yi.zhang, yangerkun,
	yukuai (C)

Hi,

在 2023/05/11 23:28, Christoph Hellwig 写道:
> On Thu, May 11, 2023 at 02:56:33PM +0800, Yu Kuai wrote:
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> Commit 99d055b4fd4b ("block: remove per-disk debugfs files in
>> blk_unregister_queue") moves blk_trace_shutdown() from
>> blk_release_queue() to blk_unregister_queue(), this is safe if blktrace
>> is created through sysfs, however, there are some regression in corner
>> cases:
>>
>> 1) for scsi, passthrough io can still be issued after del_gendisk, and
>>     blktrace debugfs entries will be removed immediately after
>>     del_gendisk(), therefor passthrough io can't be tracked and blktrace
>>     will complain:
>>
>>     failed read of /sys/kernel/debug/block/sdb/trace0: 5/Input/output error
> 
> But that is the right thing.  The only thing that has a name is the
> gendisk and it is gone at this point.  Leaking the debugfs entries
> that are named after, and ultimatively associated with the gendisk
> (even if the code is still a bit confused about this) will create a lot
> of trouble for us.
> 
>> 2) blktrace can still be enabled after del_gendisk() through ioctl if the
>>     disk is opened before del_gendisk(), and if blktrace is not shutdown
>>     through ioctl before closing the disk, debugfs entries will be
>>     leaked.
> 
> Yes.
> 
>> It seems 1) is not important, while 2) needs to be fixed apparently.
>>
>> Fix this problem by shutdown blktrace in blk_free_queue(),
>> disk_release() is not used because scsi sg support blktrace without
>> gendisk, and this is safe because queue is not freed yet, and
>> blk_trace_shutdown() is reentrant.
> 
> I think disk_release is the right place for "normal" blktrace.  The
> odd cdev based blktrace for /dev/sg will need separate handling.
> To be honest I'm not even sure how /dev/sg based passthrough is
> even supposed to work in practice, but I'll need to spend some more
> time to familarize myself with it.

I'm not sure how to specail hanlde /dev/sg* for now, however,
If we don't care about blktrace for passthrough io after del_gendisk(),
and /dev/sg* has separate handling, I think it's better just to check
QUEUE_FLAG_REGISTERED in blk_trace_setup(), and don't enable blktrace
in the first place.

Thanks,
Kuai


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH -next] block: fix blktrace debugfs entries leak
  2023-05-12  7:14   ` Yu Kuai
@ 2023-05-30  2:07     ` Yu Kuai
  2023-05-30 14:29       ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: Yu Kuai @ 2023-05-30  2:07 UTC (permalink / raw)
  To: Yu Kuai, Christoph Hellwig
  Cc: ming.lei, axboe, linux-block, linux-kernel, yi.zhang, yangerkun,
	yukuai (C)

Hi, Christoph

在 2023/05/12 15:14, Yu Kuai 写道:
> Hi,
> 
> 在 2023/05/11 23:28, Christoph Hellwig 写道:
>> On Thu, May 11, 2023 at 02:56:33PM +0800, Yu Kuai wrote:
>>> From: Yu Kuai <yukuai3@huawei.com>
>>>
>>> Commit 99d055b4fd4b ("block: remove per-disk debugfs files in
>>> blk_unregister_queue") moves blk_trace_shutdown() from
>>> blk_release_queue() to blk_unregister_queue(), this is safe if blktrace
>>> is created through sysfs, however, there are some regression in corner
>>> cases:
>>>
>>> 1) for scsi, passthrough io can still be issued after del_gendisk, and
>>>     blktrace debugfs entries will be removed immediately after
>>>     del_gendisk(), therefor passthrough io can't be tracked and blktrace
>>>     will complain:
>>>
>>>     failed read of /sys/kernel/debug/block/sdb/trace0: 5/Input/output 
>>> error
>>
>> But that is the right thing.  The only thing that has a name is the
>> gendisk and it is gone at this point.  Leaking the debugfs entries
>> that are named after, and ultimatively associated with the gendisk
>> (even if the code is still a bit confused about this) will create a lot
>> of trouble for us.
>>
>>> 2) blktrace can still be enabled after del_gendisk() through ioctl if 
>>> the
>>>     disk is opened before del_gendisk(), and if blktrace is not shutdown
>>>     through ioctl before closing the disk, debugfs entries will be
>>>     leaked.
>>
>> Yes.
>>
>>> It seems 1) is not important, while 2) needs to be fixed apparently.
>>>
>>> Fix this problem by shutdown blktrace in blk_free_queue(),
>>> disk_release() is not used because scsi sg support blktrace without
>>> gendisk, and this is safe because queue is not freed yet, and
>>> blk_trace_shutdown() is reentrant.
>>
>> I think disk_release is the right place for "normal" blktrace.  The
>> odd cdev based blktrace for /dev/sg will need separate handling.
>> To be honest I'm not even sure how /dev/sg based passthrough is
>> even supposed to work in practice, but I'll need to spend some more
>> time to familarize myself with it.
> 
> I'm not sure how to specail hanlde /dev/sg* for now, however,
> If we don't care about blktrace for passthrough io after del_gendisk(),
> and /dev/sg* has separate handling, I think it's better just to check
> QUEUE_FLAG_REGISTERED in blk_trace_setup(), and don't enable blktrace
> in the first place.

Any suggestions about this problem? Should we use separate handling for
/dev/sd? Or just free blktrace in blk_free_queue().
> 
> Thanks,
> Kuai
> 
> .
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH -next] block: fix blktrace debugfs entries leak
  2023-05-30  2:07     ` Yu Kuai
@ 2023-05-30 14:29       ` Christoph Hellwig
  2023-05-31  7:42         ` Yu Kuai
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2023-05-30 14:29 UTC (permalink / raw)
  To: Yu Kuai
  Cc: Christoph Hellwig, ming.lei, axboe, linux-block, linux-kernel,
	yi.zhang, yangerkun, yukuai (C)

On Tue, May 30, 2023 at 10:07:54AM +0800, Yu Kuai wrote:
>> If we don't care about blktrace for passthrough io after del_gendisk(),
>> and /dev/sg* has separate handling, I think it's better just to check
>> QUEUE_FLAG_REGISTERED in blk_trace_setup(), and don't enable blktrace
>> in the first place.
>
> Any suggestions about this problem? Should we use separate handling for
> /dev/sd? Or just free blktrace in blk_free_queue().

I'd be fine with trying to either remove the /dev/sg blktrace handling
and / or splitting it up so that it doesn't interact with the main disk
based one.  I can look into this if you want, or leave it to you.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH -next] block: fix blktrace debugfs entries leak
  2023-05-30 14:29       ` Christoph Hellwig
@ 2023-05-31  7:42         ` Yu Kuai
  0 siblings, 0 replies; 6+ messages in thread
From: Yu Kuai @ 2023-05-31  7:42 UTC (permalink / raw)
  To: Christoph Hellwig, Yu Kuai
  Cc: ming.lei, axboe, linux-block, linux-kernel, yi.zhang, yangerkun,
	yukuai (C)

Hi,

在 2023/05/30 22:29, Christoph Hellwig 写道:
> On Tue, May 30, 2023 at 10:07:54AM +0800, Yu Kuai wrote:
>>> If we don't care about blktrace for passthrough io after del_gendisk(),
>>> and /dev/sg* has separate handling, I think it's better just to check
>>> QUEUE_FLAG_REGISTERED in blk_trace_setup(), and don't enable blktrace
>>> in the first place.
>>
>> Any suggestions about this problem? Should we use separate handling for
>> /dev/sd? Or just free blktrace in blk_free_queue().
> 
> I'd be fine with trying to either remove the /dev/sg blktrace handling
> and / or splitting it up so that it doesn't interact with the main disk
> based one.  I can look into this if you want, or leave it to you.
> 

Ok, I'll send a v2 to free blktrace in disk_release(), in the meantime
I'll take a look how to handle blktrace for /dev/sg.

Thanks,
Kuai
> .
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-05-31  7:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-11  6:56 [PATCH -next] block: fix blktrace debugfs entries leak Yu Kuai
2023-05-11 15:28 ` Christoph Hellwig
2023-05-12  7:14   ` Yu Kuai
2023-05-30  2:07     ` Yu Kuai
2023-05-30 14:29       ` Christoph Hellwig
2023-05-31  7:42         ` Yu Kuai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox