From: Ming Lei <ming.lei@redhat.com>
To: "yukuai (C)" <yukuai3@huawei.com>
Cc: Hannes Reinecke <hare@suse.de>, Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org,
Dan Williams <dan.j.williams@intel.com>,
Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Subject: Re: [PATCH] block: fix "Directory XXXXX with parent 'block' already present!"
Date: Fri, 22 Apr 2022 10:52:39 +0800 [thread overview]
Message-ID: <YmIYdzOGRPFGMA4v@T590> (raw)
In-Reply-To: <9648097e-25a5-009e-c95f-6a76ea606f5b@huawei.com>
On Fri, Apr 22, 2022 at 09:23:40AM +0800, yukuai (C) wrote:
> 在 2022/04/22 1:28, Hannes Reinecke 写道:
> > On 4/21/22 10:34, Ming Lei wrote:
> > > q->debugfs_dir is used by blk-mq debugfs and blktrace. The dentry is
> > > created when adding disk, and removed when releasing request queue.
> > >
> > > There is small window between releasing disk and releasing request
> > > queue, and during the period, one disk with same name may be created
> > > and added, so debugfs_create_dir() may complain with "Directory XXXXX
> > > with parent 'block' already present!"
> > >
> > > Fixes the issue by moving debugfs_create_dir() into blk_alloc_queue(),
> > > and the dir name is named with q->id from beginning, and switched to
> > > disk name when adding disk, and finally changed to q->id in
> > > disk_release().
> > >
> > > Reported-by: Dan Williams <dan.j.williams@intel.com>
> > > Cc: yukuai (C) <yukuai3@huawei.com>
> > > Cc: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > ---
> > > block/blk-core.c | 4 ++++
> > > block/blk-sysfs.c | 4 ++--
> > > block/genhd.c | 8 ++++++++
> > > 3 files changed, 14 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/block/blk-core.c b/block/blk-core.c
> > > index f305cb66c72a..245ec664753d 100644
> > > --- a/block/blk-core.c
> > > +++ b/block/blk-core.c
> > > @@ -438,6 +438,7 @@ struct request_queue *blk_alloc_queue(int
> > > node_id, bool alloc_srcu)
> > > {
> > > struct request_queue *q;
> > > int ret;
> > > + char q_name[16];
> > > q = kmem_cache_alloc_node(blk_get_queue_kmem_cache(alloc_srcu),
> > > GFP_KERNEL | __GFP_ZERO, node_id);
> > > @@ -495,6 +496,9 @@ struct request_queue *blk_alloc_queue(int
> > > node_id, bool alloc_srcu)
> > > blk_set_default_limits(&q->limits);
> > > q->nr_requests = BLKDEV_DEFAULT_RQ;
> > > + sprintf(q_name, "%d", q->id);
> > > + q->debugfs_dir = debugfs_create_dir(q_name, blk_debugfs_root);
> > > +
> > > return q;
> > > fail_stats:
> > > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> > > index 88bd41d4cb59..1f986c20a07b 100644
> > > --- a/block/blk-sysfs.c
> > > +++ b/block/blk-sysfs.c
> > > @@ -837,8 +837,8 @@ int blk_register_queue(struct gendisk *disk)
> > > }
> > > mutex_lock(&q->debugfs_mutex);
> > > - q->debugfs_dir = debugfs_create_dir(kobject_name(q->kobj.parent),
> > > - blk_debugfs_root);
> > > + q->debugfs_dir = debugfs_rename(blk_debugfs_root, q->debugfs_dir,
> > > + blk_debugfs_root, kobject_name(q->kobj.parent));
> > > mutex_unlock(&q->debugfs_mutex);
> > > if (queue_is_mq(q)) {
> > > diff --git a/block/genhd.c b/block/genhd.c
> > > index 36532b931841..08895f9f7087 100644
> > > --- a/block/genhd.c
> > > +++ b/block/genhd.c
> > > @@ -25,6 +25,7 @@
> > > #include <linux/pm_runtime.h>
> > > #include <linux/badblocks.h>
> > > #include <linux/part_stat.h>
> > > +#include <linux/debugfs.h>
> > > #include "blk-throttle.h"
> > > #include "blk.h"
> > > @@ -1160,6 +1161,7 @@ static void disk_release_mq(struct
> > > request_queue *q)
> > > static void disk_release(struct device *dev)
> > > {
> > > struct gendisk *disk = dev_to_disk(dev);
> > > + char q_name[16];
> > > might_sleep();
> > > WARN_ON_ONCE(disk_live(disk));
> > > @@ -1173,6 +1175,12 @@ static void disk_release(struct device *dev)
> > > kfree(disk->random);
> > > xa_destroy(&disk->part_tbl);
> > > + mutex_lock(&disk->queue->debugfs_mutex);
> > > + sprintf(q_name, "%d", disk->queue->id);
> > > + disk->queue->debugfs_dir = debugfs_rename(blk_debugfs_root,
> > > + disk->queue->debugfs_dir, blk_debugfs_root, q_name);
> > > + mutex_unlock(&disk->queue->debugfs_mutex);
> > > +
> > > disk->queue->disk = NULL;
> > > blk_put_queue(disk->queue);
> >
> > I don't think this is the right approach.
> > From my POV the underlying reason is an imbalance between
> > debugfs_create_dir() (which happens in blk_register_queue()) and
> > debugfs_remove_dir() (which happens in blk_release_queue())
> >
> > So there is a small race window between blk_unregister_queue() and
> > blk_release_queue(), during which the queue might be re-registered and
> > then traipses over the (still-existant) queue.
> >
> > So we should rather move the call to debugfs_remove_dir() into
> > blk_unregister_queue() to have them both symmetric.
> >
> > Basically the patch '[PATCH RESEND] blk-mq: fix possible creation
> > failure for 'debugfs_dir'' from yukuai ...
> Hi,
>
> I forgot to move 'q->rqos_debugfs_dir' which causes a UAF in
> block/002, and Ming was worried that:
>
> blktrace still may work for passthrough req trace after disk is
> deleted.
There are other issues in your patch:
- "debugfs directory deleted with blktrace active" in block/002 could
be triggered.
- disk_release_mq() calls elevator_exit()/rq_qos_exit(), and the two
may trigger UAF if q->debugfs_dir is removed in blk_unregister_queue().
>
> I can shutdown blktrace in blk_unregister_queue(), however I was
> worried that concurrent blk_trace_setup() might reenable it.
blktrace does work for tracing passthrough request after
disk is removed, and your patch makes it not possible.
blk_trace_shutdown() should have been done after releasing disk.
Thanks,
Ming
next prev parent reply other threads:[~2022-04-22 2:52 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-21 8:34 [PATCH] block: fix "Directory XXXXX with parent 'block' already present!" Ming Lei
2022-04-21 12:02 ` Shinichiro Kawasaki
2022-04-21 16:09 ` Christoph Hellwig
2022-04-22 3:01 ` Ming Lei
2022-04-22 6:05 ` Christoph Hellwig
2022-04-22 6:43 ` Ming Lei
2022-04-21 16:37 ` Dan Williams
2022-04-21 17:28 ` Hannes Reinecke
2022-04-22 1:23 ` yukuai (C)
2022-04-22 2:52 ` Ming Lei [this message]
2022-04-21 20:54 ` kernel test robot
2022-04-21 23:06 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YmIYdzOGRPFGMA4v@T590 \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=dan.j.williams@intel.com \
--cc=hare@suse.de \
--cc=linux-block@vger.kernel.org \
--cc=shinichiro.kawasaki@wdc.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.