From: Xiao Ni <xni@redhat.com>
To: linux-raid@vger.kernel.org
Cc: yukuai3@huawei.com, ncroxon@redhat.com, song@kernel.org,
yukuai1@huaweicloud.com
Subject: [PATCH 1/3] md: call del_gendisk in control path
Date: Wed, 11 Jun 2025 15:31:06 +0800 [thread overview]
Message-ID: <20250611073108.25463-2-xni@redhat.com> (raw)
In-Reply-To: <20250611073108.25463-1-xni@redhat.com>
Now del_gendisk and put_disk are called asynchronously in workqueue work.
The asynchronous way has a problem that the device node can still exist
after mdadm --stop command returns in a short window. So udev rule can
open this device node and create the struct mddev in kernel again. So put
del_gendisk in control path and still leave put_disk in md_kobj_release
to avoid uaf of gendisk.
Function del_gendisk can't be called with reconfig_mutex. If it's called
with reconfig mutex, a deadlock can happen. del_gendisk waits all sysfs
files access to finish and sysfs file access waits reconfig mutex. So
put del_gendisk after releasing reconfig mutex.
But there is still a window that sysfs can be accessed between mddev_unlock
and del_gendisk. So some actions (add disk, change level, .e.g) can happen
which lead unexpected results. MD_DELETED is used to resolve this problem.
MD_DELETED is set before releasing reconfig mutex and it should be checked
for these sysfs access which need reconfig mutex. For sysfs access which
don't need reconfig mutex, del_gendisk will wait them to finish.
But it doesn't need to do this in function mddev_lock_nointr. There are
ten places that call it.
* Five of them are in dm raid which we don't need to care. MD_DELETED is
only used for md raid.
* stop_sync_thread, md_do_sync and md_start_sync are related sync request,
and it needs to wait sync thread to finish before stopping an array.
* md_ioctl: md_open is called before md_ioctl, so ->openers is added. It
will fail to stop the array. So it doesn't need to check MD_DELETED here
* md_set_readonly:
It needs to call mddev_set_closing_and_sync_blockdev when setting readonly
or read_auto. So it will fail to stop the array too because MD_CLOSING is
already set.
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Xiao Ni <xni@redhat.com>
---
drivers/md/md.c | 33 +++++++++++++++++++++++----------
drivers/md/md.h | 26 ++++++++++++++++++++++++--
2 files changed, 47 insertions(+), 12 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 0f03b21e66e4..7445e44eabff 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -636,9 +636,6 @@ static void __mddev_put(struct mddev *mddev)
mddev->ctime || mddev->hold_active)
return;
- /* Array is not configured at all, and not held active, so destroy it */
- set_bit(MD_DELETED, &mddev->flags);
-
/*
* Call queue_work inside the spinlock so that flush_workqueue() after
* mddev_find will succeed in waiting for the work to be done.
@@ -873,6 +870,16 @@ void mddev_unlock(struct mddev *mddev)
kobject_del(&rdev->kobj);
export_rdev(rdev, mddev);
}
+
+ /* Call del_gendisk after release reconfig_mutex to avoid
+ * deadlock (e.g. call del_gendisk under the lock and an
+ * access to sysfs files waits the lock)
+ * And MD_DELETED is only used for md raid which is set in
+ * do_md_stop. dm raid only uses md_stop to stop. So dm raid
+ * doesn't need to check MD_DELETED when getting reconfig lock
+ */
+ if (test_bit(MD_DELETED, &mddev->flags))
+ del_gendisk(mddev->gendisk);
}
EXPORT_SYMBOL_GPL(mddev_unlock);
@@ -5774,19 +5781,30 @@ md_attr_store(struct kobject *kobj, struct attribute *attr,
struct md_sysfs_entry *entry = container_of(attr, struct md_sysfs_entry, attr);
struct mddev *mddev = container_of(kobj, struct mddev, kobj);
ssize_t rv;
+ struct kernfs_node *kn = NULL;
if (!entry->store)
return -EIO;
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
+
+ if (entry->store == array_state_store && cmd_match(page, "clear"))
+ kn = sysfs_break_active_protection(kobj, attr);
+
spin_lock(&all_mddevs_lock);
if (!mddev_get(mddev)) {
spin_unlock(&all_mddevs_lock);
+ if (kn)
+ sysfs_unbreak_active_protection(kn);
return -EBUSY;
}
spin_unlock(&all_mddevs_lock);
rv = entry->store(mddev, page, length);
mddev_put(mddev);
+
+ if (kn)
+ sysfs_unbreak_active_protection(kn);
+
return rv;
}
@@ -5794,12 +5812,6 @@ static void md_kobj_release(struct kobject *ko)
{
struct mddev *mddev = container_of(ko, struct mddev, kobj);
- if (mddev->sysfs_state)
- sysfs_put(mddev->sysfs_state);
- if (mddev->sysfs_level)
- sysfs_put(mddev->sysfs_level);
-
- del_gendisk(mddev->gendisk);
put_disk(mddev->gendisk);
}
@@ -6646,8 +6658,9 @@ static int do_md_stop(struct mddev *mddev, int mode)
mddev->bitmap_info.offset = 0;
export_array(mddev);
-
md_clean(mddev);
+ set_bit(MD_DELETED, &mddev->flags);
+
if (mddev->hold_active == UNTIL_STOP)
mddev->hold_active = 0;
}
diff --git a/drivers/md/md.h b/drivers/md/md.h
index d45a9e6ead80..67b365621507 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -700,11 +700,26 @@ static inline bool reshape_interrupted(struct mddev *mddev)
static inline int __must_check mddev_lock(struct mddev *mddev)
{
- return mutex_lock_interruptible(&mddev->reconfig_mutex);
+ int ret;
+
+ ret = mutex_lock_interruptible(&mddev->reconfig_mutex);
+
+ /* MD_DELETED is set in do_md_stop with reconfig_mutex.
+ * So check it here.
+ */
+ if (!ret && test_bit(MD_DELETED, &mddev->flags)) {
+ ret = -ENODEV;
+ mutex_unlock(&mddev->reconfig_mutex);
+ }
+
+ return ret;
}
/* Sometimes we need to take the lock in a situation where
* failure due to interrupts is not acceptable.
+ * It doesn't need to check MD_DELETED here, the owner which
+ * holds the lock here can't be stopped. And all paths can't
+ * call this function after do_md_stop.
*/
static inline void mddev_lock_nointr(struct mddev *mddev)
{
@@ -713,7 +728,14 @@ static inline void mddev_lock_nointr(struct mddev *mddev)
static inline int mddev_trylock(struct mddev *mddev)
{
- return mutex_trylock(&mddev->reconfig_mutex);
+ int ret;
+
+ ret = mutex_trylock(&mddev->reconfig_mutex);
+ if (!ret && test_bit(MD_DELETED, &mddev->flags)) {
+ ret = -ENODEV;
+ mutex_unlock(&mddev->reconfig_mutex);
+ }
+ return ret;
}
extern void mddev_unlock(struct mddev *mddev);
--
2.32.0 (Apple Git-132)
next prev parent reply other threads:[~2025-06-11 7:31 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-11 7:31 [PATCH V6 0/3] md: call del_gendisk in sync way Xiao Ni
2025-06-11 7:31 ` Xiao Ni [this message]
2025-06-11 7:31 ` [PATCH 2/3] md: Don't clear MD_CLOSING until mddev is freed Xiao Ni
2025-06-11 7:31 ` [PATCH 3/3] md: remove/add redundancy group only in level change Xiao Ni
2025-06-14 6:46 ` [PATCH V6 0/3] md: call del_gendisk in sync way Yu Kuai
2025-06-14 8:18 ` Yu Kuai
2025-06-15 3:21 ` Xiao Ni
2025-06-16 1:17 ` Yu Kuai
2025-06-16 1:43 ` Xiao Ni
-- strict thread matches above, loose matches on Subject: below --
2025-06-10 7:20 [PATCH V5 " Xiao Ni
2025-06-10 7:20 ` [PATCH 1/3] md: call del_gendisk in control path Xiao Ni
2025-06-11 6:09 ` Yu Kuai
2025-06-04 9:07 [PATCH V4 0/3] md: call del_gendisk in sync way Xiao Ni
2025-06-04 9:07 ` [PATCH 1/3] md: call del_gendisk in control path Xiao Ni
2025-06-06 8:46 ` Yu Kuai
2025-06-06 8:48 ` Yu Kuai
2025-06-09 9:15 ` Xiao Ni
2025-06-03 5:20 [PATCH V3 0/2] md: call del_gendisk in sync way Xiao Ni
2025-06-03 5:20 ` [PATCH 1/3] md: call del_gendisk in control path Xiao Ni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250611073108.25463-2-xni@redhat.com \
--to=xni@redhat.com \
--cc=linux-raid@vger.kernel.org \
--cc=ncroxon@redhat.com \
--cc=song@kernel.org \
--cc=yukuai1@huaweicloud.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).