From: "Yu Kuai" <yukuai@fnnas.com>
To: <song@kernel.org>
Cc: <linan122@huawei.com>, <xni@redhat.com>,
<linux-raid@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<yukuai@fnnas.com>
Subject: Re: [PATCH] md: fix array_state=clear sysfs deadlock
Date: Tue, 7 Apr 2026 13:02:32 +0800 [thread overview]
Message-ID: <2660f553-3635-4606-b90e-7398e5561341@fnnas.com> (raw)
In-Reply-To: <20260330055213.3976052-1-yukuai@fnnas.com>
在 2026/3/30 13:52, Yu Kuai 写道:
> From: Yu Kuai<yukuai3@huawei.com>
>
> When "clear" is written to array_state, md_attr_store() breaks sysfs
> active protection so the array can delete itself from its own sysfs
> store method.
>
> However, md_attr_store() currently drops the mddev reference before
> calling sysfs_unbreak_active_protection(). Once do_md_stop(..., 0)
> has made the mddev eligible for delayed deletion, the temporary
> kobject reference taken by sysfs_break_active_protection() can become
> the last kobject reference protecting the md kobject.
>
> That allows sysfs_unbreak_active_protection() to drop the last
> kobject reference from the current sysfs writer context. kobject
> teardown then recurses into kernfs removal while the current sysfs
> node is still being unwound, and lockdep reports recursive locking on
> kn->active with kernfs_drain() in the call chain.
>
> Reproducer on an existing level:
> 1. Create an md0 linear array and activate it:
> mknod /dev/md0 b 9 0
> echo none > /sys/block/md0/md/metadata_version
> echo linear > /sys/block/md0/md/level
> echo 1 > /sys/block/md0/md/raid_disks
> echo "$(cat /sys/class/block/sdb/dev)" > /sys/block/md0/md/new_dev
> echo "$(($(cat /sys/class/block/sdb/size) / 2))" > \
> /sys/block/md0/md/dev-sdb/size
> echo 0 > /sys/block/md0/md/dev-sdb/slot
> echo active > /sys/block/md0/md/array_state
> 2. Wait briefly for the array to settle, then clear it:
> sleep 2
> echo clear > /sys/block/md0/md/array_state
>
> The warning looks like:
>
> WARNING: possible recursive locking detected
> bash/588 is trying to acquire lock:
> (kn->active#65) at __kernfs_remove+0x157/0x1d0
> but task is already holding lock:
> (kn->active#65) at sysfs_unbreak_active_protection+0x1f/0x40
> ...
> Call Trace:
> kernfs_drain
> __kernfs_remove
> kernfs_remove_by_name_ns
> sysfs_remove_group
> sysfs_remove_groups
> __kobject_del
> kobject_put
> md_attr_store
> kernfs_fop_write_iter
> vfs_write
> ksys_write
>
> Restore active protection before mddev_put() so the extra sysfs
> kobject reference is dropped while the mddev is still held alive. The
> actual md kobject deletion is then deferred until after the sysfs
> write path has fully returned.
>
> Fixes: 9e59d609763f ("md: call del_gendisk in control path")
> Signed-off-by: Yu Kuai<yukuai3@huawei.com>
> ---
> drivers/md/md.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
Applied to md-7.1
--
Thansk,
Kuai
prev parent reply other threads:[~2026-04-07 5:03 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-30 5:52 [PATCH] md: fix array_state=clear sysfs deadlock Yu Kuai
2026-03-30 15:47 ` Xiao Ni
2026-04-02 2:09 ` Li Nan
2026-04-07 5:02 ` Yu Kuai [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2660f553-3635-4606-b90e-7398e5561341@fnnas.com \
--to=yukuai@fnnas.com \
--cc=linan122@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=song@kernel.org \
--cc=xni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox