* [PATCH 0/3] md: call del_gendisk in sync way @ 2025-05-07 2:14 Xiao Ni 2025-05-07 2:14 ` [PATCH 1/3] md: Don't clear MD_CLOSING until mddev is freed Xiao Ni ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Xiao Ni @ 2025-05-07 2:14 UTC (permalink / raw) To: linux-raid; +Cc: song, yukuai1, ncroxon Now del_gendisk is called in a queue work which has a small window that mdadm --stop command exits but the device node still exists. It causes trouble in regression tests. This patch set tries to resolve this problem. Xiao Ni (3): md: Don't clear MD_CLOSING until mddev is freed md: replace MD_DELETED with MD_CLOSING md: call del_gendisk in sync way drivers/md/md.c | 38 +++++++++++++++++++------------------- drivers/md/md.h | 2 -- 2 files changed, 19 insertions(+), 21 deletions(-) -- 2.32.0 (Apple Git-132) ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/3] md: Don't clear MD_CLOSING until mddev is freed 2025-05-07 2:14 [PATCH 0/3] md: call del_gendisk in sync way Xiao Ni @ 2025-05-07 2:14 ` Xiao Ni 2025-05-07 2:14 ` [PATCH 2/3] md: replace MD_DELETED with MD_CLOSING Xiao Ni 2025-05-07 2:14 ` [PATCH 3/3] md: call del_gendisk in sync way Xiao Ni 2 siblings, 0 replies; 13+ messages in thread From: Xiao Ni @ 2025-05-07 2:14 UTC (permalink / raw) To: linux-raid; +Cc: song, yukuai1, ncroxon UNTIL_STOP is used to avoid mddev is freed on the last close before adding disks to mddev. And it should be cleared when stopping an array which is mentioned in commit efeb53c0e572 ("md: Allow md devices to be created by name."). So reset ->hold_active to 0 in md_clean. And MD_CLOSING should be kept until mddev is freed to avoid reopen. Signed-off-by: Xiao Ni <xni@redhat.com> --- drivers/md/md.c | 15 ++++----------- 1 file changed, 4 insertions(+), 11 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 9daa78c5fe33..9b9950ed6ee9 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -6360,15 +6360,10 @@ static void md_clean(struct mddev *mddev) mddev->persistent = 0; mddev->level = LEVEL_NONE; mddev->clevel[0] = 0; - /* - * Don't clear MD_CLOSING, or mddev can be opened again. - * 'hold_active != 0' means mddev is still in the creation - * process and will be used later. - */ - if (mddev->hold_active) - mddev->flags = 0; - else - mddev->flags &= BIT_ULL_MASK(MD_CLOSING); + /* if UNTIL_STOP is set, it's cleared here */ + mddev->hold_active = 0; + /* Don't clear MD_CLOSING, or mddev can be opened again. */ + mddev->flags &= BIT_ULL_MASK(MD_CLOSING); mddev->sb_flags = 0; mddev->ro = MD_RDWR; mddev->metadata_type[0] = 0; @@ -6595,8 +6590,6 @@ static int do_md_stop(struct mddev *mddev, int mode) export_array(mddev); md_clean(mddev); - if (mddev->hold_active == UNTIL_STOP) - mddev->hold_active = 0; } md_new_event(); sysfs_notify_dirent_safe(mddev->sysfs_state); -- 2.32.0 (Apple Git-132) ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/3] md: replace MD_DELETED with MD_CLOSING 2025-05-07 2:14 [PATCH 0/3] md: call del_gendisk in sync way Xiao Ni 2025-05-07 2:14 ` [PATCH 1/3] md: Don't clear MD_CLOSING until mddev is freed Xiao Ni @ 2025-05-07 2:14 ` Xiao Ni 2025-05-07 6:20 ` Yu Kuai 2025-05-07 2:14 ` [PATCH 3/3] md: call del_gendisk in sync way Xiao Ni 2 siblings, 1 reply; 13+ messages in thread From: Xiao Ni @ 2025-05-07 2:14 UTC (permalink / raw) To: linux-raid; +Cc: song, yukuai1, ncroxon Before commit 12a6caf27324 ("md: only delete entries from all_mddevs when the disk is freed") MD_CLOSING is cleared in ioctl path. Now MD_CLOSING will keep until mddev is freed. So MD_CLOSING can be used to check if the array is stopping. Signed-off-by: Xiao Ni <xni@redhat.com> --- drivers/md/md.c | 9 +++------ drivers/md/md.h | 2 -- 2 files changed, 3 insertions(+), 8 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 9b9950ed6ee9..c226747be9e3 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -599,7 +599,7 @@ static inline struct mddev *mddev_get(struct mddev *mddev) { lockdep_assert_held(&all_mddevs_lock); - if (test_bit(MD_DELETED, &mddev->flags)) + if (test_bit(MD_CLOSING, &mddev->flags)) return NULL; atomic_inc(&mddev->active); return mddev; @@ -613,9 +613,6 @@ static void __mddev_put(struct mddev *mddev) mddev->ctime || mddev->hold_active) return; - /* Array is not configured at all, and not held active, so destroy it */ - set_bit(MD_DELETED, &mddev->flags); - /* * Call queue_work inside the spinlock so that flush_workqueue() after * mddev_find will succeed in waiting for the work to be done. @@ -3312,7 +3309,7 @@ static bool md_rdev_overlaps(struct md_rdev *rdev) spin_lock(&all_mddevs_lock); list_for_each_entry(mddev, &all_mddevs, all_mddevs) { - if (test_bit(MD_DELETED, &mddev->flags)) + if (test_bit(MD_CLOSING, &mddev->flags)) continue; rdev_for_each(rdev2, mddev) { if (rdev != rdev2 && rdev->bdev == rdev2->bdev && @@ -8992,7 +8989,7 @@ void md_do_sync(struct md_thread *thread) goto skip; spin_lock(&all_mddevs_lock); list_for_each_entry(mddev2, &all_mddevs, all_mddevs) { - if (test_bit(MD_DELETED, &mddev2->flags)) + if (test_bit(MD_CLOSING, &mddev2->flags)) continue; if (mddev2 == mddev) continue; diff --git a/drivers/md/md.h b/drivers/md/md.h index 1cf00a04bcdd..a9dccb3d84ed 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -338,7 +338,6 @@ struct md_cluster_operations; * @MD_NOT_READY: do_md_run() is active, so 'array_state', ust not report that * array is ready yet. * @MD_BROKEN: This is used to stop writes and mark array as failed. - * @MD_DELETED: This device is being deleted * * change UNSUPPORTED_MDDEV_FLAGS for each array type if new flag is added */ @@ -353,7 +352,6 @@ enum mddev_flags { MD_HAS_MULTIPLE_PPLS, MD_NOT_READY, MD_BROKEN, - MD_DELETED, }; enum mddev_sb_flags { -- 2.32.0 (Apple Git-132) ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3] md: replace MD_DELETED with MD_CLOSING 2025-05-07 2:14 ` [PATCH 2/3] md: replace MD_DELETED with MD_CLOSING Xiao Ni @ 2025-05-07 6:20 ` Yu Kuai 2025-05-09 9:33 ` Xiao Ni 0 siblings, 1 reply; 13+ messages in thread From: Yu Kuai @ 2025-05-07 6:20 UTC (permalink / raw) To: Xiao Ni, linux-raid; +Cc: song, yukuai1, ncroxon, yukuai (C) Hi, 在 2025/05/07 10:14, Xiao Ni 写道: > Before commit 12a6caf27324 ("md: only delete entries from all_mddevs when > the disk is freed") MD_CLOSING is cleared in ioctl path. Now MD_CLOSING > will keep until mddev is freed. So MD_CLOSING can be used to check if the > array is stopping. > > Signed-off-by: Xiao Ni <xni@redhat.com> > --- > drivers/md/md.c | 9 +++------ > drivers/md/md.h | 2 -- > 2 files changed, 3 insertions(+), 8 deletions(-) > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index 9b9950ed6ee9..c226747be9e3 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -599,7 +599,7 @@ static inline struct mddev *mddev_get(struct mddev *mddev) > { > lockdep_assert_held(&all_mddevs_lock); > > - if (test_bit(MD_DELETED, &mddev->flags)) > + if (test_bit(MD_CLOSING, &mddev->flags)) > return NULL; > atomic_inc(&mddev->active); > return mddev; I noticed that MD_CLOSING can be set temporarily in following case: sysfs: if (st == readonly || st == read_auto || st == inactive || ┊ (err && st == clear)) clear_bit(MD_CLOSING, &mddev->flags); ioctl: if (cmd == STOP_ARRAY_RO || (err && cmd == STOP_ARRAY)) clear_bit(MD_CLOSING, &mddev->flags); And we should still allow mmdev_get() to pass in these cases. Thanks, Kuai > @@ -613,9 +613,6 @@ static void __mddev_put(struct mddev *mddev) > mddev->ctime || mddev->hold_active) > return; > > - /* Array is not configured at all, and not held active, so destroy it */ > - set_bit(MD_DELETED, &mddev->flags); > - > /* > * Call queue_work inside the spinlock so that flush_workqueue() after > * mddev_find will succeed in waiting for the work to be done. > @@ -3312,7 +3309,7 @@ static bool md_rdev_overlaps(struct md_rdev *rdev) > > spin_lock(&all_mddevs_lock); > list_for_each_entry(mddev, &all_mddevs, all_mddevs) { > - if (test_bit(MD_DELETED, &mddev->flags)) > + if (test_bit(MD_CLOSING, &mddev->flags)) > continue; > rdev_for_each(rdev2, mddev) { > if (rdev != rdev2 && rdev->bdev == rdev2->bdev && > @@ -8992,7 +8989,7 @@ void md_do_sync(struct md_thread *thread) > goto skip; > spin_lock(&all_mddevs_lock); > list_for_each_entry(mddev2, &all_mddevs, all_mddevs) { > - if (test_bit(MD_DELETED, &mddev2->flags)) > + if (test_bit(MD_CLOSING, &mddev2->flags)) > continue; > if (mddev2 == mddev) > continue; > diff --git a/drivers/md/md.h b/drivers/md/md.h > index 1cf00a04bcdd..a9dccb3d84ed 100644 > --- a/drivers/md/md.h > +++ b/drivers/md/md.h > @@ -338,7 +338,6 @@ struct md_cluster_operations; > * @MD_NOT_READY: do_md_run() is active, so 'array_state', ust not report that > * array is ready yet. > * @MD_BROKEN: This is used to stop writes and mark array as failed. > - * @MD_DELETED: This device is being deleted > * > * change UNSUPPORTED_MDDEV_FLAGS for each array type if new flag is added > */ > @@ -353,7 +352,6 @@ enum mddev_flags { > MD_HAS_MULTIPLE_PPLS, > MD_NOT_READY, > MD_BROKEN, > - MD_DELETED, > }; > > enum mddev_sb_flags { > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3] md: replace MD_DELETED with MD_CLOSING 2025-05-07 6:20 ` Yu Kuai @ 2025-05-09 9:33 ` Xiao Ni 2025-05-09 10:08 ` Yu Kuai 0 siblings, 1 reply; 13+ messages in thread From: Xiao Ni @ 2025-05-09 9:33 UTC (permalink / raw) To: Yu Kuai; +Cc: linux-raid, song, ncroxon, yukuai (C) On Wed, May 7, 2025 at 2:21 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: > > Hi, > > 在 2025/05/07 10:14, Xiao Ni 写道: > > Before commit 12a6caf27324 ("md: only delete entries from all_mddevs when > > the disk is freed") MD_CLOSING is cleared in ioctl path. Now MD_CLOSING > > will keep until mddev is freed. So MD_CLOSING can be used to check if the > > array is stopping. > > > > Signed-off-by: Xiao Ni <xni@redhat.com> > > --- > > drivers/md/md.c | 9 +++------ > > drivers/md/md.h | 2 -- > > 2 files changed, 3 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/md/md.c b/drivers/md/md.c > > index 9b9950ed6ee9..c226747be9e3 100644 > > --- a/drivers/md/md.c > > +++ b/drivers/md/md.c > > @@ -599,7 +599,7 @@ static inline struct mddev *mddev_get(struct mddev *mddev) > > { > > lockdep_assert_held(&all_mddevs_lock); > > > > - if (test_bit(MD_DELETED, &mddev->flags)) > > + if (test_bit(MD_CLOSING, &mddev->flags)) > > return NULL; > > atomic_inc(&mddev->active); > > return mddev; > > I noticed that MD_CLOSING can be set temporarily in following case: > > sysfs: > if (st == readonly || st == read_auto || st == inactive || > ┊ (err && st == clear)) > clear_bit(MD_CLOSING, &mddev->flags); > > ioctl: > if (cmd == STOP_ARRAY_RO || (err && cmd == STOP_ARRAY)) > clear_bit(MD_CLOSING, &mddev->flags); > > > And we should still allow mmdev_get() to pass in these cases. The two places clear MD_CLOSING rather than setting MD_CLOSING. MD_CLOSING is set when we really want to stop the array (STOP_ARRAY cmd and clear>array_state_store). So the two places clear MD_CLOSING for other situations which look good to me. Regards Xiao > > Thanks, > Kuai > > > @@ -613,9 +613,6 @@ static void __mddev_put(struct mddev *mddev) > > mddev->ctime || mddev->hold_active) > > return; > > > > - /* Array is not configured at all, and not held active, so destroy it */ > > - set_bit(MD_DELETED, &mddev->flags); > > - > > /* > > * Call queue_work inside the spinlock so that flush_workqueue() after > > * mddev_find will succeed in waiting for the work to be done. > > @@ -3312,7 +3309,7 @@ static bool md_rdev_overlaps(struct md_rdev *rdev) > > > > spin_lock(&all_mddevs_lock); > > list_for_each_entry(mddev, &all_mddevs, all_mddevs) { > > - if (test_bit(MD_DELETED, &mddev->flags)) > > + if (test_bit(MD_CLOSING, &mddev->flags)) > > continue; > > rdev_for_each(rdev2, mddev) { > > if (rdev != rdev2 && rdev->bdev == rdev2->bdev && > > @@ -8992,7 +8989,7 @@ void md_do_sync(struct md_thread *thread) > > goto skip; > > spin_lock(&all_mddevs_lock); > > list_for_each_entry(mddev2, &all_mddevs, all_mddevs) { > > - if (test_bit(MD_DELETED, &mddev2->flags)) > > + if (test_bit(MD_CLOSING, &mddev2->flags)) > > continue; > > if (mddev2 == mddev) > > continue; > > diff --git a/drivers/md/md.h b/drivers/md/md.h > > index 1cf00a04bcdd..a9dccb3d84ed 100644 > > --- a/drivers/md/md.h > > +++ b/drivers/md/md.h > > @@ -338,7 +338,6 @@ struct md_cluster_operations; > > * @MD_NOT_READY: do_md_run() is active, so 'array_state', ust not report that > > * array is ready yet. > > * @MD_BROKEN: This is used to stop writes and mark array as failed. > > - * @MD_DELETED: This device is being deleted > > * > > * change UNSUPPORTED_MDDEV_FLAGS for each array type if new flag is added > > */ > > @@ -353,7 +352,6 @@ enum mddev_flags { > > MD_HAS_MULTIPLE_PPLS, > > MD_NOT_READY, > > MD_BROKEN, > > - MD_DELETED, > > }; > > > > enum mddev_sb_flags { > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3] md: replace MD_DELETED with MD_CLOSING 2025-05-09 9:33 ` Xiao Ni @ 2025-05-09 10:08 ` Yu Kuai 2025-05-09 10:20 ` Xiao Ni 0 siblings, 1 reply; 13+ messages in thread From: Yu Kuai @ 2025-05-09 10:08 UTC (permalink / raw) To: Xiao Ni, Yu Kuai; +Cc: linux-raid, song, ncroxon, yukuai (C) Hi, 在 2025/05/09 17:33, Xiao Ni 写道: > The two places clear MD_CLOSING rather than setting MD_CLOSING. > MD_CLOSING is set when we really want to stop the array (STOP_ARRAY > cmd and clear>array_state_store). So the two places clear MD_CLOSING > for other situations which look good to me. No, MD_CLOSING can be set first in the two cases, do something and then be cleared, that's why I said temporarily. Thanks, Kuai ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3] md: replace MD_DELETED with MD_CLOSING 2025-05-09 10:08 ` Yu Kuai @ 2025-05-09 10:20 ` Xiao Ni 2025-05-09 10:26 ` Yu Kuai 0 siblings, 1 reply; 13+ messages in thread From: Xiao Ni @ 2025-05-09 10:20 UTC (permalink / raw) To: Yu Kuai; +Cc: linux-raid, song, ncroxon, yukuai (C) On Fri, May 9, 2025 at 6:08 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: > > Hi, > > 在 2025/05/09 17:33, Xiao Ni 写道: > > The two places clear MD_CLOSING rather than setting MD_CLOSING. > > MD_CLOSING is set when we really want to stop the array (STOP_ARRAY > > cmd and clear>array_state_store). So the two places clear MD_CLOSING > > for other situations which look good to me. > > No, MD_CLOSING can be set first in the two cases, do something and then > be cleared, that's why I said temporarily. So you mean mddev_get should pass in this case (between setting MD_CLOSING and clearing MD_CLOSING)? It doesn't allow get mddev now without this patch. This should be right. Regardes Xiao > > Thanks, > Kuai > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3] md: replace MD_DELETED with MD_CLOSING 2025-05-09 10:20 ` Xiao Ni @ 2025-05-09 10:26 ` Yu Kuai 2025-05-09 11:20 ` Xiao Ni 0 siblings, 1 reply; 13+ messages in thread From: Yu Kuai @ 2025-05-09 10:26 UTC (permalink / raw) To: Xiao Ni, Yu Kuai; +Cc: linux-raid, song, ncroxon, yukuai (C) Hi, 在 2025/05/09 18:20, Xiao Ni 写道: > On Fri, May 9, 2025 at 6:08 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: >> >> Hi, >> >> 在 2025/05/09 17:33, Xiao Ni 写道: >>> The two places clear MD_CLOSING rather than setting MD_CLOSING. >>> MD_CLOSING is set when we really want to stop the array (STOP_ARRAY >>> cmd and clear>array_state_store). So the two places clear MD_CLOSING >>> for other situations which look good to me. >> >> No, MD_CLOSING can be set first in the two cases, do something and then >> be cleared, that's why I said temporarily. > > So you mean mddev_get should pass in this case (between setting > MD_CLOSING and clearing MD_CLOSING)? It doesn't allow get mddev now > without this patch. This should be right. I don't understand what you mean "It doesn't allow get mddev now without this patch", for example, the ioctl STOP_ARRAY_RO will set MD_CLOSING temporarily, but never set MD_DELETED, mddev_get() can always pass. Thanks, Kuai > > Regardes > Xiao > >> >> Thanks, >> Kuai >> > > > . > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3] md: replace MD_DELETED with MD_CLOSING 2025-05-09 10:26 ` Yu Kuai @ 2025-05-09 11:20 ` Xiao Ni 2025-05-12 1:39 ` Yu Kuai 0 siblings, 1 reply; 13+ messages in thread From: Xiao Ni @ 2025-05-09 11:20 UTC (permalink / raw) To: Yu Kuai; +Cc: linux-raid, song, ncroxon, yukuai (C) On Fri, May 9, 2025 at 6:26 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: > > Hi, > > 在 2025/05/09 18:20, Xiao Ni 写道: > > On Fri, May 9, 2025 at 6:08 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: > >> > >> Hi, > >> > >> 在 2025/05/09 17:33, Xiao Ni 写道: > >>> The two places clear MD_CLOSING rather than setting MD_CLOSING. > >>> MD_CLOSING is set when we really want to stop the array (STOP_ARRAY > >>> cmd and clear>array_state_store). So the two places clear MD_CLOSING > >>> for other situations which look good to me. > >> > >> No, MD_CLOSING can be set first in the two cases, do something and then > >> be cleared, that's why I said temporarily. > > > > So you mean mddev_get should pass in this case (between setting > > MD_CLOSING and clearing MD_CLOSING)? It doesn't allow get mddev now > > without this patch. This should be right. > > I don't understand what you mean "It doesn't allow get mddev now without > this patch", for example, the ioctl STOP_ARRAY_RO will set MD_CLOSING > temporarily, but never set MD_DELETED, mddev_get() can always pass. You're right. I thought of the md_open path and it checks MD_CLOSING. There is a short window that will fail to open sysfs files. I want to make things easier and clearer here: it can't get mddev when MD_CLOSING is set. Is there any problem if we use this rule? Regards Xiao > > Thanks, > Kuai > > > > > Regardes > > Xiao > > > >> > >> Thanks, > >> Kuai > >> > > > > > > . > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3] md: replace MD_DELETED with MD_CLOSING 2025-05-09 11:20 ` Xiao Ni @ 2025-05-12 1:39 ` Yu Kuai 0 siblings, 0 replies; 13+ messages in thread From: Yu Kuai @ 2025-05-12 1:39 UTC (permalink / raw) To: Xiao Ni, Yu Kuai; +Cc: linux-raid, song, ncroxon, yukuai (C) Hi, 在 2025/05/09 19:20, Xiao Ni 写道: > On Fri, May 9, 2025 at 6:26 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: >> >> Hi, >> >> 在 2025/05/09 18:20, Xiao Ni 写道: >>> On Fri, May 9, 2025 at 6:08 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: >>>> >>>> Hi, >>>> >>>> 在 2025/05/09 17:33, Xiao Ni 写道: >>>>> The two places clear MD_CLOSING rather than setting MD_CLOSING. >>>>> MD_CLOSING is set when we really want to stop the array (STOP_ARRAY >>>>> cmd and clear>array_state_store). So the two places clear MD_CLOSING >>>>> for other situations which look good to me. >>>> >>>> No, MD_CLOSING can be set first in the two cases, do something and then >>>> be cleared, that's why I said temporarily. >>> >>> So you mean mddev_get should pass in this case (between setting >>> MD_CLOSING and clearing MD_CLOSING)? It doesn't allow get mddev now >>> without this patch. This should be right. >> >> I don't understand what you mean "It doesn't allow get mddev now without >> this patch", for example, the ioctl STOP_ARRAY_RO will set MD_CLOSING >> temporarily, but never set MD_DELETED, mddev_get() can always pass. > > You're right. I thought of the md_open path and it checks MD_CLOSING. > There is a short window that will fail to open sysfs files. I want to > make things easier and clearer here: it can't get mddev when > MD_CLOSING is set. Is there any problem if we use this rule? What I can think of is that during the window, user can't access this array through sysfs or /proc/mdstat after this patch. We'd better not introduce new restrictions for user, unless we're 100% sure they won't be affected. Thanks, Kuai > > Regards > Xiao >> >> Thanks, >> Kuai >> >>> >>> Regardes >>> Xiao >>> >>>> >>>> Thanks, >>>> Kuai >>>> >>> >>> >>> . >>> >> > > > . > ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 3/3] md: call del_gendisk in sync way 2025-05-07 2:14 [PATCH 0/3] md: call del_gendisk in sync way Xiao Ni 2025-05-07 2:14 ` [PATCH 1/3] md: Don't clear MD_CLOSING until mddev is freed Xiao Ni 2025-05-07 2:14 ` [PATCH 2/3] md: replace MD_DELETED with MD_CLOSING Xiao Ni @ 2025-05-07 2:14 ` Xiao Ni 2025-05-07 6:28 ` Yu Kuai 2 siblings, 1 reply; 13+ messages in thread From: Xiao Ni @ 2025-05-07 2:14 UTC (permalink / raw) To: linux-raid; +Cc: song, yukuai1, ncroxon Now del_gendisk and put_disk are called asynchronously in workqueue work. The asynchronous way also has a problem that the device node can still exist after mdadm --stop command returns in a short window. So udev rule can open this device node and create the struct mddev in kernel again. So put del_gendisk in ioctl path and still leave put_disk in md_kobj_release to avoid uaf. Signed-off-by: Xiao Ni <xni@redhat.com> --- drivers/md/md.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index c226747be9e3..a82778f6c31f 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -5718,19 +5718,30 @@ md_attr_store(struct kobject *kobj, struct attribute *attr, struct md_sysfs_entry *entry = container_of(attr, struct md_sysfs_entry, attr); struct mddev *mddev = container_of(kobj, struct mddev, kobj); ssize_t rv; + struct kernfs_node *kn = NULL; if (!entry->store) return -EIO; if (!capable(CAP_SYS_ADMIN)) return -EACCES; + + if (entry->store == array_state_store && cmd_match(page, "clear")) + kn = sysfs_break_active_protection(kobj, attr); + spin_lock(&all_mddevs_lock); if (!mddev_get(mddev)) { + if (kn) + sysfs_unbreak_active_protection(kn); spin_unlock(&all_mddevs_lock); return -EBUSY; } spin_unlock(&all_mddevs_lock); rv = entry->store(mddev, page, length); mddev_put(mddev); + + if (kn) + sysfs_unbreak_active_protection(kn); + return rv; } @@ -5743,7 +5754,6 @@ static void md_kobj_release(struct kobject *ko) if (mddev->sysfs_level) sysfs_put(mddev->sysfs_level); - del_gendisk(mddev->gendisk); put_disk(mddev->gendisk); } @@ -6585,8 +6595,8 @@ static int do_md_stop(struct mddev *mddev, int mode) mddev->bitmap_info.offset = 0; export_array(mddev); - md_clean(mddev); + del_gendisk(mddev->gendisk); } md_new_event(); sysfs_notify_dirent_safe(mddev->sysfs_state); -- 2.32.0 (Apple Git-132) ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] md: call del_gendisk in sync way 2025-05-07 2:14 ` [PATCH 3/3] md: call del_gendisk in sync way Xiao Ni @ 2025-05-07 6:28 ` Yu Kuai 0 siblings, 0 replies; 13+ messages in thread From: Yu Kuai @ 2025-05-07 6:28 UTC (permalink / raw) To: Xiao Ni, linux-raid; +Cc: song, yukuai1, ncroxon, yukuai (C) Hi, 在 2025/05/07 10:14, Xiao Ni 写道: > Now del_gendisk and put_disk are called asynchronously in workqueue work. > The asynchronous way also has a problem that the device node can still > exist after mdadm --stop command returns in a short window. So udev rule > can open this device node and create the struct mddev in kernel again. > > So put del_gendisk in ioctl path and still leave put_disk in > md_kobj_release to avoid uaf. > > Signed-off-by: Xiao Ni <xni@redhat.com> > --- > drivers/md/md.c | 14 ++++++++++++-- > 1 file changed, 12 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index c226747be9e3..a82778f6c31f 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -5718,19 +5718,30 @@ md_attr_store(struct kobject *kobj, struct attribute *attr, > struct md_sysfs_entry *entry = container_of(attr, struct md_sysfs_entry, attr); > struct mddev *mddev = container_of(kobj, struct mddev, kobj); > ssize_t rv; > + struct kernfs_node *kn = NULL; > > if (!entry->store) > return -EIO; > if (!capable(CAP_SYS_ADMIN)) > return -EACCES; > + > + if (entry->store == array_state_store && cmd_match(page, "clear")) > + kn = sysfs_break_active_protection(kobj, attr); > + > spin_lock(&all_mddevs_lock); > if (!mddev_get(mddev)) { > + if (kn) > + sysfs_unbreak_active_protection(kn); > spin_unlock(&all_mddevs_lock); > return -EBUSY; > } > spin_unlock(&all_mddevs_lock); > rv = entry->store(mddev, page, length); > mddev_put(mddev); > + > + if (kn) > + sysfs_unbreak_active_protection(kn); > + > return rv; > } > > @@ -5743,7 +5754,6 @@ static void md_kobj_release(struct kobject *ko) > if (mddev->sysfs_level) > sysfs_put(mddev->sysfs_level); > > - del_gendisk(mddev->gendisk); > put_disk(mddev->gendisk); > } > > @@ -6585,8 +6595,8 @@ static int do_md_stop(struct mddev *mddev, int mode) > mddev->bitmap_info.offset = 0; > > export_array(mddev); > - > md_clean(mddev); > + del_gendisk(mddev->gendisk); > } Refer to the comments in mddev_unlock, there are some special hacks, to delay removing md_redundancy_group until releasing reconfig_mutex, and looks like del_gendisk here will break that. Thanks, Kuai > md_new_event(); > sysfs_notify_dirent_safe(mddev->sysfs_state); > ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH RFC 0/3] md: call del_gendisk in sync way @ 2025-04-28 8:26 Xiao Ni 2025-04-28 8:26 ` [PATCH 3/3] " Xiao Ni 0 siblings, 1 reply; 13+ messages in thread From: Xiao Ni @ 2025-04-28 8:26 UTC (permalink / raw) To: linux-raid; +Cc: song, yukuai1, ncroxon, mtkaczyk This patch clears some flags and counters related stopping an array and calls del_gendisk in sync way. This patch set can cause a deadlock when stopping imsm raid1, I'm investigating this deadlock problem to resovle this. Xiao Ni (3): md: replace MD_DELETED with MD_CLOSING md: replace ->openers with ->active md: call del_gendisk in sync way drivers/md/md.c | 67 ++++++++++++++++++++++--------------------------- drivers/md/md.h | 13 ---------- 2 files changed, 30 insertions(+), 50 deletions(-) -- 2.32.0 (Apple Git-132) ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 3/3] md: call del_gendisk in sync way 2025-04-28 8:26 [PATCH RFC 0/3] " Xiao Ni @ 2025-04-28 8:26 ` Xiao Ni 0 siblings, 0 replies; 13+ messages in thread From: Xiao Ni @ 2025-04-28 8:26 UTC (permalink / raw) To: linux-raid; +Cc: song, yukuai1, ncroxon, mtkaczyk Now del_gendisk and put_disk are called asynchronously in workqueue work. The asynchronous way also has a problem that the device node can still exist after mdadm --stop command returns in a short window. So udev rule can open this device node and create the struct mddev in kernel again. So put del_gendisk in ioctl path and still leave put_disk in md_kobj_release to avoid uaf. Signed-off-by: Xiao Ni <xni@redhat.com> --- drivers/md/md.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 4ca3d04ce13f..0c921072ffff 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -5724,11 +5724,16 @@ md_attr_store(struct kobject *kobj, struct attribute *attr, struct md_sysfs_entry *entry = container_of(attr, struct md_sysfs_entry, attr); struct mddev *mddev = container_of(kobj, struct mddev, kobj); ssize_t rv; + struct kernfs_node *kn = NULL; if (!entry->store) return -EIO; if (!capable(CAP_SYS_ADMIN)) return -EACCES; + + if (entry->store == array_state_store && cmd_match(page, "clear")) + kn = sysfs_break_active_protection(kobj, attr); + spin_lock(&all_mddevs_lock); if (!mddev_get(mddev)) { spin_unlock(&all_mddevs_lock); @@ -5737,6 +5742,10 @@ md_attr_store(struct kobject *kobj, struct attribute *attr, spin_unlock(&all_mddevs_lock); rv = entry->store(mddev, page, length); mddev_put(mddev); + + if (kn) + sysfs_unbreak_active_protection(kn); + return rv; } @@ -5749,7 +5758,6 @@ static void md_kobj_release(struct kobject *ko) if (mddev->sysfs_level) sysfs_put(mddev->sysfs_level); - del_gendisk(mddev->gendisk); put_disk(mddev->gendisk); } @@ -6591,8 +6599,8 @@ static int do_md_stop(struct mddev *mddev, int mode) mddev->bitmap_info.offset = 0; export_array(mddev); - md_clean(mddev); + del_gendisk(mddev->gendisk); } md_new_event(); sysfs_notify_dirent_safe(mddev->sysfs_state); -- 2.32.0 (Apple Git-132) ^ permalink raw reply related [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-05-12 1:39 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-05-07 2:14 [PATCH 0/3] md: call del_gendisk in sync way Xiao Ni 2025-05-07 2:14 ` [PATCH 1/3] md: Don't clear MD_CLOSING until mddev is freed Xiao Ni 2025-05-07 2:14 ` [PATCH 2/3] md: replace MD_DELETED with MD_CLOSING Xiao Ni 2025-05-07 6:20 ` Yu Kuai 2025-05-09 9:33 ` Xiao Ni 2025-05-09 10:08 ` Yu Kuai 2025-05-09 10:20 ` Xiao Ni 2025-05-09 10:26 ` Yu Kuai 2025-05-09 11:20 ` Xiao Ni 2025-05-12 1:39 ` Yu Kuai 2025-05-07 2:14 ` [PATCH 3/3] md: call del_gendisk in sync way Xiao Ni 2025-05-07 6:28 ` Yu Kuai -- strict thread matches above, loose matches on Subject: below -- 2025-04-28 8:26 [PATCH RFC 0/3] " Xiao Ni 2025-04-28 8:26 ` [PATCH 3/3] " Xiao Ni
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).