Linux RAID subsystem development
 help / color / mirror / Atom feed
* [PATCH v4 1/1] md: add legacy_async_del_gendisk mode
@ 2025-08-13  3:29 Xiao Ni
  2025-08-13  5:26 ` Paul Menzel
  2025-08-14  0:54 ` Yu Kuai
  0 siblings, 2 replies; 7+ messages in thread
From: Xiao Ni @ 2025-08-13  3:29 UTC (permalink / raw)
  To: yukuai1; +Cc: linux-raid, yukuai3, mpatocka, luca.boccassi

commit 9e59d609763f ("md: call del_gendisk in control path") changes the
async way to sync way of calling del_gendisk. But it breaks mdadm
--assemble command. The assemble command runs like this:
1. create the array
2. stop the array
3. access the sysfs files after stopping

The sync way calls del_gendisk in step 2, so all sysfs files are removed.
Now to avoid breaking mdadm assemble command, this patch adds the parameter
legacy_async_del_gendisk that can be used to choose which way. The default
is async way. In future, we plan to change default to sync way in kernel
7.0. Then users need to upgrade to mdadm 4.5+ which removes step 2.

Fixes: 9e59d609763f ("md: call del_gendisk in control path")
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Closes: https://lore.kernel.org/linux-raid/CAMw=ZnQ=ET2St-+hnhsuq34rRPnebqcXqP1QqaHW5Bh4aaaZ4g@mail.gmail.com/T/#t
Suggested-and-reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Xiao Ni <xni@redhat.com>
---
v2: minor changes on format and log content
v3: changes in commit message and log content
v4: choose to change to sync way as default first in commit message
 drivers/md/md.c | 56 ++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 42 insertions(+), 14 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index ac85ec73a409..772cffe02ff5 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -339,6 +339,7 @@ static int start_readonly;
  * so all the races disappear.
  */
 static bool create_on_open = true;
+static bool legacy_async_del_gendisk = true;
 
 /*
  * We have a system wide 'event count' that is incremented
@@ -877,15 +878,18 @@ void mddev_unlock(struct mddev *mddev)
 		export_rdev(rdev, mddev);
 	}
 
-	/* Call del_gendisk after release reconfig_mutex to avoid
-	 * deadlock (e.g. call del_gendisk under the lock and an
-	 * access to sysfs files waits the lock)
-	 * And MD_DELETED is only used for md raid which is set in
-	 * do_md_stop. dm raid only uses md_stop to stop. So dm raid
-	 * doesn't need to check MD_DELETED when getting reconfig lock
-	 */
-	if (test_bit(MD_DELETED, &mddev->flags))
-		del_gendisk(mddev->gendisk);
+	if (!legacy_async_del_gendisk) {
+		/*
+		 * Call del_gendisk after release reconfig_mutex to avoid
+		 * deadlock (e.g. call del_gendisk under the lock and an
+		 * access to sysfs files waits the lock)
+		 * And MD_DELETED is only used for md raid which is set in
+		 * do_md_stop. dm raid only uses md_stop to stop. So dm raid
+		 * doesn't need to check MD_DELETED when getting reconfig lock
+		 */
+		if (test_bit(MD_DELETED, &mddev->flags))
+			del_gendisk(mddev->gendisk);
+	}
 }
 EXPORT_SYMBOL_GPL(mddev_unlock);
 
@@ -5818,6 +5822,13 @@ static void md_kobj_release(struct kobject *ko)
 {
 	struct mddev *mddev = container_of(ko, struct mddev, kobj);
 
+	if (legacy_async_del_gendisk) {
+		if (mddev->sysfs_state)
+			sysfs_put(mddev->sysfs_state);
+		if (mddev->sysfs_level)
+			sysfs_put(mddev->sysfs_level);
+		del_gendisk(mddev->gendisk);
+	}
 	put_disk(mddev->gendisk);
 }
 
@@ -6021,6 +6032,9 @@ static int md_alloc_and_put(dev_t dev, char *name)
 {
 	struct mddev *mddev = md_alloc(dev, name);
 
+	if (legacy_async_del_gendisk)
+		pr_warn("md: async del_gendisk mode will be removed in future, please upgrade to mdadm-4.5+\n");
+
 	if (IS_ERR(mddev))
 		return PTR_ERR(mddev);
 	mddev_put(mddev);
@@ -6431,10 +6445,22 @@ static void md_clean(struct mddev *mddev)
 	mddev->persistent = 0;
 	mddev->level = LEVEL_NONE;
 	mddev->clevel[0] = 0;
-	/* if UNTIL_STOP is set, it's cleared here */
-	mddev->hold_active = 0;
-	/* Don't clear MD_CLOSING, or mddev can be opened again. */
-	mddev->flags &= BIT_ULL_MASK(MD_CLOSING);
+
+	/*
+	 * For legacy_async_del_gendisk mode, it can stop the array in the
+	 * middle of assembling it, then it still can access the array. So
+	 * it needs to clear MD_CLOSING. If not legacy_async_del_gendisk,
+	 * it can't open the array again after stopping it. So it doesn't
+	 * clear MD_CLOSING.
+	 */
+	if (legacy_async_del_gendisk && mddev->hold_active) {
+		clear_bit(MD_CLOSING, &mddev->flags);
+	} else {
+		/* if UNTIL_STOP is set, it's cleared here */
+		mddev->hold_active = 0;
+		/* Don't clear MD_CLOSING, or mddev can be opened again. */
+		mddev->flags &= BIT_ULL_MASK(MD_CLOSING);
+	}
 	mddev->sb_flags = 0;
 	mddev->ro = MD_RDWR;
 	mddev->metadata_type[0] = 0;
@@ -6658,7 +6684,8 @@ static int do_md_stop(struct mddev *mddev, int mode)
 
 		export_array(mddev);
 		md_clean(mddev);
-		set_bit(MD_DELETED, &mddev->flags);
+		if (!legacy_async_del_gendisk)
+			set_bit(MD_DELETED, &mddev->flags);
 	}
 	md_new_event();
 	sysfs_notify_dirent_safe(mddev->sysfs_state);
@@ -10392,6 +10419,7 @@ module_param_call(start_ro, set_ro, get_ro, NULL, S_IRUSR|S_IWUSR);
 module_param(start_dirty_degraded, int, S_IRUGO|S_IWUSR);
 module_param_call(new_array, add_named_array, NULL, NULL, S_IWUSR);
 module_param(create_on_open, bool, S_IRUSR|S_IWUSR);
+module_param(legacy_async_del_gendisk, bool, 0600);
 
 MODULE_LICENSE("GPL");
 MODULE_DESCRIPTION("MD RAID framework");
-- 
2.32.0 (Apple Git-132)


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/1] md: add legacy_async_del_gendisk mode
  2025-08-13  3:29 [PATCH v4 1/1] md: add legacy_async_del_gendisk mode Xiao Ni
@ 2025-08-13  5:26 ` Paul Menzel
  2025-08-13  6:34   ` Xiao Ni
  2025-08-14  0:54 ` Yu Kuai
  1 sibling, 1 reply; 7+ messages in thread
From: Paul Menzel @ 2025-08-13  5:26 UTC (permalink / raw)
  To: Xiao Ni; +Cc: linux-raid, yukuai1, yukuai3, mpatocka, luca.boccassi

Dear Xiao,


Thank you for sending the new version of the patch.

Am 13.08.25 um 05:29 schrieb Xiao Ni:
> commit 9e59d609763f ("md: call del_gendisk in control path") changes the
> async way to sync way of calling del_gendisk. But it breaks mdadm
> --assemble command. The assemble command runs like this:
> 1. create the array
> 2. stop the array
> 3. access the sysfs files after stopping
> 
> The sync way calls del_gendisk in step 2, so all sysfs files are removed.
> Now to avoid breaking mdadm assemble command, this patch adds the parameter
> legacy_async_del_gendisk that can be used to choose which way. The default
> is async way. In future, we plan to change default to sync way in kernel
> 7.0. Then users need to upgrade to mdadm 4.5+ which removes step 2.
> 
> Fixes: 9e59d609763f ("md: call del_gendisk in control path")
> Reported-by: Mikulas Patocka <mpatocka@redhat.com>
> Closes: https://lore.kernel.org/linux-raid/CAMw=ZnQ=ET2St-+hnhsuq34rRPnebqcXqP1QqaHW5Bh4aaaZ4g@mail.gmail.com/T/#t
> Suggested-and-reviewed-by: Yu Kuai <yukuai3@huawei.com>
> Signed-off-by: Xiao Ni <xni@redhat.com>
> ---
> v2: minor changes on format and log content
> v3: changes in commit message and log content
> v4: choose to change to sync way as default first in commit message
>   drivers/md/md.c | 56 ++++++++++++++++++++++++++++++++++++-------------
>   1 file changed, 42 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index ac85ec73a409..772cffe02ff5 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -339,6 +339,7 @@ static int start_readonly;
>    * so all the races disappear.
>    */
>   static bool create_on_open = true;
> +static bool legacy_async_del_gendisk = true;
>   
>   /*
>    * We have a system wide 'event count' that is incremented
> @@ -877,15 +878,18 @@ void mddev_unlock(struct mddev *mddev)
>   		export_rdev(rdev, mddev);
>   	}
>   
> -	/* Call del_gendisk after release reconfig_mutex to avoid
> -	 * deadlock (e.g. call del_gendisk under the lock and an
> -	 * access to sysfs files waits the lock)
> -	 * And MD_DELETED is only used for md raid which is set in
> -	 * do_md_stop. dm raid only uses md_stop to stop. So dm raid
> -	 * doesn't need to check MD_DELETED when getting reconfig lock
> -	 */
> -	if (test_bit(MD_DELETED, &mddev->flags))
> -		del_gendisk(mddev->gendisk);
> +	if (!legacy_async_del_gendisk) {
> +		/*
> +		 * Call del_gendisk after release reconfig_mutex to avoid
> +		 * deadlock (e.g. call del_gendisk under the lock and an
> +		 * access to sysfs files waits the lock)
> +		 * And MD_DELETED is only used for md raid which is set in
> +		 * do_md_stop. dm raid only uses md_stop to stop. So dm raid
> +		 * doesn't need to check MD_DELETED when getting reconfig lock
> +		 */
> +		if (test_bit(MD_DELETED, &mddev->flags))
> +			del_gendisk(mddev->gendisk);
> +	}
>   }
>   EXPORT_SYMBOL_GPL(mddev_unlock);
>   
> @@ -5818,6 +5822,13 @@ static void md_kobj_release(struct kobject *ko)
>   {
>   	struct mddev *mddev = container_of(ko, struct mddev, kobj);
>   
> +	if (legacy_async_del_gendisk) {
> +		if (mddev->sysfs_state)
> +			sysfs_put(mddev->sysfs_state);
> +		if (mddev->sysfs_level)
> +			sysfs_put(mddev->sysfs_level);
> +		del_gendisk(mddev->gendisk);
> +	}
>   	put_disk(mddev->gendisk);
>   }
>   
> @@ -6021,6 +6032,9 @@ static int md_alloc_and_put(dev_t dev, char *name)
>   {
>   	struct mddev *mddev = md_alloc(dev, name);
>   
> +	if (legacy_async_del_gendisk)
> +		pr_warn("md: async del_gendisk mode will be removed in future, please upgrade to mdadm-4.5+\n");

Should this log message also be updated?

Problematic async del_gendisk mode will default to disable in Linux 7.0. 
Please upgrade to mdadm 4.5+.

> +
>   	if (IS_ERR(mddev))
>   		return PTR_ERR(mddev);
>   	mddev_put(mddev);
> @@ -6431,10 +6445,22 @@ static void md_clean(struct mddev *mddev)
>   	mddev->persistent = 0;
>   	mddev->level = LEVEL_NONE;
>   	mddev->clevel[0] = 0;
> -	/* if UNTIL_STOP is set, it's cleared here */
> -	mddev->hold_active = 0;
> -	/* Don't clear MD_CLOSING, or mddev can be opened again. */
> -	mddev->flags &= BIT_ULL_MASK(MD_CLOSING);
> +
> +	/*
> +	 * For legacy_async_del_gendisk mode, it can stop the array in the
> +	 * middle of assembling it, then it still can access the array. So
> +	 * it needs to clear MD_CLOSING. If not legacy_async_del_gendisk,
> +	 * it can't open the array again after stopping it. So it doesn't
> +	 * clear MD_CLOSING.
> +	 */
> +	if (legacy_async_del_gendisk && mddev->hold_active) {
> +		clear_bit(MD_CLOSING, &mddev->flags);
> +	} else {
> +		/* if UNTIL_STOP is set, it's cleared here */
> +		mddev->hold_active = 0;
> +		/* Don't clear MD_CLOSING, or mddev can be opened again. */
> +		mddev->flags &= BIT_ULL_MASK(MD_CLOSING);
> +	}
>   	mddev->sb_flags = 0;
>   	mddev->ro = MD_RDWR;
>   	mddev->metadata_type[0] = 0;
> @@ -6658,7 +6684,8 @@ static int do_md_stop(struct mddev *mddev, int mode)
>   
>   		export_array(mddev);
>   		md_clean(mddev);
> -		set_bit(MD_DELETED, &mddev->flags);
> +		if (!legacy_async_del_gendisk)
> +			set_bit(MD_DELETED, &mddev->flags);
>   	}
>   	md_new_event();
>   	sysfs_notify_dirent_safe(mddev->sysfs_state);
> @@ -10392,6 +10419,7 @@ module_param_call(start_ro, set_ro, get_ro, NULL, S_IRUSR|S_IWUSR);
>   module_param(start_dirty_degraded, int, S_IRUGO|S_IWUSR);
>   module_param_call(new_array, add_named_array, NULL, NULL, S_IWUSR);
>   module_param(create_on_open, bool, S_IRUSR|S_IWUSR);
> +module_param(legacy_async_del_gendisk, bool, 0600);
>   
>   MODULE_LICENSE("GPL");
>   MODULE_DESCRIPTION("MD RAID framework");

Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/1] md: add legacy_async_del_gendisk mode
  2025-08-13  5:26 ` Paul Menzel
@ 2025-08-13  6:34   ` Xiao Ni
  0 siblings, 0 replies; 7+ messages in thread
From: Xiao Ni @ 2025-08-13  6:34 UTC (permalink / raw)
  To: Paul Menzel; +Cc: linux-raid, yukuai1, yukuai3, mpatocka, luca.boccassi

On Wed, Aug 13, 2025 at 1:27 PM Paul Menzel <pmenzel@molgen.mpg.de> wrote:
>
> Dear Xiao,
>
>
> Thank you for sending the new version of the patch.
>
> Am 13.08.25 um 05:29 schrieb Xiao Ni:
> > commit 9e59d609763f ("md: call del_gendisk in control path") changes the
> > async way to sync way of calling del_gendisk. But it breaks mdadm
> > --assemble command. The assemble command runs like this:
> > 1. create the array
> > 2. stop the array
> > 3. access the sysfs files after stopping
> >
> > The sync way calls del_gendisk in step 2, so all sysfs files are removed.
> > Now to avoid breaking mdadm assemble command, this patch adds the parameter
> > legacy_async_del_gendisk that can be used to choose which way. The default
> > is async way. In future, we plan to change default to sync way in kernel
> > 7.0. Then users need to upgrade to mdadm 4.5+ which removes step 2.
> >
> > Fixes: 9e59d609763f ("md: call del_gendisk in control path")
> > Reported-by: Mikulas Patocka <mpatocka@redhat.com>
> > Closes: https://lore.kernel.org/linux-raid/CAMw=ZnQ=ET2St-+hnhsuq34rRPnebqcXqP1QqaHW5Bh4aaaZ4g@mail.gmail.com/T/#t
> > Suggested-and-reviewed-by: Yu Kuai <yukuai3@huawei.com>
> > Signed-off-by: Xiao Ni <xni@redhat.com>
> > ---
> > v2: minor changes on format and log content
> > v3: changes in commit message and log content
> > v4: choose to change to sync way as default first in commit message
> >   drivers/md/md.c | 56 ++++++++++++++++++++++++++++++++++++-------------
> >   1 file changed, 42 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > index ac85ec73a409..772cffe02ff5 100644
> > --- a/drivers/md/md.c
> > +++ b/drivers/md/md.c
> > @@ -339,6 +339,7 @@ static int start_readonly;
> >    * so all the races disappear.
> >    */
> >   static bool create_on_open = true;
> > +static bool legacy_async_del_gendisk = true;
> >
> >   /*
> >    * We have a system wide 'event count' that is incremented
> > @@ -877,15 +878,18 @@ void mddev_unlock(struct mddev *mddev)
> >               export_rdev(rdev, mddev);
> >       }
> >
> > -     /* Call del_gendisk after release reconfig_mutex to avoid
> > -      * deadlock (e.g. call del_gendisk under the lock and an
> > -      * access to sysfs files waits the lock)
> > -      * And MD_DELETED is only used for md raid which is set in
> > -      * do_md_stop. dm raid only uses md_stop to stop. So dm raid
> > -      * doesn't need to check MD_DELETED when getting reconfig lock
> > -      */
> > -     if (test_bit(MD_DELETED, &mddev->flags))
> > -             del_gendisk(mddev->gendisk);
> > +     if (!legacy_async_del_gendisk) {
> > +             /*
> > +              * Call del_gendisk after release reconfig_mutex to avoid
> > +              * deadlock (e.g. call del_gendisk under the lock and an
> > +              * access to sysfs files waits the lock)
> > +              * And MD_DELETED is only used for md raid which is set in
> > +              * do_md_stop. dm raid only uses md_stop to stop. So dm raid
> > +              * doesn't need to check MD_DELETED when getting reconfig lock
> > +              */
> > +             if (test_bit(MD_DELETED, &mddev->flags))
> > +                     del_gendisk(mddev->gendisk);
> > +     }
> >   }
> >   EXPORT_SYMBOL_GPL(mddev_unlock);
> >
> > @@ -5818,6 +5822,13 @@ static void md_kobj_release(struct kobject *ko)
> >   {
> >       struct mddev *mddev = container_of(ko, struct mddev, kobj);
> >
> > +     if (legacy_async_del_gendisk) {
> > +             if (mddev->sysfs_state)
> > +                     sysfs_put(mddev->sysfs_state);
> > +             if (mddev->sysfs_level)
> > +                     sysfs_put(mddev->sysfs_level);
> > +             del_gendisk(mddev->gendisk);
> > +     }
> >       put_disk(mddev->gendisk);
> >   }
> >
> > @@ -6021,6 +6032,9 @@ static int md_alloc_and_put(dev_t dev, char *name)
> >   {
> >       struct mddev *mddev = md_alloc(dev, name);
> >
> > +     if (legacy_async_del_gendisk)
> > +             pr_warn("md: async del_gendisk mode will be removed in future, please upgrade to mdadm-4.5+\n");
>
> Should this log message also be updated?
>
> Problematic async del_gendisk mode will default to disable in Linux 7.0.
> Please upgrade to mdadm 4.5+.

Hi Paul

At first, I actually did so. But after thinking for a while, I didn't
do it. I don't want to give a specific version in the log, because we
don't know which version to do it. And it's true, we plan to remove it
in future. If you don't strongly reject it, I will not send a new
version.

>
> > +
> >       if (IS_ERR(mddev))
> >               return PTR_ERR(mddev);
> >       mddev_put(mddev);
> > @@ -6431,10 +6445,22 @@ static void md_clean(struct mddev *mddev)
> >       mddev->persistent = 0;
> >       mddev->level = LEVEL_NONE;
> >       mddev->clevel[0] = 0;
> > -     /* if UNTIL_STOP is set, it's cleared here */
> > -     mddev->hold_active = 0;
> > -     /* Don't clear MD_CLOSING, or mddev can be opened again. */
> > -     mddev->flags &= BIT_ULL_MASK(MD_CLOSING);
> > +
> > +     /*
> > +      * For legacy_async_del_gendisk mode, it can stop the array in the
> > +      * middle of assembling it, then it still can access the array. So
> > +      * it needs to clear MD_CLOSING. If not legacy_async_del_gendisk,
> > +      * it can't open the array again after stopping it. So it doesn't
> > +      * clear MD_CLOSING.
> > +      */
> > +     if (legacy_async_del_gendisk && mddev->hold_active) {
> > +             clear_bit(MD_CLOSING, &mddev->flags);
> > +     } else {
> > +             /* if UNTIL_STOP is set, it's cleared here */
> > +             mddev->hold_active = 0;
> > +             /* Don't clear MD_CLOSING, or mddev can be opened again. */
> > +             mddev->flags &= BIT_ULL_MASK(MD_CLOSING);
> > +     }
> >       mddev->sb_flags = 0;
> >       mddev->ro = MD_RDWR;
> >       mddev->metadata_type[0] = 0;
> > @@ -6658,7 +6684,8 @@ static int do_md_stop(struct mddev *mddev, int mode)
> >
> >               export_array(mddev);
> >               md_clean(mddev);
> > -             set_bit(MD_DELETED, &mddev->flags);
> > +             if (!legacy_async_del_gendisk)
> > +                     set_bit(MD_DELETED, &mddev->flags);
> >       }
> >       md_new_event();
> >       sysfs_notify_dirent_safe(mddev->sysfs_state);
> > @@ -10392,6 +10419,7 @@ module_param_call(start_ro, set_ro, get_ro, NULL, S_IRUSR|S_IWUSR);
> >   module_param(start_dirty_degraded, int, S_IRUGO|S_IWUSR);
> >   module_param_call(new_array, add_named_array, NULL, NULL, S_IWUSR);
> >   module_param(create_on_open, bool, S_IRUSR|S_IWUSR);
> > +module_param(legacy_async_del_gendisk, bool, 0600);
> >
> >   MODULE_LICENSE("GPL");
> >   MODULE_DESCRIPTION("MD RAID framework");
>
> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>

thanks for this.
>
>
> Kind regards,
>
> Paul
>

Best Regards
Xiao


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/1] md: add legacy_async_del_gendisk mode
  2025-08-13  3:29 [PATCH v4 1/1] md: add legacy_async_del_gendisk mode Xiao Ni
  2025-08-13  5:26 ` Paul Menzel
@ 2025-08-14  0:54 ` Yu Kuai
  2025-08-18  9:33   ` Luca Boccassi
  1 sibling, 1 reply; 7+ messages in thread
From: Yu Kuai @ 2025-08-14  0:54 UTC (permalink / raw)
  To: Xiao Ni, yukuai1; +Cc: linux-raid, mpatocka, luca.boccassi, yukuai (C)


在 2025/08/13 11:29, Xiao Ni 写道:
> commit 9e59d609763f ("md: call del_gendisk in control path") changes the
> async way to sync way of calling del_gendisk. But it breaks mdadm
> --assemble command. The assemble command runs like this:
> 1. create the array
> 2. stop the array
> 3. access the sysfs files after stopping
> 
> The sync way calls del_gendisk in step 2, so all sysfs files are removed.
> Now to avoid breaking mdadm assemble command, this patch adds the parameter
> legacy_async_del_gendisk that can be used to choose which way. The default
> is async way. In future, we plan to change default to sync way in kernel
> 7.0. Then users need to upgrade to mdadm 4.5+ which removes step 2.
> 
> Fixes: 9e59d609763f ("md: call del_gendisk in control path")
> Reported-by: Mikulas Patocka <mpatocka@redhat.com>
> Closes: https://lore.kernel.org/linux-raid/CAMw=ZnQ=ET2St-+hnhsuq34rRPnebqcXqP1QqaHW5Bh4aaaZ4g@mail.gmail.com/T/#t
> Suggested-and-reviewed-by: Yu Kuai <yukuai3@huawei.com>
> Signed-off-by: Xiao Ni <xni@redhat.com>
> ---
> v2: minor changes on format and log content
> v3: changes in commit message and log content
> v4: choose to change to sync way as default first in commit message
>   drivers/md/md.c | 56 ++++++++++++++++++++++++++++++++++++-------------
>   1 file changed, 42 insertions(+), 14 deletions(-)
> 

Aplied to md-6.17
Thanks

> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index ac85ec73a409..772cffe02ff5 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -339,6 +339,7 @@ static int start_readonly;
>    * so all the races disappear.
>    */
>   static bool create_on_open = true;
> +static bool legacy_async_del_gendisk = true;
>   
>   /*
>    * We have a system wide 'event count' that is incremented
> @@ -877,15 +878,18 @@ void mddev_unlock(struct mddev *mddev)
>   		export_rdev(rdev, mddev);
>   	}
>   
> -	/* Call del_gendisk after release reconfig_mutex to avoid
> -	 * deadlock (e.g. call del_gendisk under the lock and an
> -	 * access to sysfs files waits the lock)
> -	 * And MD_DELETED is only used for md raid which is set in
> -	 * do_md_stop. dm raid only uses md_stop to stop. So dm raid
> -	 * doesn't need to check MD_DELETED when getting reconfig lock
> -	 */
> -	if (test_bit(MD_DELETED, &mddev->flags))
> -		del_gendisk(mddev->gendisk);
> +	if (!legacy_async_del_gendisk) {
> +		/*
> +		 * Call del_gendisk after release reconfig_mutex to avoid
> +		 * deadlock (e.g. call del_gendisk under the lock and an
> +		 * access to sysfs files waits the lock)
> +		 * And MD_DELETED is only used for md raid which is set in
> +		 * do_md_stop. dm raid only uses md_stop to stop. So dm raid
> +		 * doesn't need to check MD_DELETED when getting reconfig lock
> +		 */
> +		if (test_bit(MD_DELETED, &mddev->flags))
> +			del_gendisk(mddev->gendisk);
> +	}
>   }
>   EXPORT_SYMBOL_GPL(mddev_unlock);
>   
> @@ -5818,6 +5822,13 @@ static void md_kobj_release(struct kobject *ko)
>   {
>   	struct mddev *mddev = container_of(ko, struct mddev, kobj);
>   
> +	if (legacy_async_del_gendisk) {
> +		if (mddev->sysfs_state)
> +			sysfs_put(mddev->sysfs_state);
> +		if (mddev->sysfs_level)
> +			sysfs_put(mddev->sysfs_level);
> +		del_gendisk(mddev->gendisk);
> +	}
>   	put_disk(mddev->gendisk);
>   }
>   
> @@ -6021,6 +6032,9 @@ static int md_alloc_and_put(dev_t dev, char *name)
>   {
>   	struct mddev *mddev = md_alloc(dev, name);
>   
> +	if (legacy_async_del_gendisk)
> +		pr_warn("md: async del_gendisk mode will be removed in future, please upgrade to mdadm-4.5+\n");
> +
>   	if (IS_ERR(mddev))
>   		return PTR_ERR(mddev);
>   	mddev_put(mddev);
> @@ -6431,10 +6445,22 @@ static void md_clean(struct mddev *mddev)
>   	mddev->persistent = 0;
>   	mddev->level = LEVEL_NONE;
>   	mddev->clevel[0] = 0;
> -	/* if UNTIL_STOP is set, it's cleared here */
> -	mddev->hold_active = 0;
> -	/* Don't clear MD_CLOSING, or mddev can be opened again. */
> -	mddev->flags &= BIT_ULL_MASK(MD_CLOSING);
> +
> +	/*
> +	 * For legacy_async_del_gendisk mode, it can stop the array in the
> +	 * middle of assembling it, then it still can access the array. So
> +	 * it needs to clear MD_CLOSING. If not legacy_async_del_gendisk,
> +	 * it can't open the array again after stopping it. So it doesn't
> +	 * clear MD_CLOSING.
> +	 */
> +	if (legacy_async_del_gendisk && mddev->hold_active) {
> +		clear_bit(MD_CLOSING, &mddev->flags);
> +	} else {
> +		/* if UNTIL_STOP is set, it's cleared here */
> +		mddev->hold_active = 0;
> +		/* Don't clear MD_CLOSING, or mddev can be opened again. */
> +		mddev->flags &= BIT_ULL_MASK(MD_CLOSING);
> +	}
>   	mddev->sb_flags = 0;
>   	mddev->ro = MD_RDWR;
>   	mddev->metadata_type[0] = 0;
> @@ -6658,7 +6684,8 @@ static int do_md_stop(struct mddev *mddev, int mode)
>   
>   		export_array(mddev);
>   		md_clean(mddev);
> -		set_bit(MD_DELETED, &mddev->flags);
> +		if (!legacy_async_del_gendisk)
> +			set_bit(MD_DELETED, &mddev->flags);
>   	}
>   	md_new_event();
>   	sysfs_notify_dirent_safe(mddev->sysfs_state);
> @@ -10392,6 +10419,7 @@ module_param_call(start_ro, set_ro, get_ro, NULL, S_IRUSR|S_IWUSR);
>   module_param(start_dirty_degraded, int, S_IRUGO|S_IWUSR);
>   module_param_call(new_array, add_named_array, NULL, NULL, S_IWUSR);
>   module_param(create_on_open, bool, S_IRUSR|S_IWUSR);
> +module_param(legacy_async_del_gendisk, bool, 0600);
>   
>   MODULE_LICENSE("GPL");
>   MODULE_DESCRIPTION("MD RAID framework");
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/1] md: add legacy_async_del_gendisk mode
  2025-08-14  0:54 ` Yu Kuai
@ 2025-08-18  9:33   ` Luca Boccassi
  2025-08-18 15:57     ` Yu Kuai
  0 siblings, 1 reply; 7+ messages in thread
From: Luca Boccassi @ 2025-08-18  9:33 UTC (permalink / raw)
  To: Yu Kuai; +Cc: Xiao Ni, linux-raid, mpatocka, yukuai (C)

On Thu, 14 Aug 2025 at 01:54, Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
>
> 在 2025/08/13 11:29, Xiao Ni 写道:
> > commit 9e59d609763f ("md: call del_gendisk in control path") changes the
> > async way to sync way of calling del_gendisk. But it breaks mdadm
> > --assemble command. The assemble command runs like this:
> > 1. create the array
> > 2. stop the array
> > 3. access the sysfs files after stopping
> >
> > The sync way calls del_gendisk in step 2, so all sysfs files are removed.
> > Now to avoid breaking mdadm assemble command, this patch adds the parameter
> > legacy_async_del_gendisk that can be used to choose which way. The default
> > is async way. In future, we plan to change default to sync way in kernel
> > 7.0. Then users need to upgrade to mdadm 4.5+ which removes step 2.
> >
> > Fixes: 9e59d609763f ("md: call del_gendisk in control path")
> > Reported-by: Mikulas Patocka <mpatocka@redhat.com>
> > Closes: https://lore.kernel.org/linux-raid/CAMw=ZnQ=ET2St-+hnhsuq34rRPnebqcXqP1QqaHW5Bh4aaaZ4g@mail.gmail.com/T/#t
> > Suggested-and-reviewed-by: Yu Kuai <yukuai3@huawei.com>
> > Signed-off-by: Xiao Ni <xni@redhat.com>
> > ---
> > v2: minor changes on format and log content
> > v3: changes in commit message and log content
> > v4: choose to change to sync way as default first in commit message
> >   drivers/md/md.c | 56 ++++++++++++++++++++++++++++++++++++-------------
> >   1 file changed, 42 insertions(+), 14 deletions(-)
> >
>
> Aplied to md-6.17
> Thanks

Hi,

I noticed this bugfix is not in 6.17~rc2 released yesterday, will it
be in rc3? Thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/1] md: add legacy_async_del_gendisk mode
  2025-08-18  9:33   ` Luca Boccassi
@ 2025-08-18 15:57     ` Yu Kuai
  2025-08-18 15:58       ` Luca Boccassi
  0 siblings, 1 reply; 7+ messages in thread
From: Yu Kuai @ 2025-08-18 15:57 UTC (permalink / raw)
  To: Luca Boccassi, Yu Kuai; +Cc: Xiao Ni, linux-raid, mpatocka, yukuai (C)

Hi,

在 2025/8/18 17:33, Luca Boccassi 写道:
> On Thu, 14 Aug 2025 at 01:54, Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>
>> 在 2025/08/13 11:29, Xiao Ni 写道:
>>> commit 9e59d609763f ("md: call del_gendisk in control path") changes the
>>> async way to sync way of calling del_gendisk. But it breaks mdadm
>>> --assemble command. The assemble command runs like this:
>>> 1. create the array
>>> 2. stop the array
>>> 3. access the sysfs files after stopping
>>>
>>> The sync way calls del_gendisk in step 2, so all sysfs files are removed.
>>> Now to avoid breaking mdadm assemble command, this patch adds the parameter
>>> legacy_async_del_gendisk that can be used to choose which way. The default
>>> is async way. In future, we plan to change default to sync way in kernel
>>> 7.0. Then users need to upgrade to mdadm 4.5+ which removes step 2.
>>>
>>> Fixes: 9e59d609763f ("md: call del_gendisk in control path")
>>> Reported-by: Mikulas Patocka <mpatocka@redhat.com>
>>> Closes: https://lore.kernel.org/linux-raid/CAMw=ZnQ=ET2St-+hnhsuq34rRPnebqcXqP1QqaHW5Bh4aaaZ4g@mail.gmail.com/T/#t
>>> Suggested-and-reviewed-by: Yu Kuai <yukuai3@huawei.com>
>>> Signed-off-by: Xiao Ni <xni@redhat.com>
>>> ---
>>> v2: minor changes on format and log content
>>> v3: changes in commit message and log content
>>> v4: choose to change to sync way as default first in commit message
>>>    drivers/md/md.c | 56 ++++++++++++++++++++++++++++++++++++-------------
>>>    1 file changed, 42 insertions(+), 14 deletions(-)
>>>
>> Aplied to md-6.17
>> Thanks
> Hi,
>
> I noticed this bugfix is not in 6.17~rc2 released yesterday, will it
> be in rc3? Thanks

Yes. You should see this in linux-next soon, and later in rc3.

Thanks,
Kuai


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/1] md: add legacy_async_del_gendisk mode
  2025-08-18 15:57     ` Yu Kuai
@ 2025-08-18 15:58       ` Luca Boccassi
  0 siblings, 0 replies; 7+ messages in thread
From: Luca Boccassi @ 2025-08-18 15:58 UTC (permalink / raw)
  To: yukuai; +Cc: Yu Kuai, Xiao Ni, linux-raid, mpatocka, yukuai (C)

On Mon, 18 Aug 2025 at 16:57, Yu Kuai <yukuai@kernel.org> wrote:
>
> Hi,
>
> 在 2025/8/18 17:33, Luca Boccassi 写道:
> > On Thu, 14 Aug 2025 at 01:54, Yu Kuai <yukuai1@huaweicloud.com> wrote:
> >>
> >> 在 2025/08/13 11:29, Xiao Ni 写道:
> >>> commit 9e59d609763f ("md: call del_gendisk in control path") changes the
> >>> async way to sync way of calling del_gendisk. But it breaks mdadm
> >>> --assemble command. The assemble command runs like this:
> >>> 1. create the array
> >>> 2. stop the array
> >>> 3. access the sysfs files after stopping
> >>>
> >>> The sync way calls del_gendisk in step 2, so all sysfs files are removed.
> >>> Now to avoid breaking mdadm assemble command, this patch adds the parameter
> >>> legacy_async_del_gendisk that can be used to choose which way. The default
> >>> is async way. In future, we plan to change default to sync way in kernel
> >>> 7.0. Then users need to upgrade to mdadm 4.5+ which removes step 2.
> >>>
> >>> Fixes: 9e59d609763f ("md: call del_gendisk in control path")
> >>> Reported-by: Mikulas Patocka <mpatocka@redhat.com>
> >>> Closes: https://lore.kernel.org/linux-raid/CAMw=ZnQ=ET2St-+hnhsuq34rRPnebqcXqP1QqaHW5Bh4aaaZ4g@mail.gmail.com/T/#t
> >>> Suggested-and-reviewed-by: Yu Kuai <yukuai3@huawei.com>
> >>> Signed-off-by: Xiao Ni <xni@redhat.com>
> >>> ---
> >>> v2: minor changes on format and log content
> >>> v3: changes in commit message and log content
> >>> v4: choose to change to sync way as default first in commit message
> >>>    drivers/md/md.c | 56 ++++++++++++++++++++++++++++++++++++-------------
> >>>    1 file changed, 42 insertions(+), 14 deletions(-)
> >>>
> >> Aplied to md-6.17
> >> Thanks
> > Hi,
> >
> > I noticed this bugfix is not in 6.17~rc2 released yesterday, will it
> > be in rc3? Thanks
>
> Yes. You should see this in linux-next soon, and later in rc3.
>
> Thanks,
> Kuai

That's great, thank you

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-08-18 15:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-13  3:29 [PATCH v4 1/1] md: add legacy_async_del_gendisk mode Xiao Ni
2025-08-13  5:26 ` Paul Menzel
2025-08-13  6:34   ` Xiao Ni
2025-08-14  0:54 ` Yu Kuai
2025-08-18  9:33   ` Luca Boccassi
2025-08-18 15:57     ` Yu Kuai
2025-08-18 15:58       ` Luca Boccassi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox