* fix ->run failure handling
@ 2024-06-04 17:25 Christoph Hellwig
2024-06-04 17:25 ` [PATCH 1/2] md/raid0: don't free conf on raid0_run failure Christoph Hellwig
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Christoph Hellwig @ 2024-06-04 17:25 UTC (permalink / raw)
To: Song Liu, Yu Kuai; +Cc: linux-raid
Hi Song, hi Yu,
this series fixes problems I found when testing how md handles
trying to creat an array with disks that support incompatible
protection information or a mix of protection information and
not protection information.
Note that this only fixes the oops due to the double frees. After the
attempt to at the incompatible device the /dev/md0 node is still around
with zero blocks and linked to the the previous devices in sysfs which
is a bit confusing, but unrelated to the resource cleanup touched
here.
raid10 and raid5 also free conf on failure in their ->run handle,
but set mddev->private to NULL which prevents the double free as well
(but requires more code)
Diffstat:
raid0.c | 21 +++++----------------
raid1.c | 14 +++-----------
2 files changed, 8 insertions(+), 27 deletions(-)
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/2] md/raid0: don't free conf on raid0_run failure
2024-06-04 17:25 fix ->run failure handling Christoph Hellwig
@ 2024-06-04 17:25 ` Christoph Hellwig
2024-06-07 6:08 ` Yu Kuai
2024-06-04 17:25 ` [PATCH 2/2] md/raid1: " Christoph Hellwig
2024-06-10 20:58 ` fix ->run failure handling Song Liu
2 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2024-06-04 17:25 UTC (permalink / raw)
To: Song Liu, Yu Kuai; +Cc: linux-raid
The core md code calls the ->free method which already frees conf.
Fixes: 0c031fd37f69 ("md: Move alloc/free acct bioset in to personality")
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/md/raid0.c | 21 +++++----------------
1 file changed, 5 insertions(+), 16 deletions(-)
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index c5d4aeb68404c9..81c01347cd24e6 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -365,18 +365,13 @@ static sector_t raid0_size(struct mddev *mddev, sector_t sectors, int raid_disks
return array_sectors;
}
-static void free_conf(struct mddev *mddev, struct r0conf *conf)
-{
- kfree(conf->strip_zone);
- kfree(conf->devlist);
- kfree(conf);
-}
-
static void raid0_free(struct mddev *mddev, void *priv)
{
struct r0conf *conf = priv;
- free_conf(mddev, conf);
+ kfree(conf->strip_zone);
+ kfree(conf->devlist);
+ kfree(conf);
}
static int raid0_set_limits(struct mddev *mddev)
@@ -415,7 +410,7 @@ static int raid0_run(struct mddev *mddev)
if (!mddev_is_dm(mddev)) {
ret = raid0_set_limits(mddev);
if (ret)
- goto out_free_conf;
+ return ret;
}
/* calculate array device size */
@@ -427,13 +422,7 @@ static int raid0_run(struct mddev *mddev)
dump_zones(mddev);
- ret = md_integrity_register(mddev);
- if (ret)
- goto out_free_conf;
- return 0;
-out_free_conf:
- free_conf(mddev, conf);
- return ret;
+ return md_integrity_register(mddev);
}
/*
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/2] md/raid1: don't free conf on raid0_run failure
2024-06-04 17:25 fix ->run failure handling Christoph Hellwig
2024-06-04 17:25 ` [PATCH 1/2] md/raid0: don't free conf on raid0_run failure Christoph Hellwig
@ 2024-06-04 17:25 ` Christoph Hellwig
2024-06-10 20:58 ` fix ->run failure handling Song Liu
2 siblings, 0 replies; 9+ messages in thread
From: Christoph Hellwig @ 2024-06-04 17:25 UTC (permalink / raw)
To: Song Liu, Yu Kuai; +Cc: linux-raid
The core md code calls the ->free method which already frees conf.
Fixes: 07f1a6850c5d ("md/raid1: fail run raid1 array when active disk less than one")
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/md/raid1.c | 14 +++-----------
1 file changed, 3 insertions(+), 11 deletions(-)
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 7b8a71ca66dde0..1f321826ef02ba 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -3204,7 +3204,6 @@ static int raid1_set_limits(struct mddev *mddev)
return queue_limits_set(mddev->gendisk->queue, &lim);
}
-static void raid1_free(struct mddev *mddev, void *priv);
static int raid1_run(struct mddev *mddev)
{
struct r1conf *conf;
@@ -3238,7 +3237,7 @@ static int raid1_run(struct mddev *mddev)
if (!mddev_is_dm(mddev)) {
ret = raid1_set_limits(mddev);
if (ret)
- goto abort;
+ return ret;
}
mddev->degraded = 0;
@@ -3252,8 +3251,7 @@ static int raid1_run(struct mddev *mddev)
*/
if (conf->raid_disks - mddev->degraded < 1) {
md_unregister_thread(mddev, &conf->thread);
- ret = -EINVAL;
- goto abort;
+ return -EINVAL;
}
if (conf->raid_disks - mddev->degraded == 1)
@@ -3277,14 +3275,8 @@ static int raid1_run(struct mddev *mddev)
md_set_array_sectors(mddev, raid1_size(mddev, 0, 0));
ret = md_integrity_register(mddev);
- if (ret) {
+ if (ret)
md_unregister_thread(mddev, &mddev->thread);
- goto abort;
- }
- return 0;
-
-abort:
- raid1_free(mddev, conf);
return ret;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] md/raid0: don't free conf on raid0_run failure
2024-06-04 17:25 ` [PATCH 1/2] md/raid0: don't free conf on raid0_run failure Christoph Hellwig
@ 2024-06-07 6:08 ` Yu Kuai
2024-06-07 6:17 ` Yu Kuai
0 siblings, 1 reply; 9+ messages in thread
From: Yu Kuai @ 2024-06-07 6:08 UTC (permalink / raw)
To: Christoph Hellwig, Song Liu; +Cc: linux-raid, yukuai (C)
在 2024/06/05 1:25, Christoph Hellwig 写道:
> The core md code calls the ->free method which already frees conf.
>
> Fixes: 0c031fd37f69 ("md: Move alloc/free acct bioset in to personality")
> Signed-off-by: Christoph Hellwig <hch@lst.de>
LGTM
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
> ---
> drivers/md/raid0.c | 21 +++++----------------
> 1 file changed, 5 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> index c5d4aeb68404c9..81c01347cd24e6 100644
> --- a/drivers/md/raid0.c
> +++ b/drivers/md/raid0.c
> @@ -365,18 +365,13 @@ static sector_t raid0_size(struct mddev *mddev, sector_t sectors, int raid_disks
> return array_sectors;
> }
>
> -static void free_conf(struct mddev *mddev, struct r0conf *conf)
> -{
> - kfree(conf->strip_zone);
> - kfree(conf->devlist);
> - kfree(conf);
> -}
> -
> static void raid0_free(struct mddev *mddev, void *priv)
> {
> struct r0conf *conf = priv;
>
> - free_conf(mddev, conf);
> + kfree(conf->strip_zone);
> + kfree(conf->devlist);
> + kfree(conf);
> }
>
> static int raid0_set_limits(struct mddev *mddev)
> @@ -415,7 +410,7 @@ static int raid0_run(struct mddev *mddev)
> if (!mddev_is_dm(mddev)) {
> ret = raid0_set_limits(mddev);
> if (ret)
> - goto out_free_conf;
> + return ret;
> }
>
> /* calculate array device size */
> @@ -427,13 +422,7 @@ static int raid0_run(struct mddev *mddev)
>
> dump_zones(mddev);
>
> - ret = md_integrity_register(mddev);
> - if (ret)
> - goto out_free_conf;
> - return 0;
> -out_free_conf:
> - free_conf(mddev, conf);
> - return ret;
> + return md_integrity_register(mddev);
> }
>
> /*
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] md/raid0: don't free conf on raid0_run failure
2024-06-07 6:08 ` Yu Kuai
@ 2024-06-07 6:17 ` Yu Kuai
2024-06-07 6:19 ` Christoph Hellwig
0 siblings, 1 reply; 9+ messages in thread
From: Yu Kuai @ 2024-06-07 6:17 UTC (permalink / raw)
To: Yu Kuai, Christoph Hellwig, Song Liu; +Cc: linux-raid, yukuai (C)
Hi,
在 2024/06/07 14:08, Yu Kuai 写道:
> 在 2024/06/05 1:25, Christoph Hellwig 写道:
>> The core md code calls the ->free method which already frees conf.
>>
>> Fixes: 0c031fd37f69 ("md: Move alloc/free acct bioset in to personality")
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>
> LGTM
> Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Sorry that I just noticed that the sysfs api level_store() calls the
pers->run() as well, and it didn't handle failure by pers->free().
It's weried that the api will return success in this case, and after
this patch, it will require __md_stop() to free the conf.
Can you also fix the level_store()? By checking pers->run() and if it
failed, call pers->free() and return error.
Thanks,
Kuai
>
>> ---
>> drivers/md/raid0.c | 21 +++++----------------
>> 1 file changed, 5 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
>> index c5d4aeb68404c9..81c01347cd24e6 100644
>> --- a/drivers/md/raid0.c
>> +++ b/drivers/md/raid0.c
>> @@ -365,18 +365,13 @@ static sector_t raid0_size(struct mddev *mddev,
>> sector_t sectors, int raid_disks
>> return array_sectors;
>> }
>> -static void free_conf(struct mddev *mddev, struct r0conf *conf)
>> -{
>> - kfree(conf->strip_zone);
>> - kfree(conf->devlist);
>> - kfree(conf);
>> -}
>> -
>> static void raid0_free(struct mddev *mddev, void *priv)
>> {
>> struct r0conf *conf = priv;
>> - free_conf(mddev, conf);
>> + kfree(conf->strip_zone);
>> + kfree(conf->devlist);
>> + kfree(conf);
>> }
>> static int raid0_set_limits(struct mddev *mddev)
>> @@ -415,7 +410,7 @@ static int raid0_run(struct mddev *mddev)
>> if (!mddev_is_dm(mddev)) {
>> ret = raid0_set_limits(mddev);
>> if (ret)
>> - goto out_free_conf;
>> + return ret;
>> }
>> /* calculate array device size */
>> @@ -427,13 +422,7 @@ static int raid0_run(struct mddev *mddev)
>> dump_zones(mddev);
>> - ret = md_integrity_register(mddev);
>> - if (ret)
>> - goto out_free_conf;
>> - return 0;
>> -out_free_conf:
>> - free_conf(mddev, conf);
>> - return ret;
>> + return md_integrity_register(mddev);
>> }
>> /*
>>
>
> .
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] md/raid0: don't free conf on raid0_run failure
2024-06-07 6:17 ` Yu Kuai
@ 2024-06-07 6:19 ` Christoph Hellwig
0 siblings, 0 replies; 9+ messages in thread
From: Christoph Hellwig @ 2024-06-07 6:19 UTC (permalink / raw)
To: Yu Kuai; +Cc: Christoph Hellwig, Song Liu, linux-raid, yukuai (C)
On Fri, Jun 07, 2024 at 02:17:14PM +0800, Yu Kuai wrote:
> Sorry that I just noticed that the sysfs api level_store() calls the
> pers->run() as well, and it didn't handle failure by pers->free().
> It's weried that the api will return success in this case, and after
> this patch, it will require __md_stop() to free the conf.
>
> Can you also fix the level_store()? By checking pers->run() and if it
> failed, call pers->free() and return error.
I can look into it. But if we don't have consistent callers anyway
I'd be tempted to not call ->free on ->run failure, as that is a
rather unusual convention.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: fix ->run failure handling
2024-06-04 17:25 fix ->run failure handling Christoph Hellwig
2024-06-04 17:25 ` [PATCH 1/2] md/raid0: don't free conf on raid0_run failure Christoph Hellwig
2024-06-04 17:25 ` [PATCH 2/2] md/raid1: " Christoph Hellwig
@ 2024-06-10 20:58 ` Song Liu
2024-06-11 6:13 ` Christoph Hellwig
2 siblings, 1 reply; 9+ messages in thread
From: Song Liu @ 2024-06-10 20:58 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Yu Kuai, linux-raid
Hi Christoph,
On Tue, Jun 4, 2024 at 10:26 AM Christoph Hellwig <hch@lst.de> wrote:
>
> Hi Song, hi Yu,
>
> this series fixes problems I found when testing how md handles
> trying to creat an array with disks that support incompatible
> protection information or a mix of protection information and
> not protection information.
>
> Note that this only fixes the oops due to the double frees. After the
> attempt to at the incompatible device the /dev/md0 node is still around
> with zero blocks and linked to the the previous devices in sysfs which
> is a bit confusing, but unrelated to the resource cleanup touched
> here.
>
> raid10 and raid5 also free conf on failure in their ->run handle,
> but set mddev->private to NULL which prevents the double free as well
> (but requires more code)
>
> Diffstat:
> raid0.c | 21 +++++----------------
> raid1.c | 14 +++-----------
> 2 files changed, 8 insertions(+), 27 deletions(-)
Thanks for the patchset. I applied it to the md-6.11 branch.
Song
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: fix ->run failure handling
2024-06-10 20:58 ` fix ->run failure handling Song Liu
@ 2024-06-11 6:13 ` Christoph Hellwig
2024-06-12 16:36 ` Song Liu
0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2024-06-11 6:13 UTC (permalink / raw)
To: Song Liu; +Cc: Christoph Hellwig, Yu Kuai, linux-raid
On Mon, Jun 10, 2024 at 01:58:17PM -0700, Song Liu wrote:
> > raid10 and raid5 also free conf on failure in their ->run handle,
> > but set mddev->private to NULL which prevents the double free as well
> > (but requires more code)
> >
> > Diffstat:
> > raid0.c | 21 +++++----------------
> > raid1.c | 14 +++-----------
> > 2 files changed, 8 insertions(+), 27 deletions(-)
>
> Thanks for the patchset. I applied it to the md-6.11 branch.
I was planning to respin this to also handle failures in the takeover
path.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: fix ->run failure handling
2024-06-11 6:13 ` Christoph Hellwig
@ 2024-06-12 16:36 ` Song Liu
0 siblings, 0 replies; 9+ messages in thread
From: Song Liu @ 2024-06-12 16:36 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Yu Kuai, linux-raid
Hi Christoph,
On Mon, Jun 10, 2024 at 11:13 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Mon, Jun 10, 2024 at 01:58:17PM -0700, Song Liu wrote:
> > > raid10 and raid5 also free conf on failure in their ->run handle,
> > > but set mddev->private to NULL which prevents the double free as well
> > > (but requires more code)
> > >
> > > Diffstat:
> > > raid0.c | 21 +++++----------------
> > > raid1.c | 14 +++-----------
> > > 2 files changed, 8 insertions(+), 27 deletions(-)
> >
> > Thanks for the patchset. I applied it to the md-6.11 branch.
>
> I was planning to respin this to also handle failures in the takeover
> path.
How about we do that in a follow up patch?
Thanks for fixing these issues!
Song
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-06-12 16:36 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-04 17:25 fix ->run failure handling Christoph Hellwig
2024-06-04 17:25 ` [PATCH 1/2] md/raid0: don't free conf on raid0_run failure Christoph Hellwig
2024-06-07 6:08 ` Yu Kuai
2024-06-07 6:17 ` Yu Kuai
2024-06-07 6:19 ` Christoph Hellwig
2024-06-04 17:25 ` [PATCH 2/2] md/raid1: " Christoph Hellwig
2024-06-10 20:58 ` fix ->run failure handling Song Liu
2024-06-11 6:13 ` Christoph Hellwig
2024-06-12 16:36 ` Song Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).