* fix ->run failure handling
@ 2024-06-04 17:25 Christoph Hellwig
2024-06-04 17:25 ` [PATCH 1/2] md/raid0: don't free conf on raid0_run failure Christoph Hellwig
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Christoph Hellwig @ 2024-06-04 17:25 UTC (permalink / raw)
To: Song Liu, Yu Kuai; +Cc: linux-raid
Hi Song, hi Yu,
this series fixes problems I found when testing how md handles
trying to creat an array with disks that support incompatible
protection information or a mix of protection information and
not protection information.
Note that this only fixes the oops due to the double frees. After the
attempt to at the incompatible device the /dev/md0 node is still around
with zero blocks and linked to the the previous devices in sysfs which
is a bit confusing, but unrelated to the resource cleanup touched
here.
raid10 and raid5 also free conf on failure in their ->run handle,
but set mddev->private to NULL which prevents the double free as well
(but requires more code)
Diffstat:
raid0.c | 21 +++++----------------
raid1.c | 14 +++-----------
2 files changed, 8 insertions(+), 27 deletions(-)
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH 1/2] md/raid0: don't free conf on raid0_run failure 2024-06-04 17:25 fix ->run failure handling Christoph Hellwig @ 2024-06-04 17:25 ` Christoph Hellwig 2024-06-07 6:08 ` Yu Kuai 2024-06-04 17:25 ` [PATCH 2/2] md/raid1: " Christoph Hellwig 2024-06-10 20:58 ` fix ->run failure handling Song Liu 2 siblings, 1 reply; 9+ messages in thread From: Christoph Hellwig @ 2024-06-04 17:25 UTC (permalink / raw) To: Song Liu, Yu Kuai; +Cc: linux-raid The core md code calls the ->free method which already frees conf. Fixes: 0c031fd37f69 ("md: Move alloc/free acct bioset in to personality") Signed-off-by: Christoph Hellwig <hch@lst.de> --- drivers/md/raid0.c | 21 +++++---------------- 1 file changed, 5 insertions(+), 16 deletions(-) diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c index c5d4aeb68404c9..81c01347cd24e6 100644 --- a/drivers/md/raid0.c +++ b/drivers/md/raid0.c @@ -365,18 +365,13 @@ static sector_t raid0_size(struct mddev *mddev, sector_t sectors, int raid_disks return array_sectors; } -static void free_conf(struct mddev *mddev, struct r0conf *conf) -{ - kfree(conf->strip_zone); - kfree(conf->devlist); - kfree(conf); -} - static void raid0_free(struct mddev *mddev, void *priv) { struct r0conf *conf = priv; - free_conf(mddev, conf); + kfree(conf->strip_zone); + kfree(conf->devlist); + kfree(conf); } static int raid0_set_limits(struct mddev *mddev) @@ -415,7 +410,7 @@ static int raid0_run(struct mddev *mddev) if (!mddev_is_dm(mddev)) { ret = raid0_set_limits(mddev); if (ret) - goto out_free_conf; + return ret; } /* calculate array device size */ @@ -427,13 +422,7 @@ static int raid0_run(struct mddev *mddev) dump_zones(mddev); - ret = md_integrity_register(mddev); - if (ret) - goto out_free_conf; - return 0; -out_free_conf: - free_conf(mddev, conf); - return ret; + return md_integrity_register(mddev); } /* -- 2.43.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] md/raid0: don't free conf on raid0_run failure 2024-06-04 17:25 ` [PATCH 1/2] md/raid0: don't free conf on raid0_run failure Christoph Hellwig @ 2024-06-07 6:08 ` Yu Kuai 2024-06-07 6:17 ` Yu Kuai 0 siblings, 1 reply; 9+ messages in thread From: Yu Kuai @ 2024-06-07 6:08 UTC (permalink / raw) To: Christoph Hellwig, Song Liu; +Cc: linux-raid, yukuai (C) 在 2024/06/05 1:25, Christoph Hellwig 写道: > The core md code calls the ->free method which already frees conf. > > Fixes: 0c031fd37f69 ("md: Move alloc/free acct bioset in to personality") > Signed-off-by: Christoph Hellwig <hch@lst.de> LGTM Reviewed-by: Yu Kuai <yukuai3@huawei.com> > --- > drivers/md/raid0.c | 21 +++++---------------- > 1 file changed, 5 insertions(+), 16 deletions(-) > > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c > index c5d4aeb68404c9..81c01347cd24e6 100644 > --- a/drivers/md/raid0.c > +++ b/drivers/md/raid0.c > @@ -365,18 +365,13 @@ static sector_t raid0_size(struct mddev *mddev, sector_t sectors, int raid_disks > return array_sectors; > } > > -static void free_conf(struct mddev *mddev, struct r0conf *conf) > -{ > - kfree(conf->strip_zone); > - kfree(conf->devlist); > - kfree(conf); > -} > - > static void raid0_free(struct mddev *mddev, void *priv) > { > struct r0conf *conf = priv; > > - free_conf(mddev, conf); > + kfree(conf->strip_zone); > + kfree(conf->devlist); > + kfree(conf); > } > > static int raid0_set_limits(struct mddev *mddev) > @@ -415,7 +410,7 @@ static int raid0_run(struct mddev *mddev) > if (!mddev_is_dm(mddev)) { > ret = raid0_set_limits(mddev); > if (ret) > - goto out_free_conf; > + return ret; > } > > /* calculate array device size */ > @@ -427,13 +422,7 @@ static int raid0_run(struct mddev *mddev) > > dump_zones(mddev); > > - ret = md_integrity_register(mddev); > - if (ret) > - goto out_free_conf; > - return 0; > -out_free_conf: > - free_conf(mddev, conf); > - return ret; > + return md_integrity_register(mddev); > } > > /* > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] md/raid0: don't free conf on raid0_run failure 2024-06-07 6:08 ` Yu Kuai @ 2024-06-07 6:17 ` Yu Kuai 2024-06-07 6:19 ` Christoph Hellwig 0 siblings, 1 reply; 9+ messages in thread From: Yu Kuai @ 2024-06-07 6:17 UTC (permalink / raw) To: Yu Kuai, Christoph Hellwig, Song Liu; +Cc: linux-raid, yukuai (C) Hi, 在 2024/06/07 14:08, Yu Kuai 写道: > 在 2024/06/05 1:25, Christoph Hellwig 写道: >> The core md code calls the ->free method which already frees conf. >> >> Fixes: 0c031fd37f69 ("md: Move alloc/free acct bioset in to personality") >> Signed-off-by: Christoph Hellwig <hch@lst.de> > > LGTM > Reviewed-by: Yu Kuai <yukuai3@huawei.com> Sorry that I just noticed that the sysfs api level_store() calls the pers->run() as well, and it didn't handle failure by pers->free(). It's weried that the api will return success in this case, and after this patch, it will require __md_stop() to free the conf. Can you also fix the level_store()? By checking pers->run() and if it failed, call pers->free() and return error. Thanks, Kuai > >> --- >> drivers/md/raid0.c | 21 +++++---------------- >> 1 file changed, 5 insertions(+), 16 deletions(-) >> >> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c >> index c5d4aeb68404c9..81c01347cd24e6 100644 >> --- a/drivers/md/raid0.c >> +++ b/drivers/md/raid0.c >> @@ -365,18 +365,13 @@ static sector_t raid0_size(struct mddev *mddev, >> sector_t sectors, int raid_disks >> return array_sectors; >> } >> -static void free_conf(struct mddev *mddev, struct r0conf *conf) >> -{ >> - kfree(conf->strip_zone); >> - kfree(conf->devlist); >> - kfree(conf); >> -} >> - >> static void raid0_free(struct mddev *mddev, void *priv) >> { >> struct r0conf *conf = priv; >> - free_conf(mddev, conf); >> + kfree(conf->strip_zone); >> + kfree(conf->devlist); >> + kfree(conf); >> } >> static int raid0_set_limits(struct mddev *mddev) >> @@ -415,7 +410,7 @@ static int raid0_run(struct mddev *mddev) >> if (!mddev_is_dm(mddev)) { >> ret = raid0_set_limits(mddev); >> if (ret) >> - goto out_free_conf; >> + return ret; >> } >> /* calculate array device size */ >> @@ -427,13 +422,7 @@ static int raid0_run(struct mddev *mddev) >> dump_zones(mddev); >> - ret = md_integrity_register(mddev); >> - if (ret) >> - goto out_free_conf; >> - return 0; >> -out_free_conf: >> - free_conf(mddev, conf); >> - return ret; >> + return md_integrity_register(mddev); >> } >> /* >> > > . > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] md/raid0: don't free conf on raid0_run failure 2024-06-07 6:17 ` Yu Kuai @ 2024-06-07 6:19 ` Christoph Hellwig 0 siblings, 0 replies; 9+ messages in thread From: Christoph Hellwig @ 2024-06-07 6:19 UTC (permalink / raw) To: Yu Kuai; +Cc: Christoph Hellwig, Song Liu, linux-raid, yukuai (C) On Fri, Jun 07, 2024 at 02:17:14PM +0800, Yu Kuai wrote: > Sorry that I just noticed that the sysfs api level_store() calls the > pers->run() as well, and it didn't handle failure by pers->free(). > It's weried that the api will return success in this case, and after > this patch, it will require __md_stop() to free the conf. > > Can you also fix the level_store()? By checking pers->run() and if it > failed, call pers->free() and return error. I can look into it. But if we don't have consistent callers anyway I'd be tempted to not call ->free on ->run failure, as that is a rather unusual convention. ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/2] md/raid1: don't free conf on raid0_run failure 2024-06-04 17:25 fix ->run failure handling Christoph Hellwig 2024-06-04 17:25 ` [PATCH 1/2] md/raid0: don't free conf on raid0_run failure Christoph Hellwig @ 2024-06-04 17:25 ` Christoph Hellwig 2024-06-10 20:58 ` fix ->run failure handling Song Liu 2 siblings, 0 replies; 9+ messages in thread From: Christoph Hellwig @ 2024-06-04 17:25 UTC (permalink / raw) To: Song Liu, Yu Kuai; +Cc: linux-raid The core md code calls the ->free method which already frees conf. Fixes: 07f1a6850c5d ("md/raid1: fail run raid1 array when active disk less than one") Signed-off-by: Christoph Hellwig <hch@lst.de> --- drivers/md/raid1.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 7b8a71ca66dde0..1f321826ef02ba 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -3204,7 +3204,6 @@ static int raid1_set_limits(struct mddev *mddev) return queue_limits_set(mddev->gendisk->queue, &lim); } -static void raid1_free(struct mddev *mddev, void *priv); static int raid1_run(struct mddev *mddev) { struct r1conf *conf; @@ -3238,7 +3237,7 @@ static int raid1_run(struct mddev *mddev) if (!mddev_is_dm(mddev)) { ret = raid1_set_limits(mddev); if (ret) - goto abort; + return ret; } mddev->degraded = 0; @@ -3252,8 +3251,7 @@ static int raid1_run(struct mddev *mddev) */ if (conf->raid_disks - mddev->degraded < 1) { md_unregister_thread(mddev, &conf->thread); - ret = -EINVAL; - goto abort; + return -EINVAL; } if (conf->raid_disks - mddev->degraded == 1) @@ -3277,14 +3275,8 @@ static int raid1_run(struct mddev *mddev) md_set_array_sectors(mddev, raid1_size(mddev, 0, 0)); ret = md_integrity_register(mddev); - if (ret) { + if (ret) md_unregister_thread(mddev, &mddev->thread); - goto abort; - } - return 0; - -abort: - raid1_free(mddev, conf); return ret; } -- 2.43.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: fix ->run failure handling 2024-06-04 17:25 fix ->run failure handling Christoph Hellwig 2024-06-04 17:25 ` [PATCH 1/2] md/raid0: don't free conf on raid0_run failure Christoph Hellwig 2024-06-04 17:25 ` [PATCH 2/2] md/raid1: " Christoph Hellwig @ 2024-06-10 20:58 ` Song Liu 2024-06-11 6:13 ` Christoph Hellwig 2 siblings, 1 reply; 9+ messages in thread From: Song Liu @ 2024-06-10 20:58 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Yu Kuai, linux-raid Hi Christoph, On Tue, Jun 4, 2024 at 10:26 AM Christoph Hellwig <hch@lst.de> wrote: > > Hi Song, hi Yu, > > this series fixes problems I found when testing how md handles > trying to creat an array with disks that support incompatible > protection information or a mix of protection information and > not protection information. > > Note that this only fixes the oops due to the double frees. After the > attempt to at the incompatible device the /dev/md0 node is still around > with zero blocks and linked to the the previous devices in sysfs which > is a bit confusing, but unrelated to the resource cleanup touched > here. > > raid10 and raid5 also free conf on failure in their ->run handle, > but set mddev->private to NULL which prevents the double free as well > (but requires more code) > > Diffstat: > raid0.c | 21 +++++---------------- > raid1.c | 14 +++----------- > 2 files changed, 8 insertions(+), 27 deletions(-) Thanks for the patchset. I applied it to the md-6.11 branch. Song ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: fix ->run failure handling 2024-06-10 20:58 ` fix ->run failure handling Song Liu @ 2024-06-11 6:13 ` Christoph Hellwig 2024-06-12 16:36 ` Song Liu 0 siblings, 1 reply; 9+ messages in thread From: Christoph Hellwig @ 2024-06-11 6:13 UTC (permalink / raw) To: Song Liu; +Cc: Christoph Hellwig, Yu Kuai, linux-raid On Mon, Jun 10, 2024 at 01:58:17PM -0700, Song Liu wrote: > > raid10 and raid5 also free conf on failure in their ->run handle, > > but set mddev->private to NULL which prevents the double free as well > > (but requires more code) > > > > Diffstat: > > raid0.c | 21 +++++---------------- > > raid1.c | 14 +++----------- > > 2 files changed, 8 insertions(+), 27 deletions(-) > > Thanks for the patchset. I applied it to the md-6.11 branch. I was planning to respin this to also handle failures in the takeover path. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: fix ->run failure handling 2024-06-11 6:13 ` Christoph Hellwig @ 2024-06-12 16:36 ` Song Liu 0 siblings, 0 replies; 9+ messages in thread From: Song Liu @ 2024-06-12 16:36 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Yu Kuai, linux-raid Hi Christoph, On Mon, Jun 10, 2024 at 11:13 PM Christoph Hellwig <hch@lst.de> wrote: > > On Mon, Jun 10, 2024 at 01:58:17PM -0700, Song Liu wrote: > > > raid10 and raid5 also free conf on failure in their ->run handle, > > > but set mddev->private to NULL which prevents the double free as well > > > (but requires more code) > > > > > > Diffstat: > > > raid0.c | 21 +++++---------------- > > > raid1.c | 14 +++----------- > > > 2 files changed, 8 insertions(+), 27 deletions(-) > > > > Thanks for the patchset. I applied it to the md-6.11 branch. > > I was planning to respin this to also handle failures in the takeover > path. How about we do that in a follow up patch? Thanks for fixing these issues! Song ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-06-12 16:36 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-06-04 17:25 fix ->run failure handling Christoph Hellwig 2024-06-04 17:25 ` [PATCH 1/2] md/raid0: don't free conf on raid0_run failure Christoph Hellwig 2024-06-07 6:08 ` Yu Kuai 2024-06-07 6:17 ` Yu Kuai 2024-06-07 6:19 ` Christoph Hellwig 2024-06-04 17:25 ` [PATCH 2/2] md/raid1: " Christoph Hellwig 2024-06-10 20:58 ` fix ->run failure handling Song Liu 2024-06-11 6:13 ` Christoph Hellwig 2024-06-12 16:36 ` Song Liu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).