linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/3] md/raid1: call free_r1bio() before allow_barrier()
@ 2023-08-14 13:53 Xueshi Hu
  2023-08-14 13:53 ` [PATCH v6 1/3] " Xueshi Hu
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Xueshi Hu @ 2023-08-14 13:53 UTC (permalink / raw)
  To: yukuai3, song, dan.j.williams, neilb, akpm, neilb; +Cc: linux-raid, Xueshi Hu

Because raid reshape changes the r1conf::raid_disks and the mempool, it
orders that there's no in-flight r1bio when reshaping. However, the
current caller of allow_barrier() allows the reshape
operation to proceed even if the old r1bio requests have not been freed.

-> v2:
	- fix the problem one by one instead of calling
	blk_mq_freeze_queue() as suggested by Yu Kuai
-> v3:
	- add freeze_array_totally() to replace freeze_array() instead
	  of gave up in raid1_reshape()
	- add a missed fix in raid_end_bio_io()
	- add a small check at the start of raid1_reshape()
-> v4:
	- add fix tag and revise the commit message
	- drop patch 1 as there is an ongoing systematic fix for the bug
	- drop patch 3 as it's unrelated which will be sent in
	another patch
-> v5:
	- split the patch into three parts, with each individual patch fix
	one bug.
-> v6:
	- drop the fix tag in patch 1.


Xueshi Hu (3):
  md/raid1: call free_r1bio() before allow_barrier()
  md/raid1: free the r1bio firstly before waiting for blocked rdev
  md/raid1: keep the barrier held until handle_read_error() finished

 drivers/md/raid1.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

-- 
2.40.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v6 1/3] md/raid1: call free_r1bio() before allow_barrier()
  2023-08-14 13:53 [PATCH v6 0/3] md/raid1: call free_r1bio() before allow_barrier() Xueshi Hu
@ 2023-08-14 13:53 ` Xueshi Hu
  2023-08-14 13:53 ` [PATCH v6 2/3] md/raid1: free the r1bio firstly before waiting for blocked rdev Xueshi Hu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Xueshi Hu @ 2023-08-14 13:53 UTC (permalink / raw)
  To: yukuai3, song, dan.j.williams, neilb, akpm, neilb; +Cc: linux-raid, Xueshi Hu

After allow_barrier(), an concurrent raid1_reshape() will replace old
mempool and r1conf::raid_disks, which are necessary when freeing the
r1bio. Change the execution order of free_r1bio() and allow_barrier() so
that kernel can free r1bio safely.

Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Xueshi Hu <xueshi.hu@smartx.com>
---
 drivers/md/raid1.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index dd25832eb045..dbbee0c14a5b 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -313,6 +313,7 @@ static void raid_end_bio_io(struct r1bio *r1_bio)
 {
 	struct bio *bio = r1_bio->master_bio;
 	struct r1conf *conf = r1_bio->mddev->private;
+	sector_t sector = r1_bio->sector;
 
 	/* if nobody has done the final endio yet, do it now */
 	if (!test_and_set_bit(R1BIO_Returned, &r1_bio->state)) {
@@ -323,13 +324,13 @@ static void raid_end_bio_io(struct r1bio *r1_bio)
 
 		call_bio_endio(r1_bio);
 	}
+
+	free_r1bio(r1_bio);
 	/*
 	 * Wake up any possible resync thread that waits for the device
 	 * to go idle.  All I/Os, even write-behind writes, are done.
 	 */
-	allow_barrier(conf, r1_bio->sector);
-
-	free_r1bio(r1_bio);
+	allow_barrier(conf, sector);
 }
 
 /*
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v6 2/3] md/raid1: free the r1bio firstly before waiting for blocked rdev
  2023-08-14 13:53 [PATCH v6 0/3] md/raid1: call free_r1bio() before allow_barrier() Xueshi Hu
  2023-08-14 13:53 ` [PATCH v6 1/3] " Xueshi Hu
@ 2023-08-14 13:53 ` Xueshi Hu
  2023-08-14 13:53 ` [PATCH v6 3/3] md/raid1: keep the barrier held until handle_read_error() finished Xueshi Hu
  2023-08-14 16:40 ` [PATCH v6 0/3] md/raid1: call free_r1bio() before allow_barrier() Song Liu
  3 siblings, 0 replies; 5+ messages in thread
From: Xueshi Hu @ 2023-08-14 13:53 UTC (permalink / raw)
  To: yukuai3, song, dan.j.williams, neilb, akpm, neilb; +Cc: linux-raid, Xueshi Hu

Raid1 reshape will change mempool and r1conf::raid_disks which are
necessary when freeing r1bio. allow_barrier() make an concurrent
raid1_reshape() possible. So, free the in-flight r1bio firstly before
waiting blocked rdev.

Fixes: 6bfe0b499082 ("md: support blocking writes to an array on device failure")
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Xueshi Hu <xueshi.hu@smartx.com>
---
 drivers/md/raid1.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index dbbee0c14a5b..5f17f30a00a9 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1374,6 +1374,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
 		return;
 	}
 
+ retry_write:
 	r1_bio = alloc_r1bio(mddev, bio);
 	r1_bio->sectors = max_write_sectors;
 
@@ -1389,7 +1390,6 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
 	 */
 
 	disks = conf->raid_disks * 2;
- retry_write:
 	blocked_rdev = NULL;
 	rcu_read_lock();
 	max_sectors = r1_bio->sectors;
@@ -1469,7 +1469,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
 		for (j = 0; j < i; j++)
 			if (r1_bio->bios[j])
 				rdev_dec_pending(conf->mirrors[j].rdev, mddev);
-		r1_bio->state = 0;
+		free_r1bio(r1_bio);
 		allow_barrier(conf, bio->bi_iter.bi_sector);
 
 		if (bio->bi_opf & REQ_NOWAIT) {
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v6 3/3] md/raid1: keep the barrier held until handle_read_error() finished
  2023-08-14 13:53 [PATCH v6 0/3] md/raid1: call free_r1bio() before allow_barrier() Xueshi Hu
  2023-08-14 13:53 ` [PATCH v6 1/3] " Xueshi Hu
  2023-08-14 13:53 ` [PATCH v6 2/3] md/raid1: free the r1bio firstly before waiting for blocked rdev Xueshi Hu
@ 2023-08-14 13:53 ` Xueshi Hu
  2023-08-14 16:40 ` [PATCH v6 0/3] md/raid1: call free_r1bio() before allow_barrier() Song Liu
  3 siblings, 0 replies; 5+ messages in thread
From: Xueshi Hu @ 2023-08-14 13:53 UTC (permalink / raw)
  To: yukuai3, song, dan.j.williams, neilb, akpm, neilb; +Cc: linux-raid, Xueshi Hu

handle_read_error() will call allow_barrier() to match the former barrier
raising. However, it should put the allow_barrier() at the end to avoid an
concurrent raid reshape.

Fixes: 689389a06ce7 ("md/raid1: simplify handle_read_error().")
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Xueshi Hu <xueshi.hu@smartx.com>
---
 drivers/md/raid1.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 5f17f30a00a9..5a5eb5f1a224 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2499,6 +2499,7 @@ static void handle_read_error(struct r1conf *conf, struct r1bio *r1_bio)
 	struct mddev *mddev = conf->mddev;
 	struct bio *bio;
 	struct md_rdev *rdev;
+	sector_t sector;
 
 	clear_bit(R1BIO_ReadError, &r1_bio->state);
 	/* we got a read error. Maybe the drive is bad.  Maybe just
@@ -2528,12 +2529,13 @@ static void handle_read_error(struct r1conf *conf, struct r1bio *r1_bio)
 	}
 
 	rdev_dec_pending(rdev, conf->mddev);
-	allow_barrier(conf, r1_bio->sector);
+	sector = r1_bio->sector;
 	bio = r1_bio->master_bio;
 
 	/* Reuse the old r1_bio so that the IO_BLOCKED settings are preserved */
 	r1_bio->state = 0;
 	raid1_read_request(mddev, bio, r1_bio->sectors, r1_bio);
+	allow_barrier(conf, sector);
 }
 
 static void raid1d(struct md_thread *thread)
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v6 0/3] md/raid1: call free_r1bio() before allow_barrier()
  2023-08-14 13:53 [PATCH v6 0/3] md/raid1: call free_r1bio() before allow_barrier() Xueshi Hu
                   ` (2 preceding siblings ...)
  2023-08-14 13:53 ` [PATCH v6 3/3] md/raid1: keep the barrier held until handle_read_error() finished Xueshi Hu
@ 2023-08-14 16:40 ` Song Liu
  3 siblings, 0 replies; 5+ messages in thread
From: Song Liu @ 2023-08-14 16:40 UTC (permalink / raw)
  To: Xueshi Hu; +Cc: yukuai3, dan.j.williams, neilb, akpm, neilb, linux-raid

On Mon, Aug 14, 2023 at 9:54 PM Xueshi Hu <xueshi.hu@smartx.com> wrote:
>
> Because raid reshape changes the r1conf::raid_disks and the mempool, it
> orders that there's no in-flight r1bio when reshaping. However, the
> current caller of allow_barrier() allows the reshape
> operation to proceed even if the old r1bio requests have not been freed.

Applied v6 to md-next after updating some commit log.

Thanks,
Song

>
> -> v2:
>         - fix the problem one by one instead of calling
>         blk_mq_freeze_queue() as suggested by Yu Kuai
> -> v3:
>         - add freeze_array_totally() to replace freeze_array() instead
>           of gave up in raid1_reshape()
>         - add a missed fix in raid_end_bio_io()
>         - add a small check at the start of raid1_reshape()
> -> v4:
>         - add fix tag and revise the commit message
>         - drop patch 1 as there is an ongoing systematic fix for the bug
>         - drop patch 3 as it's unrelated which will be sent in
>         another patch
> -> v5:
>         - split the patch into three parts, with each individual patch fix
>         one bug.
> -> v6:
>         - drop the fix tag in patch 1.
>
>
> Xueshi Hu (3):
>   md/raid1: call free_r1bio() before allow_barrier()
>   md/raid1: free the r1bio firstly before waiting for blocked rdev
>   md/raid1: keep the barrier held until handle_read_error() finished
>
>  drivers/md/raid1.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
>
> --
> 2.40.1
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-08-14 16:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-14 13:53 [PATCH v6 0/3] md/raid1: call free_r1bio() before allow_barrier() Xueshi Hu
2023-08-14 13:53 ` [PATCH v6 1/3] " Xueshi Hu
2023-08-14 13:53 ` [PATCH v6 2/3] md/raid1: free the r1bio firstly before waiting for blocked rdev Xueshi Hu
2023-08-14 13:53 ` [PATCH v6 3/3] md/raid1: keep the barrier held until handle_read_error() finished Xueshi Hu
2023-08-14 16:40 ` [PATCH v6 0/3] md/raid1: call free_r1bio() before allow_barrier() Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).