[PATCH 0/3] Misc changes for md again

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/3] Misc changes for md again
@ 2021-10-17 13:50 Guoqing Jiang
  2021-10-17 13:50 ` [PATCH 1/3] md/bitmap: don't set max_write_behind if there is no write mostly device Guoqing Jiang
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Guoqing Jiang @ 2021-10-17 13:50 UTC (permalink / raw)
  To: song; +Cc: linux-raid

Hello,

This patchset addresses previous comment, please review.

Thanks,
Guoqing

Guoqing Jiang (3):
  md/bitmap: don't set max_write_behind if there is no write mostly
    device
  md/raid10: add 'read_err' to raid10_read_request
  md/raid10: factor out a get_error_dev helper

 drivers/md/md-bitmap.c | 18 +++++++++++++++
 drivers/md/raid10.c    | 52 ++++++++++++++++++++++--------------------
 2 files changed, 45 insertions(+), 25 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] md/bitmap: don't set max_write_behind if there is no write mostly device
  2021-10-17 13:50 [PATCH 0/3] Misc changes for md again Guoqing Jiang
@ 2021-10-17 13:50 ` Guoqing Jiang
  2021-10-19  6:29   ` Song Liu
  2021-10-17 13:50 ` [PATCH 2/3] md/raid10: add 'read_err' to raid10_read_request Guoqing Jiang
  2021-10-17 13:50 ` [PATCH 3/3] md/raid10: factor out a get_error_dev helper Guoqing Jiang
  2 siblings, 1 reply; 7+ messages in thread
From: Guoqing Jiang @ 2021-10-17 13:50 UTC (permalink / raw)
  To: song; +Cc: linux-raid

We shouldn't set it since write behind IO should only happen to write
mostly device.

Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
 drivers/md/md-bitmap.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index e29c6298ef5c..9424879d8d7e 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -2469,11 +2469,29 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
 {
 	unsigned long backlog;
 	unsigned long old_mwb = mddev->bitmap_info.max_write_behind;
+	struct md_rdev *rdev;
+	bool has_write_mostly = false;
 	int rv = kstrtoul(buf, 10, &backlog);
 	if (rv)
 		return rv;
 	if (backlog > COUNTER_MAX)
 		return -EINVAL;
+
+	/*
+	 * Without write mostly device, it doesn't make sense to set
+	 * backlog for max_write_behind.
+	 */
+	rdev_for_each(rdev, mddev)
+		if (test_bit(WriteMostly, &rdev->flags)) {
+			has_write_mostly = true;
+			break;
+		}
+	if (!has_write_mostly) {
+		pr_warn_ratelimited("%s: can't set backlog, no write mostly"
+				    " device available\n", mdname(mddev));
+		return -EINVAL;
+	}
+
 	mddev->bitmap_info.max_write_behind = backlog;
 	if (!backlog && mddev->serial_info_pool) {
 		/* serial_info_pool is not needed if backlog is zero */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] md/raid10: add 'read_err' to raid10_read_request
  2021-10-17 13:50 [PATCH 0/3] Misc changes for md again Guoqing Jiang
  2021-10-17 13:50 ` [PATCH 1/3] md/bitmap: don't set max_write_behind if there is no write mostly device Guoqing Jiang
@ 2021-10-17 13:50 ` Guoqing Jiang
  2021-10-17 13:50 ` [PATCH 3/3] md/raid10: factor out a get_error_dev helper Guoqing Jiang
  2 siblings, 0 replies; 7+ messages in thread
From: Guoqing Jiang @ 2021-10-17 13:50 UTC (permalink / raw)
  To: song; +Cc: linux-raid

Add the paramenter since the err retry logic is only meaningful when
the caller is handle_read_error.

Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
 drivers/md/raid10.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index dde98f65bd04..49f3187b2d46 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1116,7 +1116,7 @@ static void regular_request_wait(struct mddev *mddev, struct r10conf *conf,
 }
 
 static void raid10_read_request(struct mddev *mddev, struct bio *bio,
-				struct r10bio *r10_bio)
+				struct r10bio *r10_bio, bool handle_read_err)
 {
 	struct r10conf *conf = mddev->private;
 	struct bio *read_bio;
@@ -1129,7 +1129,7 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
 	struct md_rdev *err_rdev = NULL;
 	gfp_t gfp = GFP_NOIO;
 
-	if (slot >= 0 && r10_bio->devs[slot].rdev) {
+	if (handle_read_err && slot >= 0 && r10_bio->devs[slot].rdev) {
 		/*
 		 * This is an error retry, but we cannot
 		 * safely dereference the rdev in the r10_bio,
@@ -1519,7 +1519,7 @@ static void __make_request(struct mddev *mddev, struct bio *bio, int sectors)
 			conf->geo.raid_disks);
 
 	if (bio_data_dir(bio) == READ)
-		raid10_read_request(mddev, bio, r10_bio);
+		raid10_read_request(mddev, bio, r10_bio, false);
 	else
 		raid10_write_request(mddev, bio, r10_bio);
 }
@@ -2918,7 +2918,7 @@ static void handle_read_error(struct mddev *mddev, struct r10bio *r10_bio)
 	rdev_dec_pending(rdev, mddev);
 	allow_barrier(conf);
 	r10_bio->state = 0;
-	raid10_read_request(mddev, r10_bio->master_bio, r10_bio);
+	raid10_read_request(mddev, r10_bio->master_bio, r10_bio, true);
 }
 
 static void handle_write_completed(struct r10conf *conf, struct r10bio *r10_bio)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] md/raid10: factor out a get_error_dev helper
  2021-10-17 13:50 [PATCH 0/3] Misc changes for md again Guoqing Jiang
  2021-10-17 13:50 ` [PATCH 1/3] md/bitmap: don't set max_write_behind if there is no write mostly device Guoqing Jiang
  2021-10-17 13:50 ` [PATCH 2/3] md/raid10: add 'read_err' to raid10_read_request Guoqing Jiang
@ 2021-10-17 13:50 ` Guoqing Jiang
  2021-10-19  6:59   ` Song Liu
  2 siblings, 1 reply; 7+ messages in thread
From: Guoqing Jiang @ 2021-10-17 13:50 UTC (permalink / raw)
  To: song; +Cc: linux-raid

Add a helper to find error_dev in case handle_read_err is true.

Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
 drivers/md/raid10.c | 44 +++++++++++++++++++++++---------------------
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 49f3187b2d46..35d842ec377b 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1115,6 +1115,27 @@ static void regular_request_wait(struct mddev *mddev, struct r10conf *conf,
 	}
 }
 
+static void get_error_dev(struct mddev *mddev, struct r10conf *conf, int slot,
+			  struct md_rdev **err_rdev, struct r10bio *r10_bio)
+{
+	/*
+	 * This is an error retry, but we cannot
+	 * safely dereference the rdev in the r10_bio,
+	 * we must use the one in conf.
+	 * If it has already been disconnected (unlikely)
+	 * we lose the device name in error messages.
+	 */
+	int disk;
+
+	rcu_read_lock();
+	disk = r10_bio->devs[slot].devnum;
+	*err_rdev = rcu_dereference(conf->mirrors[disk].rdev);
+	if (!*err_rdev)
+		/* This never gets dereferenced */
+		*err_rdev = r10_bio->devs[slot].rdev;
+	rcu_read_unlock();
+}
+
 static void raid10_read_request(struct mddev *mddev, struct bio *bio,
 				struct r10bio *r10_bio, bool handle_read_err)
 {
@@ -1130,37 +1151,18 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
 	gfp_t gfp = GFP_NOIO;
 
 	if (handle_read_err && slot >= 0 && r10_bio->devs[slot].rdev) {
-		/*
-		 * This is an error retry, but we cannot
-		 * safely dereference the rdev in the r10_bio,
-		 * we must use the one in conf.
-		 * If it has already been disconnected (unlikely)
-		 * we lose the device name in error messages.
-		 */
-		int disk;
 		/*
 		 * As we are blocking raid10, it is a little safer to
 		 * use __GFP_HIGH.
 		 */
 		gfp = GFP_NOIO | __GFP_HIGH;
-
-		rcu_read_lock();
-		disk = r10_bio->devs[slot].devnum;
-		err_rdev = rcu_dereference(conf->mirrors[disk].rdev);
-		if (err_rdev)
-			bdevname(err_rdev->bdev, b);
-		else {
-			strcpy(b, "???");
-			/* This never gets dereferenced */
-			err_rdev = r10_bio->devs[slot].rdev;
-		}
-		rcu_read_unlock();
+		get_error_dev(mddev, conf, slot, &err_rdev, r10_bio);
 	}
-
 	regular_request_wait(mddev, conf, bio, r10_bio->sectors);
 	rdev = read_balance(conf, r10_bio, &max_sectors);
 	if (!rdev) {
 		if (err_rdev) {
+			bdevname(err_rdev->bdev, b);
 			pr_crit_ratelimited("md/raid10:%s: %s: unrecoverable I/O read error for block %llu\n",
 					    mdname(mddev), b,
 					    (unsigned long long)r10_bio->sector);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] md/bitmap: don't set max_write_behind if there is no write mostly device
  2021-10-17 13:50 ` [PATCH 1/3] md/bitmap: don't set max_write_behind if there is no write mostly device Guoqing Jiang
@ 2021-10-19  6:29   ` Song Liu
  0 siblings, 0 replies; 7+ messages in thread
From: Song Liu @ 2021-10-19  6:29 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: linux-raid

On Sun, Oct 17, 2021 at 6:50 AM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>
> We shouldn't set it since write behind IO should only happen to write
> mostly device.
>
> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>

Applied to md-next with a couple minor changes.

> ---
>  drivers/md/md-bitmap.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
>
> diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
> index e29c6298ef5c..9424879d8d7e 100644
> --- a/drivers/md/md-bitmap.c
> +++ b/drivers/md/md-bitmap.c
> @@ -2469,11 +2469,29 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
>  {
>         unsigned long backlog;
>         unsigned long old_mwb = mddev->bitmap_info.max_write_behind;
> +       struct md_rdev *rdev;
> +       bool has_write_mostly = false;
>         int rv = kstrtoul(buf, 10, &backlog);
>         if (rv)
>                 return rv;
>         if (backlog > COUNTER_MAX)
>                 return -EINVAL;
> +
> +       /*
> +        * Without write mostly device, it doesn't make sense to set
> +        * backlog for max_write_behind.
> +        */
> +       rdev_for_each(rdev, mddev)
> +               if (test_bit(WriteMostly, &rdev->flags)) {
> +                       has_write_mostly = true;
> +                       break;
> +               }

Added curly brackets for multi-line block.

> +       if (!has_write_mostly) {
> +               pr_warn_ratelimited("%s: can't set backlog, no write mostly"
> +                                   " device available\n", mdname(mddev));

Merged the two strings to a single line. checkpatch.pl should complain
about splitting a print string.

> +               return -EINVAL;
> +       }
> +
>         mddev->bitmap_info.max_write_behind = backlog;
>         if (!backlog && mddev->serial_info_pool) {
>                 /* serial_info_pool is not needed if backlog is zero */
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] md/raid10: factor out a get_error_dev helper
  2021-10-17 13:50 ` [PATCH 3/3] md/raid10: factor out a get_error_dev helper Guoqing Jiang
@ 2021-10-19  6:59   ` Song Liu
  2021-10-19  8:00     ` Guoqing Jiang
  0 siblings, 1 reply; 7+ messages in thread
From: Song Liu @ 2021-10-19  6:59 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: linux-raid

On Sun, Oct 17, 2021 at 6:50 AM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>
> Add a helper to find error_dev in case handle_read_err is true.
>
> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>

For 2/3 and 3/3, I was thinking about something like below (only
compile tested).
Would this work?

Thanks,
Song

diff --git i/drivers/md/raid10.c w/drivers/md/raid10.c
index dde98f65bd04f..c2387f55343dd 100644
--- i/drivers/md/raid10.c
+++ w/drivers/md/raid10.c
@@ -1116,7 +1116,7 @@ static void regular_request_wait(struct mddev
*mddev, struct r10conf *conf,
 }

 static void raid10_read_request(struct mddev *mddev, struct bio *bio,
-                               struct r10bio *r10_bio)
+                               struct r10bio *r10_bio, struct md_rdev
*err_rdev)
 {
        struct r10conf *conf = mddev->private;
        struct bio *read_bio;
@@ -1126,36 +1126,17 @@ static void raid10_read_request(struct mddev
*mddev, struct bio *bio,
        struct md_rdev *rdev;
        char b[BDEVNAME_SIZE];
        int slot = r10_bio->read_slot;
-       struct md_rdev *err_rdev = NULL;
        gfp_t gfp = GFP_NOIO;

-       if (slot >= 0 && r10_bio->devs[slot].rdev) {
-               /*
-                * This is an error retry, but we cannot
-                * safely dereference the rdev in the r10_bio,
-                * we must use the one in conf.
-                * If it has already been disconnected (unlikely)
-                * we lose the device name in error messages.
-                */
-               int disk;
-               /*
-                * As we are blocking raid10, it is a little safer to
-                * use __GFP_HIGH.
-                */
+       /*
+        * As we are blocking raid10, it is a little safer to
+        * use __GFP_HIGH.
+        */
+       if (err_rdev) {
                gfp = GFP_NOIO | __GFP_HIGH;
-
-               rcu_read_lock();
-               disk = r10_bio->devs[slot].devnum;
-               err_rdev = rcu_dereference(conf->mirrors[disk].rdev);
-               if (err_rdev)
-                       bdevname(err_rdev->bdev, b);
-               else {
-                       strcpy(b, "???");
-                       /* This never gets dereferenced */
-                       err_rdev = r10_bio->devs[slot].rdev;
-               }
-               rcu_read_unlock();
-       }
+               bdevname(err_rdev->bdev, b);
+       } else
+               strcpy(b, "???");

        regular_request_wait(mddev, conf, bio, r10_bio->sectors);
        rdev = read_balance(conf, r10_bio, &max_sectors);
@@ -1519,7 +1500,7 @@ static void __make_request(struct mddev *mddev,
struct bio *bio, int sectors)
                        conf->geo.raid_disks);

        if (bio_data_dir(bio) == READ)
-               raid10_read_request(mddev, bio, r10_bio);
+               raid10_read_request(mddev, bio, r10_bio, NULL);
        else
                raid10_write_request(mddev, bio, r10_bio);
 }
@@ -2887,6 +2868,31 @@ static int narrow_write_error(struct r10bio
*r10_bio, int i)
        return ok;
 }

+static struct md_rdev *get_error_dev(struct mddev *mddev, struct
r10conf *conf, int slot,
+                                    struct r10bio *r10_bio)
+{
+       struct md_rdev *err_rdev = NULL;
+
+       /*
+        * This is an error retry, but we cannot
+        * safely dereference the rdev in the r10_bio,
+        * we must use the one in conf.
+        * If it has already been disconnected (unlikely)
+        * we lose the device name in error messages.
+        */
+       int disk;
+
+       rcu_read_lock();
+       disk = r10_bio->devs[slot].devnum;
+       err_rdev = rcu_dereference(conf->mirrors[disk].rdev);
+       if (!err_rdev) {
+               /* This never gets dereferenced */
+               err_rdev = r10_bio->devs[slot].rdev;
+       }
+       rcu_read_unlock();
+       return err_rdev;
+}
+
 static void handle_read_error(struct mddev *mddev, struct r10bio *r10_bio)
 {
        int slot = r10_bio->read_slot;
@@ -2918,7 +2924,8 @@ static void handle_read_error(struct mddev
*mddev, struct r10bio *r10_bio)
        rdev_dec_pending(rdev, mddev);
        allow_barrier(conf);
        r10_bio->state = 0;
-       raid10_read_request(mddev, r10_bio->master_bio, r10_bio);
+       raid10_read_request(mddev, r10_bio->master_bio, r10_bio,
+                           get_error_dev(mddev, conf, slot, r10_bio));
 }

 static void handle_write_completed(struct r10conf *conf, struct
r10bio *r10_bio)

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] md/raid10: factor out a get_error_dev helper
  2021-10-19  6:59   ` Song Liu
@ 2021-10-19  8:00     ` Guoqing Jiang
  0 siblings, 0 replies; 7+ messages in thread
From: Guoqing Jiang @ 2021-10-19  8:00 UTC (permalink / raw)
  To: Song Liu; +Cc: linux-raid



On 10/19/21 2:59 PM, Song Liu wrote:
> On Sun, Oct 17, 2021 at 6:50 AM Guoqing Jiang<guoqing.jiang@linux.dev>  wrote:
>> Add a helper to find error_dev in case handle_read_err is true.
>>
>> Signed-off-by: Guoqing Jiang<guoqing.jiang@linux.dev>
> For 2/3 and 3/3, I was thinking about something like below (only
> compile tested).
> Would this work?

Thanks for clarification, I guess it works.

> Thanks,
> Song
>
> diff --git i/drivers/md/raid10.c w/drivers/md/raid10.c
> index dde98f65bd04f..c2387f55343dd 100644
> --- i/drivers/md/raid10.c
> +++ w/drivers/md/raid10.c
> @@ -1116,7 +1116,7 @@ static void regular_request_wait(struct mddev
> *mddev, struct r10conf *conf,
>   }
>
>   static void raid10_read_request(struct mddev *mddev, struct bio *bio,
> -                               struct r10bio *r10_bio)
> +                               struct r10bio *r10_bio, struct md_rdev
> *err_rdev)
>   {
>          struct r10conf *conf = mddev->private;
>          struct bio *read_bio;
> @@ -1126,36 +1126,17 @@ static void raid10_read_request(struct mddev
> *mddev, struct bio *bio,
>          struct md_rdev *rdev;
>          char b[BDEVNAME_SIZE];
>          int slot = r10_bio->read_slot;
> -       struct md_rdev *err_rdev = NULL;
>          gfp_t gfp = GFP_NOIO;
>
> -       if (slot >= 0 && r10_bio->devs[slot].rdev) {
> -               /*
> -                * This is an error retry, but we cannot
> -                * safely dereference the rdev in the r10_bio,
> -                * we must use the one in conf.
> -                * If it has already been disconnected (unlikely)
> -                * we lose the device name in error messages.
> -                */
> -               int disk;
> -               /*
> -                * As we are blocking raid10, it is a little safer to
> -                * use __GFP_HIGH.
> -                */
> +       /*
> +        * As we are blocking raid10, it is a little safer to
> +        * use __GFP_HIGH.
> +        */
> +       if (err_rdev) {
>                  gfp = GFP_NOIO | __GFP_HIGH;
> -
> -               rcu_read_lock();
> -               disk = r10_bio->devs[slot].devnum;
> -               err_rdev = rcu_dereference(conf->mirrors[disk].rdev);
> -               if (err_rdev)
> -                       bdevname(err_rdev->bdev, b);
> -               else {
> -                       strcpy(b, "???");
> -                       /* This never gets dereferenced */
> -                       err_rdev = r10_bio->devs[slot].rdev;
> -               }
> -               rcu_read_unlock();
> -       }
> +               bdevname(err_rdev->bdev, b);
> +       } else
> +               strcpy(b, "???");
>
>          regular_request_wait(mddev, conf, bio, r10_bio->sectors);
>          rdev = read_balance(conf, r10_bio, &max_sectors);
> @@ -1519,7 +1500,7 @@ static void __make_request(struct mddev *mddev,
> struct bio *bio, int sectors)
>                          conf->geo.raid_disks);
>
>          if (bio_data_dir(bio) == READ)
> -               raid10_read_request(mddev, bio, r10_bio);
> +               raid10_read_request(mddev, bio, r10_bio, NULL);
>          else
>                  raid10_write_request(mddev, bio, r10_bio);
>   }
> @@ -2887,6 +2868,31 @@ static int narrow_write_error(struct r10bio
> *r10_bio, int i)
>          return ok;
>   }
>
> +static struct md_rdev *get_error_dev(struct mddev *mddev, struct
> r10conf *conf, int slot,
> +                                    struct r10bio *r10_bio)
> +{
> +       struct md_rdev *err_rdev = NULL;
> +

We need the original check ("slot >= 0 && r10_bio->devs[slot].rdev ")
in the function I think. Will take a closer look and send a new version.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-10-19  8:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-10-17 13:50 [PATCH 0/3] Misc changes for md again Guoqing Jiang
2021-10-17 13:50 ` [PATCH 1/3] md/bitmap: don't set max_write_behind if there is no write mostly device Guoqing Jiang
2021-10-19  6:29   ` Song Liu
2021-10-17 13:50 ` [PATCH 2/3] md/raid10: add 'read_err' to raid10_read_request Guoqing Jiang
2021-10-17 13:50 ` [PATCH 3/3] md/raid10: factor out a get_error_dev helper Guoqing Jiang
2021-10-19  6:59   ` Song Liu
2021-10-19  8:00     ` Guoqing Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).