* [PATCH v2 1/3] md/raid10: check replacement and rdev to prevent submit the same io twice
2023-07-01 8:05 [PATCH v2 0/3] raid10 bugfix linan666
@ 2023-07-01 8:05 ` linan666
2023-07-01 8:05 ` [PATCH v2 2/3] md/raid10: factor out dereference_rdev_and_rrdev() linan666
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: linan666 @ 2023-07-01 8:05 UTC (permalink / raw)
To: song, guoqing.jiang, xni, colyli
Cc: linux-raid, linux-kernel, linan122, yukuai3, yi.zhang, houtao1,
yangerkun
From: Li Nan <linan122@huawei.com>
After commit 4ca40c2ce099 ("md/raid10: Allow replacement device to be
replace old drive."), 'rdev' and 'replacement' could appear to be
identical. There are already checks for that in wait_blocked_dev() and
raid10_write_request(). Add check for raid10_handle_discard() now.
Signed-off-by: Li Nan <linan122@huawei.com>
---
drivers/md/raid10.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index fabc340aae4f..3e6a09aaaba6 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1811,6 +1811,8 @@ static int raid10_handle_discard(struct mddev *mddev, struct bio *bio)
r10_bio->devs[disk].bio = NULL;
r10_bio->devs[disk].repl_bio = NULL;
+ if (rdev == rrdev)
+ rrdev = NULL;
if (rdev && (test_bit(Faulty, &rdev->flags)))
rdev = NULL;
if (rrdev && (test_bit(Faulty, &rrdev->flags)))
--
2.39.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v2 2/3] md/raid10: factor out dereference_rdev_and_rrdev()
2023-07-01 8:05 [PATCH v2 0/3] raid10 bugfix linan666
2023-07-01 8:05 ` [PATCH v2 1/3] md/raid10: check replacement and rdev to prevent submit the same io twice linan666
@ 2023-07-01 8:05 ` linan666
2023-07-01 8:05 ` [PATCH v2 3/3] md/raid10: use dereference_rdev_and_rrdev() to get devices linan666
2023-07-07 9:14 ` [PATCH v2 0/3] raid10 bugfix Song Liu
3 siblings, 0 replies; 5+ messages in thread
From: linan666 @ 2023-07-01 8:05 UTC (permalink / raw)
To: song, guoqing.jiang, xni, colyli
Cc: linux-raid, linux-kernel, linan122, yukuai3, yi.zhang, houtao1,
yangerkun
From: Li Nan <linan122@huawei.com>
Factor out a helper to get 'rdev' and 'replacement' from config->mirrors.
Just to make code cleaner and prepare to fix the bug of io loss while
'replacement' replace 'rdev'.
There is no functional change.
Signed-off-by: Li Nan <linan122@huawei.com>
---
drivers/md/raid10.c | 29 ++++++++++++++++++++---------
1 file changed, 20 insertions(+), 9 deletions(-)
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 3e6a09aaaba6..a6c3806be903 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1346,6 +1346,25 @@ static void raid10_write_one_disk(struct mddev *mddev, struct r10bio *r10_bio,
}
}
+static struct md_rdev *dereference_rdev_and_rrdev(struct raid10_info *mirror,
+ struct md_rdev **prrdev)
+{
+ struct md_rdev *rdev, *rrdev;
+
+ rrdev = rcu_dereference(mirror->replacement);
+ /*
+ * Read replacement first to prevent reading both rdev and
+ * replacement as NULL during replacement replace rdev.
+ */
+ smp_mb();
+ rdev = rcu_dereference(mirror->rdev);
+ if (rdev == rrdev)
+ rrdev = NULL;
+
+ *prrdev = rrdev;
+ return rdev;
+}
+
static void wait_blocked_dev(struct mddev *mddev, struct r10bio *r10_bio)
{
int i;
@@ -1489,15 +1508,7 @@ static void raid10_write_request(struct mddev *mddev, struct bio *bio,
int d = r10_bio->devs[i].devnum;
struct md_rdev *rdev, *rrdev;
- rrdev = rcu_dereference(conf->mirrors[d].replacement);
- /*
- * Read replacement first to prevent reading both rdev and
- * replacement as NULL during replacement replace rdev.
- */
- smp_mb();
- rdev = rcu_dereference(conf->mirrors[d].rdev);
- if (rdev == rrdev)
- rrdev = NULL;
+ rdev = dereference_rdev_and_rrdev(&conf->mirrors[d], &rrdev);
if (rdev && (test_bit(Faulty, &rdev->flags)))
rdev = NULL;
if (rrdev && (test_bit(Faulty, &rrdev->flags)))
--
2.39.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v2 3/3] md/raid10: use dereference_rdev_and_rrdev() to get devices
2023-07-01 8:05 [PATCH v2 0/3] raid10 bugfix linan666
2023-07-01 8:05 ` [PATCH v2 1/3] md/raid10: check replacement and rdev to prevent submit the same io twice linan666
2023-07-01 8:05 ` [PATCH v2 2/3] md/raid10: factor out dereference_rdev_and_rrdev() linan666
@ 2023-07-01 8:05 ` linan666
2023-07-07 9:14 ` [PATCH v2 0/3] raid10 bugfix Song Liu
3 siblings, 0 replies; 5+ messages in thread
From: linan666 @ 2023-07-01 8:05 UTC (permalink / raw)
To: song, guoqing.jiang, xni, colyli
Cc: linux-raid, linux-kernel, linan122, yukuai3, yi.zhang, houtao1,
yangerkun
From: Li Nan <linan122@huawei.com>
Commit 2ae6aaf76912 ("md/raid10: fix io loss while replacement replace
rdev") reads replacement first to prevent io loss. However, there are same
issue in wait_blocked_dev() and raid10_handle_discard(), too. Fix it by
using dereference_rdev_and_rrdev() to get devices.
Fixes: d30588b2731f ("md/raid10: improve raid10 discard request")
Fixes: f2e7e269a752 ("md/raid10: pull the code that wait for blocked dev into one function")
Signed-off-by: Li Nan <linan122@huawei.com>
---
drivers/md/raid10.c | 15 +++++----------
1 file changed, 5 insertions(+), 10 deletions(-)
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index a6c3806be903..cbaec6fce1d7 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1375,11 +1375,9 @@ static void wait_blocked_dev(struct mddev *mddev, struct r10bio *r10_bio)
blocked_rdev = NULL;
rcu_read_lock();
for (i = 0; i < conf->copies; i++) {
- struct md_rdev *rdev = rcu_dereference(conf->mirrors[i].rdev);
- struct md_rdev *rrdev = rcu_dereference(
- conf->mirrors[i].replacement);
- if (rdev == rrdev)
- rrdev = NULL;
+ struct md_rdev *rdev, *rrdev;
+
+ rdev = dereference_rdev_and_rrdev(&conf->mirrors[i], &rrdev);
if (rdev && unlikely(test_bit(Blocked, &rdev->flags))) {
atomic_inc(&rdev->nr_pending);
blocked_rdev = rdev;
@@ -1815,15 +1813,12 @@ static int raid10_handle_discard(struct mddev *mddev, struct bio *bio)
*/
rcu_read_lock();
for (disk = 0; disk < geo->raid_disks; disk++) {
- struct md_rdev *rdev = rcu_dereference(conf->mirrors[disk].rdev);
- struct md_rdev *rrdev = rcu_dereference(
- conf->mirrors[disk].replacement);
+ struct md_rdev *rdev, *rrdev;
+ rdev = dereference_rdev_and_rrdev(&conf->mirrors[disk], &rrdev);
r10_bio->devs[disk].bio = NULL;
r10_bio->devs[disk].repl_bio = NULL;
- if (rdev == rrdev)
- rrdev = NULL;
if (rdev && (test_bit(Faulty, &rdev->flags)))
rdev = NULL;
if (rrdev && (test_bit(Faulty, &rrdev->flags)))
--
2.39.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2 0/3] raid10 bugfix
2023-07-01 8:05 [PATCH v2 0/3] raid10 bugfix linan666
` (2 preceding siblings ...)
2023-07-01 8:05 ` [PATCH v2 3/3] md/raid10: use dereference_rdev_and_rrdev() to get devices linan666
@ 2023-07-07 9:14 ` Song Liu
3 siblings, 0 replies; 5+ messages in thread
From: Song Liu @ 2023-07-07 9:14 UTC (permalink / raw)
To: linan666
Cc: guoqing.jiang, xni, colyli, linux-raid, linux-kernel, linan122,
yukuai3, yi.zhang, houtao1, yangerkun
On Sat, Jul 1, 2023 at 4:06 PM <linan666@huaweicloud.com> wrote:
>
> From: Li Nan <linan122@huawei.com>
>
> Changes in v2:
> - patch 2/3, change function name to dereference_rdev_and_rrdev. Return
> rdev to reduce output argument.
>
> Li Nan (3):
> md/raid10: check replacement and rdev to prevent submit the same io
> twice
> md/raid10: factor out dereference_rdev_and_rrdev()
> md/raid10: use dereference_rdev_and_rrdev() to get devices
Applied to md-next. Thanks!
Song
>
> drivers/md/raid10.c | 42 +++++++++++++++++++++++++-----------------
> 1 file changed, 25 insertions(+), 17 deletions(-)
>
> --
> 2.39.2
>
^ permalink raw reply [flat|nested] 5+ messages in thread