From: Guoqing Jiang <guoqing.jiang@linux.dev>
To: Xiao Ni <xni@redhat.com>, Song Liu <song@kernel.org>
Cc: linux-raid <linux-raid@vger.kernel.org>,
Heinz Mauelshagen <heinzm@redhat.com>,
Nigel Croxon <ncroxon@redhat.com>
Subject: Re: The read data is wrong from raid5 when recovery happens
Date: Fri, 26 May 2023 11:09:24 +0800 [thread overview]
Message-ID: <ebe7fa31-2e9a-74da-bbbd-3d5238590a7c@linux.dev> (raw)
In-Reply-To: <CALTww28aV5CGXQAu46Rkc=fG1jK=ARzCT8VGoVyje8kQdqEXMg@mail.gmail.com>
On 5/26/23 09:49, Xiao Ni wrote:
> Hi all
>
> We found a problem recently. The read data is wrong when recovery happens.
> Now we've found it's introduced by patch 10764815f (md: add io accounting
> for raid0 and raid5). I can reproduce this 100%. This problem exists in
> upstream. The test steps are like this:
>
> 1. mdadm -CR $devname -l5 -n4 /dev/sd[b-e] --force --assume-clean
> 2. mkfs.ext4 -F $devname
> 3. mount $devname $mount_point
> 4. mdadm --incremental --fail sdd
> 5. dd if=/dev/zero of=/tmp/pythontest/file1 bs=1M count=100000
> status=progress
> 6. mdadm /dev/md126 --add /dev/sdd
> 7. create 31 processes that writes and reads. It compares the content with
> md5sum. The test will go on until the recovery stops
> 8. wait for about 10 minutes, we can see some processes report checksum is
> wrong. But if it re-read the data again, the checksum will be good.
>
> I tried to narrow this problem like this:
>
> - md_account_bio(mddev, &bi);
> + if (rw == WRITE)
> + md_account_bio(mddev, &bi);
> If it only do account for write requests, the problem can disappear.
>
> - if (rw == READ && mddev->degraded == 0 &&
> - mddev->reshape_position == MaxSector) {
> - bi = chunk_aligned_read(mddev, bi);
> - if (!bi)
> - return true;
> - }
> + //if (rw == READ && mddev->degraded == 0 &&
> + // mddev->reshape_position == MaxSector) {
> + // bi = chunk_aligned_read(mddev, bi);
> + // if (!bi)
> + // return true;
> + //}
>
> if (unlikely(bio_op(bi) == REQ_OP_DISCARD)) {
> make_discard_request(mddev, bi);
> @@ -6180,7 +6180,8 @@ static bool raid5_make_request(struct mddev *mddev,
> struct bio * bi)
> md_write_end(mddev);
> return true;
> }
> - md_account_bio(mddev, &bi);
> + if (rw == READ)
> + md_account_bio(mddev, &bi);
>
> I comment the chunk_aligned_read out and only account for read requests,
> this problem can be reproduced.
After a quick look,raid5_read_one_chunk clones bio by itself, so no need to
do it for the chunk aligned readcase. Could you pls try this?
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6120,6 +6120,7 @@static bool raid5_make_request(struct mddev *mddev,
struct bio * bi)
const int rw = bio_data_dir(bi);
enum stripe_result res;
int s, stripe_cnt;
+bool account_bio = true;
if (unlikely(bi->bi_opf & REQ_PREFLUSH)) {
int ret = log_handle_flush_request(conf, bi);
@@ -6148,6 +6149,7 @@static bool raid5_make_request(struct mddev *mddev,
struct bio * bi)
if (rw == READ && mddev->degraded == 0 &&
mddev->reshape_position == MaxSector) {
bi = chunk_aligned_read(mddev, bi);
+account_bio = false;
if (!bi)
return true;
}
@@ -6180,7 +6182,8 @@static bool raid5_make_request(struct mddev *mddev,
struct bio * bi)
md_write_end(mddev);
return true;
}
- md_account_bio(mddev, &bi);
+if (account_bio)
+md_account_bio(mddev, &bi);
Thanks,
Guoqing
next prev parent reply other threads:[~2023-05-26 3:15 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CALTww28aV5CGXQAu46Rkc=fG1jK=ARzCT8VGoVyje8kQdqEXMg@mail.gmail.com>
2023-05-26 2:08 ` Fwd: The read data is wrong from raid5 when recovery happens Xiao Ni
2023-05-26 2:17 ` Yu Kuai
2023-05-26 2:40 ` Xiao Ni
2023-05-26 2:47 ` Yu Kuai
2023-05-26 3:02 ` Xiao Ni
2023-05-26 3:56 ` d tbsky
2023-05-26 6:20 ` Xiao Ni
2024-02-14 15:15 ` Fwd: " Mateusz Kusiak
2024-02-14 17:12 ` Song Liu
[not found] ` <CALTww29s1WupaVRSrEX1GbD=1Bt7b5cxseDnBLARkH1uHUhtCA@mail.gmail.com>
2024-02-15 10:41 ` Mateusz Kusiak
2023-05-26 3:09 ` Guoqing Jiang [this message]
2023-05-26 6:45 ` Xiao Ni
2023-05-26 7:12 ` Guoqing Jiang
2023-05-26 7:23 ` Xiao Ni
2023-05-26 9:13 ` Mariusz Tkaczyk
2023-05-26 21:13 ` Song Liu
2023-05-27 0:56 ` Xiao Ni
2023-07-11 0:39 ` Xiao Ni
2023-07-14 1:30 ` Yu Kuai
2023-05-29 2:25 ` Guoqing Jiang
2023-05-29 3:41 ` Xiao Ni
2023-05-29 8:33 ` Guoqing Jiang
2023-05-29 8:40 ` Xiao Ni
2023-05-30 1:36 ` Guoqing Jiang
2023-05-30 2:02 ` Yu Kuai
2023-05-30 2:11 ` Xiao Ni
2023-05-30 2:23 ` Guoqing Jiang
2023-05-30 2:30 ` Xiao Ni
2023-05-30 2:43 ` Guoqing Jiang
2023-06-14 8:27 ` Kusiak, Mateusz
2023-06-14 8:46 ` Xiao Ni
2023-05-29 13:51 ` Xiao Ni
2023-05-30 0:53 ` Guoqing Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ebe7fa31-2e9a-74da-bbbd-3d5238590a7c@linux.dev \
--to=guoqing.jiang@linux.dev \
--cc=heinzm@redhat.com \
--cc=linux-raid@vger.kernel.org \
--cc=ncroxon@redhat.com \
--cc=song@kernel.org \
--cc=xni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.