From: Brassow Jonathan <jbrassow@redhat.com>
To: Eivind Sarto <eivindsarto@gmail.com>
Cc: NeilBrown <neilb@suse.de>, linux-raid@vger.kernel.org
Subject: Re: raid1 data corruption during resync
Date: Tue, 2 Sep 2014 14:24:08 -0500 [thread overview]
Message-ID: <EAC50175-42D9-43BA-ADE2-1A50BD0581CB@redhat.com> (raw)
In-Reply-To: <20A5228D-DD63-4A6C-B2C6-B0C38996E636@gmail.com>
On Aug 29, 2014, at 2:29 PM, Eivind Sarto wrote:
> I am seeing occasional data corruption during raid1 resync.
> Reviewing the raid1 code, I suspect that commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 introduced a bug.
> Prior to this commit raise_barrier() used to wait for conf->nr_pending to become zero. It no longer does this.
> It is not easy to reproduce the corruption, so I wanted to ask about the following potential fix while I am still testing it.
> Once I validate that the fix indeed works, I will post a proper patch.
> Do you have any feedback?
>
> — drivers/md/raid1.c 2014-08-22 15:19:15.000000000 -0700
> +++ /tmp/raid1.c 2014-08-29 12:07:51.000000000 -0700
> @@ -851,7 +851,7 @@ static void raise_barrier(struct r1conf
> * handling.
> */
> wait_event_lock_irq(conf->wait_barrier,
> - !conf->array_frozen &&
> + !conf->array_frozen && !conf->nr_pending &&
> conf->barrier < RESYNC_DEPTH &&
> (conf->start_next_window >=
> conf->next_resync + RESYNC_SECTORS),
This patch does not work - at least, it doesn't fix the issues I'm seeing. My system hangs (in various places, like the resync thread) after commit 79ef3a8. When testing this patch, I also added some code to dm-raid.c to allow me to print-out some of the variables when I encounter a problem. After applying this patch and printing the variables, I see:
Sep 2 14:04:15 bp-01 kernel: device-mapper: raid: start_next_window = 12288
Sep 2 14:04:15 bp-01 kernel: device-mapper: raid: current_window_requests = -46
5257
Sep 2 14:04:15 bp-01 kernel: device-mapper: raid: next_window_requests = -11562
Sep 2 14:04:15 bp-01 kernel: device-mapper: raid: nr_pending = 0
Sep 2 14:04:15 bp-01 kernel: device-mapper: raid: nr_waiting = 0
Sep 2 14:04:15 bp-01 kernel: device-mapper: raid: nr_queued = 0
Sep 2 14:04:15 bp-01 kernel: device-mapper: raid: barrier = 1
Sep 2 14:04:15 bp-01 kernel: device-mapper: raid: array_frozen = 0
Some of those values look pretty bizarre to me and suggest the accounting is pretty messed up.
brassow
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-09-02 19:24 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-29 19:29 raid1 data corruption during resync Eivind Sarto
2014-09-02 14:10 ` Brassow Jonathan
2014-09-02 16:43 ` Eivind Sarto
2014-09-02 17:04 ` Eivind Sarto
2014-09-02 16:59 ` Brassow Jonathan
2014-09-02 19:24 ` Brassow Jonathan [this message]
2014-09-02 22:07 ` Eivind Sarto
2014-09-02 22:14 ` Brassow Jonathan
2014-09-02 23:55 ` NeilBrown
2014-09-03 0:48 ` Eivind Sarto
2014-09-03 1:18 ` Brassow Jonathan
2014-09-03 1:31 ` NeilBrown
2014-09-03 1:45 ` Brassow Jonathan
2014-09-03 21:39 ` Brassow Jonathan
2014-09-04 5:28 ` NeilBrown
[not found] <D4FE2D75-4208-48C9-A4D0-432F092E5AE9@redhat.com>
2014-09-08 15:52 ` Fwd: " Brassow Jonathan
2014-09-09 1:08 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=EAC50175-42D9-43BA-ADE2-1A50BD0581CB@redhat.com \
--to=jbrassow@redhat.com \
--cc=eivindsarto@gmail.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).