From: NeilBrown <neilb@suse.de>
To: Brassow Jonathan <jbrassow@redhat.com>
Cc: Eivind Sarto <eivindsarto@gmail.com>, linux-raid@vger.kernel.org
Subject: Re: raid1 data corruption during resync
Date: Wed, 3 Sep 2014 11:31:51 +1000 [thread overview]
Message-ID: <20140903113151.4099292b@notabene.brown> (raw)
In-Reply-To: <F9B1D613-79EF-488E-8D41-4C1BDBC6D4BE@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 7491 bytes --]
On Tue, 2 Sep 2014 20:18:06 -0500 Brassow Jonathan <jbrassow@redhat.com>
wrote:
>
> On Sep 2, 2014, at 6:55 PM, NeilBrown wrote:
>
> > On Tue, 2 Sep 2014 15:07:26 -0700 Eivind Sarto <eivindsarto@gmail.com> wrote:
> >
> >>
> >> On Sep 2, 2014, at 12:24 PM, Brassow Jonathan <jbrassow@redhat.com> wrote:
> >>
> >>>
> >>> On Aug 29, 2014, at 2:29 PM, Eivind Sarto wrote:
> >>>
> >>>> I am seeing occasional data corruption during raid1 resync.
> >>>> Reviewing the raid1 code, I suspect that commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 introduced a bug.
> >>>> Prior to this commit raise_barrier() used to wait for conf->nr_pending to become zero. It no longer does this.
> >>>> It is not easy to reproduce the corruption, so I wanted to ask about the following potential fix while I am still testing it.
> >>>> Once I validate that the fix indeed works, I will post a proper patch.
> >>>> Do you have any feedback?
> >>>>
> >>>> — drivers/md/raid1.c 2014-08-22 15:19:15.000000000 -0700
> >>>> +++ /tmp/raid1.c 2014-08-29 12:07:51.000000000 -0700
> >>>> @@ -851,7 +851,7 @@ static void raise_barrier(struct r1conf
> >>>> * handling.
> >>>> */
> >>>> wait_event_lock_irq(conf->wait_barrier,
> >>>> - !conf->array_frozen &&
> >>>> + !conf->array_frozen && !conf->nr_pending &&
> >>>> conf->barrier < RESYNC_DEPTH &&
> >>>> (conf->start_next_window >=
> >>>> conf->next_resync + RESYNC_SECTORS),
> >>>
> >>> This patch does not work - at least, it doesn't fix the issues I'm seeing. My system hangs (in various places, like the resync thread) after commit 79ef3a8. When testing this patch, I also added some code to dm-raid.c to allow me to print-out some of the variables when I encounter a problem. After applying this patch and printing the variables, I see:
> >>> Sep 2 14:04:15 bp-01 kernel: device-mapper: raid: start_next_window = 12288
> >>> Sep 2 14:04:15 bp-01 kernel: device-mapper: raid: current_window_requests = -46
> >>> 5257
> >>> Sep 2 14:04:15 bp-01 kernel: device-mapper: raid: next_window_requests = -11562
> >>> Sep 2 14:04:15 bp-01 kernel: device-mapper: raid: nr_pending = 0
> >>> Sep 2 14:04:15 bp-01 kernel: device-mapper: raid: nr_waiting = 0
> >>> Sep 2 14:04:15 bp-01 kernel: device-mapper: raid: nr_queued = 0
> >>> Sep 2 14:04:15 bp-01 kernel: device-mapper: raid: barrier = 1
> >>> Sep 2 14:04:15 bp-01 kernel: device-mapper: raid: array_frozen = 0
> >>>
> >>> Some of those values look pretty bizarre to me and suggest the accounting is pretty messed up.
> >>>
> >>> brassow
> >>>
> >>
> >> After reviewing commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 I notice that wait_barrier() will now only exclude writes. User reads are not excluded even if the fall within the resync window.
> >> The old implementation used to exclude both reads and writes while resync-IO is active.
> >> Could this be a cause of data corruption?
> >>
> >
> > Could be.
> > From read_balance:
> >
> > if (conf->mddev->recovery_cp < MaxSector &&
> > (this_sector + sectors >= conf->next_resync))
> > choose_first = 1;
> > else
> > choose_first = 0;
> >
> > This used to be safe because a read immediately before next_resync would wait
> > until all resync requests completed. But now that read requests don't block
> > it isn't safe.
> > Probably best to make this:
> >
> > choose_first = (conf->mddev->recovery_cp < this_sector + sectors);
> >
> > Can you test that?
>
> Unfortunately, I doubt very much that that will solve the issue that I am seeing. (Although, I'd welcome this fix also.)
>
> I have predominantly been testing RHEL kernels, but I've seen this with the latest upstream kernel as well. When doing fault testing, like failing a drive while the resync is happening, the processes can become hung. Here's one example, but they vary:
>
> INFO: task mdX_resync:11334 blocked for more than 120 seconds.
> Not tainted 2.6.32-497.el6.x86_64 #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> mdX_resync D 0000000000000000 0 11334 2 0x00000080
> ffff8801fada5be0 0000000000000046 ffff88021a3bcc00 0000000300000001
> ffff880218b92200 ffff880218668f50 0000000000000282 0000000000000003
> 0000000000000001 0000000000000282 ffff88021bfb45f8 ffff8801fada5fd8
> Call Trace:
> [<ffffffff8140e439>] ? md_wakeup_thread+0x39/0x70
> [<ffffffffa02b81dd>] sync_request+0x26d/0xb90 [raid1]
> [<ffffffff81064b90>] ? default_wake_function+0x0/0x20
> [<ffffffff810577e9>] ? __wake_up_common+0x59/0x90
> [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40
> [<ffffffff8105bd23>] ? __wake_up+0x53/0x70
> [<ffffffff81414a47>] md_do_sync+0x6c7/0xd20
> [<ffffffff814154f5>] md_thread+0x115/0x150
> [<ffffffff814153e0>] ? md_thread+0x0/0x150
> [<ffffffff8109e66e>] kthread+0x9e/0xc0
> [<ffffffff8100c20a>] child_rip+0xa/0x20
> [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
> [<ffffffff8100c200>] ? child_rip+0x0/0x20
>
>
> and...
>
>
> INFO: task kjournald:10732 blocked for more than 120 seconds.
> Not tainted 2.6.32-497.el6.x86_64 #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> kjournald D 0000000000000008 0 10732 2 0x00000080
> ffff8801d71238b0 0000000000000046 0000000000000000 0000000300000001
> ffff880218b92200 ffff880218668f50 000003783fd7424b 0000000000000003
> 0000000000000001 000000010035a163 ffff880219135af8 ffff8801d7123fd8
> Call Trace:
> [<ffffffffa02b463f>] wait_barrier+0xcf/0x280 [raid1]
> [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40
> [<ffffffffa02b71bf>] make_request+0x8f/0xe40 [raid1]
> [<ffffffff81126af5>] ? mempool_alloc_slab+0x15/0x20
> [<ffffffff81126c93>] ? mempool_alloc+0x63/0x140
> [<ffffffff811c9d6b>] ? bio_alloc_bioset+0x5b/0xf0
> [<ffffffffa02cf018>] raid_map+0x18/0x20 [dm_raid]
>
> Bisecting has revealed that the commit that is the subject of this thread is the root of the problem. It does take a long time to reproduce the issue with the scripts that I have, but if there is info I can get you, let me know.
>
> brassow
Thanks for the details Jon. I don't suppose you have a stack trace from an
upstream kernel? Not that I doubt you have one, and I suspect it is very
similar. But I've learnt not to trust "suspect" and "presume" and "no doubt".
I'll have a look anyway.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2014-09-03 1:31 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-29 19:29 raid1 data corruption during resync Eivind Sarto
2014-09-02 14:10 ` Brassow Jonathan
2014-09-02 16:43 ` Eivind Sarto
2014-09-02 17:04 ` Eivind Sarto
2014-09-02 16:59 ` Brassow Jonathan
2014-09-02 19:24 ` Brassow Jonathan
2014-09-02 22:07 ` Eivind Sarto
2014-09-02 22:14 ` Brassow Jonathan
2014-09-02 23:55 ` NeilBrown
2014-09-03 0:48 ` Eivind Sarto
2014-09-03 1:18 ` Brassow Jonathan
2014-09-03 1:31 ` NeilBrown [this message]
2014-09-03 1:45 ` Brassow Jonathan
2014-09-03 21:39 ` Brassow Jonathan
2014-09-04 5:28 ` NeilBrown
[not found] <D4FE2D75-4208-48C9-A4D0-432F092E5AE9@redhat.com>
2014-09-08 15:52 ` Fwd: " Brassow Jonathan
2014-09-09 1:08 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140903113151.4099292b@notabene.brown \
--to=neilb@suse.de \
--cc=eivindsarto@gmail.com \
--cc=jbrassow@redhat.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).