From: NeilBrown <neilb@suse.de>
To: Brassow Jonathan <jbrassow@redhat.com>
Cc: Eivind Sarto <eivindsarto@gmail.com>,
linux-raid@vger.kernel.org, majianpeng <majianpeng@gmail.com>
Subject: Re: [PATCH 0/5] Fixes for RAID1 resync
Date: Thu, 18 Sep 2014 17:48:46 +1000 [thread overview]
Message-ID: <20140918174846.6a445eaf@notabene.brown> (raw)
In-Reply-To: <2C41CCF8-8B5C-486F-AE43-42D10EBAA0A5@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 4707 bytes --]
On Tue, 16 Sep 2014 11:31:26 -0500 Brassow Jonathan <jbrassow@redhat.com>
wrote:
>
> On Sep 14, 2014, at 10:30 PM, NeilBrown wrote:
>
> > On Thu, 11 Sep 2014 12:12:01 -0500 Brassow Jonathan <jbrassow@redhat.com>
> > wrote:
> >
> >>
> >> On Sep 10, 2014, at 10:45 PM, Brassow Jonathan wrote:
> >>
> >>>
> >>> On Sep 10, 2014, at 1:20 AM, NeilBrown wrote:
> >>>
> >>>>
> >>>> Jon: could you test with these patches on top of what you
> >>>> have just in case something happens to fix the problem without
> >>>> me realising it?
> >>>
> >>> I'm on it. The test is running. I'll know later tomorrow.
> >>>
> >>> brassow
> >>
> >> The test is still failing from here. I grabbed 3.17.0-rc4, added the 5 patches, and got the attached backtraces when testing. As I said, the hangs are not exactly the same. This set shows the mdX_raid1 thread in the middle of handling a read failure.
> >
> > Thanks.
> > mdX_raid1 is blocked in freeze_array.
> > That could be caused by conf->nr_pending nor aligning properly with
> > conf->nr_queued.
> >
> > Both normal IO and resync IO can be retried with reschedule_retry()
> > and so be counted into ->nr_queued, but only normal IO gets counted in
> > ->nr_pending.
> >
> > Previously could could only possibly have on or the other and when handling
> > a read failure it could only be normal IO. But now that they two types can
> > interleave, we can have both normal and resync IO requests queued, so we need
> > to count them both in nr_pending.
> >
> > So the following patch might help.
> >
> > How complicated are your test scripts? Could you send them to me so I can
> > try too?
> >
> > Thanks,
> > NeilBrown
> >
> > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> > index 888dbdfb6986..6a9c73435eb8 100644
> > --- a/drivers/md/raid1.c
> > +++ b/drivers/md/raid1.c
> > @@ -856,6 +856,7 @@ static void raise_barrier(struct r1conf *conf, sector_t sector_nr)
> > conf->next_resync + RESYNC_SECTORS),
> > conf->resync_lock);
> >
> > + conf->nr_pending++;
> > spin_unlock_irq(&conf->resync_lock);
> > }
> >
> > @@ -865,6 +866,7 @@ static void lower_barrier(struct r1conf *conf)
> > BUG_ON(conf->barrier <= 0);
> > spin_lock_irqsave(&conf->resync_lock, flags);
> > conf->barrier--;
> > + conf->nr_pending--;
> > spin_unlock_irqrestore(&conf->resync_lock, flags);
> > wake_up(&conf->wait_barrier);
> > }
>
> No luck, it is failing faster than before.
>
> I haven't looked into this myself, but the dm-raid1.c code makes use of dm-region-hash.c which coordinates recovery and nominal I/O in a way that allows them to both occur in a simple, non-overlapping way. I'm not sure it would make sense to use that instead of this new approach. I have no idea how much effort that would be, but I could have someone look into it at some point if you think it might be interesting.
>
Hi Jon,
I can see the appeal of using known-working code, but there is every chance
that we would break it when plugging it into md ;-)
I've found another bug.... it is a very subtle one and it has been around
since before the patch you bisected to so it probably isn't your bug.
It also only affects array with bad-blocks listed. The patch is below
but I very much doubt testing will show any change...
I'll keep looking..... oh, found one. This one looks more convincing.
If memory is short, make_request() will allocate an r1bio from the mempool
rather than from the slab. That r1bio won't have just been zeroed.
This is mostly OK as we initialise all the fields that aren't left in
a clean state ... except ->start_next_window.
We initialise that for write requests, but not for read.
So when we use a mempool-allocated r1bio that was previously used for
write and had ->start_next_window set, and is now used for read,
then things will go wrong.
So this patch definitely is worth testing.
Thanks for your continued patience in testing!!!
Thanks,
NeilBrown
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index a95f9e179e6f..7187d9b8431f 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1185,6 +1185,7 @@ read_again:
atomic_read(&bitmap->behind_writes) == 0);
}
r1_bio->read_disk = rdisk;
+ r1_bio->start_next_window = 0;
read_bio = bio_clone_mddev(bio, GFP_NOIO, mddev);
bio_trim(read_bio, r1_bio->sector - bio->bi_iter.bi_sector,
@@ -1444,6 +1445,7 @@ read_again:
r1_bio->state = 0;
r1_bio->mddev = mddev;
r1_bio->sector = bio->bi_iter.bi_sector + sectors_handled;
+ start_next_window = wait_barrier(conf, bio);
goto retry_write;
}
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2014-09-18 7:48 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-10 6:20 [PATCH 0/5] Fixes for RAID1 resync NeilBrown
2014-09-10 6:20 ` [PATCH 2/5] md/raid1: clean up request counts properly in close_sync() NeilBrown
2014-09-10 6:20 ` [PATCH 3/5] md/raid1: make sure resync waits for conflicting writes to complete NeilBrown
2014-09-10 6:20 ` [PATCH 1/5] md/raid1: be more cautious where we read-balance during resync NeilBrown
2014-09-10 6:20 ` [PATCH 4/5] md/raid1: Don't use next_resync to determine how far resync has progressed NeilBrown
2014-09-10 6:20 ` [PATCH 5/5] md/raid1: update next_resync under resync_lock NeilBrown
2014-09-11 3:45 ` [PATCH 0/5] Fixes for RAID1 resync Brassow Jonathan
2014-09-11 17:12 ` Brassow Jonathan
2014-09-15 3:30 ` NeilBrown
2014-09-16 16:31 ` Brassow Jonathan
2014-09-18 7:48 ` NeilBrown [this message]
2014-09-24 4:25 ` Brassow Jonathan
2014-09-24 4:49 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140918174846.6a445eaf@notabene.brown \
--to=neilb@suse.de \
--cc=eivindsarto@gmail.com \
--cc=jbrassow@redhat.com \
--cc=linux-raid@vger.kernel.org \
--cc=majianpeng@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).