linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Brassow Jonathan <jbrassow@redhat.com>
Cc: Eivind Sarto <eivindsarto@gmail.com>,
	linux-raid@vger.kernel.org, majianpeng <majianpeng@gmail.com>
Subject: Re: [PATCH 0/5] Fixes for RAID1 resync
Date: Mon, 15 Sep 2014 13:30:06 +1000	[thread overview]
Message-ID: <20140915133006.14e57085@notabene.brown> (raw)
In-Reply-To: <8697EC47-F648-4E66-B37C-4A2DC3030696@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 2108 bytes --]

On Thu, 11 Sep 2014 12:12:01 -0500 Brassow Jonathan <jbrassow@redhat.com>
wrote:

> 
> On Sep 10, 2014, at 10:45 PM, Brassow Jonathan wrote:
> 
> > 
> > On Sep 10, 2014, at 1:20 AM, NeilBrown wrote:
> > 
> >> 
> >> Jon: could you test with these patches on top of what you
> >> have just in case something happens to fix the problem without
> >> me realising it?
> > 
> > I'm on it.  The test is running.  I'll know later tomorrow.
> > 
> > brassow
> 
> The test is still failing from here.  I grabbed 3.17.0-rc4, added the 5 patches, and got the attached backtraces when testing.  As I said, the hangs are not exactly the same.  This set shows the mdX_raid1 thread in the middle of handling a read failure.

Thanks.
mdX_raid1 is blocked in freeze_array.
That could be caused by conf->nr_pending nor aligning properly with
conf->nr_queued.

Both normal IO and resync IO can be retried with reschedule_retry()
and so be counted into ->nr_queued, but only normal IO gets counted in
->nr_pending.

Previously could could only possibly have on or the other and when handling
a read failure it could only be normal IO.  But now that they two types can
interleave, we can have both normal and resync IO requests queued, so we need
to count them both in nr_pending.

So the following patch might help.

How complicated are your test scripts?  Could you send them to me so I can
try too?

Thanks,
NeilBrown

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 888dbdfb6986..6a9c73435eb8 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -856,6 +856,7 @@ static void raise_barrier(struct r1conf *conf, sector_t sector_nr)
 			     conf->next_resync + RESYNC_SECTORS),
 			    conf->resync_lock);
 
+	conf->nr_pending++;
 	spin_unlock_irq(&conf->resync_lock);
 }
 
@@ -865,6 +866,7 @@ static void lower_barrier(struct r1conf *conf)
 	BUG_ON(conf->barrier <= 0);
 	spin_lock_irqsave(&conf->resync_lock, flags);
 	conf->barrier--;
+	conf->nr_pending--;
 	spin_unlock_irqrestore(&conf->resync_lock, flags);
 	wake_up(&conf->wait_barrier);
 }

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

  reply	other threads:[~2014-09-15  3:30 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-10  6:20 [PATCH 0/5] Fixes for RAID1 resync NeilBrown
2014-09-10  6:20 ` [PATCH 2/5] md/raid1: clean up request counts properly in close_sync() NeilBrown
2014-09-10  6:20 ` [PATCH 4/5] md/raid1: Don't use next_resync to determine how far resync has progressed NeilBrown
2014-09-10  6:20 ` [PATCH 1/5] md/raid1: be more cautious where we read-balance during resync NeilBrown
2014-09-10  6:20 ` [PATCH 3/5] md/raid1: make sure resync waits for conflicting writes to complete NeilBrown
2014-09-10  6:20 ` [PATCH 5/5] md/raid1: update next_resync under resync_lock NeilBrown
2014-09-11  3:45 ` [PATCH 0/5] Fixes for RAID1 resync Brassow Jonathan
2014-09-11 17:12   ` Brassow Jonathan
2014-09-15  3:30     ` NeilBrown [this message]
2014-09-16 16:31       ` Brassow Jonathan
2014-09-18  7:48         ` NeilBrown
2014-09-24  4:25           ` Brassow Jonathan
2014-09-24  4:49             ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140915133006.14e57085@notabene.brown \
    --to=neilb@suse.de \
    --cc=eivindsarto@gmail.com \
    --cc=jbrassow@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=majianpeng@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).