linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Alexander Lyakas <alex.bolshoy@gmail.com>
Cc: linux-raid <linux-raid@vger.kernel.org>,
	Shyam Kaushik <shyam@zadarastorage.com>,
	yair@zadarastorage.com
Subject: Re: BUG - raid 1 deadlock on handle_read_error / wait_barrier
Date: Tue, 4 Jun 2013 11:49:24 +1000	[thread overview]
Message-ID: <20130604114924.37e4573c@notabene.brown> (raw)
In-Reply-To: <CAGRgLy4zTwopPS-cVwT1RKBr8pE7f3n-X3YLyFpktooQtTYGuQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2948 bytes --]

On Sun, 2 Jun 2013 15:43:41 +0300 Alexander Lyakas <alex.bolshoy@gmail.com>
wrote:

> Hello Neil,
> I believe I have found what is causing the deadlock. It happens in two flavors:
> 
> 1)
> # raid1d() is called, and conf->pending_bio_list is non-empty at this point
> # raid1d() calls md_check_recovery(), which eventually calls
> raid1_add_disk(), which calls raise_barrier()
> # Now raise_barrier will wait for conf->nr_pending to become 0, but it
> cannot become 0, because there are bios sitting in
> conf->pending_bio_list, which nobody will flush, because raid1d is the
> one supposed to call flush_pending_writes(), either directly or via
> handle_read_error. But it is stuck in raise_barrier.
> 
> 2)
> # raid1_add_disk() calls raise_barrier(), and waits for
> conf->nr_pending to become 0, as before
> # new WRITE comes and calls wait_barrier(), but this thread has a
> non-empty current->bio_list
> # In this case, the code allows the WRITE to go through
> wait_barrier(), and trigger WRITEs to mirror legs, but these WRITEs
> again end up in conf->pending_bio_list (either via raid1_unplug or
> directly). But nobody will flush conf->pending_bio_list, because
> raid1d is stuck in raise_barrier.
> 
> Previously, for example in kernel 3.2, raid1_add_disk did not call
> raise_barrier, so this problem did not happen.
> 
> Attached is a reproduction with some prints that I added to
> raise_barrier and wait_barrier (their code also attached). It
> demonstrates case 2. It shows that once raise_barrier got called,
> conf->nr_pending drops down, until it equals the number of
> wait_barrier calls, that slipped through because of non-empty
> current->bio_list. And at this point, this array hangs.
> 
> Can you please comment on how to fix this problem. It looks like a
> real deadlock.
> We can perhaps call md_check_recovery() after flush_pending_writes(),
> but this only makes the window smaller, not closes it entirely. But it
> looks like we really should not be calling raise_barrier from raid1d.
> 
> Thanks,
> Alex.

Hi Alex,
 thanks for the analysis.

Does the following patch fix it?  It makes raise_barrier  more  like
freeze_array().
If not, could you try making the same change to the first
wait_event_lock_irq in raise_barrier?

Thanks.
NeilBrown



diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 328fa2d..d34f892 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -828,9 +828,9 @@ static void raise_barrier(struct r1conf *conf)
 	conf->barrier++;
 
 	/* Now wait for all pending IO to complete */
-	wait_event_lock_irq(conf->wait_barrier,
-			    !conf->nr_pending && conf->barrier < RESYNC_DEPTH,
-			    conf->resync_lock);
+	wait_event_lock_irq_cmd(conf->wait_barrier,
+				!conf->nr_pending && conf->barrier < RESYNC_DEPTH,
+				conf->resync_lock, flush_pending_writes(conf));
 
 	spin_unlock_irq(&conf->resync_lock);
 }

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

  reply	other threads:[~2013-06-04  1:49 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-21 22:58 BUG - raid 1 deadlock on handle_read_error / wait_barrier Tregaron Bayly
2013-02-22  3:44 ` Joe Lawrence
2013-02-22 11:52 ` majianpeng
2013-02-22 16:03   ` Tregaron Bayly
2013-02-22 18:14     ` Joe Lawrence
2013-02-24 22:43 ` NeilBrown
2013-02-25  0:04   ` NeilBrown
2013-02-25 16:11     ` Tregaron Bayly
2013-02-25 22:54       ` NeilBrown
2013-02-26 14:09       ` Joe Lawrence
2013-05-16 14:07         ` Alexander Lyakas
2013-05-20  7:17           ` NeilBrown
2013-05-30 14:30             ` Alexander Lyakas
2013-06-02 12:43               ` Alexander Lyakas
2013-06-04  1:49                 ` NeilBrown [this message]
2013-06-04  9:52                   ` Alexander Lyakas
2013-06-06 15:00                   ` Tregaron Bayly
2013-06-08  9:45                     ` Alexander Lyakas
2013-06-12  0:42                       ` NeilBrown
2013-06-12  1:30                     ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130604114924.37e4573c@notabene.brown \
    --to=neilb@suse.de \
    --cc=alex.bolshoy@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=shyam@zadarastorage.com \
    --cc=yair@zadarastorage.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).