All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Chris Friesen <chris.friesen@genband.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: hung in raise_barrier() in raid1.c  -- any ideas?
Date: Fri, 21 Sep 2012 12:16:58 +1000	[thread overview]
Message-ID: <20120921121658.4fc4e5aa@notabene.brown> (raw)
In-Reply-To: <505BA133.3000005@genband.com>

[-- Attachment #1: Type: text/plain, Size: 1744 bytes --]

On Thu, 20 Sep 2012 17:05:23 -0600 Chris Friesen <chris.friesen@genband.com>
wrote:

> On 09/20/2012 03:27 PM, NeilBrown wrote:
> > On Thu, 20 Sep 2012 11:55:02 -0600 Chris Friesen<chris.friesen@genband.com>
> > wrote:
> >
> >> On 09/20/2012 10:52 AM, Chris Friesen wrote:
> >>>
> >>> Hi,
> >>>
> >>> I've got a fairly beefy (32 cpus, 64GB ram, isci-based SAS disks,
> >>> etc.) embedded system running 2.6.27.
> >>>
> >>> We're seeing issues where disk operations suddenly seem to stall.  In
> >>> the most recent case we had the hung-task watchdog indicate that
> >>> md1_resync was stuck for more than 120sec in raise_barrier().
> >>>
> >>> There are a bunch of "normal" tasks also stuck in wait_barrier(), so
> >>> based on that I assume we're stuck in the second call to
> >>> wait_event_lock_irq().
> >>>
> >>> Has anyone seen anything like this?  Could commit 73d5c38 be related?
> >>> What about 1d9d524?
> >>
> >> Could d6b42dc be related?
> >
> > That last one seems more likely.  Does the scenario fit your config.
> > i.e. is your RAID1 being used under LVM?
> >
> > If it does, then I would say it is very likely this issue.
> 
> 
> Yes, we're using it under LVM.  I've added some instrumentation to tell 
> if we're hitting that case.  The current->bio_list handling is a bit 
> different in 2.6.27 but I think I figured out the equivalent to the patch.
> 
> Interesting that it took this long to fix that issue.
> 
> 
> >> Also, what's the meaning of RESYNC_DEPTH?
> >
> > The maximum number of resync requests that can be concurrently active.
> 
> And each request would resync a single block?

Each request will resync up to RESYNC_BLOCK_SIZE bytes - i.e. up to 64K.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

      reply	other threads:[~2012-09-21  2:16 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-20 16:52 hung in raise_barrier() in raid1.c -- any ideas? Chris Friesen
2012-09-20 17:55 ` Chris Friesen
2012-09-20 21:27   ` NeilBrown
2012-09-20 23:05     ` Chris Friesen
2012-09-21  2:16       ` NeilBrown [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120921121658.4fc4e5aa@notabene.brown \
    --to=neilb@suse.de \
    --cc=chris.friesen@genband.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.