All of lore.kernel.org
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: ravichandra <vmynidi@caviumnetworks.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: Problem regarding RAID10 on kernel 2.6.31
Date: Mon, 9 Aug 2010 18:10:42 +1000	[thread overview]
Message-ID: <20100809181042.2dd7f7ca@notabene> (raw)
In-Reply-To: <1281339596.18581.34.camel@venkata-pc.in.caveonetworks.com>

On Mon, 09 Aug 2010 13:09:56 +0530
ravichandra <vmynidi@caviumnetworks.com> wrote:

> Hi,
>    Thanks.The patch you have sent is working.There is no hanging up
> after the patch is applied.can you elaborate on the problem which was
> there earlier??
> 

It's .... complicated.

An important fact is that generic_make_request queues recursive requests
rather than issuing them immediately.  This avoids excessive stack usage with
stacked block devices.

So in the case where a read crosses a chunk boundary, raid10:make_request
issues two separate generic_make_request calls to two different devices, each
preceded by a wait_barrier call (Which is cancelled with allow_barrer() when
the request completes).
The first is queued and will not be issued until the second is also queued and
the raid10:make_request call completes.

The wait_barrier call increments nr_pending.
If the resync/recovery thread tries to 'raise_barrier' between these calls,
it will find nr_pending set and will wait with ->barrier incremented so when
the next wait_barrier is attempted, is will block - forever.

If generic_make_request didn't queue things, the first request would
complete, nr_pending would decrement, resync would proceed with a single
request, then the second wait_barrier would complete and the second request
could be submitted.

The fix was to elevate conf->nr_waiting for the duration of both submissions
so raise_barrier holds off setting ->barrier until both submissions are
complete.

Hope that makes sense.

NeilBrown


  reply	other threads:[~2010-08-09  8:10 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-06  9:41 Problem regarding RAID10 on kernel 2.6.31 ravichandra
2010-08-06 10:14 ` Neil Brown
2010-08-09  7:39   ` ravichandra
2010-08-09  8:10     ` Neil Brown [this message]
2010-10-18 21:23       ` Hari Subramanian
2010-10-18 22:38         ` Neil Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100809181042.2dd7f7ca@notabene \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=vmynidi@caviumnetworks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.