All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Jonathan Brassow <jbrassow@redhat.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: [PATCH] MD: Quickly return errors if too many devices have failed.
Date: Mon, 18 Mar 2013 10:49:05 +1100	[thread overview]
Message-ID: <20130318104905.4a70bc00@notabene.brown> (raw)
In-Reply-To: <1363195764.24906.14.camel@f16>

[-- Attachment #1: Type: text/plain, Size: 2158 bytes --]

On Wed, 13 Mar 2013 12:29:24 -0500 Jonathan Brassow <jbrassow@redhat.com>
wrote:

> Neil,
> 
> I've noticed that when too many devices fail in a RAID arrary that
> addtional I/O will hang, yielding an endless supply of:
> Mar 12 11:52:53 bp-01 kernel: Buffer I/O error on device md1, logical block 3
> Mar 12 11:52:53 bp-01 kernel: lost page write due to I/O error on md1
> Mar 12 11:52:53 bp-01 kernel: sector=800 i=3           (null)           (null)  
>          (null)           (null) 1

This is the third report in as many weeks that mentions that WARN_ON.
The first two where quite different causes.
I think this one is the same as the first one, which means it would be fixed
by  
      md/raid5: schedule_construction should abort if nothing to do.

which is commit 29d90fa2adbdd9f in linux-next.

> Mar 12 11:52:53 bp-01 kernel: ------------[ cut here ]------------
> Mar 12 11:52:53 bp-01 kernel: WARNING: at drivers/md/raid5.c:354 init_stripe+0x2d4/0x370 [raid456]()

> 
> Are other people seeing this, or is this an artifact of the way I am killing
> devices ('echo offline > /sys/block/$dev/device/state')?

That is a perfectly good way to kill a device.

> 
> I would prefer to get immediate errors if nothing can be done to satisfy the
> request and I've been thinking of something like the attached patch.  The
> patch below is incomplete.  It does not take into account any reshaping that
> is going on, nor does it try to figure out if a mirror set in RAID10 has died;
> but I hope it gets the basic idea across.
> 
> Is this a good way to handle this situation, or am I missing something?

I think we do get immediate errors (once all bugs are fixed).
Your patch does extra work for every request which is only of value if the
array has failed - and it really doesn't make sense to optimise for a failed
array.
The current approach is to just try to satisfy a request and once we find
that we need to do something that is impossible - return an error at that
point.  I think that is best.

Can you try the commit I identified and see if it makes the problem go away?

Thanks,
NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

  reply	other threads:[~2013-03-17 23:49 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-13 17:29 [PATCH] MD: Quickly return errors if too many devices have failed Jonathan Brassow
2013-03-17 23:49 ` NeilBrown [this message]
2013-03-18 16:15   ` Brassow Jonathan
2013-03-18 17:31   ` Roy Sigurd Karlsbakk
2013-03-19 19:14     ` Brassow Jonathan
2013-03-19 21:15   ` Brassow Jonathan
2013-03-20  2:46     ` NeilBrown
2013-03-20 20:56       ` Brassow Jonathan
2013-03-20 23:04         ` NeilBrown
2013-03-21 13:58           ` Brassow Jonathan
2013-03-28  0:13             ` NeilBrown
2013-04-19 13:22               ` Brassow Jonathan
2013-03-21 14:01           ` Brassow Jonathan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130318104905.4a70bc00@notabene.brown \
    --to=neilb@suse.de \
    --cc=jbrassow@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.