Bug#624343: linux-image-2.6.38-2-amd64: frequent message "bio too big device md0 (248 > 240)" in kern.log

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
To: Ben Hutchings <ben@decadent.org.uk>, 624343@bugs.debian.org
Cc: Jameson Graef Rollins <jrollins@finestructure.net>,
	NeilBrown <neilb@suse.de>,
	linux-raid@vger.kernel.org
Subject: Bug#624343: linux-image-2.6.38-2-amd64: frequent message "bio too big device md0 (248 > 240)" in kern.log
Date: Sun, 01 May 2011 20:42:52 -0400	[thread overview]
Message-ID: <4DBDFE0C.3000304@fifthhorseman.net> (raw)
In-Reply-To: <1304294457.2833.111.camel@localhost>

[-- Attachment #1: Type: text/plain, Size: 2626 bytes --]

On 05/01/2011 08:00 PM, Ben Hutchings wrote:
> On Sun, 2011-05-01 at 15:06 -0700, Jameson Graef Rollins wrote:
>> Hi, Ben.  Can you explain why this is not expected to work?  Which part
>> exactly is not expected to work and why?
> 
> Adding another type of disk controller (USB storage versus whatever the
> SSD interface is) to a RAID that is already in use.
> 
 [...]
> The normal state of a RAID set is that all disks are online.  You have
> deliberately turned this on its head; the normal state of your RAID set
> is that one disk is missing.  This is such a basic principle that most
> documentation won't mention it.

This is somewhat worrisome to me.  Consider a fileserver with
non-hotswap disks.  One disk fails in the morning, but the machine is in
production use, and the admin's goals are:

 * minimize downtime,
 * reboot only during off-hours, and
 * minimize the amount of time that the array is spent de-synced.

A responsible admin might reasonably expect to attach a disk via a
well-tested USB or ieee1394 adapter, bring the array back into sync,
announce to the rest of the organization that there will be a scheduled
reboot later in the evening.

Then, at the scheduled reboot, move the disk from the USB/ieee1394
adapter to the direct ATA interface on the machine.

If this sequence of operations is likely (or even possible) to cause
data loss, it should be spelled out in BIG RED LETTERS someplace.  I
don't think any of the above steps seem unreasonable, and the set of
goals the admin is attempting to meet are certainly commonplace goals.

> The error is that you changed the I/O capabilities of the RAID while it
> was already in use.  But what I was describing as 'correct' was that an
> error code was returned, rather than the error condition only being
> logged.  If the error condition is not properly propagated then it could
> lead to data loss.

How is an admin to know which I/O capabilities to check before adding a
device to a RAID array?  When is it acceptable to mix I/O capabilities?
 Can a RAID array which is not currently being used as a backing store
for a filesystem be assembled of unlike disks?  What if it is then
(later) used as a backing store for a filesystem?

One of the advantages people tout for in-kernel software raid (over many
H/W RAID implementations) is the ability to mix disks, so that you're
not reliant on a single vendor during a failure.  If this advantage
doesn't extend across certain classes of disk, it would be good to be
unambiguous about what can be mixed and what cannot.

Regards,

	--dkg

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 1030 bytes --]

next prev parent reply	other threads:[~2011-05-02  0:42 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20110427161901.27049.31001.reportbug@servo.factory.finestructure.net>
2011-04-29  4:39 ` Bug#624343: linux-image-2.6.38-2-amd64: frequent message "bio too big device md0 (248 > 240)" in kern.log Ben Hutchings
2011-05-01 22:06   ` Jameson Graef Rollins
2011-05-02  0:00     ` Ben Hutchings
2011-05-02  0:22       ` NeilBrown
2011-05-02  2:47         ` Guy Watkins
2011-05-02  5:07         ` Daniel Kahn Gillmor
2011-05-02  9:08         ` David Brown
2011-05-02 10:00           ` NeilBrown
2011-05-02 10:32             ` David Brown
2011-05-02 14:56             ` David Brown
2011-05-02  0:42       ` Daniel Kahn Gillmor [this message]
2011-05-02  1:04         ` Ben Hutchings
2011-05-02  1:17           ` Jameson Graef Rollins
2011-05-02  9:05             ` David Brown
2011-05-02  9:11     ` David Brown
2011-05-02 16:38       ` Jameson Graef Rollins
2011-05-02 18:54         ` David Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DBDFE0C.3000304@fifthhorseman.net \
    --to=dkg@fifthhorseman.net \
    --cc=624343@bugs.debian.org \
    --cc=ben@decadent.org.uk \
    --cc=jrollins@finestructure.net \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.