linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Mason <chris.mason@fusionio.com>
To: NeilBrown <neilb@suse.de>
Cc: Kerin Millar <kerframil@gmail.com>,
	"linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: raid10 make_request failure during iozone benchmark upon btrfs
Date: Tue, 3 Jul 2012 11:08:41 -0400	[thread overview]
Message-ID: <20120703150841.GE14928@shiny> (raw)
In-Reply-To: <20120703124727.6e2232d1@notabene.brown>

On Mon, Jul 02, 2012 at 08:47:27PM -0600, NeilBrown wrote:
> Thanks.  Looks like it is a btrfs bug - so a big "hello" to linux-btrfs :-)
> 
> The symptom is that iozone on btrfs on md/raid10 can result in
> 
> [  919.893454] md/raid10:md0: make_request bug: can't convert block across chunks or bigger than 256k 6653500160 256
> [  919.893465] btrfs: bdev /dev/mapper/vg0-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
> 
> 
> i.e. RAID10 has a 256K chunk size, but is getting 256K requests which overlap
> two chunks - the last half of one chunk and the first half of the next.
> That isn't allowed and raid10_mergeable_bvec, called by bio_add_page, should
> prevent it.
> 
> However btrfs_map_bio() sets ->bi_sector to a new value without verifying
> that the resulting bio is still acceptable - which it isn't.
> 
> The core problem is that you cannot build a bio for one location, then use it
> freely at another location.
> md/raid1 handles this by checking each addition to a bio against all the
> possible location that it might read/write it.  Maybe btrfs could do the
> same.
> Alternately we could work with Kent Overstreet (of bcache fame) to remove the
> restriction that the fs must make the bio compatible with the device -
> instead requiring the device to split bios when needed, and making it easy to
> do that (currently it is not easy).
> And there are probably other alternative.

In this case btrfs should really break the bio down to smaller chunks
and hand feed the lower layers.  There are corners where we think the
device can go a certain size and then later on figure out we were just
too optimistic.  So we should deal with it by breaking the bio up and
then lowering our max.

-chris


  reply	other threads:[~2012-07-03 15:08 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-02  2:34 raid10 make_request failure during iozone benchmark upon btrfs Kerin Millar
2012-07-02  2:52 ` NeilBrown
2012-07-02  2:58   ` Kerin Millar
2012-07-03  1:39     ` NeilBrown
2012-07-03  2:13       ` Kerin Millar
2012-07-03  2:47         ` NeilBrown
2012-07-03 15:08           ` Chris Mason [this message]
2012-07-07 17:29           ` Kerin Millar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120703150841.GE14928@shiny \
    --to=chris.mason@fusionio.com \
    --cc=kerframil@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).