Re: Weird Issue with raid 5+0

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: chris <tknchris@gmail.com>
To: Xen-Devel List <xen-devel@lists.xensource.com>
Cc: linux-raid@vger.kernel.org, Neil Brown <neilb@suse.de>
Subject: Re: Weird Issue with raid 5+0
Date: Mon, 8 Mar 2010 10:35:57 -0500	[thread overview]
Message-ID: <31e44a111003080735t4ddf7c63uaa517ad6522cca67@mail.gmail.com> (raw)
In-Reply-To: <20100308165021.6529fe6d@notabene.brown>

I forwarding this to xen-devel because it appears to be a bug in dom0 kernel.

I recently experienced a strange issue with software raid1+0 under Xen
on a new machine. I was getting corruption in my guest volumes and
tons of kernel messages such as:

[305044.571962] raid0_make_request bug: can't convert block across
chunks or bigger than 64k 14147455 4

The full thread is located at http://marc.info/?t=126672694700001&r=1&w=2
Detailed output at http://pastebin.com/f6a52db74

It appears after speaking with the linux-raid mailing list that this
is due a bug which has been fixed but the fix is not included in the
dom0 kernel. I'm not sure what sources kernel 2.6.26-2-xen-amd64 is
based on, but since xenlinux is still at 2.6.18 I was assuming that
this bug would still exist.

My questions for xen-devel are:

Can you tell me if there is any dom0 kernel where this issue is fixed?
Is there anything I can do to help get this resolved? Testing? Patching?

- chrris

On Mon, Mar 8, 2010 at 12:50 AM, Neil Brown <neilb@suse.de> wrote:
> On Sun, 21 Feb 2010 19:16:40 +1100
> Neil Brown <neilb@suse.de> wrote:
>
>> On Sun, 21 Feb 2010 02:26:42 -0500
>> chris <tknchris@gmail.com> wrote:
>>
>> > That is exactly what I didn't want to hear :( I am running
>> > 2.6.26-2-xen-amd64. Are you sure its a kernel problem and nothing to
>> > do with my chunk/block sizes? If this is a bug what versions are
>> > affected, I'll build a new domU kernel and see if I can get it working
>> > there.
>> >
>> > - chris
>>
>> I'm absolutely sure it is a kernel bug.
>
> And I think I now know what the bug is.
>
> A patch was recently posted to dm-devel which I think addresses exactly this
> problem.
>
> I reproduce it below.
>
> NeilBrown
>
> -------------------
> If the lower device exposes a merge_bvec_fn,
> dm_set_device_limits() restricts max_sectors
> to PAGE_SIZE "just to be safe".
>
> This is not sufficient, however.
>
> If someone uses bio_add_page() to add 8 disjunct 512 byte partial
> pages to a bio, it would succeed, but could still cross a border
> of whatever restrictions are below us (e.g. raid10 stripe boundary).
> An attempted bio_split() would not succeed, because bi_vcnt is 8.
>
> One example that triggered this frequently is the xen io layer.
>
> raid10_make_request bug: can't convert block across chunks or bigger than 64k 209265151 1
>
> Signed-off-by: Lars <lars.ellenberg@linbit.com>
>
>
> ---
>  drivers/md/dm-table.c |   12 ++++++++++--
>  1 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> index 4b22feb..c686ff4 100644
> --- a/drivers/md/dm-table.c
> +++ b/drivers/md/dm-table.c
> @@ -515,14 +515,22 @@ int dm_set_device_limits(struct dm_target *ti, struct dm_dev *dev,
>
>        /*
>         * Check if merge fn is supported.
> -        * If not we'll force DM to use PAGE_SIZE or
> +        * If not we'll force DM to use single bio_vec of PAGE_SIZE or
>         * smaller I/O, just to be safe.
>         */
>
> -       if (q->merge_bvec_fn && !ti->type->merge)
> +       if (q->merge_bvec_fn && !ti->type->merge) {
>                limits->max_sectors =
>                        min_not_zero(limits->max_sectors,
>                                     (unsigned int) (PAGE_SIZE >> 9));
> +               /* Restricting max_sectors is not enough.
> +                * If someone uses bio_add_page to add 8 disjunct 512 byte
> +                * partial pages to a bio, it would succeed,
> +                * but could still cross a border of whatever restrictions
> +                * are below us (e.g. raid0 stripe boundary).  An attempted
> +                * bio_split() would not succeed, because bi_vcnt is 8. */
> +               limits->max_segments = 1;
> +       }
>        return 0;
>  }
>  EXPORT_SYMBOL_GPL(dm_set_device_limits);
> --
> 1.6.3.3
>

next prev parent reply	other threads:[~2010-03-08 15:35 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-21  4:33 Weird Issue with raid 5+0 chris
2010-02-21  5:48 ` Neil Brown
2010-02-21  7:26   ` chris
2010-02-21  8:16     ` Neil Brown
2010-02-21  8:21       ` Neil Brown
2010-02-21  9:17         ` chris
2010-02-21 10:35           ` chris
2010-03-08  5:50       ` Neil Brown
2010-03-08  6:16         ` chris
2010-03-08  7:05           ` Neil Brown
2010-03-08 15:35         ` chris [this message]
2010-03-08 17:29           ` [Xen-devel] " Konrad Rzeszutek Wilk
2010-03-09 19:42             ` Neil Brown
2010-03-08 23:26           ` Jeremy Fitzhardinge
2010-03-09  0:48             ` chris
2010-03-09  1:14               ` Jeremy Fitzhardinge
2010-03-08 20:14         ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=31e44a111003080735t4ddf7c63uaa517ad6522cca67@mail.gmail.com \
    --to=tknchris@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).