All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jes Sorensen <Jes.Sorensen@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Pete Ashdown <pashdown@xmission.com>,
	kvm@vger.kernel.org, Aaron Toponce <atoponce@xmission.com>
Subject: Re: kvm + raid1 showstopper bug
Date: Tue, 21 Feb 2012 13:40:28 +0100	[thread overview]
Message-ID: <4F4390BC.5060809@redhat.com> (raw)
In-Reply-To: <CAJSP0QUBnnmQepv8AZuXdg+mEXY6ckPMeUTuW_drDD3yLEtouw@mail.gmail.com>

On 02/18/12 14:25, Stefan Hajnoczi wrote:
> On Fri, Feb 17, 2012 at 3:31 PM, Pete Ashdown <pashdown@xmission.com> wrote:
>> > On 02/17/2012 04:30 AM, Stefan Hajnoczi wrote:
>>> >> On Fri, Feb 17, 2012 at 4:57 AM, Pete Ashdown <pashdown@xmission.com> wrote:
>>>> >>> I've been waiting for some response from the Ubuntu team regarding a bug on
>>>> >>> launchpad, but it appears that it isn't being taken seriously:
>>>> >>>
>>>> >>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/745785
>>> >> This looks interesting.  Let me try to summarize, please point out if
>>> >> I get something wrong:
>>> >>
>>> >> You have software RAID1 on the host, your disk images live on this
>>> >> device.  Whenever checkarray runs on the host you find that VMs become
>>> >> unresponsive.  Guests print warnings that a task is blocked for more
>>> >> than 120 seconds.  Guests become unresponsive on the network.
>> > In my case, it is drbd+RAID10, but the bug still applies.  It isn't
>> > whenever checkarray runs, but whenever checkarray decides to do a resync,
>> > it will block all IO somewhere before the end of the resync.  Then yes, it
>> > isn't long before the guests start to fail due to their inability to
>> > read/write.
> I have not attempted to reproduce this yet but have taken a look at
> drviers/md/raid10.c resync code.  md resync uses a similar mechanism
> for RAID1 and RAID10.  While a block is being synced the entire device
> will force regular I/O requests to wait.  There are tunables which let
> you rate-limit resyncing, I think this can solve your problem.
> Perhaps the resync is too aggressive and is impacting regular I/O so
> much that the guest is warning about it.  See Documentation/md.txt for
> sync_speed_max and other sysfs attributes.
> 
> The bug report suggests qemu-kvm itself is operating fine because the
> guest is still executing and VNC/monitor are alive.  After a while the
> guest warns about the stuck I/O.

It could be a bug in the raid1/raid10 code, which is triggered by the
way qemu keeps file on it (O_DIRECT or something). However there have
been a *lot* of fixes to the raid code since 2.6.38 (which is the kernel
version I saw referenced in the launchpad link). Please try and
reproduce it with a more uptodate kernel.

I have no idea what drbd8 is and what relation it has to the kernel or
kvm here.

Cheers,
Jes

      parent reply	other threads:[~2012-02-21 12:40 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-17  4:57 kvm + raid1 showstopper bug Pete Ashdown
2012-02-17 11:30 ` Stefan Hajnoczi
2012-02-17 15:31   ` Pete Ashdown
2012-02-18 13:25     ` Stefan Hajnoczi
2012-02-19 18:17       ` Pete Ashdown
2012-02-21 12:40       ` Jes Sorensen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F4390BC.5060809@redhat.com \
    --to=jes.sorensen@redhat.com \
    --cc=atoponce@xmission.com \
    --cc=kvm@vger.kernel.org \
    --cc=pashdown@xmission.com \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.