From: Stefan Hajnoczi <stefanha@gmail.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, ghammer@redhat.com,
Jason Wang <jasowang@redhat.com>,
linux-kernel@vger.kernel.org,
Markus Armbruster <armbru@redhat.com>,
Alejandro Comisario <alejandro.comisario@mercadolibre.com>
Subject: Re: [Qemu-devel] Massive read only kvm guests when backing file was missing
Date: Thu, 27 Mar 2014 09:53:47 +0100 [thread overview]
Message-ID: <20140327085347.GB9580@stefanha-thinkpad.redhat.com> (raw)
In-Reply-To: <20140327081040.GA21756@redhat.com>
On Thu, Mar 27, 2014 at 10:10:40AM +0200, Michael S. Tsirkin wrote:
> On Thu, Mar 27, 2014 at 08:36:57AM +0100, Markus Armbruster wrote:
> > "Michael S. Tsirkin" <mst@redhat.com> writes:
> >
> > > On Wed, Mar 26, 2014 at 11:08:03PM -0300, Alejandro Comisario wrote:
> > >> Hi List!
> > >> Hope some one can help me, we had a big issue in our cloud the other
> > >> day, a couple of our openstack regions ( +2000 kvm guests with qcow2 )
> > >> went read only filesystem from the guest side because the backing
> > >> files directory (the openstack _base directory) was compromised and
> > >> the data was lost, when we realized the data was lost, it took us 5
> > >> mins to restore the backup of the backing files, but by that time all
> > >> the kvm guests received some kind of IO error from the hypervisor
> > >> layer, and went read only on root filesystem.
> > >>
> > >> My question would be, is there a way to hold the IO operations against
> > >> the backing files ( i thought that would be 99% READ operations ) for
> > >> a little longer ( im asking this because i dont quite understand what
> > >> is the process and when it raises the error ) in a case the backing
> > >> files are missing (no IO possible) but is recoverable within minutes ?
> > >>
> > >> Any tip on how to achieve this if possible, or information about how
> > >> backing files works on kvm, will be amazing.
> > >> Waiting for feedback!
> > >>
> > >> kindest regards.
> > >> Alejandro Comisario
> > >
> > >
> > > I'm guessing this is what happened: guests timed out meanwhile.
> > > You can increase the timeout within the guest:
> > > echo 600 > /sys/block/sda/device/timeout
> > > to timeout after 10 minutes.
> > >
> > > If you have installed qemu guest agent on your system, you can do this
> > > from the host. Unfortunately by default it's memory can be pushed out to swap
> > > and then on disk error access there might will fail :(
> > > Maybe we should consider mlock on all its memory at least as an option.
> > >
> > > You could pause your guests, restart them after the issue is resolved,
> > > and we could I guess add functionality to pause VM on disk errors
> > > automatically.
> > > Stefan?
> >
> > Would -drive rerror=stop do?
>
> I think it will. It's a pity it doesn't appear in --help output -
> would make it easier to find.
It is documented on the man page. I'll send a patch to document it in
the --help output too.
But there's still a problem because the guest can have a shorter timeout
or the image may be NFS mounted on the host. In that case the guest may
give up on the request before the host. Then there is nothing QEMU can
do to avoid an error being returned to the application or the guest file
system going into read-only mode.
So make sure the timeout inside the guest is high.
Stefan
next prev parent reply other threads:[~2014-03-27 8:54 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAMrG31z=oy-53Lfya4svhNniD_7Q1YETuHeZsotHj8U5xJNYmw@mail.gmail.com>
2014-03-27 6:41 ` [Qemu-devel] Massive read only kvm guests when backing file was missing Michael S. Tsirkin
2014-03-27 7:36 ` Markus Armbruster
2014-03-27 8:10 ` Michael S. Tsirkin
2014-03-27 8:53 ` Stefan Hajnoczi [this message]
2014-03-27 16:13 ` Alejandro Comisario
2014-03-27 16:14 ` Alejandro Comisario
2014-03-28 7:01 ` Michael Tokarev
2014-03-28 8:47 ` Stefan Hajnoczi
2014-04-01 0:51 ` Alejandro Comisario
2014-04-01 13:52 ` Stefan Hajnoczi
2014-04-01 14:09 ` Alejandro Comisario
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140327085347.GB9580@stefanha-thinkpad.redhat.com \
--to=stefanha@gmail.com \
--cc=alejandro.comisario@mercadolibre.com \
--cc=armbru@redhat.com \
--cc=ghammer@redhat.com \
--cc=jasowang@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).