Re: Thoughts on VM fence infrastructure

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Felipe Franciosi <felipe@nutanix.com>
Cc: Rafael David Tinoco <rafaeldtinoco@ubuntu.com>,
	Aditya Ramesh <aramesh@nutanix.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>
Subject: Re: Thoughts on VM fence infrastructure
Date: Tue, 1 Oct 2019 12:10:31 +0100	[thread overview]
Message-ID: <20191001111031.GH26133@redhat.com> (raw)
In-Reply-To: <E1A62EE3-9EC0-49DC-A871-C6424F5FD807@nutanix.com>

On Tue, Oct 01, 2019 at 10:46:24AM +0000, Felipe Franciosi wrote:
> Hi Daniel!
> 
> 
> > On Oct 1, 2019, at 11:31 AM, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > 
> > On Tue, Oct 01, 2019 at 09:56:17AM +0000, Felipe Franciosi wrote:
> 
> (Apologies for the mangled URL, nothing I can do about that.) :(
> 
> There are several points which favour adding this to Qemu:
> - Not all environments use systemd.

Sure, if you want to cope with that you can just use the HW watchdog
directly instead of via systemd. 

> - HW watchdogs always reboot the host, which is too drastic.
> - You may not want to protect all VMs in the same way.

Same points repeated below, so I'll respond there....

> > IMHO doing this at the host OS level is going to be more reliable in
> > terms of detecting the problem in the first place, as well as more
> > reliable in taking the action - its very difficult for a hardware CPU
> > reset to fail to work.
> 
> Absolutely, but it's a very drastic measure that:
> - May be unnecessary.

Of course, the inability to predict future consequences is what
forces us into assuming the worst case & taking actions to
mitigate that. It will definitely result in unccessary killing
of hosts, but that is what gives you the safety guarantees you
can't otherwise achieve.

I gave the example elsewhere that even if you kill QEMU, the kernel
can have pending I/O associated with QEMU that can be sent if the
host later recovers.

> - Will fence everything even perhaps only some VMs need protection.

I don't believe its viable to have offer real protection to only
a subset of VMs, principally because the kernel is doing I/O work
on behalf of the VM, so to protect just 1 VM you must fence the
kernel.

> What are your thoughts on this 3-level approach?
> 1) Qemu tries to log() + abort() (deadline)

Just abort()'ing isn't going to be a viable strategy with QEMU's move
towards a multi-process architecture. This introduces the problem that
the "main" QEMU process has to enumerate all the helpers it is dealing
with and kill them all off in some way. This is non-trivial especially
if some of the helpers are running under different privilege levels.

You could declare that multi-process QEMU is out of scope, but I think
QEMU self-fencing would need to offer compelling benefits over host OS
self-fencing to justify that exception. Personally I'm not seeing it.

> 2) Kernel sends SIGKILL (harddeadline)

This is slightly easier to deal with multiple processes in that it
isn't restricted by the privileges of the main QEMU vs helpers and
could take advantage of cgroups perhaps.

> 3) HW watchdog kicks in (harderdeadline)

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

next prev parent reply	other threads:[~2019-10-01 11:11 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-30 10:30 Thoughts on VM fence infrastructure Felipe Franciosi
2019-09-30 14:29 ` Dr. David Alan Gilbert
2019-09-30 15:46   ` Felipe Franciosi
2019-09-30 16:03     ` Dr. David Alan Gilbert
2019-09-30 16:59       ` Felipe Franciosi
2019-09-30 17:11         ` Dr. David Alan Gilbert
2019-09-30 17:33           ` Felipe Franciosi
2019-09-30 17:59             ` Dr. David Alan Gilbert
2019-09-30 19:23               ` Felipe Franciosi
2019-10-01  8:23                 ` Dr. David Alan Gilbert
2019-10-01  9:56                   ` Felipe Franciosi
2019-10-01 10:05                     ` Dr. David Alan Gilbert
2019-10-01 10:31                     ` Daniel P. Berrangé
2019-10-01 10:46                       ` Felipe Franciosi
2019-10-01 11:10                         ` Daniel P. Berrangé [this message]
2019-10-01 11:38                           ` Felipe Franciosi
2019-10-01 10:49                 ` Daniel P. Berrangé
2019-09-30 19:45               ` Rafael David Tinoco
2019-09-30 20:24                 ` Felipe Franciosi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191001111031.GH26133@redhat.com \
    --to=berrange@redhat.com \
    --cc=aramesh@nutanix.com \
    --cc=dgilbert@redhat.com \
    --cc=felipe@nutanix.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rafaeldtinoco@ubuntu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).