From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Felipe Franciosi <felipe@nutanix.com>
Cc: Aditya Ramesh <aramesh@nutanix.com>, qemu-devel <qemu-devel@nongnu.org>
Subject: Re: Thoughts on VM fence infrastructure
Date: Mon, 30 Sep 2019 18:59:14 +0100 [thread overview]
Message-ID: <20190930175914.GM2759@work-vm> (raw)
In-Reply-To: <CA2CBDDF-99ED-4693-8622-89D4F2E71DE9@nutanix.com>
* Felipe Franciosi (felipe@nutanix.com) wrote:
>
>
> > On Sep 30, 2019, at 6:11 PM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> >
> > * Felipe Franciosi (felipe@nutanix.com) wrote:
> >>
> >>
> >>> On Sep 30, 2019, at 5:03 PM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> >>>
> >>> * Felipe Franciosi (felipe@nutanix.com) wrote:
> >>>> Hi David,
> >>>>
> >>>>> On Sep 30, 2019, at 3:29 PM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> >>>>>
> >>>>> * Felipe Franciosi (felipe@nutanix.com) wrote:
> >>>>>> Heyall,
> >>>>>>
> >>>>>> We have a use case where a host should self-fence (and all VMs should
> >>>>>> die) if it doesn't hear back from a heartbeat within a certain time
> >>>>>> period. Lots of ideas were floated around where libvirt could take
> >>>>>> care of killing VMs or a separate service could do it. The concern
> >>>>>> with those is that various failures could lead to _those_ services
> >>>>>> being unavailable and the fencing wouldn't be enforced as it should.
> >>>>>>
> >>>>>> Ultimately, it feels like Qemu should be responsible for this
> >>>>>> heartbeat and exit (or execute a custom callback) on timeout.
> >>>>>
> >>>>> It doesn't feel doing it inside qemu would be any safer; something
> >>>>> outside QEMU can forcibly emit a kill -9 and qemu *will* stop.
> >>>>
> >>>> The argument above is that we would have to rely on this external
> >>>> service being functional. Consider the case where the host is
> >>>> dysfunctional, with this service perhaps crashed and a corrupt
> >>>> filesystem preventing it from restarting. The VMs would never die.
> >>>
> >>> Yeh that could fail.
> >>>
> >>>> It feels like a Qemu timer-driven heartbeat check and calls abort() /
> >>>> exit() would be more reliable. Thoughts?
> >>>
> >>> OK, yes; perhaps using a timer_create and telling it to send a fatal
> >>> signal is pretty solid; it would take the kernel to do that once it's
> >>> set.
> >>
> >> I'm confused about why the kernel needs to be involved. If this is a
> >> timer off the Qemu main loop, it can just check on the heartbeat
> >> condition (which should be customisable) and call abort() if that's
> >> not satisfied. If you agree on that I'd like to talk about how that
> >> check could be made customisable.
> >
> > There are times when the main loop can get blocked even though the CPU
> > threads can be running and can in some configurations perform IO
> > even without the main loop (I think!).
>
> Ah, that's a very good point. Indeed, you can perform IO in those
> cases specially when using vhost devices.
>
> > By setting a timer in the kernel that sends a signal to qemu, the kernel
> > will send that signal however broken qemu is.
>
> Got you now. That's probably better. Do you reckon a signal is
> preferable over SIGEV_THREAD?
Not sure; probably the safest is getting the kernel to SIGKILL it - but
that's a complete nightmare to debug - your process just goes *pop*
with no apparent reason why.
I've not used SIGEV_THREAD - it looks promising though.
> I'm still wondering how to make this customisable so that different
> types of heartbeat could be implemented (preferably without creating
> external dependencies per discussion above). Thoughts welcome.
Yes, you need something to enable it, and some safe way to retrigger
the timer. A qmp command marked as 'oob' might be the right way -
another qm command can't block it.
Dave
> F.
>
> >
> >>
> >>>
> >>> IMHO the safer way is to kick the host off the network by reprogramming
> >>> switches; so even if the qemu is actually alive it can't get anywhere.
> >>>
> >>> Dave
> >>
> >> Naturally some off-host STONITH is preferable, but that's not always
> >> available. A self-fencing mechanism right at the heart of the emulator
> >> can do the job without external hardware dependencies.
> >
> > Dave
> >
> >> Cheers,
> >> Felipe
> >>
> >>>
> >>>
> >>>> Felipe
> >>>>
> >>>>>
> >>>>>> Does something already exist for this purpose which could be used?
> >>>>>> Would a generic Qemu-fencing infrastructure be something of interest?
> >>>>> Dave
> >>>>>
> >>>>>
> >>>>>> Cheers,
> >>>>>> F.
> >>>>>>
> >>>>> --
> >>>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >>>>
> >>> --
> >>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >>
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2019-09-30 18:00 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-30 10:30 Thoughts on VM fence infrastructure Felipe Franciosi
2019-09-30 14:29 ` Dr. David Alan Gilbert
2019-09-30 15:46 ` Felipe Franciosi
2019-09-30 16:03 ` Dr. David Alan Gilbert
2019-09-30 16:59 ` Felipe Franciosi
2019-09-30 17:11 ` Dr. David Alan Gilbert
2019-09-30 17:33 ` Felipe Franciosi
2019-09-30 17:59 ` Dr. David Alan Gilbert [this message]
2019-09-30 19:23 ` Felipe Franciosi
2019-10-01 8:23 ` Dr. David Alan Gilbert
2019-10-01 9:56 ` Felipe Franciosi
2019-10-01 10:05 ` Dr. David Alan Gilbert
2019-10-01 10:31 ` Daniel P. Berrangé
2019-10-01 10:46 ` Felipe Franciosi
2019-10-01 11:10 ` Daniel P. Berrangé
2019-10-01 11:38 ` Felipe Franciosi
2019-10-01 10:49 ` Daniel P. Berrangé
2019-09-30 19:45 ` Rafael David Tinoco
2019-09-30 20:24 ` Felipe Franciosi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190930175914.GM2759@work-vm \
--to=dgilbert@redhat.com \
--cc=aramesh@nutanix.com \
--cc=felipe@nutanix.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).