From: Luiz Capitulino <lcapitulino@redhat.com>
To: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: amit.shah@redhat.com, jcody@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command
Date: Fri, 6 Jan 2012 17:04:39 -0200 [thread overview]
Message-ID: <20120106170439.02292f6f@doriath> (raw)
In-Reply-To: <4F06190D.3030905@linux.vnet.ibm.com>
On Thu, 05 Jan 2012 15:41:33 -0600
Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
> On 01/05/2012 02:25 PM, Luiz Capitulino wrote:
> > On Thu, 05 Jan 2012 09:10:50 -0600
> > Michael Roth<mdroth@linux.vnet.ibm.com> wrote:
> >
> >> On 01/05/2012 08:42 AM, Luiz Capitulino wrote:
> >>> On Thu, 5 Jan 2012 12:59:27 +0000
> >>> "Daniel P. Berrange"<berrange@redhat.com> wrote:
> >>>
> >>>> On Thu, Jan 05, 2012 at 10:37:14AM -0200, Luiz Capitulino wrote:
> >>>>> On Thu, 5 Jan 2012 10:16:30 +0000
> >>>>> "Daniel P. Berrange"<berrange@redhat.com> wrote:
> >>>>>
> >>>>>> On Wed, Jan 04, 2012 at 05:45:11PM -0200, Luiz Capitulino wrote:
> >>>>>>> This version drops modes 'sleep' and 'hybrid' because they don't work
> >>>>>>> properly due to issues in qemu. Only the 'hibernate' mode is supported
> >>>>>>> for now.
> >>>>>>
> >>>>>> IMHO this is short-sighted. When the bugs QEMU in are fixed so that
> >>>>>> these modes work, you have needlessly put users in the situation where
> >>>>>> they have to now upgrade the guest agent everywhere to take advantage
> >>>>>> of the bugfix.
> >>>>>
> >>>>> That was my thinking until v4. But after discussing with Michael the issues
> >>>>> we have with S3 I concluded that it doesn't make sense to offer an API to
> >>>>> something that doesn't work, this will just generate bug reports. Also,
> >>>>> updating to get new features is normal and expected.
> >>>>
> >>>> This is assuming that users will always upgrade their VMs& hosts in
> >>>> lock step, which I rather doubt they will in practice. eg imagine a
> >>>> deployment might have a mixture of hosts, running QEMU 1.1 (broken S3)
> >>>> and QEMU 1.2 (working S3). If they build VM disk images they will likely
> >>>> use the QEMU GA from 1.2 for all their builds, even if many of them
> >>>> will only run on QEMU 1.1 hosts. So you'll end up having 'sleep' and
> >>>> 'hybrid' commands available in the guest agent, even though the host
> >>>> QEMU doesn't work properly.
> >>>>
> >>>> So you *will* ultimately need to make sure that QEMU GA from 1.2, has
> >>>> sensible behaviour when run on a QEMU 1.1 host. If you don't address
> >>>> this during 1.1, you may well find yourself in an un-winnable situation
> >>>> for 1.2, where it is impossible to provide good behaviour on old hosts.
> >>>>
> >>>> So IMHO we are better off in the long run, if we include all commands
> >>>> right now, even though some don't work yet, and work to ensure we have
> >>>> good error reporting behaviour for those that don't work.
> >>>
> >>> Yes, I agree. As a side note: if we add error reporting it will only work
> >>> on 1.1 and later. Ie, the problem you describe above will still happen
> >>> with 1.0.
> >>>
> >>> But what you're suggesting seems to be the right thing to do. Do you agree
> >>> Michael?
> >>
> >> Agree, but unless we add an RPC that QEMU uses to advertise
> >> capabilities, I'm really not sure it's possible to detect whether or not
> >> the host will support it.
> >
> > You mean an RPC to advertise if 'sleep' is supported? I think this is best done
> > by making guest-suspend return an error as suggested by Daniel, otherwise a
> > client that doesn't query for capabilities might run in trouble.
>
> Agreed, but what I mean is that if the user executes the suspend using
> on up-level agent running on a down-level 1.0 host, the agent will still
> see s3 advertised and issue the buggy suspend. That's why I suggested
> the host->agent capabilities reporting as a possible (but somewhat ugly)
> way to just simply tell the agent it can handle it (and, lacking that,
> assume that it can't).
That makes sense.
>
> >
> > There's an important detail though: we need to make qemu not advertise S3 for
> > this to work. However, we might be able to fix S3 for 1.1 (and bugs, like the
> > S4 ones, can't be detected, limiting the scope of the 'unsupported' error).
> >
> > So, we could merge all modes and commit to get S3 fixed for 1.1 :)
>
> No disagreement there, if we can commit to making qemu-ga/qemu 1.1
> releases interoperable in this manner (whether by fixing s3 or not
> advertising it), I think that approach is perfectly fine, ideal even.
> Doing a 1.1 release where qemu and qemu-ga are not interoperable (qemu
> missing s3 support, qemu-ga using s3) was my main objection.
I see.
> But there is a 2nd topic here I'm trying to mull over: what is qemu-ga's
> support policy for down-level hosts? backward-compatible? incompatible?
That's a good question, I think we should be backward-compatible, but I think
that's not going to be trivial.
> The above approach to this problem suggests the latter (qemu-ga 1.1 has
> RPCs that will knowingly break 1.0 qemu instances). We could solve this
> by introducing the capabilities negotiation I mentioned early. It
> actually wouldn't need to be anything other than qemu telling qemu-ga
> what qemu-ga version-level it supports. By default we assume 1.0, and
> limit qemu-ga to that until qemu-ga is told otherwise (so, no
> sleep/hybrid suspend modes). For new RPCs we may be able to handle this
> version automatically, since we include qemu version levels for the RPCs
> in the schema. For functionality within an RPC (like sleep/hybrid
> suspend modes) we could use conditional code.
>
> If we take that approach (maintaining backward-compatibility), we'd need
> to introduce that code in the agent now, and require qemu/libvirt
> execute the guest-set-support-level RPC or whatever to access these 1.1
> features.
What does guest-set-support-level do? It enables all 1.1 post features?
A different approach would be to add a new field in the command dict in
the schema file, say 'broken-in-qemu-version', and change qemu-ga to check
that field in its main loop before executing a command. If
'broken-in-qemu-version' <= qemu version qemu-ga returns an not supported
error.
For commands like the guest-suspend which is partially supported, we'd have
to do a manual check for the qemu version as you suggested above.
That's just an idea though, I'm not sure what's the best way to do this.
>
> Technically, there's a required RPC qemu-ga clients need to execute
> already: guest-sync. It's required because we have no way to reliably
> detect EOF over virtio-serial, and thus an agent may send stale data to
> a newly-connected qemu-ga client, so the client needs to do the
> guest-sync command to find the expected response and re-sync the
> streams. We could roll the guest-set-support-level functionality into
> that. Basically just add another field.
>
> >
> >> And if we can't detect that reliably, we're
> >> better off leaving it out for now, because sleeping guests is not
> >> obscure functionality, and accidentally nuking guests when a user sleeps
> >> them (presumably because they want to retain their working state) is
> >> much worse than telling a user to upgrade their agent, or not supported
> >> or whatever.
> >>
> >>>
> >>>> As an example, if S3 is broken in current QEMU, then we should not be
> >>>> advertizing S3 to the guest OS. This would allow 'pm-is-supported --suspend'
> >>>> to return false, at which point the guest agent can send back a nice error
> >>>> message 'Suspend is not supported on this host', instead of just having the
> >>>> guest try to suspend& hang or worse.
> >>>
> >>
> >
>
next prev parent reply other threads:[~2012-01-06 19:04 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-04 19:45 [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command Luiz Capitulino
2012-01-04 19:45 ` [Qemu-devel] [PATCH 1/2] qemu-ga: set O_NONBLOCK for serial channels Luiz Capitulino
2012-01-04 19:55 ` Michael Roth
2012-01-04 19:45 ` [Qemu-devel] [PATCH 2/2] qemu-ga: Add the guest-suspend command Luiz Capitulino
2012-01-04 20:00 ` Michael Roth
2012-01-04 20:03 ` Eric Blake
2012-01-05 12:29 ` Luiz Capitulino
2012-01-05 12:46 ` Daniel P. Berrange
2012-01-05 12:58 ` Luiz Capitulino
2012-01-05 10:16 ` [Qemu-devel] [PATCH v4 0/2]: " Daniel P. Berrange
2012-01-05 12:37 ` Luiz Capitulino
2012-01-05 12:59 ` Daniel P. Berrange
2012-01-05 14:42 ` Luiz Capitulino
2012-01-05 15:10 ` Michael Roth
2012-01-05 20:25 ` Luiz Capitulino
2012-01-05 21:41 ` Michael Roth
2012-01-06 19:04 ` Luiz Capitulino [this message]
2012-01-06 21:03 ` Michael Roth
2012-01-05 15:04 ` Michael Roth
2012-01-05 15:11 ` Daniel P. Berrange
2012-01-05 15:18 ` Michael Roth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120106170439.02292f6f@doriath \
--to=lcapitulino@redhat.com \
--cc=amit.shah@redhat.com \
--cc=jcody@redhat.com \
--cc=mdroth@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).