Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Luiz Capitulino <lcapitulino@redhat.com>
To: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: amit.shah@redhat.com, jcody@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command
Date: Fri, 6 Jan 2012 17:04:39 -0200	[thread overview]
Message-ID: <20120106170439.02292f6f@doriath> (raw)
In-Reply-To: <4F06190D.3030905@linux.vnet.ibm.com>

On Thu, 05 Jan 2012 15:41:33 -0600
Michael Roth <mdroth@linux.vnet.ibm.com> wrote:

> On 01/05/2012 02:25 PM, Luiz Capitulino wrote:
> > On Thu, 05 Jan 2012 09:10:50 -0600
> > Michael Roth<mdroth@linux.vnet.ibm.com>  wrote:
> >
> >> On 01/05/2012 08:42 AM, Luiz Capitulino wrote:
> >>> On Thu, 5 Jan 2012 12:59:27 +0000
> >>> "Daniel P. Berrange"<berrange@redhat.com>   wrote:
> >>>
> >>>> On Thu, Jan 05, 2012 at 10:37:14AM -0200, Luiz Capitulino wrote:
> >>>>> On Thu, 5 Jan 2012 10:16:30 +0000
> >>>>> "Daniel P. Berrange"<berrange@redhat.com>   wrote:
> >>>>>
> >>>>>> On Wed, Jan 04, 2012 at 05:45:11PM -0200, Luiz Capitulino wrote:
> >>>>>>> This version drops modes 'sleep' and 'hybrid' because they don't work
> >>>>>>> properly due to issues in qemu. Only the 'hibernate' mode is supported
> >>>>>>> for now.
> >>>>>>
> >>>>>> IMHO this is short-sighted. When the bugs QEMU in are fixed so that
> >>>>>> these modes work, you have needlessly put users in the situation where
> >>>>>> they have to now upgrade the guest agent everywhere to take advantage
> >>>>>> of the bugfix.
> >>>>>
> >>>>> That was my thinking until v4. But after discussing with Michael the issues
> >>>>> we have with S3 I concluded that it doesn't make sense to offer an API to
> >>>>> something that doesn't work, this will just generate bug reports. Also,
> >>>>> updating to get new features is normal and expected.
> >>>>
> >>>> This is assuming that users will always upgrade their VMs&   hosts in
> >>>> lock step, which I rather doubt they will in practice. eg imagine a
> >>>> deployment might have a mixture of hosts, running QEMU 1.1 (broken S3)
> >>>> and QEMU 1.2 (working S3). If they build VM disk images they will likely
> >>>> use the QEMU GA from 1.2 for all their builds, even if many of them
> >>>> will only run on QEMU 1.1 hosts. So you'll end up having 'sleep' and
> >>>> 'hybrid' commands available in the guest agent, even though the host
> >>>> QEMU doesn't work properly.
> >>>>
> >>>> So you *will* ultimately need to make sure that QEMU GA from 1.2, has
> >>>> sensible behaviour when run on a QEMU 1.1 host.  If you don't address
> >>>> this during 1.1, you may well find yourself in an un-winnable situation
> >>>> for 1.2, where it is impossible to provide good behaviour on old hosts.
> >>>>
> >>>> So IMHO we are better off in the long run, if we include all commands
> >>>> right now, even though some don't work yet, and work to ensure we have
> >>>> good error reporting behaviour for those that don't work.
> >>>
> >>> Yes, I agree. As a side note: if we add error reporting it will only work
> >>> on 1.1 and later.  Ie, the problem you describe above will still happen
> >>> with 1.0.
> >>>
> >>> But what you're suggesting seems to be the right thing to do. Do you agree
> >>> Michael?
> >>
> >> Agree, but unless we add an RPC that QEMU uses to advertise
> >> capabilities, I'm really not sure it's possible to detect whether or not
> >> the host will support it.
> >
> > You mean an RPC to advertise if 'sleep' is supported? I think this is best done
> > by making guest-suspend return an error as suggested by Daniel, otherwise a
> > client that doesn't query for capabilities might run in trouble.
> 
> Agreed, but what I mean is that if the user executes the suspend using 
> on up-level agent running on a down-level 1.0 host, the agent will still 
> see s3 advertised and issue the buggy suspend. That's why I suggested 
> the host->agent capabilities reporting as a possible (but somewhat ugly) 
> way to just simply tell the agent it can handle it (and, lacking that, 
> assume that it can't).

That makes sense.

> 
> >
> > There's an important detail though: we need to make qemu not advertise S3 for
> > this to work. However, we might be able to fix S3 for 1.1 (and bugs, like the
> > S4 ones, can't be detected, limiting the scope of the 'unsupported' error).
> >
> > So, we could merge all modes and commit to get S3 fixed for 1.1 :)
> 
> No disagreement there, if we can commit to making qemu-ga/qemu 1.1 
> releases interoperable in this manner (whether by fixing s3 or not 
> advertising it), I think that approach is perfectly fine, ideal even. 
> Doing a 1.1 release where qemu and qemu-ga are not interoperable (qemu 
> missing s3 support, qemu-ga using s3) was my main objection.

I see.

> But there is a 2nd topic here I'm trying to mull over: what is qemu-ga's 
> support policy for down-level hosts? backward-compatible? incompatible?

That's a good question, I think we should be backward-compatible, but I think
that's not going to be trivial.

> The above approach to this problem suggests the latter (qemu-ga 1.1 has 
> RPCs that will knowingly break 1.0 qemu instances). We could solve this 
> by introducing the capabilities negotiation I mentioned early. It 
> actually wouldn't need to be anything other than qemu telling qemu-ga 
> what qemu-ga version-level it supports. By default we assume 1.0, and 
> limit qemu-ga to that until qemu-ga is told otherwise (so, no 
> sleep/hybrid suspend modes). For new RPCs we may be able to handle this 
> version automatically, since we include qemu version levels for the RPCs 
> in the schema. For functionality within an RPC (like sleep/hybrid 
> suspend modes) we could use conditional code.
> 
> If we take that approach (maintaining backward-compatibility), we'd need 
> to introduce that code in the agent now, and require qemu/libvirt 
> execute the guest-set-support-level RPC or whatever to access these 1.1 
> features.

What does guest-set-support-level do? It enables all 1.1 post features?

A different approach would be to add a new field in the command dict in
the schema file, say 'broken-in-qemu-version', and change qemu-ga to check
that field in its main loop before executing a command. If
'broken-in-qemu-version' <= qemu version qemu-ga returns an not supported
error.

For commands like the guest-suspend which is partially supported, we'd have
to do a manual check for the qemu version as you suggested above.

That's just an idea though, I'm not sure what's the best way to do this.

> 
> Technically, there's a required RPC qemu-ga clients need to execute 
> already: guest-sync. It's required because we have no way to reliably 
> detect EOF over virtio-serial, and thus an agent may send stale data to 
> a newly-connected qemu-ga client, so the client needs to do the 
> guest-sync command to find the expected response and re-sync the 
> streams. We could roll the guest-set-support-level functionality into 
> that. Basically just add another field.
> 
> >
> >> And if we can't detect that reliably, we're
> >> better off leaving it out for now, because sleeping guests is not
> >> obscure functionality, and accidentally nuking guests when a user sleeps
> >> them (presumably because they want to retain their working state) is
> >> much worse than telling a user to upgrade their agent, or not supported
> >> or whatever.
> >>
> >>>
> >>>> As an example, if S3 is broken in current QEMU, then we should not be
> >>>> advertizing S3 to the guest OS. This would allow 'pm-is-supported --suspend'
> >>>> to return false, at which point the guest agent can send back a nice error
> >>>> message 'Suspend is not supported on this host', instead of just having the
> >>>> guest try to suspend&   hang or worse.
> >>>
> >>
> >
>

next prev parent reply	other threads:[~2012-01-06 19:04 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-04 19:45 [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command Luiz Capitulino
2012-01-04 19:45 ` [Qemu-devel] [PATCH 1/2] qemu-ga: set O_NONBLOCK for serial channels Luiz Capitulino
2012-01-04 19:55   ` Michael Roth
2012-01-04 19:45 ` [Qemu-devel] [PATCH 2/2] qemu-ga: Add the guest-suspend command Luiz Capitulino
2012-01-04 20:00   ` Michael Roth
2012-01-04 20:03   ` Eric Blake
2012-01-05 12:29     ` Luiz Capitulino
2012-01-05 12:46   ` Daniel P. Berrange
2012-01-05 12:58     ` Luiz Capitulino
2012-01-05 10:16 ` [Qemu-devel] [PATCH v4 0/2]: " Daniel P. Berrange
2012-01-05 12:37   ` Luiz Capitulino
2012-01-05 12:59     ` Daniel P. Berrange
2012-01-05 14:42       ` Luiz Capitulino
2012-01-05 15:10         ` Michael Roth
2012-01-05 20:25           ` Luiz Capitulino
2012-01-05 21:41             ` Michael Roth
2012-01-06 19:04               ` Luiz Capitulino [this message]
2012-01-06 21:03                 ` Michael Roth
2012-01-05 15:04       ` Michael Roth
2012-01-05 15:11         ` Daniel P. Berrange
2012-01-05 15:18           ` Michael Roth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120106170439.02292f6f@doriath \
    --to=lcapitulino@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=jcody@redhat.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).