From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:49547) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RiotG-00010W-OH for qemu-devel@nongnu.org; Thu, 05 Jan 2012 10:06:08 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RiotB-0003VV-Ju for qemu-devel@nongnu.org; Thu, 05 Jan 2012 10:06:02 -0500 Received: from e7.ny.us.ibm.com ([32.97.182.137]:36665) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RiotB-0003VK-GW for qemu-devel@nongnu.org; Thu, 05 Jan 2012 10:05:57 -0500 Received: from /spool/local by e7.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 5 Jan 2012 10:05:49 -0500 Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay03.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q05F5XWK245238 for ; Thu, 5 Jan 2012 10:05:33 -0500 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q05F4xWn006922 for ; Thu, 5 Jan 2012 10:05:00 -0500 Message-ID: <4F05BC19.2010608@linux.vnet.ibm.com> Date: Thu, 05 Jan 2012 09:04:57 -0600 From: Michael Roth MIME-Version: 1.0 References: <1325706313-21936-1-git-send-email-lcapitulino@redhat.com> <20120105101630.GC31797@redhat.com> <20120105103714.60b0cefd@doriath> <20120105125927.GL31797@redhat.com> In-Reply-To: <20120105125927.GL31797@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Daniel P. Berrange" Cc: amit.shah@redhat.com, jcody@redhat.com, qemu-devel@nongnu.org, Luiz Capitulino On 01/05/2012 06:59 AM, Daniel P. Berrange wrote: > On Thu, Jan 05, 2012 at 10:37:14AM -0200, Luiz Capitulino wrote: >> On Thu, 5 Jan 2012 10:16:30 +0000 >> "Daniel P. Berrange" wrote: >> >>> On Wed, Jan 04, 2012 at 05:45:11PM -0200, Luiz Capitulino wrote: >>>> This version drops modes 'sleep' and 'hybrid' because they don't work >>>> properly due to issues in qemu. Only the 'hibernate' mode is supported >>>> for now. >>> >>> IMHO this is short-sighted. When the bugs QEMU in are fixed so that >>> these modes work, you have needlessly put users in the situation where >>> they have to now upgrade the guest agent everywhere to take advantage >>> of the bugfix. >> >> That was my thinking until v4. But after discussing with Michael the issues >> we have with S3 I concluded that it doesn't make sense to offer an API to >> something that doesn't work, this will just generate bug reports. Also, >> updating to get new features is normal and expected. > > This is assuming that users will always upgrade their VMs& hosts in > lock step, which I rather doubt they will in practice. eg imagine a > deployment might have a mixture of hosts, running QEMU 1.1 (broken S3) > and QEMU 1.2 (working S3). If they build VM disk images they will likely > use the QEMU GA from 1.2 for all their builds, even if many of them > will only run on QEMU 1.1 hosts. So you'll end up having 'sleep' and > 'hybrid' commands available in the guest agent, even though the host > QEMU doesn't work properly. > > So you *will* ultimately need to make sure that QEMU GA from 1.2, has > sensible behaviour when run on a QEMU 1.1 host. If you don't address > this during 1.1, you may well find yourself in an un-winnable situation > for 1.2, where it is impossible to provide good behaviour on old hosts. > > So IMHO we are better off in the long run, if we include all commands > right now, even though some don't work yet, and work to ensure we have > good error reporting behaviour for those that don't work. > > As an example, if S3 is broken in current QEMU, then we should not be > advertizing S3 to the guest OS. This would allow 'pm-is-supported --suspend' > to return false, at which point the guest agent can send back a nice error > message 'Suspend is not supported on this host', instead of just having the > guest try to suspend& hang or worse. This still requires we're lockstep with host QEMU (ideally that would be the case via push-deployment of the agent, exactly because of issues like this. Or at least, it'd make the upgrade process painless). And outside of that, I really cannot think of any nice way to check, from the agent, that a host has required functionality for {this,an} RPC. Not unless we forced a bi-directional capabilities negotiation sequence, and I don't like the idea of injecting this kind of data into a guest. libvirt could maybe filter the modes based on QEMU version, but that's not the only consumer of the agent. Really I think this is a case study for why push-deployment of agents is the way to go. QEMU could query qemu-ga directly and generate an 'agent update available' event that users/frontends can use to prompt an update to the latest version. Then all the upgrade inertia involved with saving code/features for subsequent agent versions is mostly gone, and we can "do the right thing" with regard to broken functionality :) Unfortunately that option isn't available yet. But it just seems wrong to introduce something we know is broken, to the extent that even those involved with it's development aren't currently capable of testing it fully. I mean, we know what the user expectations are, and it's not that, unfortunately for us :( I'd be more open to it if the bug wasn't so bad, but nuking your guest's working state every time you make the mistake of hitting the pretty "sleep" button in virt-manager or whatever is pretty bad. > > Daniel