From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:49629) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rip6D-0000W3-6u for qemu-devel@nongnu.org; Thu, 05 Jan 2012 10:19:26 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Rip68-00066r-JY for qemu-devel@nongnu.org; Thu, 05 Jan 2012 10:19:25 -0500 Received: from e35.co.us.ibm.com ([32.97.110.153]:39069) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rip68-00066n-8y for qemu-devel@nongnu.org; Thu, 05 Jan 2012 10:19:20 -0500 Received: from /spool/local by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 5 Jan 2012 08:19:18 -0700 Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by d03relay04.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q05FIm4a094256 for ; Thu, 5 Jan 2012 08:18:48 -0700 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q05FIi5p023108 for ; Thu, 5 Jan 2012 08:18:44 -0700 Message-ID: <4F05BF50.1010908@linux.vnet.ibm.com> Date: Thu, 05 Jan 2012 09:18:40 -0600 From: Michael Roth MIME-Version: 1.0 References: <1325706313-21936-1-git-send-email-lcapitulino@redhat.com> <20120105101630.GC31797@redhat.com> <20120105103714.60b0cefd@doriath> <20120105125927.GL31797@redhat.com> <4F05BC19.2010608@linux.vnet.ibm.com> <20120105151113.GA7466@redhat.com> In-Reply-To: <20120105151113.GA7466@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Daniel P. Berrange" Cc: amit.shah@redhat.com, jcody@redhat.com, qemu-devel@nongnu.org, Luiz Capitulino On 01/05/2012 09:11 AM, Daniel P. Berrange wrote: > On Thu, Jan 05, 2012 at 09:04:57AM -0600, Michael Roth wrote: >> On 01/05/2012 06:59 AM, Daniel P. Berrange wrote: >>> On Thu, Jan 05, 2012 at 10:37:14AM -0200, Luiz Capitulino wrote: >>>> On Thu, 5 Jan 2012 10:16:30 +0000 >>>> "Daniel P. Berrange" wrote: >>>> >>>>> On Wed, Jan 04, 2012 at 05:45:11PM -0200, Luiz Capitulino wrote: >>>>>> This version drops modes 'sleep' and 'hybrid' because they don't work >>>>>> properly due to issues in qemu. Only the 'hibernate' mode is supported >>>>>> for now. >>>>> >>>>> IMHO this is short-sighted. When the bugs QEMU in are fixed so that >>>>> these modes work, you have needlessly put users in the situation where >>>>> they have to now upgrade the guest agent everywhere to take advantage >>>>> of the bugfix. >>>> >>>> That was my thinking until v4. But after discussing with Michael the issues >>>> we have with S3 I concluded that it doesn't make sense to offer an API to >>>> something that doesn't work, this will just generate bug reports. Also, >>>> updating to get new features is normal and expected. >>> >>> This is assuming that users will always upgrade their VMs& hosts in >>> lock step, which I rather doubt they will in practice. eg imagine a >>> deployment might have a mixture of hosts, running QEMU 1.1 (broken S3) >>> and QEMU 1.2 (working S3). If they build VM disk images they will likely >>> use the QEMU GA from 1.2 for all their builds, even if many of them >>> will only run on QEMU 1.1 hosts. So you'll end up having 'sleep' and >>> 'hybrid' commands available in the guest agent, even though the host >>> QEMU doesn't work properly. >>> >>> So you *will* ultimately need to make sure that QEMU GA from 1.2, has >>> sensible behaviour when run on a QEMU 1.1 host. If you don't address >>> this during 1.1, you may well find yourself in an un-winnable situation >>> for 1.2, where it is impossible to provide good behaviour on old hosts. >>> >>> So IMHO we are better off in the long run, if we include all commands >>> right now, even though some don't work yet, and work to ensure we have >>> good error reporting behaviour for those that don't work. >>> >>> As an example, if S3 is broken in current QEMU, then we should not be >>> advertizing S3 to the guest OS. This would allow 'pm-is-supported --suspend' >>> to return false, at which point the guest agent can send back a nice error >>> message 'Suspend is not supported on this host', instead of just having the >>> guest try to suspend& hang or worse. >> >> This still requires we're lockstep with host QEMU (ideally that >> would be the case via push-deployment of the agent, exactly because >> of issues like this. Or at least, it'd make the upgrade process >> painless). And outside of that, I really cannot think of any nice >> way to check, from the agent, that a host has required functionality >> for {this,an} RPC. Not unless we forced a bi-directional >> capabilities negotiation sequence, and I don't like the idea of >> injecting this kind of data into a guest. libvirt could maybe filter >> the modes based on QEMU version, but that's not the only consumer of >> the agent. > > Err, the scenario I just described does not require lockstep > upgrade. Newer QEMU GA agent should be able to run on historical > QEMU hosts just fine. I'm also not trying to suggest we need a Bad terminology on my part, what I mean is if qemu-ga error reporting requires a newer qemu, we still execute the sleep on buggy hosts unless the host-level is adequate. > general bi-directional capabilities negotiation here either. > The key is that in this particular case, QEMU should only > expose S3 to the guest if it is actually capable of working. > Then, the pm-is-supported command will 'just work'. No > host<->guest agent negoiation is required. > > Daniel