From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:55018) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RrtPI-00059B-AF for qemu-devel@nongnu.org; Mon, 30 Jan 2012 10:44:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RrtP9-00072J-Vx for qemu-devel@nongnu.org; Mon, 30 Jan 2012 10:44:36 -0500 Received: from e1.ny.us.ibm.com ([32.97.182.141]:45637) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RrtP9-000723-Me for qemu-devel@nongnu.org; Mon, 30 Jan 2012 10:44:27 -0500 Received: from /spool/local by e1.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 30 Jan 2012 10:44:23 -0500 Received: from d01relay03.pok.ibm.com (d01relay03.pok.ibm.com [9.56.227.235]) by d01dlp02.pok.ibm.com (Postfix) with ESMTP id A0F0E6E80BA for ; Mon, 30 Jan 2012 10:44:00 -0500 (EST) Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d01relay03.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q0UFht41260628 for ; Mon, 30 Jan 2012 10:43:57 -0500 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q0UFhs2c028777 for ; Mon, 30 Jan 2012 08:43:54 -0700 Message-ID: <4F26BAAE.8060101@linux.vnet.ibm.com> Date: Mon, 30 Jan 2012 09:43:42 -0600 From: Michael Roth MIME-Version: 1.0 References: <20120126144632.GM21211@redhat.com> <4F216EAB.2000707@redhat.com> <20120126173533.0079dc98@doriath.home> <4F21DA3D.9020203@codemonkey.ws> <20120130105747.1cba8bbb@doriath> <4F26A130.10000@codemonkey.ws> <20120130124453.35fc1894@doriath> In-Reply-To: <20120130124453.35fc1894@doriath> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Luiz Capitulino Cc: libvir-list@redhat.com, Michal Privoznik , Eric Blake , QEMU Developers On 01/30/2012 08:44 AM, Luiz Capitulino wrote: > On Mon, 30 Jan 2012 07:54:56 -0600 > Anthony Liguori wrote: > >> On 01/30/2012 06:57 AM, Luiz Capitulino wrote: >>> On Thu, 26 Jan 2012 16:57:01 -0600 >>> Anthony Liguori wrote: >>> >>>> On 01/26/2012 01:35 PM, Luiz Capitulino wrote: >>>>> On Thu, 26 Jan 2012 08:18:03 -0700 >>>>> Eric Blake wrote: >>>>> >>>>>> [adding qemu-devel] >>>>>> >>>>>> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: >>>>>>>> One thing, that you'll probably notice is this >>>>>>>> 'set-support-level' command. Basically, it tells GA what qemu version >>>>>>>> is it running on. Ideally, this should be done as soon as >>>>>>>> GA starts up. However, that cannot be determined from outside >>>>>>>> world as GA doesn't emit any events yet. >>>>>>>> Ideally^2 this command should be left out as it should be qemu >>>>>>>> who tells its own agent this kind of information. >>>>>>>> Anyway, I was going to call this command in qemuProcess{Startup, >>>>>>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs >>>>>>>> so guest can boot and start GA, but that implies returning from qemuProcess*. >>>>>>>> >>>>>>>> So I am setting this just before 'guest-suspend' command, as >>>>>>>> there is one more thing about GA. It is unable to remember anything >>>>>>>> upon its restart (GA process). Which has BTW show flaw >>>>>>>> in our current code with FS freeze& thaw. If we freeze guest >>>>>>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as >>>>>>>> GA thinks FS are not frozen. But that's a different cup of tea. >>>>>>>> >>>>>>>> Because of what written above, we need to call set-level >>>>>>>> on every suspend. >>>>>>> >>>>>>> >>>>>>> IMHO all this says that the 'set-level' command is a conceptually >>>>>>> unfixably broken design& should be killed in QEMU before it turns >>>>>>> into an even bigger mess. >>>>> >>>>> Can you elaborate on this? Michal and I talked on irc about making the >>>>> compatibility level persistent, would that help? >>>>> >>>>>>> Once we're in a situation where we need to call 'set-level' prior >>>>>>> to every single invocation, you might as well just allow the QEMU >>>>>>> version number to be passed in directly as an arg to the command >>>>>>> you are running directly thus avoiding this horrificness. >>>>>> >>>>>> Qemu folks, would you care to chime in on this? >>>>>> >>>>>> Exactly how is the set-level command supposed to work? As I understand >>>>>> it, the goal is that if the guest has qemu-ga 1.1 installed, but is >>>>>> being run by qemu 1.0, then we want to ensure that any guest agent >>>>>> command supported by qemu-ga 1.1 but requiring features of qemu not >>>>>> present in qemu 1.0 will be properly rejected. >>>>> >>>>> Not exactly, the default support of qemu-ga is qemu 1.0. This means that by >>>>> default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This >>>>> way the set-support-level command allows you to specify that qemu 2.0 features >>>>> are supported. >>>> >>>> Version numbers are meaningless. What happens when a bunch of features get >>>> backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of >>>> 2.0? >>>> >>>> The feature negotiation mechanism we have in QMP is the existence of a command. >>>> If we're in a position where we're trying to disable part of a command, it >>>> simply means that we should have multiple commands such that we can just remove >>>> the disabled part entirely. >>> >>> You may have a point that we shouldn't be using the version number for that, >>> but just switching to multiple commands doesn't solve the fundamental problem. >>> >>> The fundamental problem is that, S3 in current (and old) qemu has two known bugs: >>> >>> 1. The screen is left black after S3 (it's a bug in seabios) >>> 2. QEMU resumes the guest immediately (Gerd posted patches to address this) >>> >>> We're going to address both issues in 1.1. However, if qemu-ga is installed in >>> an old qemu and S3 is used, the bugs will be triggered. >> >> It's a management tool problem. >> >> Before a management tool issues a command, it should query the existence of the >> command to determine whether this version of QEMU has that capability. If the >> tool needs to use two commands, it should query the existence of both of them. >> >> In this case, the management tool needs a qemu-ga command *and* a QEMU command >> (to resume from suspend) so it should query both of them. >> >> Obviously, we wouldn't have a resume-from-suspend command in QEMU unless it S3 >> worked in QEMU as expected. > > That's right, it's a coincidence, but I don't see why this wouldn't work. > > I think we should do the following then: > > 1. Drop the set-support-level command > 2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid, > guest-suspend-disk > 3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing > the guest-suspend-ram command > > Michal, Michael, do you agree? Yup yup, looks sound to me. > >> Alternatively, if there really was no reason to have a resume-from-suspend >> command, this would be the point where we would add a capabilities command >> adding the "working-s3" capability. >> >> But with capabilities, this is a direct QEMU->management tool interaction, not a >> proxy through the guest agent. >> >> We shouldn't trust the guest agent and we certainly don't want to rely on the >> guest agent to avoid sending an improper command to QEMU! That would be a >> security issue. > > Fair enough, although that makes me wonder if the planned feature of pulling > qemu-ga's commands into the QMP namespace won't have similar security issues. > >> >>> >>> We need a way for qemu-ga to query qemu about the existence of a working S3 >>> support. The set-support-level solves that. >> >> qemu-ga is not an entry point for QEMU features. It's strictly a mechanism to >> ask the guest to do something. If we need to interact with QEMU directly to >> query a capability and/or presence of a command, then we should talk to QEMU >> directly. >> >> To put it another way, a management tool MUST deal with the fact that when >> issuing the suspend-to-ram command, a guest may ignore it or attempt to do >> something malicious. >> >> Regards, >> >> Anthony Liguori >> >> >>> Another option would be to disable (or enable) S3 by default in qemu-ga, and let >>> the admin enable (or disable it) according to S3 support being fixed in qemu. >>> >> > >