From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:40142) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RrsTo-0004Vl-UJ for qemu-devel@nongnu.org; Mon, 30 Jan 2012 09:45:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RrsTk-0005M7-B5 for qemu-devel@nongnu.org; Mon, 30 Jan 2012 09:45:12 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46224) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RrsTk-0005LU-3O for qemu-devel@nongnu.org; Mon, 30 Jan 2012 09:45:08 -0500 Date: Mon, 30 Jan 2012 12:44:53 -0200 From: Luiz Capitulino Message-ID: <20120130124453.35fc1894@doriath> In-Reply-To: <4F26A130.10000@codemonkey.ws> References: <20120126144632.GM21211@redhat.com> <4F216EAB.2000707@redhat.com> <20120126173533.0079dc98@doriath.home> <4F21DA3D.9020203@codemonkey.ws> <20120130105747.1cba8bbb@doriath> <4F26A130.10000@codemonkey.ws> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: libvir-list@redhat.com, Michal Privoznik , Eric Blake , QEMU Developers , mdroth@linux.vnet.ibm.com On Mon, 30 Jan 2012 07:54:56 -0600 Anthony Liguori wrote: > On 01/30/2012 06:57 AM, Luiz Capitulino wrote: > > On Thu, 26 Jan 2012 16:57:01 -0600 > > Anthony Liguori wrote: > > > >> On 01/26/2012 01:35 PM, Luiz Capitulino wrote: > >>> On Thu, 26 Jan 2012 08:18:03 -0700 > >>> Eric Blake wrote: > >>> > >>>> [adding qemu-devel] > >>>> > >>>> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: > >>>>>> One thing, that you'll probably notice is this > >>>>>> 'set-support-level' command. Basically, it tells GA what qemu version > >>>>>> is it running on. Ideally, this should be done as soon as > >>>>>> GA starts up. However, that cannot be determined from outside > >>>>>> world as GA doesn't emit any events yet. > >>>>>> Ideally^2 this command should be left out as it should be qemu > >>>>>> who tells its own agent this kind of information. > >>>>>> Anyway, I was going to call this command in qemuProcess{Startup, > >>>>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs > >>>>>> so guest can boot and start GA, but that implies returning from qemuProcess*. > >>>>>> > >>>>>> So I am setting this just before 'guest-suspend' command, as > >>>>>> there is one more thing about GA. It is unable to remember anything > >>>>>> upon its restart (GA process). Which has BTW show flaw > >>>>>> in our current code with FS freeze& thaw. If we freeze guest > >>>>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as > >>>>>> GA thinks FS are not frozen. But that's a different cup of tea. > >>>>>> > >>>>>> Because of what written above, we need to call set-level > >>>>>> on every suspend. > >>>>> > >>>>> > >>>>> IMHO all this says that the 'set-level' command is a conceptually > >>>>> unfixably broken design& should be killed in QEMU before it turns > >>>>> into an even bigger mess. > >>> > >>> Can you elaborate on this? Michal and I talked on irc about making the > >>> compatibility level persistent, would that help? > >>> > >>>>> Once we're in a situation where we need to call 'set-level' prior > >>>>> to every single invocation, you might as well just allow the QEMU > >>>>> version number to be passed in directly as an arg to the command > >>>>> you are running directly thus avoiding this horrificness. > >>>> > >>>> Qemu folks, would you care to chime in on this? > >>>> > >>>> Exactly how is the set-level command supposed to work? As I understand > >>>> it, the goal is that if the guest has qemu-ga 1.1 installed, but is > >>>> being run by qemu 1.0, then we want to ensure that any guest agent > >>>> command supported by qemu-ga 1.1 but requiring features of qemu not > >>>> present in qemu 1.0 will be properly rejected. > >>> > >>> Not exactly, the default support of qemu-ga is qemu 1.0. This means that by > >>> default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This > >>> way the set-support-level command allows you to specify that qemu 2.0 features > >>> are supported. > >> > >> Version numbers are meaningless. What happens when a bunch of features get > >> backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of > >> 2.0? > >> > >> The feature negotiation mechanism we have in QMP is the existence of a command. > >> If we're in a position where we're trying to disable part of a command, it > >> simply means that we should have multiple commands such that we can just remove > >> the disabled part entirely. > > > > You may have a point that we shouldn't be using the version number for that, > > but just switching to multiple commands doesn't solve the fundamental problem. > > > > The fundamental problem is that, S3 in current (and old) qemu has two known bugs: > > > > 1. The screen is left black after S3 (it's a bug in seabios) > > 2. QEMU resumes the guest immediately (Gerd posted patches to address this) > > > > We're going to address both issues in 1.1. However, if qemu-ga is installed in > > an old qemu and S3 is used, the bugs will be triggered. > > It's a management tool problem. > > Before a management tool issues a command, it should query the existence of the > command to determine whether this version of QEMU has that capability. If the > tool needs to use two commands, it should query the existence of both of them. > > In this case, the management tool needs a qemu-ga command *and* a QEMU command > (to resume from suspend) so it should query both of them. > > Obviously, we wouldn't have a resume-from-suspend command in QEMU unless it S3 > worked in QEMU as expected. That's right, it's a coincidence, but I don't see why this wouldn't work. I think we should do the following then: 1. Drop the set-support-level command 2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid, guest-suspend-disk 3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing the guest-suspend-ram command Michal, Michael, do you agree? > Alternatively, if there really was no reason to have a resume-from-suspend > command, this would be the point where we would add a capabilities command > adding the "working-s3" capability. > > But with capabilities, this is a direct QEMU->management tool interaction, not a > proxy through the guest agent. > > We shouldn't trust the guest agent and we certainly don't want to rely on the > guest agent to avoid sending an improper command to QEMU! That would be a > security issue. Fair enough, although that makes me wonder if the planned feature of pulling qemu-ga's commands into the QMP namespace won't have similar security issues. > > > > > We need a way for qemu-ga to query qemu about the existence of a working S3 > > support. The set-support-level solves that. > > qemu-ga is not an entry point for QEMU features. It's strictly a mechanism to > ask the guest to do something. If we need to interact with QEMU directly to > query a capability and/or presence of a command, then we should talk to QEMU > directly. > > To put it another way, a management tool MUST deal with the fact that when > issuing the suspend-to-ram command, a guest may ignore it or attempt to do > something malicious. > > Regards, > > Anthony Liguori > > > > Another option would be to disable (or enable) S3 by default in qemu-ga, and let > > the admin enable (or disable it) according to S3 support being fixed in qemu. > > >