From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=52237 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RqVi8-0002GT-GT for qemu-devel@nongnu.org; Thu, 26 Jan 2012 15:14:21 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RqVi2-0001bE-O0 for qemu-devel@nongnu.org; Thu, 26 Jan 2012 15:14:17 -0500 Received: from mx1.redhat.com ([209.132.183.28]:9216) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RqVi2-0001b3-Fs for qemu-devel@nongnu.org; Thu, 26 Jan 2012 15:14:14 -0500 Date: Thu, 26 Jan 2012 18:13:55 -0200 From: Luiz Capitulino Message-ID: <20120126181355.3df32a72@doriath.home> In-Reply-To: <4F21AC59.3000303@redhat.com> References: <20120126144632.GM21211@redhat.com> <4F216EAB.2000707@redhat.com> <20120126173533.0079dc98@doriath.home> <4F21AC59.3000303@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Michal Privoznik Cc: libvir-list@redhat.com, Eric Blake , QEMU Developers , mdroth@linux.vnet.ibm.com On Thu, 26 Jan 2012 20:41:13 +0100 Michal Privoznik wrote: > On 26.01.2012 20:35, Luiz Capitulino wrote: > > On Thu, 26 Jan 2012 08:18:03 -0700 > > Eric Blake wrote: > > > >> [adding qemu-devel] > >> > >> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: > >>>> One thing, that you'll probably notice is this > >>>> 'set-support-level' command. Basically, it tells GA what qemu version > >>>> is it running on. Ideally, this should be done as soon as > >>>> GA starts up. However, that cannot be determined from outside > >>>> world as GA doesn't emit any events yet. > >>>> Ideally^2 this command should be left out as it should be qemu > >>>> who tells its own agent this kind of information. > >>>> Anyway, I was going to call this command in qemuProcess{Startup, > >>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs > >>>> so guest can boot and start GA, but that implies returning from qemuProcess*. > >>>> > >>>> So I am setting this just before 'guest-suspend' command, as > >>>> there is one more thing about GA. It is unable to remember anything > >>>> upon its restart (GA process). Which has BTW show flaw > >>>> in our current code with FS freeze & thaw. If we freeze guest > >>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as > >>>> GA thinks FS are not frozen. But that's a different cup of tea. > >>>> > >>>> Because of what written above, we need to call set-level > >>>> on every suspend. > >>> > >>> > >>> IMHO all this says that the 'set-level' command is a conceptually > >>> unfixably broken design & should be killed in QEMU before it turns > >>> into an even bigger mess. > > > > Can you elaborate on this? Michal and I talked on irc about making the > > compatibility level persistent, would that help? > > > >>> Once we're in a situation where we need to call 'set-level' prior > >>> to every single invocation, you might as well just allow the QEMU > >>> version number to be passed in directly as an arg to the command > >>> you are running directly thus avoiding this horrificness. > >> > >> Qemu folks, would you care to chime in on this? > >> > >> Exactly how is the set-level command supposed to work? As I understand > >> it, the goal is that if the guest has qemu-ga 1.1 installed, but is > >> being run by qemu 1.0, then we want to ensure that any guest agent > >> command supported by qemu-ga 1.1 but requiring features of qemu not > >> present in qemu 1.0 will be properly rejected. > > > > Not exactly, the default support of qemu-ga is qemu 1.0. This means that by > > default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This > > way the set-support-level command allows you to specify that qemu 2.0 features > > are supported. > > > > Note that this is only about specific features that depend on host support, > > like S3 suspend which is known to be buggy in current and old qemu. > > > >> But whose job is it to tell the guest agent what version of qemu is > >> running? Based on the above conversation, it looks like the current > >> qemu implementation does not do any handshaking on its own when the > >> guest agent first comes alive, which means that you are forcing the work > >> on the management app (libvirt). And this is inherently racy - if the > >> guest is allowed to restart its qemu-ga process at will, and each > >> restart of that guest process triggers a need to redo the handshake, > >> then libvirt can never reliably know what version the agent is running at. > > > > Making the set-support-level persistent would solve it, wouldn't it? > > Yes and no. We still need an event when GA come to live. Because if > anybody tries to write something for GA which is not running (and for > purpose of this scenario assume it never will), like 'set-support-level' > and wait for answer (which will never come) he will be blocked > indefinitely. However, if he writes it after 1st event come, everything > is OK. What if the event never reach libvirt? This problem is a lot more general and is not related to the set-support-level command. Maybe adding shutdown & start events can serve as good hints, but they won't fix the problem. IMHO, the best way to solve this is to issue the guest-sync command with a timeout. If you get no answer, then try again later.