Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Michael Roth <mdroth@linux.vnet.ibm.com>
To: Luiz Capitulino <lcapitulino@redhat.com>
Cc: libvir-list@redhat.com, Michal Privoznik <mprivozn@redhat.com>,
	Eric Blake <eblake@redhat.com>,
	QEMU Developers <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests
Date: Thu, 26 Jan 2012 16:51:31 -0600	[thread overview]
Message-ID: <4F21D8F3.8000505@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120126181355.3df32a72@doriath.home>

On 01/26/2012 02:13 PM, Luiz Capitulino wrote:
> On Thu, 26 Jan 2012 20:41:13 +0100
> Michal Privoznik<mprivozn@redhat.com>  wrote:
>
>> On 26.01.2012 20:35, Luiz Capitulino wrote:
>>> On Thu, 26 Jan 2012 08:18:03 -0700
>>> Eric Blake<eblake@redhat.com>  wrote:
>>>
>>>> [adding qemu-devel]
>>>>
>>>> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
>>>>>> One thing, that you'll probably notice is this
>>>>>> 'set-support-level' command. Basically, it tells GA what qemu version
>>>>>> is it running on. Ideally, this should be done as soon as
>>>>>> GA starts up. However, that cannot be determined from outside
>>>>>> world as GA doesn't emit any events yet.
>>>>>> Ideally^2 this command should be left out as it should be qemu
>>>>>> who tells its own agent this kind of information.
>>>>>> Anyway, I was going to call this command in qemuProcess{Startup,
>>>>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs
>>>>>> so guest can boot and start GA, but that implies returning from qemuProcess*.
>>>>>>
>>>>>> So I am setting this just before 'guest-suspend' command, as
>>>>>> there is one more thing about GA. It is unable to remember anything
>>>>>> upon its restart (GA process). Which has BTW show flaw
>>>>>> in our current code with FS freeze&  thaw. If we freeze guest
>>>>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as
>>>>>> GA thinks FS are not frozen. But that's a different cup of tea.
>>>>>>
>>>>>> Because of what written above, we need to call set-level
>>>>>> on every suspend.
>>>>>
>>>>>
>>>>> IMHO all this says that the 'set-level' command is a conceptually
>>>>> unfixably broken design&  should be killed in QEMU before it turns
>>>>> into an even bigger mess.
>>>
>>> Can you elaborate on this? Michal and I talked on irc about making the
>>> compatibility level persistent, would that help?
>>>
>>>>> Once we're in a situation where we need to call 'set-level' prior
>>>>> to every single invocation, you might as well just allow the QEMU
>>>>> version number to be passed in directly as an arg to the command
>>>>> you are running directly thus avoiding this horrificness.
>>>>
>>>> Qemu folks, would you care to chime in on this?
>>>>
>>>> Exactly how is the set-level command supposed to work?  As I understand
>>>> it, the goal is that if the guest has qemu-ga 1.1 installed, but is
>>>> being run by qemu 1.0, then we want to ensure that any guest agent
>>>> command supported by qemu-ga 1.1 but requiring features of qemu not
>>>> present in qemu 1.0 will be properly rejected.
>>>
>>> Not exactly, the default support of qemu-ga is qemu 1.0. This means that by
>>> default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This
>>> way the set-support-level command allows you to specify that qemu 2.0 features
>>> are supported.
>>>
>>> Note that this is only about specific features that depend on host support,
>>> like S3 suspend which is known to be buggy in current and old qemu.
>>>
>>>> But whose job is it to tell the guest agent what version of qemu is
>>>> running?  Based on the above conversation, it looks like the current
>>>> qemu implementation does not do any handshaking on its own when the
>>>> guest agent first comes alive, which means that you are forcing the work
>>>> on the management app (libvirt).  And this is inherently racy - if the
>>>> guest is allowed to restart its qemu-ga process at will, and each
>>>> restart of that guest process triggers a need to redo the handshake,
>>>> then libvirt can never reliably know what version the agent is running at.
>>>
>>> Making the set-support-level persistent would solve it, wouldn't it?
>>
>> Yes and no. We still need an event when GA come to live. Because if
>> anybody tries to write something for GA which is not running (and for
>> purpose of this scenario assume it never will), like 'set-support-level'
>> and wait for answer (which will never come) he will be blocked
>> indefinitely. However, if he writes it after 1st event come, everything
>> is OK.
>
> What if the event never reach libvirt?
>
> This problem is a lot more general and is not related to the
> set-support-level command. Maybe adding shutdown&  start events can serve as
> good hints, but they won't fix the problem.

Yah, start up events are a good indicator to issue the guest-sync 
sequence (we had them at one point, and planned to re-add them for QMP 
integration, and since libvirt is taking on this role for now it might 
make sense to re-add it now), but once that sequence is issued the agent 
can still be manually stopped, or the guest-sync sequence itself can 
timeout.

And there's no way to reliably send a stop indicator, maybe to capture 
shutdown events, but not consistently enough that we can base the 
protocol on it (agent might get pkill -9'd for instance, and 
virtio-serial doesn't currently plumb a guest-side disconnect to the 
chardev front-end, so you'd never know).

So, the only indication you'll ever get that your "session" ended is 
either a timeout, or, if we add it, a start up event. In either case the 
response is to issue the reset sequence.

The way it would need to work with resets is everytime a command times 
out you:

1) report the timeout error to libvirt client/management app. set 
guest-agent_available = 0, such that further libvirt calls that depend 
on it would return "not currently available", or something to that effect.
2) issue guest-sync with new unique session id
3) read a json object/response.
   - if you time out, goto 2
   - if your response doesn't have the session id you're expecting, 
repeat 3) (since it may be a response to a previous guest-sync RPC that 
you timed out on prematurely, but you can't just wait indefinitely, 
since it may never arrive)
4) set guest_agent_available = 1, proceed with normal operation till the 
next timeout (or start up event, if we add one).


>
> IMHO, the best way to solve this is to issue the guest-sync command with
> a timeout. If you get no answer, then try again later.
>

next prev parent reply	other threads:[~2012-01-26 22:52 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1327585806.git.mprivozn@redhat.com>
     [not found] ` <20120126144632.GM21211@redhat.com>
2012-01-26 15:18   ` [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests Eric Blake
2012-01-26 19:35     ` Luiz Capitulino
2012-01-26 19:41       ` Michal Privoznik
2012-01-26 20:13         ` Luiz Capitulino
2012-01-26 22:51           ` Michael Roth [this message]
2012-01-26 22:57       ` Anthony Liguori
2012-01-30 12:57         ` Luiz Capitulino
2012-01-30 13:54           ` Anthony Liguori
2012-01-30 14:44             ` Luiz Capitulino
2012-01-30 15:43               ` Michael Roth
2012-01-30 15:58               ` Eric Blake
2012-01-30 17:07                 ` Michael Roth
2012-01-30 18:30                   ` Luiz Capitulino
2012-01-30 16:08               ` Michal Privoznik
2012-01-30 18:36                 ` Luiz Capitulino
2012-01-30 15:03           ` Michael Roth
2012-01-26 22:54     ` Anthony Liguori
2012-01-27  0:01     ` Michael Roth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F21D8F3.8000505@linux.vnet.ibm.com \
    --to=mdroth@linux.vnet.ibm.com \
    --cc=eblake@redhat.com \
    --cc=lcapitulino@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=mprivozn@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).