From: Michael Roth <mdroth@linux.vnet.ibm.com>
To: Luiz Capitulino <lcapitulino@redhat.com>
Cc: libvir-list@redhat.com, Michal Privoznik <mprivozn@redhat.com>,
Eric Blake <eblake@redhat.com>,
QEMU Developers <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests
Date: Thu, 26 Jan 2012 16:51:31 -0600 [thread overview]
Message-ID: <4F21D8F3.8000505@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120126181355.3df32a72@doriath.home>
On 01/26/2012 02:13 PM, Luiz Capitulino wrote:
> On Thu, 26 Jan 2012 20:41:13 +0100
> Michal Privoznik<mprivozn@redhat.com> wrote:
>
>> On 26.01.2012 20:35, Luiz Capitulino wrote:
>>> On Thu, 26 Jan 2012 08:18:03 -0700
>>> Eric Blake<eblake@redhat.com> wrote:
>>>
>>>> [adding qemu-devel]
>>>>
>>>> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
>>>>>> One thing, that you'll probably notice is this
>>>>>> 'set-support-level' command. Basically, it tells GA what qemu version
>>>>>> is it running on. Ideally, this should be done as soon as
>>>>>> GA starts up. However, that cannot be determined from outside
>>>>>> world as GA doesn't emit any events yet.
>>>>>> Ideally^2 this command should be left out as it should be qemu
>>>>>> who tells its own agent this kind of information.
>>>>>> Anyway, I was going to call this command in qemuProcess{Startup,
>>>>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs
>>>>>> so guest can boot and start GA, but that implies returning from qemuProcess*.
>>>>>>
>>>>>> So I am setting this just before 'guest-suspend' command, as
>>>>>> there is one more thing about GA. It is unable to remember anything
>>>>>> upon its restart (GA process). Which has BTW show flaw
>>>>>> in our current code with FS freeze& thaw. If we freeze guest
>>>>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as
>>>>>> GA thinks FS are not frozen. But that's a different cup of tea.
>>>>>>
>>>>>> Because of what written above, we need to call set-level
>>>>>> on every suspend.
>>>>>
>>>>>
>>>>> IMHO all this says that the 'set-level' command is a conceptually
>>>>> unfixably broken design& should be killed in QEMU before it turns
>>>>> into an even bigger mess.
>>>
>>> Can you elaborate on this? Michal and I talked on irc about making the
>>> compatibility level persistent, would that help?
>>>
>>>>> Once we're in a situation where we need to call 'set-level' prior
>>>>> to every single invocation, you might as well just allow the QEMU
>>>>> version number to be passed in directly as an arg to the command
>>>>> you are running directly thus avoiding this horrificness.
>>>>
>>>> Qemu folks, would you care to chime in on this?
>>>>
>>>> Exactly how is the set-level command supposed to work? As I understand
>>>> it, the goal is that if the guest has qemu-ga 1.1 installed, but is
>>>> being run by qemu 1.0, then we want to ensure that any guest agent
>>>> command supported by qemu-ga 1.1 but requiring features of qemu not
>>>> present in qemu 1.0 will be properly rejected.
>>>
>>> Not exactly, the default support of qemu-ga is qemu 1.0. This means that by
>>> default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This
>>> way the set-support-level command allows you to specify that qemu 2.0 features
>>> are supported.
>>>
>>> Note that this is only about specific features that depend on host support,
>>> like S3 suspend which is known to be buggy in current and old qemu.
>>>
>>>> But whose job is it to tell the guest agent what version of qemu is
>>>> running? Based on the above conversation, it looks like the current
>>>> qemu implementation does not do any handshaking on its own when the
>>>> guest agent first comes alive, which means that you are forcing the work
>>>> on the management app (libvirt). And this is inherently racy - if the
>>>> guest is allowed to restart its qemu-ga process at will, and each
>>>> restart of that guest process triggers a need to redo the handshake,
>>>> then libvirt can never reliably know what version the agent is running at.
>>>
>>> Making the set-support-level persistent would solve it, wouldn't it?
>>
>> Yes and no. We still need an event when GA come to live. Because if
>> anybody tries to write something for GA which is not running (and for
>> purpose of this scenario assume it never will), like 'set-support-level'
>> and wait for answer (which will never come) he will be blocked
>> indefinitely. However, if he writes it after 1st event come, everything
>> is OK.
>
> What if the event never reach libvirt?
>
> This problem is a lot more general and is not related to the
> set-support-level command. Maybe adding shutdown& start events can serve as
> good hints, but they won't fix the problem.
Yah, start up events are a good indicator to issue the guest-sync
sequence (we had them at one point, and planned to re-add them for QMP
integration, and since libvirt is taking on this role for now it might
make sense to re-add it now), but once that sequence is issued the agent
can still be manually stopped, or the guest-sync sequence itself can
timeout.
And there's no way to reliably send a stop indicator, maybe to capture
shutdown events, but not consistently enough that we can base the
protocol on it (agent might get pkill -9'd for instance, and
virtio-serial doesn't currently plumb a guest-side disconnect to the
chardev front-end, so you'd never know).
So, the only indication you'll ever get that your "session" ended is
either a timeout, or, if we add it, a start up event. In either case the
response is to issue the reset sequence.
The way it would need to work with resets is everytime a command times
out you:
1) report the timeout error to libvirt client/management app. set
guest-agent_available = 0, such that further libvirt calls that depend
on it would return "not currently available", or something to that effect.
2) issue guest-sync with new unique session id
3) read a json object/response.
- if you time out, goto 2
- if your response doesn't have the session id you're expecting,
repeat 3) (since it may be a response to a previous guest-sync RPC that
you timed out on prematurely, but you can't just wait indefinitely,
since it may never arrive)
4) set guest_agent_available = 1, proceed with normal operation till the
next timeout (or start up event, if we add one).
>
> IMHO, the best way to solve this is to issue the guest-sync command with
> a timeout. If you get no answer, then try again later.
>
next prev parent reply other threads:[~2012-01-26 22:52 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <cover.1327585806.git.mprivozn@redhat.com>
[not found] ` <20120126144632.GM21211@redhat.com>
2012-01-26 15:18 ` [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests Eric Blake
2012-01-26 19:35 ` Luiz Capitulino
2012-01-26 19:41 ` Michal Privoznik
2012-01-26 20:13 ` Luiz Capitulino
2012-01-26 22:51 ` Michael Roth [this message]
2012-01-26 22:57 ` Anthony Liguori
2012-01-30 12:57 ` Luiz Capitulino
2012-01-30 13:54 ` Anthony Liguori
2012-01-30 14:44 ` Luiz Capitulino
2012-01-30 15:43 ` Michael Roth
2012-01-30 15:58 ` Eric Blake
2012-01-30 17:07 ` Michael Roth
2012-01-30 18:30 ` Luiz Capitulino
2012-01-30 16:08 ` Michal Privoznik
2012-01-30 18:36 ` Luiz Capitulino
2012-01-30 15:03 ` Michael Roth
2012-01-26 22:54 ` Anthony Liguori
2012-01-27 0:01 ` Michael Roth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F21D8F3.8000505@linux.vnet.ibm.com \
--to=mdroth@linux.vnet.ibm.com \
--cc=eblake@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=libvir-list@redhat.com \
--cc=mprivozn@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.