From: Michael Roth <mdroth@linux.vnet.ibm.com>
To: Luiz Capitulino <lcapitulino@redhat.com>
Cc: libvir-list@redhat.com, Michal Privoznik <mprivozn@redhat.com>,
Eric Blake <eblake@redhat.com>,
QEMU Developers <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests
Date: Thu, 26 Jan 2012 16:51:31 -0600 [thread overview]
Message-ID: <4F21D8F3.8000505@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120126181355.3df32a72@doriath.home>
On 01/26/2012 02:13 PM, Luiz Capitulino wrote:
> On Thu, 26 Jan 2012 20:41:13 +0100
> Michal Privoznik<mprivozn@redhat.com> wrote:
>
>> On 26.01.2012 20:35, Luiz Capitulino wrote:
>>> On Thu, 26 Jan 2012 08:18:03 -0700
>>> Eric Blake<eblake@redhat.com> wrote:
>>>
>>>> [adding qemu-devel]
>>>>
>>>> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
>>>>>> One thing, that you'll probably notice is this
>>>>>> 'set-support-level' command. Basically, it tells GA what qemu version
>>>>>> is it running on. Ideally, this should be done as soon as
>>>>>> GA starts up. However, that cannot be determined from outside
>>>>>> world as GA doesn't emit any events yet.
>>>>>> Ideally^2 this command should be left out as it should be qemu
>>>>>> who tells its own agent this kind of information.
>>>>>> Anyway, I was going to call this command in qemuProcess{Startup,
>>>>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs
>>>>>> so guest can boot and start GA, but that implies returning from qemuProcess*.
>>>>>>
>>>>>> So I am setting this just before 'guest-suspend' command, as
>>>>>> there is one more thing about GA. It is unable to remember anything
>>>>>> upon its restart (GA process). Which has BTW show flaw
>>>>>> in our current code with FS freeze& thaw. If we freeze guest
>>>>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as
>>>>>> GA thinks FS are not frozen. But that's a different cup of tea.
>>>>>>
>>>>>> Because of what written above, we need to call set-level
>>>>>> on every suspend.
>>>>>
>>>>>
>>>>> IMHO all this says that the 'set-level' command is a conceptually
>>>>> unfixably broken design& should be killed in QEMU before it turns
>>>>> into an even bigger mess.
>>>
>>> Can you elaborate on this? Michal and I talked on irc about making the
>>> compatibility level persistent, would that help?
>>>
>>>>> Once we're in a situation where we need to call 'set-level' prior
>>>>> to every single invocation, you might as well just allow the QEMU
>>>>> version number to be passed in directly as an arg to the command
>>>>> you are running directly thus avoiding this horrificness.
>>>>
>>>> Qemu folks, would you care to chime in on this?
>>>>
>>>> Exactly how is the set-level command supposed to work? As I understand
>>>> it, the goal is that if the guest has qemu-ga 1.1 installed, but is
>>>> being run by qemu 1.0, then we want to ensure that any guest agent
>>>> command supported by qemu-ga 1.1 but requiring features of qemu not
>>>> present in qemu 1.0 will be properly rejected.
>>>
>>> Not exactly, the default support of qemu-ga is qemu 1.0. This means that by
>>> default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This
>>> way the set-support-level command allows you to specify that qemu 2.0 features
>>> are supported.
>>>
>>> Note that this is only about specific features that depend on host support,
>>> like S3 suspend which is known to be buggy in current and old qemu.
>>>
>>>> But whose job is it to tell the guest agent what version of qemu is
>>>> running? Based on the above conversation, it looks like the current
>>>> qemu implementation does not do any handshaking on its own when the
>>>> guest agent first comes alive, which means that you are forcing the work
>>>> on the management app (libvirt). And this is inherently racy - if the
>>>> guest is allowed to restart its qemu-ga process at will, and each
>>>> restart of that guest process triggers a need to redo the handshake,
>>>> then libvirt can never reliably know what version the agent is running at.
>>>
>>> Making the set-support-level persistent would solve it, wouldn't it?
>>
>> Yes and no. We still need an event when GA come to live. Because if
>> anybody tries to write something for GA which is not running (and for
>> purpose of this scenario assume it never will), like 'set-support-level'
>> and wait for answer (which will never come) he will be blocked
>> indefinitely. However, if he writes it after 1st event come, everything
>> is OK.
>
> What if the event never reach libvirt?
>
> This problem is a lot more general and is not related to the
> set-support-level command. Maybe adding shutdown& start events can serve as
> good hints, but they won't fix the problem.
Yah, start up events are a good indicator to issue the guest-sync
sequence (we had them at one point, and planned to re-add them for QMP
integration, and since libvirt is taking on this role for now it might
make sense to re-add it now), but once that sequence is issued the agent
can still be manually stopped, or the guest-sync sequence itself can
timeout.
And there's no way to reliably send a stop indicator, maybe to capture
shutdown events, but not consistently enough that we can base the
protocol on it (agent might get pkill -9'd for instance, and
virtio-serial doesn't currently plumb a guest-side disconnect to the
chardev front-end, so you'd never know).
So, the only indication you'll ever get that your "session" ended is
either a timeout, or, if we add it, a start up event. In either case the
response is to issue the reset sequence.
The way it would need to work with resets is everytime a command times
out you:
1) report the timeout error to libvirt client/management app. set
guest-agent_available = 0, such that further libvirt calls that depend
on it would return "not currently available", or something to that effect.
2) issue guest-sync with new unique session id
3) read a json object/response.
- if you time out, goto 2
- if your response doesn't have the session id you're expecting,
repeat 3) (since it may be a response to a previous guest-sync RPC that
you timed out on prematurely, but you can't just wait indefinitely,
since it may never arrive)
4) set guest_agent_available = 1, proceed with normal operation till the
next timeout (or start up event, if we add one).
>
> IMHO, the best way to solve this is to issue the guest-sync command with
> a timeout. If you get no answer, then try again later.
>
next prev parent reply other threads:[~2012-01-26 22:52 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <cover.1327585806.git.mprivozn@redhat.com>
[not found] ` <20120126144632.GM21211@redhat.com>
2012-01-26 15:18 ` [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests Eric Blake
2012-01-26 19:35 ` Luiz Capitulino
2012-01-26 19:41 ` Michal Privoznik
2012-01-26 20:13 ` Luiz Capitulino
2012-01-26 22:51 ` Michael Roth [this message]
2012-01-26 22:57 ` Anthony Liguori
2012-01-30 12:57 ` Luiz Capitulino
2012-01-30 13:54 ` Anthony Liguori
2012-01-30 14:44 ` Luiz Capitulino
2012-01-30 15:43 ` Michael Roth
2012-01-30 15:58 ` Eric Blake
2012-01-30 17:07 ` Michael Roth
2012-01-30 18:30 ` Luiz Capitulino
2012-01-30 16:08 ` Michal Privoznik
2012-01-30 18:36 ` Luiz Capitulino
2012-01-30 15:03 ` Michael Roth
2012-01-26 22:54 ` Anthony Liguori
2012-01-27 0:01 ` Michael Roth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F21D8F3.8000505@linux.vnet.ibm.com \
--to=mdroth@linux.vnet.ibm.com \
--cc=eblake@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=libvir-list@redhat.com \
--cc=mprivozn@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).