From: Michael Roth <mdroth@linux.vnet.ibm.com>
To: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>,
aliguori@us.ibm.com, qemu-devel <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [PATCH v2] qemu-ga: Add the guest-suspend command
Date: Wed, 14 Dec 2011 19:27:19 -0600 [thread overview]
Message-ID: <4EE94CF7.2070601@linux.vnet.ibm.com> (raw)
In-Reply-To: <20111214215601.5820b735@doriath>
On 12/14/2011 05:56 PM, Luiz Capitulino wrote:
> On Wed, 14 Dec 2011 15:14:40 -0600
> Michael Roth<mdroth@linux.vnet.ibm.com> wrote:
>
>> On 12/14/2011 02:56 PM, Michael Roth wrote:
>>> On 12/14/2011 02:06 PM, Luiz Capitulino wrote:
>>>> On Wed, 14 Dec 2011 13:43:23 -0600
>>>> Michael Roth<mdroth@linux.vnet.ibm.com> wrote:
>>>>
>>>>> On 12/14/2011 12:17 PM, Luiz Capitulino wrote:
>>>>>> On Wed, 14 Dec 2011 14:38:55 -0200
>>>>>> Luiz Capitulino<lcapitulino@redhat.com> wrote:
>>>>>>
>>>>>>>>> I'm also wondering if we could use g_child_watch_add(), but it's
>>>>>>>>> not clear
>>>>>>>>> to me if it works with processes not created with g_spawn_*()
>>>>>>>>> functions.
>>>>>>>>
>>>>>>>> GPid's map to something other than PIDs on Windows, so I think
>>>>>>>> we'd have
>>>>>>>> issues there. But our fork() approach probably wouldn't work at
>>>>>>>> all on
>>>>>>>> Windows except maybe under cygwin, so at some point we'd probably
>>>>>>>> want
>>>>>>>> to switch over to g_spawn for this kind of stuff anyway...
>>>>>>>>
>>>>>>>> So this might be a good point to switch over to using the glib
>>>>>>>> functions.
>>>>>>>>
>>>>>>>> Would you mind trying to do the hibernate/zombie reaping stuff using
>>>>>>>> g_spawn+g_child_watch_add()? It might end up being the easiest route.
>>>>>>>> Otherwise I can take a look at it later today.
>>>>>>>
>>>>>>> Well, there are two problems with g_spawn wrt to the manual method of
>>>>>>> writing to the sysfs file. The first one is that I'm not sure if
>>>>>>> g_spawn()
>>>>>>> reports the file not found error synchronously. The other problem
>>>>>>> is that,
>>>>>>> I'd have to fork() anyway to write to the sysfs file (unless we
>>>>>>> decide that
>>>>>>> it's ok to do this synchronously, which seems ok to me).
>>>>>>
>>>>>> The version below uses g_spawn_async(). The code is a bit simpler
>>>>>> than previous
>>>>>> versions, but there are two important details about it:
>>>>>>
>>>>>> 1. I'm letting g_spawn_async() reap the child automatically. I don't
>>>>>> know
>>>>>> how it does it though. I'd guess it uses g_child_watch_add(), worst
>>>>>> case it ignores SIGCHLD (although I think this would be awful)
>>>>>
>>>>> Weird, doesn't seem to add a watcher or do anything with SIGCHLD. You're
>>>>> not seeing an error that's causing it to reap it immediately after
>>>>> the fork?
>>>>
>>>> No. I haven't tested it much though, but the basic use case seemed to
>>>> work
>>>> fine.
>>>>
>>>>>>
>>>>>> 2. The manual method of writing to the sysfs is done synchronously.
>>>>>> This
>>>>>> means that the command response will only be sent when the guest
>>>>>> resumes
>>>>>
>>>>> Want to avoid sync blocking calls as much as possible until timeouts are
>>>>> sorted out, particularly in this case. Currently, with hibernate at
>>>>> least, QEMU will exit (or is something being done to change that
>>>>> behavior?), libvirt will restart the VM later via some other interface,
>>>>> establish a connection to a new monitor and new agent channel, then gets
>>>>> a spurious response from the agent for a command it issue to a different
>>>>> QEMU process far in the past. So the common scenario would need nasty,
>>>>> special handling in libvirt and they'd probably hate us for it.
>>>>
>>>> Problem is: even with the other approach there's no way to guarantee that
>>>> the response will arrive before the guest suspends or shuts down, as it
>>>> depends on which process runs first.
>>>
>>>> If we want to guarantee that the client will see a response before the
>>>> guest suspends, then we need some form of synchronization. Like, the
>>>> child
>>>> should wait for the parent to send the response to the client before
>>>> exec()ing (or writing to the sysfs file).
>>>>
>>>
>>> There's no way to guarantee it though, even if we remove the window of
>>
>> Rather, there's no way to guarantee the client will get a response (or
>> that, given a response, the async command will succeed).
>
> Well, it's the best we can do, ie. only allow the child to run when the
> parent is done sending the response. For the error case, we don't report
> errors in the child back to the client anyway, we do it via syslog.
>
>> But you're that we could make it so that time-out/lack of response at
>> least implies that the command *won't* be carried out...
>>
>> That might be a nice guarantee, but your call on whether you want to do
>> the signalling approach for hibernate. Otherwise, I think async will
>> work well enough in the common case, and response timeout would imply
>> neither success nor failure of the request, which I think is the norm
>> for most things.
>
> Yes, I think the signaling approach is complex and I'm not sure we really
> we have a problem here. Maybe I should do it async for now, if this turns
> out to be a problem for libvirt we can try more complex approaches.
>
Agreed, it reduces corner-cases, but still requires similar
error-handling in place, so it's a good candidate for treating as a
future optimization. Although...
>>
>>> opportunity for a child to complete a shutdown/suspend/etc request
>>> before we write out a response, there's still the potential scenario
>>> where an admin kills off/restarts qemu-ga while it's processing a request.
>>>
>>> So relying on client-side timeouts to handle these corner cases is
>>> acceptable, since being able to handle that is a requirement of a robust
>>> client, especially since we're talking to a potentially malicious
>>> guest/agent. We just need to do our best to make these situations corner
>>> cases rather than common ones.
Anthony rightly pointed out on IRC that punting the problem to a
client-side timeout still requires a reset mechanism in the case of
suspend/hibernate, since, although it won't happen every single time as
it would with a synchronous call, we can still send a spurious response
when the guest wakes up (if we slept before delivery).
So the libvirt implementation will need to do a reset every every time
they reconnect after waking up a suspended/hibernated guest. It should
also be part of the general timeout logic, since there's no guarantee
the timeout wasn't premature and a spurious response won't still arrive.
I'll get the logic for doing qemu-ga resets added to the wiki... it's
not pretty, and I was hoping to hide it away in the QMP integration
since QEMU's JSON parser is already prepped to handle the reserved 0xFF
flush token that's required...but it works at least.
>>>
>>>>>
>>>>> So probably best to stick with your previous approach for now, and look
>>>>> for a glib solution later.
>>>>>
>>>>>>
>>>>>> If you think this approach is acceptable, I'll test it more, update
>>>>>> its doc,
>>>>>> etc and post it again.
>>>>>>
>>>>>> diff --git a/qapi-schema-guest.json b/qapi-schema-guest.json
>>>>>> index 5f8a18d..63f65a6 100644
>>>>>> --- a/qapi-schema-guest.json
>>>>>> +++ b/qapi-schema-guest.json
>>>>>> @@ -219,3 +219,20 @@
>>>>>> ##
>>>>>> { 'command': 'guest-fsfreeze-thaw',
>>>>>> 'returns': 'int' }
>>>>>> +
>>>>>> +##
>>>>>> +# @guest-suspend
>>>>>> +#
>>>>>> +# Suspend guest execution by entering ACPI power state S3 or S4.
>>>>>> +#
>>>>>> +# @mode: 'hibernate' RAM content is saved in the disk and the guest is
>>>>>> +# powered down (this corresponds to ACPI S4)
>>>>>> +# 'sleep' execution is suspended but the RAM retains its contents
>>>>>> +# (this corresponds to ACPI S3)
>>>>>> +#
>>>>>> +# Notes: This is an asynchronous request. There's no guarantee it will
>>>>>> +# succeed. Errors will be logged to guest's syslog.
>>>>>> +#
>>>>>> +# Since: 1.1
>>>>>> +##
>>>>>> +{ 'command': 'guest-suspend', 'data': { 'mode': 'str' } }
>>>>>> diff --git a/qga/guest-agent-commands.c b/qga/guest-agent-commands.c
>>>>>> index a09c8ca..0c6b78e 100644
>>>>>> --- a/qga/guest-agent-commands.c
>>>>>> +++ b/qga/guest-agent-commands.c
>>>>>> @@ -574,6 +574,56 @@ int64_t qmp_guest_fsfreeze_thaw(Error **err)
>>>>>> }
>>>>>> #endif
>>>>>>
>>>>>> +#define LINUX_SYS_STATE_FILE "/sys/power/state"
>>>>>> +
>>>>>> +void qmp_guest_suspend(const char *mode, Error **err)
>>>>>> +{
>>>>>> + GError *error = NULL;
>>>>>> + gchar *argv[2];
>>>>>> +
>>>>>> + if (strcmp(mode, "hibernate") == 0) {
>>>>>> + argv[0] = (gchar *) "pm-hibernate";
>>>>>> + } else if (strcmp(mode, "sleep") == 0) {
>>>>>> + argv[0] = (gchar *) "pm-suspend";
>>>>>> + } else if (strcmp(mode, "hybrid") == 0) {
>>>>>> + argv[0] = (gchar *) "pm-hybrid";
>>>>>> + } else {
>>>>>> + error_set(err, QERR_INVALID_PARAMETER, "mode");
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> + argv[1] = NULL;
>>>>>> + if (g_spawn_async(NULL, argv, NULL,G_SPAWN_SEARCH_PATH |
>>>>>> + G_SPAWN_FILE_AND_ARGV_ZERO,
>>>>>> + NULL, NULL, NULL,&error)< 0) {
>>>>>> + int fd;
>>>>>> + const char *cmd;
>>>>>> +
>>>>>> + slog("%s\n", error->message);
>>>>>> + g_error_free(error);
>>>>>> +
>>>>>> + if (strcmp(mode, "hybrid") == 0) {
>>>>>> + error_set(err, QERR_UNDEFINED_ERROR);
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> + slog("trying to suspend using the manual method...\n");
>>>>>> +
>>>>>> + fd = open(LINUX_SYS_STATE_FILE, O_WRONLY);
>>>>>> + if (fd< 0) {
>>>>>> + error_set(err, QERR_OPEN_FILE_FAILED, LINUX_SYS_STATE_FILE);
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> + cmd = strcmp(mode, "sleep") == 0 ? "mem" : "disk";
>>>>>> + if (write(fd, cmd, strlen(cmd))< 0) {
>>>>>> + error_set(err, QERR_IO_ERROR);
>>>>>> + }
>>>>>> +
>>>>>> + close(fd);
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>> /* register init/cleanup routines for stateful command groups */
>>>>>> void ga_command_state_init(GAState *s, GACommandState *cs)
>>>>>> {
>>>>>
>>>>
>>>
>>
>
next prev parent reply other threads:[~2011-12-15 1:27 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-13 18:28 [Qemu-devel] [PATCH v2] qemu-ga: Add the guest-suspend command Luiz Capitulino
2011-12-13 20:03 ` Michael Roth
2011-12-14 13:00 ` Luiz Capitulino
2011-12-14 15:54 ` Michael Roth
2011-12-14 16:38 ` Luiz Capitulino
2011-12-14 18:06 ` Michael Roth
2011-12-14 23:44 ` Luiz Capitulino
2011-12-14 18:17 ` Luiz Capitulino
2011-12-14 19:43 ` Michael Roth
2011-12-14 20:06 ` Luiz Capitulino
2011-12-14 20:56 ` Michael Roth
2011-12-14 21:14 ` Michael Roth
2011-12-14 23:56 ` Luiz Capitulino
2011-12-15 1:27 ` Michael Roth [this message]
2011-12-13 20:27 ` Michael Roth
2011-12-14 13:07 ` Luiz Capitulino
2011-12-14 15:50 ` Michael Roth
2011-12-13 23:13 ` Daniel P. Berrange
2011-12-14 13:08 ` Luiz Capitulino
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4EE94CF7.2070601@linux.vnet.ibm.com \
--to=mdroth@linux.vnet.ibm.com \
--cc=aliguori@us.ibm.com \
--cc=amit.shah@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).