From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:52460) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RavtK-0002G5-61 for qemu-devel@nongnu.org; Wed, 14 Dec 2011 15:57:31 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RavtI-0005LW-DM for qemu-devel@nongnu.org; Wed, 14 Dec 2011 15:57:30 -0500 Received: from e7.ny.us.ibm.com ([32.97.182.137]:38849) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RavtI-0005L6-2R for qemu-devel@nongnu.org; Wed, 14 Dec 2011 15:57:28 -0500 Received: from /spool/local by e7.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 14 Dec 2011 15:57:26 -0500 Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay06.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id pBEKuZpU3055730 for ; Wed, 14 Dec 2011 15:56:35 -0500 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id pBEKuZ4o024435 for ; Wed, 14 Dec 2011 18:56:35 -0200 Message-ID: <4EE90D81.1030506@linux.vnet.ibm.com> Date: Wed, 14 Dec 2011 14:56:33 -0600 From: Michael Roth MIME-Version: 1.0 References: <20111213162850.4cd135a3@doriath> <4EE7AF7C.2060302@linux.vnet.ibm.com> <20111214110029.4549bf90@doriath> <4EE8C6B5.1010406@linux.vnet.ibm.com> <20111214143855.5652b0cb@doriath> <20111214161751.543e0b02@doriath> <4EE8FC5B.9040807@linux.vnet.ibm.com> <20111214180615.21e47786@doriath> In-Reply-To: <20111214180615.21e47786@doriath> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2] qemu-ga: Add the guest-suspend command List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Luiz Capitulino Cc: Amit Shah , aliguori@us.ibm.com, qemu-devel On 12/14/2011 02:06 PM, Luiz Capitulino wrote: > On Wed, 14 Dec 2011 13:43:23 -0600 > Michael Roth wrote: > >> On 12/14/2011 12:17 PM, Luiz Capitulino wrote: >>> On Wed, 14 Dec 2011 14:38:55 -0200 >>> Luiz Capitulino wrote: >>> >>>>>> I'm also wondering if we could use g_child_watch_add(), but it's not clear >>>>>> to me if it works with processes not created with g_spawn_*() functions. >>>>> >>>>> GPid's map to something other than PIDs on Windows, so I think we'd have >>>>> issues there. But our fork() approach probably wouldn't work at all on >>>>> Windows except maybe under cygwin, so at some point we'd probably want >>>>> to switch over to g_spawn for this kind of stuff anyway... >>>>> >>>>> So this might be a good point to switch over to using the glib functions. >>>>> >>>>> Would you mind trying to do the hibernate/zombie reaping stuff using >>>>> g_spawn+g_child_watch_add()? It might end up being the easiest route. >>>>> Otherwise I can take a look at it later today. >>>> >>>> Well, there are two problems with g_spawn wrt to the manual method of >>>> writing to the sysfs file. The first one is that I'm not sure if g_spawn() >>>> reports the file not found error synchronously. The other problem is that, >>>> I'd have to fork() anyway to write to the sysfs file (unless we decide that >>>> it's ok to do this synchronously, which seems ok to me). >>> >>> The version below uses g_spawn_async(). The code is a bit simpler than previous >>> versions, but there are two important details about it: >>> >>> 1. I'm letting g_spawn_async() reap the child automatically. I don't know >>> how it does it though. I'd guess it uses g_child_watch_add(), worst >>> case it ignores SIGCHLD (although I think this would be awful) >> >> Weird, doesn't seem to add a watcher or do anything with SIGCHLD. You're >> not seeing an error that's causing it to reap it immediately after the fork? > > No. I haven't tested it much though, but the basic use case seemed to work > fine. > >>> >>> 2. The manual method of writing to the sysfs is done synchronously. This >>> means that the command response will only be sent when the guest resumes >> >> Want to avoid sync blocking calls as much as possible until timeouts are >> sorted out, particularly in this case. Currently, with hibernate at >> least, QEMU will exit (or is something being done to change that >> behavior?), libvirt will restart the VM later via some other interface, >> establish a connection to a new monitor and new agent channel, then gets >> a spurious response from the agent for a command it issue to a different >> QEMU process far in the past. So the common scenario would need nasty, >> special handling in libvirt and they'd probably hate us for it. > > Problem is: even with the other approach there's no way to guarantee that > the response will arrive before the guest suspends or shuts down, as it > depends on which process runs first. > If we want to guarantee that the client will see a response before the > guest suspends, then we need some form of synchronization. Like, the child > should wait for the parent to send the response to the client before > exec()ing (or writing to the sysfs file). > There's no way to guarantee it though, even if we remove the window of opportunity for a child to complete a shutdown/suspend/etc request before we write out a response, there's still the potential scenario where an admin kills off/restarts qemu-ga while it's processing a request. So relying on client-side timeouts to handle these corner cases is acceptable, since being able to handle that is a requirement of a robust client, especially since we're talking to a potentially malicious guest/agent. We just need to do our best to make these situations corner cases rather than common ones. >> >> So probably best to stick with your previous approach for now, and look >> for a glib solution later. >> >>> >>> If you think this approach is acceptable, I'll test it more, update its doc, >>> etc and post it again. >>> >>> diff --git a/qapi-schema-guest.json b/qapi-schema-guest.json >>> index 5f8a18d..63f65a6 100644 >>> --- a/qapi-schema-guest.json >>> +++ b/qapi-schema-guest.json >>> @@ -219,3 +219,20 @@ >>> ## >>> { 'command': 'guest-fsfreeze-thaw', >>> 'returns': 'int' } >>> + >>> +## >>> +# @guest-suspend >>> +# >>> +# Suspend guest execution by entering ACPI power state S3 or S4. >>> +# >>> +# @mode: 'hibernate' RAM content is saved in the disk and the guest is >>> +# powered down (this corresponds to ACPI S4) >>> +# 'sleep' execution is suspended but the RAM retains its contents >>> +# (this corresponds to ACPI S3) >>> +# >>> +# Notes: This is an asynchronous request. There's no guarantee it will >>> +# succeed. Errors will be logged to guest's syslog. >>> +# >>> +# Since: 1.1 >>> +## >>> +{ 'command': 'guest-suspend', 'data': { 'mode': 'str' } } >>> diff --git a/qga/guest-agent-commands.c b/qga/guest-agent-commands.c >>> index a09c8ca..0c6b78e 100644 >>> --- a/qga/guest-agent-commands.c >>> +++ b/qga/guest-agent-commands.c >>> @@ -574,6 +574,56 @@ int64_t qmp_guest_fsfreeze_thaw(Error **err) >>> } >>> #endif >>> >>> +#define LINUX_SYS_STATE_FILE "/sys/power/state" >>> + >>> +void qmp_guest_suspend(const char *mode, Error **err) >>> +{ >>> + GError *error = NULL; >>> + gchar *argv[2]; >>> + >>> + if (strcmp(mode, "hibernate") == 0) { >>> + argv[0] = (gchar *) "pm-hibernate"; >>> + } else if (strcmp(mode, "sleep") == 0) { >>> + argv[0] = (gchar *) "pm-suspend"; >>> + } else if (strcmp(mode, "hybrid") == 0) { >>> + argv[0] = (gchar *) "pm-hybrid"; >>> + } else { >>> + error_set(err, QERR_INVALID_PARAMETER, "mode"); >>> + return; >>> + } >>> + >>> + argv[1] = NULL; >>> + if (g_spawn_async(NULL, argv, NULL,G_SPAWN_SEARCH_PATH | >>> + G_SPAWN_FILE_AND_ARGV_ZERO, >>> + NULL, NULL, NULL,&error)< 0) { >>> + int fd; >>> + const char *cmd; >>> + >>> + slog("%s\n", error->message); >>> + g_error_free(error); >>> + >>> + if (strcmp(mode, "hybrid") == 0) { >>> + error_set(err, QERR_UNDEFINED_ERROR); >>> + return; >>> + } >>> + >>> + slog("trying to suspend using the manual method...\n"); >>> + >>> + fd = open(LINUX_SYS_STATE_FILE, O_WRONLY); >>> + if (fd< 0) { >>> + error_set(err, QERR_OPEN_FILE_FAILED, LINUX_SYS_STATE_FILE); >>> + return; >>> + } >>> + >>> + cmd = strcmp(mode, "sleep") == 0 ? "mem" : "disk"; >>> + if (write(fd, cmd, strlen(cmd))< 0) { >>> + error_set(err, QERR_IO_ERROR); >>> + } >>> + >>> + close(fd); >>> + } >>> +} >>> + >>> /* register init/cleanup routines for stateful command groups */ >>> void ga_command_state_init(GAState *s, GACommandState *cs) >>> { >> >