From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:49066) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RawAN-00006p-OM for qemu-devel@nongnu.org; Wed, 14 Dec 2011 16:15:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RawAM-0000zi-20 for qemu-devel@nongnu.org; Wed, 14 Dec 2011 16:15:07 -0500 Received: from e8.ny.us.ibm.com ([32.97.182.138]:39687) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RawAL-0000xg-Ul for qemu-devel@nongnu.org; Wed, 14 Dec 2011 16:15:06 -0500 Received: from /spool/local by e8.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 14 Dec 2011 16:15:01 -0500 Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by d01relay07.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id pBELEflY1495242 for ; Wed, 14 Dec 2011 16:14:43 -0500 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id pBELEfoM008910 for ; Wed, 14 Dec 2011 14:14:41 -0700 Message-ID: <4EE911C0.2000007@linux.vnet.ibm.com> Date: Wed, 14 Dec 2011 15:14:40 -0600 From: Michael Roth MIME-Version: 1.0 References: <20111213162850.4cd135a3@doriath> <4EE7AF7C.2060302@linux.vnet.ibm.com> <20111214110029.4549bf90@doriath> <4EE8C6B5.1010406@linux.vnet.ibm.com> <20111214143855.5652b0cb@doriath> <20111214161751.543e0b02@doriath> <4EE8FC5B.9040807@linux.vnet.ibm.com> <20111214180615.21e47786@doriath> <4EE90D81.1030506@linux.vnet.ibm.com> In-Reply-To: <4EE90D81.1030506@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2] qemu-ga: Add the guest-suspend command List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Luiz Capitulino Cc: Amit Shah , aliguori@us.ibm.com, qemu-devel On 12/14/2011 02:56 PM, Michael Roth wrote: > On 12/14/2011 02:06 PM, Luiz Capitulino wrote: >> On Wed, 14 Dec 2011 13:43:23 -0600 >> Michael Roth wrote: >> >>> On 12/14/2011 12:17 PM, Luiz Capitulino wrote: >>>> On Wed, 14 Dec 2011 14:38:55 -0200 >>>> Luiz Capitulino wrote: >>>> >>>>>>> I'm also wondering if we could use g_child_watch_add(), but it's >>>>>>> not clear >>>>>>> to me if it works with processes not created with g_spawn_*() >>>>>>> functions. >>>>>> >>>>>> GPid's map to something other than PIDs on Windows, so I think >>>>>> we'd have >>>>>> issues there. But our fork() approach probably wouldn't work at >>>>>> all on >>>>>> Windows except maybe under cygwin, so at some point we'd probably >>>>>> want >>>>>> to switch over to g_spawn for this kind of stuff anyway... >>>>>> >>>>>> So this might be a good point to switch over to using the glib >>>>>> functions. >>>>>> >>>>>> Would you mind trying to do the hibernate/zombie reaping stuff using >>>>>> g_spawn+g_child_watch_add()? It might end up being the easiest route. >>>>>> Otherwise I can take a look at it later today. >>>>> >>>>> Well, there are two problems with g_spawn wrt to the manual method of >>>>> writing to the sysfs file. The first one is that I'm not sure if >>>>> g_spawn() >>>>> reports the file not found error synchronously. The other problem >>>>> is that, >>>>> I'd have to fork() anyway to write to the sysfs file (unless we >>>>> decide that >>>>> it's ok to do this synchronously, which seems ok to me). >>>> >>>> The version below uses g_spawn_async(). The code is a bit simpler >>>> than previous >>>> versions, but there are two important details about it: >>>> >>>> 1. I'm letting g_spawn_async() reap the child automatically. I don't >>>> know >>>> how it does it though. I'd guess it uses g_child_watch_add(), worst >>>> case it ignores SIGCHLD (although I think this would be awful) >>> >>> Weird, doesn't seem to add a watcher or do anything with SIGCHLD. You're >>> not seeing an error that's causing it to reap it immediately after >>> the fork? >> >> No. I haven't tested it much though, but the basic use case seemed to >> work >> fine. >> >>>> >>>> 2. The manual method of writing to the sysfs is done synchronously. >>>> This >>>> means that the command response will only be sent when the guest >>>> resumes >>> >>> Want to avoid sync blocking calls as much as possible until timeouts are >>> sorted out, particularly in this case. Currently, with hibernate at >>> least, QEMU will exit (or is something being done to change that >>> behavior?), libvirt will restart the VM later via some other interface, >>> establish a connection to a new monitor and new agent channel, then gets >>> a spurious response from the agent for a command it issue to a different >>> QEMU process far in the past. So the common scenario would need nasty, >>> special handling in libvirt and they'd probably hate us for it. >> >> Problem is: even with the other approach there's no way to guarantee that >> the response will arrive before the guest suspends or shuts down, as it >> depends on which process runs first. > >> If we want to guarantee that the client will see a response before the >> guest suspends, then we need some form of synchronization. Like, the >> child >> should wait for the parent to send the response to the client before >> exec()ing (or writing to the sysfs file). >> > > There's no way to guarantee it though, even if we remove the window of Rather, there's no way to guarantee the client will get a response (or that, given a response, the async command will succeed). But you're that we could make it so that time-out/lack of response at least implies that the command *won't* be carried out... That might be a nice guarantee, but your call on whether you want to do the signalling approach for hibernate. Otherwise, I think async will work well enough in the common case, and response timeout would imply neither success nor failure of the request, which I think is the norm for most things. > opportunity for a child to complete a shutdown/suspend/etc request > before we write out a response, there's still the potential scenario > where an admin kills off/restarts qemu-ga while it's processing a request. > > So relying on client-side timeouts to handle these corner cases is > acceptable, since being able to handle that is a requirement of a robust > client, especially since we're talking to a potentially malicious > guest/agent. We just need to do our best to make these situations corner > cases rather than common ones. > >>> >>> So probably best to stick with your previous approach for now, and look >>> for a glib solution later. >>> >>>> >>>> If you think this approach is acceptable, I'll test it more, update >>>> its doc, >>>> etc and post it again. >>>> >>>> diff --git a/qapi-schema-guest.json b/qapi-schema-guest.json >>>> index 5f8a18d..63f65a6 100644 >>>> --- a/qapi-schema-guest.json >>>> +++ b/qapi-schema-guest.json >>>> @@ -219,3 +219,20 @@ >>>> ## >>>> { 'command': 'guest-fsfreeze-thaw', >>>> 'returns': 'int' } >>>> + >>>> +## >>>> +# @guest-suspend >>>> +# >>>> +# Suspend guest execution by entering ACPI power state S3 or S4. >>>> +# >>>> +# @mode: 'hibernate' RAM content is saved in the disk and the guest is >>>> +# powered down (this corresponds to ACPI S4) >>>> +# 'sleep' execution is suspended but the RAM retains its contents >>>> +# (this corresponds to ACPI S3) >>>> +# >>>> +# Notes: This is an asynchronous request. There's no guarantee it will >>>> +# succeed. Errors will be logged to guest's syslog. >>>> +# >>>> +# Since: 1.1 >>>> +## >>>> +{ 'command': 'guest-suspend', 'data': { 'mode': 'str' } } >>>> diff --git a/qga/guest-agent-commands.c b/qga/guest-agent-commands.c >>>> index a09c8ca..0c6b78e 100644 >>>> --- a/qga/guest-agent-commands.c >>>> +++ b/qga/guest-agent-commands.c >>>> @@ -574,6 +574,56 @@ int64_t qmp_guest_fsfreeze_thaw(Error **err) >>>> } >>>> #endif >>>> >>>> +#define LINUX_SYS_STATE_FILE "/sys/power/state" >>>> + >>>> +void qmp_guest_suspend(const char *mode, Error **err) >>>> +{ >>>> + GError *error = NULL; >>>> + gchar *argv[2]; >>>> + >>>> + if (strcmp(mode, "hibernate") == 0) { >>>> + argv[0] = (gchar *) "pm-hibernate"; >>>> + } else if (strcmp(mode, "sleep") == 0) { >>>> + argv[0] = (gchar *) "pm-suspend"; >>>> + } else if (strcmp(mode, "hybrid") == 0) { >>>> + argv[0] = (gchar *) "pm-hybrid"; >>>> + } else { >>>> + error_set(err, QERR_INVALID_PARAMETER, "mode"); >>>> + return; >>>> + } >>>> + >>>> + argv[1] = NULL; >>>> + if (g_spawn_async(NULL, argv, NULL,G_SPAWN_SEARCH_PATH | >>>> + G_SPAWN_FILE_AND_ARGV_ZERO, >>>> + NULL, NULL, NULL,&error)< 0) { >>>> + int fd; >>>> + const char *cmd; >>>> + >>>> + slog("%s\n", error->message); >>>> + g_error_free(error); >>>> + >>>> + if (strcmp(mode, "hybrid") == 0) { >>>> + error_set(err, QERR_UNDEFINED_ERROR); >>>> + return; >>>> + } >>>> + >>>> + slog("trying to suspend using the manual method...\n"); >>>> + >>>> + fd = open(LINUX_SYS_STATE_FILE, O_WRONLY); >>>> + if (fd< 0) { >>>> + error_set(err, QERR_OPEN_FILE_FAILED, LINUX_SYS_STATE_FILE); >>>> + return; >>>> + } >>>> + >>>> + cmd = strcmp(mode, "sleep") == 0 ? "mem" : "disk"; >>>> + if (write(fd, cmd, strlen(cmd))< 0) { >>>> + error_set(err, QERR_IO_ERROR); >>>> + } >>>> + >>>> + close(fd); >>>> + } >>>> +} >>>> + >>>> /* register init/cleanup routines for stateful command groups */ >>>> void ga_command_state_init(GAState *s, GACommandState *cs) >>>> { >>> >> >