From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:41680) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QHH1a-0006GY-DP for qemu-devel@nongnu.org; Tue, 03 May 2011 10:56:31 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QHH1Z-0002RP-7V for qemu-devel@nongnu.org; Tue, 03 May 2011 10:56:30 -0400 Received: from e34.co.us.ibm.com ([32.97.110.152]:34370) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QHH1Y-0002RL-W9 for qemu-devel@nongnu.org; Tue, 03 May 2011 10:56:29 -0400 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e34.co.us.ibm.com (8.14.4/8.13.1) with ESMTP id p43Ei46F003944 for ; Tue, 3 May 2011 08:44:04 -0600 Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by d03relay04.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p43EvTfU136698 for ; Tue, 3 May 2011 08:57:29 -0600 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p43F1N8c029852 for ; Tue, 3 May 2011 09:01:23 -0600 Message-ID: <4DC01795.5000304@linux.vnet.ibm.com> Date: Tue, 03 May 2011 09:56:21 -0500 From: Michael Roth MIME-Version: 1.0 References: <1303138953-1334-1-git-send-email-mdroth@linux.vnet.ibm.com> <4DAFFD05.20304@redhat.com> <4DB0376B.8010505@linux.vnet.ibm.com> <4DBFFA61.2020104@redhat.com> <4DC008E7.8010504@linux.vnet.ibm.com> <4DC00D3E.3090804@redhat.com> In-Reply-To: <4DC00D3E.3090804@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC][PATCH v2 00/11] QEMU Guest Agent: QMP-based host/guest communication (virtagent) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jes Sorensen Cc: aliguori@linux.vnet.ibm.com, agl@linux.vnet.ibm.com, qemu-devel@nongnu.org On 05/03/2011 09:12 AM, Jes Sorensen wrote: > On 05/03/11 15:53, Michael Roth wrote: >>> >>> IMHO it is better to disable the commands rather than just logging, but >>> either way should allow it to drop in. >> >> Kinda agree, but logging seems to be the real dependency. With the >> server-side timeouts now in place even doing stuff like fopen/fwrite is >> permitted (it would just timeout if it blocked too long). It's the >> logging stuff that we don't really have a way to recover from, because >> it's not run in a thread we can just nuke after a certain amount of time. >> >> Even when we're not frozen, we can't guarantee an fopen/fwrite/fread >> will succeed, so failures shouldn't be too much of a surprise since they >> need to be handled anyway. And determining whether or not a command >> should be marked as executable during a freeze is somewhat nebulous >> (fopen might work for read-only access, but hang for write access when >> O_CREATE is set, fwrite might succeed if it doesn't require a flush, >> etc), plus internal things like logging need to be taken into account. >> >> So, for now at least I think it's a reasonable way to do it. > > Please be very careful with any fwrite() calls - it's not just logging. > Any writes to the frozen file systems will result in the caller being > put to sleep until the file system is unfrozen. It won't timeout, and > the agent will be stuck hanging in that call. > > It's fun playing with the fsfreeze stuff on your desktop system - doing > it in an xterm has 'interesting' effects..... :) > > This is why I prefer the 'disable function' rather 'disable logging' > approach. > I'll review the code to make sure, but what I mean is that the actual daemon only ever does disk i/o when it's doing logging. All the RPCs are handled in a separate worker thread, so if they do something fubar we should still be able to recover (the timeouts are done by pthread_cancel()'ing the worker thread). So things should be recoverable so long as we don't log when frozen. RPCs succeeding or failing we don't care too much about, since they'll just result in a nice "command timed out" response to the user eventually. Checking logging_enabled is more to ensure we don't let the user exploit the fs_freeze side-effects to do sensitive stuff under the guest owner's nose (not that we couldn't anyway from host, but it may important for piece of mind about have an externally accessible/privileged agent in your guest). works_under_fsfreeze is harder to determine beforehand, unless maybe you flag any command that did i/o other than logging...but there are actually use cases where that would be too restrictive. If they're using fwrite to pipe output through a character device, for instance, or twiddling knobs in sysfs. Better to keep the same semantics as you'd have when "physically" on the machine, when possible. Will need to do some good testing though, fsfreeze is definitely a strange state to be in :) >>> Sorry for the late reply, been a bit swamped here. >> >> No problem I have your patches in my tree now. They still need a little >> bit of love and testing but I should be able to get them out on the list >> shortly. > > Sounds great! > > Cheers, > Jes