From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:33965) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qmax1-0005vh-R6 for qemu-devel@nongnu.org; Thu, 28 Jul 2011 20:29:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qmax0-0007rH-GJ for qemu-devel@nongnu.org; Thu, 28 Jul 2011 20:29:15 -0400 Received: from serv2.oss.ntt.co.jp ([222.151.198.100]:40395) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qmawz-0007r9-Rf for qemu-devel@nongnu.org; Thu, 28 Jul 2011 20:29:14 -0400 Message-ID: <4E31FED5.80408@oss.ntt.co.jp> Date: Fri, 29 Jul 2011 09:29:09 +0900 From: Fernando Luis Vazquez Cao MIME-Version: 1.0 References: <20110727152457.GK18528@redhat.com> <1311821631.9256.11.camel@nexus.oss.ntt.co.jp> <20110728080313.GE3087@redhat.com> <4E317C24.3000102@linux.vnet.ibm.com> In-Reply-To: <4E317C24.3000102@linux.vnet.ibm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] RFC: moving fsfreeze support from the userland guest agent to the guest kernel List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Michael Roth Cc: Andrea Arcangeli , Jes Sorensen , qemu-devel@nongnu.org, Luiz Capitulino Michael Roth =E3=81=95=E3=82=93=E3=81=AF=E6=9B=B8=E3=81=8D=E3=81=BE=E3=81= =97=E3=81=9F: > On 07/28/2011 03:03 AM, Andrea Arcangeli wrote: >> On Thu, Jul 28, 2011 at 11:53:50AM +0900, Fernando Luis V=C3=A1zquez C= ao=20 >> wrote: >>> On Wed, 2011-07-27 at 17:24 +0200, Andrea Arcangeli wrote: >>>> making >>>> sure no lib is calling any I/O function to be able to defreeze the >>>> filesystems later, making sure the oom killer or a wrong kill -9 >>>> $RANDOM isn't killing the agent by mistake while the I/O is blocked >>>> and the copy is going. >>> >>> Yes with the current API if the agent is killed while the filesystems >>> are frozen we are screwed. >>> >>> I have just submitted patches that implement a new API that should ma= ke >>> the virtualization use case more reliable. Basically, I am adding a n= ew >>> ioctl, FIGETFREEZEFD, which freezes the indicated filesystem and=20 >>> returns >>> a file descriptor; as long as that file descriptor is held open, the >>> filesystem remains open. If the freeze file descriptor is closed (be = it >>> through a explicit call to close(2) or as part of process exit >>> housekeeping) the associated filesystem is automatically thawed. >>> >>> - fsfreeze: add ioctl to create a fd for freeze control >>> http://marc.info/?l=3Dlinux-fsdevel&m=3D131175212512290&w=3D2 >>> - fsfreeze: add freeze fd ioctls >>> http://marc.info/?l=3Dlinux-fsdevel&m=3D131175220612341&w=3D2 >> >> This is probably how the API should have been implemented originally >> instead of FIFREEZE/FITHAW. >> >> It looks a bit overkill though, I would think it'd be enough to have >> the fsfreeze forced at FIGETFREEZEFD, and the only way to thaw by >> closing the file without requiring any of the >> FS_FREEZE_FD/FS_THAW_FD/FS_ISFROZEN_FD. But I guess you have use cases > > One of the crappy things about the current implementation is the=20 > inability to determine whether or not a filesystem is frozen. At least=20 > in the context of guest agent at least, it'd be nice if=20 > guest-fsfreeze-status checked the actual system state rather than some=20 > internal state that may not necessarily reflect reality (if we freeze,=20 > and some other application thaws, we currently still report the state=20 > as frozen). > > Also in the context of the guest agent, we are indeed screwed if the=20 > agent gets killed while in a frozen state, and remain screwed even if=20 > it's restarted since we have no way of determining whether or not=20 > we're in a frozen state and thus should disable logging operations. That is precisely the reason I added the new API. > We could check status by looking for a failure from the freeze=20 > operation, but if you're just interested in getting the state, having=20 > to potentially induce a freeze just to get at the state is really=20 > heavy-handed. > > So having an open operation that doesn't force a freeze/thaw/status=20 > operation serves some fairly common use cases I think.=20 Yep. If you think there is something missing API wise let me know and I=20 will implement it. Thanks, Fernando