From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Xin Zhao" Subject: Re: how do versioning filesystems take snapshot of opened files? Date: Tue, 3 Jul 2007 17:17:53 -0400 Message-ID: <4ae3c140707031417vaccebben82d56df10692aa80@mail.gmail.com> References: <20070703133830.0e83a6bf@think.oraclecorp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "Chris Mason" , linux-fsdevel To: "Bryan Henderson" Return-path: Received: from wa-out-1112.google.com ([209.85.146.177]:59086 "EHLO wa-out-1112.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754042AbXGCVRy (ORCPT ); Tue, 3 Jul 2007 17:17:54 -0400 Received: by wa-out-1112.google.com with SMTP id v27so2956976wah for ; Tue, 03 Jul 2007 14:17:53 -0700 (PDT) In-Reply-To: Content-Disposition: inline Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On 7/3/07, Bryan Henderson wrote: > >>we want a open/close consistency in snapshots. > > > >This depends on the transaction engine in your filesystem. None of the > >existing linux filesystems have a way to start a transaction when the > >file opens and finish it when the file closes, or a way to roll back > >individual operations that have happened inside a given transaction. > > > >It certainly could be done, but it would also introduce a great deal of > >complexity to the FS. > > And I would be opposed as a matter of architecture to making open/close > transactional. People often read more into open/close than is there, but > open is just about gaining access and close is just about releasing > resources. It isn't appropriate for close to _mean_ anything. > > There are filesystems that have transactions. They use separate start > transaction / end transaction system calls (not POSIX). > > >> Pausing apps itself > >> does not solve this problem, because a file could be already opened > >> and in the middle of write. > > Just to be clear: we're saying "pause," but we mean "quiesce." I.e., tell > the application to reach a point where it's not in the middle of anything > and then tell you it's there. Indeed, whether you use open/close or some > other kind of transaction, just pausing the application doesn't help. If > you were to implement open/close transactions, the filesystem driver would > just wait for the application to close and in the meantime block all new > opens. If we want to support open/close consistency, maybe we don't really need the help from the application. For example, the filesystem is implemented this way. When a file is opened for write, we copy the metadata and create a CoW bitmap to keep track what has been changed. Before writing any new data to the file, we copy the old data and then write the new data. As such, when we take snapshot and encounter the opened file, we can save the old data instead of the newdata, since the old data is in a consistent state. Of course, new file opening should also be handled this way. The filesystem driver cannot wait for application to close, I think. If the application is snapshot aware, the wait time could be tolerable. But if the application does not provide a way to process the quience request, the wait could be infinite. What do you think? > > -- > Bryan Henderson IBM Almaden Research Center > San Jose CA Filesystems > > >