From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Charles P. Wright" Subject: Re: [RFC] Support for stackable file systems on top of nfs Date: Fri, 11 Nov 2005 10:27:25 -0500 Message-ID: <1131722845.10610.7.camel@localhost.localdomain> References: <1131643942.9389.17.camel@kleikamp.austin.ibm.com> <20051110200741.GA23192@infradead.org> <200511102135.jAALZlfS016100@sumu.lexma.ibm.com> <1131676316.8804.93.camel@lade.trondhjem.org> <1131681856.8804.103.camel@lade.trondhjem.org> <200511111345.jABDjxvw020167@sumu.lexma.ibm.com> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Cc: Trond Myklebust , dhowells@redhat.com, nfsv4 , fsdevel Return-path: Received: from filer.fsl.cs.sunysb.edu ([130.245.126.2]:8071 "EHLO filer.fsl.cs.sunysb.edu") by vger.kernel.org with ESMTP id S1750812AbVKKP1h (ORCPT ); Fri, 11 Nov 2005 10:27:37 -0500 To: "John T. Kohl" In-Reply-To: <200511111345.jABDjxvw020167@sumu.lexma.ibm.com> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Fri, 2005-11-11 at 08:45 -0500, John T. Kohl wrote: > >>>>> "Trond" == Trond Myklebust writes: > > Trond> On Thu, 2005-11-10 at 21:32 -0500, Trond Myklebust wrote: > >> It sounds to me like you want to talk to the cachefs folks. They too > >> need special hooks in the NFS low-level page cache routines in order to > >> be able to mirror write requests to the local backing store and/or > >> reroute read requests to that backing store. > > Trond> Note: I'm not saying that you should special case Clearcase for NFS, but > Trond> if both you and cachefs have similar requirements for hooks, then > Trond> perhaps we could look for a common solution (perhaps at the VFS level?). > > Thanks for the encouragement. > > It looks to me like the i_mapping and f_mapping stuff is intended to let > a stacking file system share pages with a backing-store file system (we > really want to share pages, it's efficient and avoids a whole host of > cache coherency problems), but the interfaces are not adequate for that > to work with NFS as the backing-store. > > Other than i_mapping/f_mapping, I don't think it's possible right now > for stacking file systems to handle the address_space operations in our > layer *and* share the same pages with the backing-store, since the struct > pages are attached to the address space via file->f_mapping. At Stony Brook, we've come across similar problems. It is relatively easy to double cache, but inefficient. It is also relatively easy to single-cache, but then you don't get to intercept any of these interesting operations. Getting both at once is tricky. Nikolai Joukov developed a method that he uses for Tracefs, with pointer flipping. Basically, we set the page mapping to the lower-level mapping before the oepration, and unsets it afterwards. > [Special-casing for NFS would be tricky and probably improper--should we > really care what's below us? How would we determine that our backing > store inode is an NFS inode (or any other sort that doesn't handle > i_mapping hosting)? We don't have access to the NFS symbol names for > the file_operations or address_space_operations, so we can't even cheat > and determine whether the object below us is NFS. One way to check is !strcmp(i_mapping->host->i_sb->s_type->name, "nfs"). We used this in Unionfs because NFS doesn't returns EACCESS instead of EROFS for read-only file systems. Charles