From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Howells Subject: Re: [RFC] Support for stackable file systems on top of nfs Date: Mon, 14 Nov 2005 16:02:13 +0000 Message-ID: <17963.1131984133@warthog.cambridge.redhat.com> References: Cc: "John T. Kohl" , Trond Myklebust , dhowells@redhat.com, nfsv4@linux-nfs.org, Charles Wright , linux-fsdevel@vger.kernel.org Return-path: Received: from mx1.redhat.com ([66.187.233.31]:54237 "EHLO mx1.redhat.com") by vger.kernel.org with ESMTP id S1751165AbVKNQCg (ORCPT ); Mon, 14 Nov 2005 11:02:36 -0500 In-Reply-To: To: Nikolai Joukov Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org Nikolai Joukov wrote: > int stackable_readpage(file_t *file, page_t *page) > { > ... > page->mapping = lower_inode->i_mapping; > err = lower_inode->i_mapping->a_ops->readpage(lower_file, page); > page->mapping = inode->i_mapping; > ... > } This is a really bad idea for a number of reasons: (1) page->mapping isn't the only metadata in the page. (2) The lower_inode may interpret a hold as a block of zeros rather than bouncing the request back to the higher inode with ENODATA or something. (3) The lower_inode readpage() may not be complete at the time you switch the mapping pointer back. Obviously the page will be locked until at such time as completion or an error occurs. (4) The lower_inode readpage() may complete before it returns, in which case the VM may go and do something unspeakable to that page whilst it's still got the wrong mapping attached. (5) The lower_inode may have attached its own metadata to page->private, and this may refer back to this page, and may subsequently trip an assertion because the page's mapping pointer has been corrupted. David