From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:37484) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qs05G-0001jl-F9 for qemu-devel@nongnu.org; Fri, 12 Aug 2011 18:20:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qs05F-0007eV-EX for qemu-devel@nongnu.org; Fri, 12 Aug 2011 18:20:06 -0400 Received: from verein.lst.de ([213.95.11.211]:60068 helo=newverein.lst.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qs05F-0007cq-6n for qemu-devel@nongnu.org; Fri, 12 Aug 2011 18:20:05 -0400 Date: Sat, 13 Aug 2011 00:20:03 +0200 From: Christoph Hellwig Message-ID: <20110812222003.GA8300@lst.de> References: <20110811064059.GU6342@yookeroo.fritz.box> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110811064059.GU6342@yookeroo.fritz.box> Subject: Re: [Qemu-devel] Fix refcounting in hugetlbfs quota handling List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Linus Torvalds , Avi Kivity , Marcelo Tosatti , Jan Kiszka , qemu-devel@nongnu.org, agraf@suse.de, kvm , Paul Mackerras , linux-kernel@vger.kernel.org On Thu, Aug 11, 2011 at 04:40:59PM +1000, David Gibson wrote: > Linus, please apply > > hugetlbfs tracks the current usage of hugepages per hugetlbfs > mountpoint. To correctly track this when hugepages are released, it > must find the right hugetlbfs super_block from the struct page > available in free_huge_page(). a superblock is not a mountpoint, it's a filesystem instance. You can happily have a single filesystem mounted at multiple mount points. > However, this usage is buggy, because nothing ensures that the > address_space is not freed before all the hugepages that belonged to > it are. In practice that will usually be the case, but if extra page > references have been taken by e.g. drivers or kvm doing > get_user_pages() then the file, inode and address space may be > destroyed before all the pages. > > In addition, the quota functions use the mapping only to get the inode > then the super_block. However, most of the callers already have the > inode anyway and have to get the mapping from there. > > This patch, therefore, stores a pointer to the inode instead of the > address_space in the page private data for hugepages. What's sthe point? The lifetime of inode->i_mapping is exactly the same as that of the inode, except for those few filesystem that use one from a different inode (and then for the whole lifetime of the inode), so I can't see how your patch will make a difference. > More > importantly it correctly adjusts the reference count on the inodes > when they're added to the page private data. This ensures that the > inode (and therefore the super block) will not be freed before we use > it from free_huge_page. That seems like the real fix. And even if you'd still do the other bits it should be a separate patch/