From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Morton Subject: Re: [PATCH] fsstack/vfs: fixes to fsstack_copy_inode_size Date: Thu, 3 Dec 2009 14:23:29 -0800 Message-ID: <20091203142329.46cdcc44.akpm@linux-foundation.org> References: <200912030021.nB30Lt16020106@agora.fsl.cs.sunysb.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: viro@zeniv.linux.org.uk, Hugh Dickins , linux-fsdevel@vger.kernel.org, Tyler Hicks , Dustin Kirkland To: Erez Zadok Return-path: Received: from smtp1.linux-foundation.org ([140.211.169.13]:42179 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752749AbZLCWXt (ORCPT ); Thu, 3 Dec 2009 17:23:49 -0500 In-Reply-To: <200912030021.nB30Lt16020106@agora.fsl.cs.sunysb.edu> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Wed, 2 Dec 2009 19:21:55 -0500 Erez Zadok wrote: > > Andrew, > > This is a patch I've been using successfully in Unionfs for more than a year > and multiple kernel versions. The origin of patch was from Hugh Dickins, > who found problems with 32-bit/SMP systems. You and hch had some comments. > This is the final version which I've been using. See thread of discussion > starting here: > > > > The comments in this patch provide more details. I think this patch is > ready for upstream. At the moment, the only mainline user of this code is > ecryptfs. > um. None of that is usable as a changelog. > > diff --git a/fs/stack.c b/fs/stack.c > index 67716f6..3d7f5c3 100644 > --- a/fs/stack.c > +++ b/fs/stack.c > @@ -9,8 +9,54 @@ > */ > void fsstack_copy_inode_size(struct inode *dst, const struct inode *src) > { > - i_size_write(dst, i_size_read((struct inode *)src)); > - dst->i_blocks = src->i_blocks; > + loff_t i_size; > + blkcnt_t i_blocks; > + > + /* > + * i_size_read() includes its own seqlocking and protection from > + * preemption (see include/linux/fs.h): we need nothing extra for > + * that here, and prefer to avoid nesting locks than attempt to > + * keep i_size and i_blocks in synch together. > + */ > + i_size = i_size_read(src); > + > + /* > + * But if CONFIG_LSF (on 32-bit), we ought to make an effort to keep CONFIG_LBDAF? > + * the two halves of i_blocks in synch despite SMP or PREEMPT - though > + * stat's generic_fillattr() doesn't bother, and we won't be applying > + * quotas (where i_blocks does become important) at the upper level. > + * > + * We don't actually know what locking is used at the lower level; but > + * if it's a filesystem that supports quotas, it will be using i_lock > + * as in inode_add_bytes(). tmpfs uses other locking, and its 32-bit > + * is (just) able to exceed 2TB i_size with the aid of holes; but its > + * i_blocks cannot carry into the upper long without almost 2TB swap - > + * let's ignore that case. > + */ > + if (sizeof(i_blocks) > sizeof(long)) > + spin_lock((spinlock_t *) &src->i_lock); > + i_blocks = src->i_blocks; > + if (sizeof(i_blocks) > sizeof(long)) > + spin_unlock((spinlock_t *) &src->i_lock); And the typecasts are needed because `src' is const. urgh. Logically true but practically false. Perhaps just remove the const? > + /* > + * If CONFIG_SMP on 32-bit, it's vital for fsstack_copy_inode_size() And CONFIG_PREEMPT. > + * to hold some lock around i_size_write(), otherwise i_size_read() > + * may spin forever (see include/linux/fs.h). We don't necessarily > + * hold i_mutex when this is called, so take i_lock for that case. > + * > + * And if CONFIG_LSF (on 32-bit), continue our effort to keep the CONFIG_LBDAF? > + * two halves of i_blocks in synch despite SMP or PREEMPT: use i_lock > + * for that case too, and do both at once by combining the tests. > + * > + * There is none of this locking overhead in the 64-bit case. > + */ > + if (sizeof(i_size) > sizeof(long) || sizeof(i_blocks) > sizeof(long)) > + spin_lock(&dst->i_lock); > + i_size_write(dst, i_size); > + dst->i_blocks = i_blocks; > + if (sizeof(i_size) > sizeof(long) || sizeof(i_blocks) > sizeof(long)) > + spin_unlock(&dst->i_lock); > } > EXPORT_SYMBOL_GPL(fsstack_copy_inode_size); >