linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fsstack/vfs: fixes to fsstack_copy_inode_size
@ 2009-12-03  0:21 Erez Zadok
  2009-12-03 22:23 ` Andrew Morton
  0 siblings, 1 reply; 3+ messages in thread
From: Erez Zadok @ 2009-12-03  0:21 UTC (permalink / raw)
  To: Andrew Morton, viro, Hugh Dickins
  Cc: linux-fsdevel, ezk, Tyler Hicks, Dustin Kirkland


Andrew,

This is a patch I've been using successfully in Unionfs for more than a year
and multiple kernel versions.  The origin of patch was from Hugh Dickins,
who found problems with 32-bit/SMP systems.  You and hch had some comments.
This is the final version which I've been using.  See thread of discussion
starting here:

	<http://marc.info/?l=linux-kernel&m=121469807803867&w=2>

The comments in this patch provide more details.  I think this patch is
ready for upstream.  At the moment, the only mainline user of this code is
ecryptfs.

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>

diff --git a/fs/stack.c b/fs/stack.c
index 67716f6..3d7f5c3 100644
--- a/fs/stack.c
+++ b/fs/stack.c
@@ -9,8 +9,54 @@
  */
 void fsstack_copy_inode_size(struct inode *dst, const struct inode *src)
 {
-	i_size_write(dst, i_size_read((struct inode *)src));
-	dst->i_blocks = src->i_blocks;
+	loff_t i_size;
+	blkcnt_t i_blocks;
+
+	/*
+	 * i_size_read() includes its own seqlocking and protection from
+	 * preemption (see include/linux/fs.h): we need nothing extra for
+	 * that here, and prefer to avoid nesting locks than attempt to
+	 * keep i_size and i_blocks in synch together.
+	 */
+	i_size = i_size_read(src);
+
+	/*
+	 * But if CONFIG_LSF (on 32-bit), we ought to make an effort to keep
+	 * the two halves of i_blocks in synch despite SMP or PREEMPT - though
+	 * stat's generic_fillattr() doesn't bother, and we won't be applying
+	 * quotas (where i_blocks does become important) at the upper level.
+	 *
+	 * We don't actually know what locking is used at the lower level; but
+	 * if it's a filesystem that supports quotas, it will be using i_lock
+	 * as in inode_add_bytes().  tmpfs uses other locking, and its 32-bit
+	 * is (just) able to exceed 2TB i_size with the aid of holes; but its
+	 * i_blocks cannot carry into the upper long without almost 2TB swap -
+	 * let's ignore that case.
+	 */
+	if (sizeof(i_blocks) > sizeof(long))
+		spin_lock((spinlock_t *) &src->i_lock);
+	i_blocks = src->i_blocks;
+	if (sizeof(i_blocks) > sizeof(long))
+		spin_unlock((spinlock_t *) &src->i_lock);
+
+	/*
+	 * If CONFIG_SMP on 32-bit, it's vital for fsstack_copy_inode_size()
+	 * to hold some lock around i_size_write(), otherwise i_size_read()
+	 * may spin forever (see include/linux/fs.h).  We don't necessarily
+	 * hold i_mutex when this is called, so take i_lock for that case.
+	 *
+	 * And if CONFIG_LSF (on 32-bit), continue our effort to keep the
+	 * two halves of i_blocks in synch despite SMP or PREEMPT: use i_lock
+	 * for that case too, and do both at once by combining the tests.
+	 *
+	 * There is none of this locking overhead in the 64-bit case.
+	 */
+	if (sizeof(i_size) > sizeof(long) || sizeof(i_blocks) > sizeof(long))
+		spin_lock(&dst->i_lock);
+	i_size_write(dst, i_size);
+	dst->i_blocks = i_blocks;
+	if (sizeof(i_size) > sizeof(long) || sizeof(i_blocks) > sizeof(long))
+		spin_unlock(&dst->i_lock);
 }
 EXPORT_SYMBOL_GPL(fsstack_copy_inode_size);
 

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-12-04  2:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-03  0:21 [PATCH] fsstack/vfs: fixes to fsstack_copy_inode_size Erez Zadok
2009-12-03 22:23 ` Andrew Morton
2009-12-04  2:54   ` Erez Zadok

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).