* [PATCH] fsstack/vfs: fixes to fsstack_copy_inode_size
@ 2009-12-03 0:21 Erez Zadok
2009-12-03 22:23 ` Andrew Morton
0 siblings, 1 reply; 3+ messages in thread
From: Erez Zadok @ 2009-12-03 0:21 UTC (permalink / raw)
To: Andrew Morton, viro, Hugh Dickins
Cc: linux-fsdevel, ezk, Tyler Hicks, Dustin Kirkland
Andrew,
This is a patch I've been using successfully in Unionfs for more than a year
and multiple kernel versions. The origin of patch was from Hugh Dickins,
who found problems with 32-bit/SMP systems. You and hch had some comments.
This is the final version which I've been using. See thread of discussion
starting here:
<http://marc.info/?l=linux-kernel&m=121469807803867&w=2>
The comments in this patch provide more details. I think this patch is
ready for upstream. At the moment, the only mainline user of this code is
ecryptfs.
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
diff --git a/fs/stack.c b/fs/stack.c
index 67716f6..3d7f5c3 100644
--- a/fs/stack.c
+++ b/fs/stack.c
@@ -9,8 +9,54 @@
*/
void fsstack_copy_inode_size(struct inode *dst, const struct inode *src)
{
- i_size_write(dst, i_size_read((struct inode *)src));
- dst->i_blocks = src->i_blocks;
+ loff_t i_size;
+ blkcnt_t i_blocks;
+
+ /*
+ * i_size_read() includes its own seqlocking and protection from
+ * preemption (see include/linux/fs.h): we need nothing extra for
+ * that here, and prefer to avoid nesting locks than attempt to
+ * keep i_size and i_blocks in synch together.
+ */
+ i_size = i_size_read(src);
+
+ /*
+ * But if CONFIG_LSF (on 32-bit), we ought to make an effort to keep
+ * the two halves of i_blocks in synch despite SMP or PREEMPT - though
+ * stat's generic_fillattr() doesn't bother, and we won't be applying
+ * quotas (where i_blocks does become important) at the upper level.
+ *
+ * We don't actually know what locking is used at the lower level; but
+ * if it's a filesystem that supports quotas, it will be using i_lock
+ * as in inode_add_bytes(). tmpfs uses other locking, and its 32-bit
+ * is (just) able to exceed 2TB i_size with the aid of holes; but its
+ * i_blocks cannot carry into the upper long without almost 2TB swap -
+ * let's ignore that case.
+ */
+ if (sizeof(i_blocks) > sizeof(long))
+ spin_lock((spinlock_t *) &src->i_lock);
+ i_blocks = src->i_blocks;
+ if (sizeof(i_blocks) > sizeof(long))
+ spin_unlock((spinlock_t *) &src->i_lock);
+
+ /*
+ * If CONFIG_SMP on 32-bit, it's vital for fsstack_copy_inode_size()
+ * to hold some lock around i_size_write(), otherwise i_size_read()
+ * may spin forever (see include/linux/fs.h). We don't necessarily
+ * hold i_mutex when this is called, so take i_lock for that case.
+ *
+ * And if CONFIG_LSF (on 32-bit), continue our effort to keep the
+ * two halves of i_blocks in synch despite SMP or PREEMPT: use i_lock
+ * for that case too, and do both at once by combining the tests.
+ *
+ * There is none of this locking overhead in the 64-bit case.
+ */
+ if (sizeof(i_size) > sizeof(long) || sizeof(i_blocks) > sizeof(long))
+ spin_lock(&dst->i_lock);
+ i_size_write(dst, i_size);
+ dst->i_blocks = i_blocks;
+ if (sizeof(i_size) > sizeof(long) || sizeof(i_blocks) > sizeof(long))
+ spin_unlock(&dst->i_lock);
}
EXPORT_SYMBOL_GPL(fsstack_copy_inode_size);
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] fsstack/vfs: fixes to fsstack_copy_inode_size
2009-12-03 0:21 [PATCH] fsstack/vfs: fixes to fsstack_copy_inode_size Erez Zadok
@ 2009-12-03 22:23 ` Andrew Morton
2009-12-04 2:54 ` Erez Zadok
0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2009-12-03 22:23 UTC (permalink / raw)
To: Erez Zadok
Cc: viro, Hugh Dickins, linux-fsdevel, Tyler Hicks, Dustin Kirkland
On Wed, 2 Dec 2009 19:21:55 -0500
Erez Zadok <ezk@cs.sunysb.edu> wrote:
>
> Andrew,
>
> This is a patch I've been using successfully in Unionfs for more than a year
> and multiple kernel versions. The origin of patch was from Hugh Dickins,
> who found problems with 32-bit/SMP systems. You and hch had some comments.
> This is the final version which I've been using. See thread of discussion
> starting here:
>
> <http://marc.info/?l=linux-kernel&m=121469807803867&w=2>
>
> The comments in this patch provide more details. I think this patch is
> ready for upstream. At the moment, the only mainline user of this code is
> ecryptfs.
>
um. None of that is usable as a changelog.
>
> diff --git a/fs/stack.c b/fs/stack.c
> index 67716f6..3d7f5c3 100644
> --- a/fs/stack.c
> +++ b/fs/stack.c
> @@ -9,8 +9,54 @@
> */
> void fsstack_copy_inode_size(struct inode *dst, const struct inode *src)
> {
> - i_size_write(dst, i_size_read((struct inode *)src));
> - dst->i_blocks = src->i_blocks;
> + loff_t i_size;
> + blkcnt_t i_blocks;
> +
> + /*
> + * i_size_read() includes its own seqlocking and protection from
> + * preemption (see include/linux/fs.h): we need nothing extra for
> + * that here, and prefer to avoid nesting locks than attempt to
> + * keep i_size and i_blocks in synch together.
> + */
> + i_size = i_size_read(src);
> +
> + /*
> + * But if CONFIG_LSF (on 32-bit), we ought to make an effort to keep
CONFIG_LBDAF?
> + * the two halves of i_blocks in synch despite SMP or PREEMPT - though
> + * stat's generic_fillattr() doesn't bother, and we won't be applying
> + * quotas (where i_blocks does become important) at the upper level.
> + *
> + * We don't actually know what locking is used at the lower level; but
> + * if it's a filesystem that supports quotas, it will be using i_lock
> + * as in inode_add_bytes(). tmpfs uses other locking, and its 32-bit
> + * is (just) able to exceed 2TB i_size with the aid of holes; but its
> + * i_blocks cannot carry into the upper long without almost 2TB swap -
> + * let's ignore that case.
> + */
> + if (sizeof(i_blocks) > sizeof(long))
> + spin_lock((spinlock_t *) &src->i_lock);
> + i_blocks = src->i_blocks;
> + if (sizeof(i_blocks) > sizeof(long))
> + spin_unlock((spinlock_t *) &src->i_lock);
And the typecasts are needed because `src' is const. urgh. Logically
true but practically false. Perhaps just remove the const?
> + /*
> + * If CONFIG_SMP on 32-bit, it's vital for fsstack_copy_inode_size()
And CONFIG_PREEMPT.
> + * to hold some lock around i_size_write(), otherwise i_size_read()
> + * may spin forever (see include/linux/fs.h). We don't necessarily
> + * hold i_mutex when this is called, so take i_lock for that case.
> + *
> + * And if CONFIG_LSF (on 32-bit), continue our effort to keep the
CONFIG_LBDAF?
> + * two halves of i_blocks in synch despite SMP or PREEMPT: use i_lock
> + * for that case too, and do both at once by combining the tests.
> + *
> + * There is none of this locking overhead in the 64-bit case.
> + */
> + if (sizeof(i_size) > sizeof(long) || sizeof(i_blocks) > sizeof(long))
> + spin_lock(&dst->i_lock);
> + i_size_write(dst, i_size);
> + dst->i_blocks = i_blocks;
> + if (sizeof(i_size) > sizeof(long) || sizeof(i_blocks) > sizeof(long))
> + spin_unlock(&dst->i_lock);
> }
> EXPORT_SYMBOL_GPL(fsstack_copy_inode_size);
>
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] fsstack/vfs: fixes to fsstack_copy_inode_size
2009-12-03 22:23 ` Andrew Morton
@ 2009-12-04 2:54 ` Erez Zadok
0 siblings, 0 replies; 3+ messages in thread
From: Erez Zadok @ 2009-12-04 2:54 UTC (permalink / raw)
To: Andrew Morton
Cc: Erez Zadok, viro, Hugh Dickins, linux-fsdevel, Tyler Hicks,
Dustin Kirkland
In message <20091203142329.46cdcc44.akpm@linux-foundation.org>, Andrew Morton writes:
> On Wed, 2 Dec 2009 19:21:55 -0500
> Erez Zadok <ezk@cs.sunysb.edu> wrote:
>
> >
> > Andrew,
> >
> > This is a patch I've been using successfully in Unionfs for more than a year
> > and multiple kernel versions. The origin of patch was from Hugh Dickins,
> > who found problems with 32-bit/SMP systems. You and hch had some comments.
> > This is the final version which I've been using. See thread of discussion
> > starting here:
> >
> > <http://marc.info/?l=linux-kernel&m=121469807803867&w=2>
> >
> > The comments in this patch provide more details. I think this patch is
> > ready for upstream. At the moment, the only mainline user of this code is
> > ecryptfs.
> >
>
> um. None of that is usable as a changelog.
Will fix.
> > diff --git a/fs/stack.c b/fs/stack.c
> > index 67716f6..3d7f5c3 100644
> > --- a/fs/stack.c
> > +++ b/fs/stack.c
> > @@ -9,8 +9,54 @@
> > */
> > void fsstack_copy_inode_size(struct inode *dst, const struct inode *src)
> > {
> > - i_size_write(dst, i_size_read((struct inode *)src));
> > - dst->i_blocks = src->i_blocks;
> > + loff_t i_size;
> > + blkcnt_t i_blocks;
> > +
> > + /*
> > + * i_size_read() includes its own seqlocking and protection from
> > + * preemption (see include/linux/fs.h): we need nothing extra for
> > + * that here, and prefer to avoid nesting locks than attempt to
> > + * keep i_size and i_blocks in synch together.
> > + */
> > + i_size = i_size_read(src);
> > +
> > + /*
> > + * But if CONFIG_LSF (on 32-bit), we ought to make an effort to keep
>
> CONFIG_LBDAF?
Ah, yes, the CONFIG option name has changed. Thanks.
I assume you meant teh comment is incorrect.
> > + * the two halves of i_blocks in synch despite SMP or PREEMPT - though
> > + * stat's generic_fillattr() doesn't bother, and we won't be applying
> > + * quotas (where i_blocks does become important) at the upper level.
> > + *
> > + * We don't actually know what locking is used at the lower level; but
> > + * if it's a filesystem that supports quotas, it will be using i_lock
> > + * as in inode_add_bytes(). tmpfs uses other locking, and its 32-bit
> > + * is (just) able to exceed 2TB i_size with the aid of holes; but its
> > + * i_blocks cannot carry into the upper long without almost 2TB swap -
> > + * let's ignore that case.
> > + */
> > + if (sizeof(i_blocks) > sizeof(long))
> > + spin_lock((spinlock_t *) &src->i_lock);
> > + i_blocks = src->i_blocks;
> > + if (sizeof(i_blocks) > sizeof(long))
> > + spin_unlock((spinlock_t *) &src->i_lock);
>
> And the typecasts are needed because `src' is const. urgh. Logically
> true but practically false. Perhaps just remove the const?
I agree. It's not much help here. I'll remove the const from the function
proto. The alternative would be to propagate the const down to
spin_{un}lock(), but I think it's best not to touch such a critical function
for the sake of seldom used fsstack code.
> > + /*
> > + * If CONFIG_SMP on 32-bit, it's vital for fsstack_copy_inode_size()
>
> And CONFIG_PREEMPT.
Will fix the comment here too.
Erez.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-12-04 2:55 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-03 0:21 [PATCH] fsstack/vfs: fixes to fsstack_copy_inode_size Erez Zadok
2009-12-03 22:23 ` Andrew Morton
2009-12-04 2:54 ` Erez Zadok
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).