linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fs: Make sure data stored into inode is properly seen before unlocking new inode
@ 2009-09-08 11:41 Jan Kara
  2009-09-08 18:42 ` Christoph Hellwig
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Jan Kara @ 2009-09-08 11:41 UTC (permalink / raw)
  To: LKML; +Cc: linux-fsdevel, Andrew Morton, hch, Jan Kara

In theory it could happen that on one CPU we initialize a new inode but clearing
of I_NEW | I_LOCK gets reordered before some of the initialization. Thus on
another CPU we return not fully uptodate inode from iget_locked().

This seems to fix a corruption issue on ext3 mounted over NFS.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/inode.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

  Since Al doesn't seem to be online, does anybody else have opinion on this
patch? I can merge it via my tree but I'd like to get a review from someone
else.

diff --git a/fs/inode.c b/fs/inode.c
index 901bad1..e9a8e77 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -696,6 +696,7 @@ void unlock_new_inode(struct inode *inode)
 	 * just created it (so there can be no old holders
 	 * that haven't tested I_LOCK).
 	 */
+	smp_mb();
 	WARN_ON((inode->i_state & (I_LOCK|I_NEW)) != (I_LOCK|I_NEW));
 	inode->i_state &= ~(I_LOCK|I_NEW);
 	wake_up_inode(inode);
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] fs: Make sure data stored into inode is properly seen before unlocking new inode
  2009-09-08 11:41 [PATCH] fs: Make sure data stored into inode is properly seen before unlocking new inode Jan Kara
@ 2009-09-08 18:42 ` Christoph Hellwig
  2009-09-09 22:03 ` Andrew Morton
  2009-09-12 15:06 ` Al Viro
  2 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2009-09-08 18:42 UTC (permalink / raw)
  To: Jan Kara; +Cc: LKML, linux-fsdevel, Andrew Morton, hch

On Tue, Sep 08, 2009 at 01:41:03PM +0200, Jan Kara wrote:
> In theory it could happen that on one CPU we initialize a new inode but clearing
> of I_NEW | I_LOCK gets reordered before some of the initialization. Thus on
> another CPU we return not fully uptodate inode from iget_locked().
> 
> This seems to fix a corruption issue on ext3 mounted over NFS.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Looks good to me.  Impressive that this causes real life issues.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fs: Make sure data stored into inode is properly seen before unlocking new inode
  2009-09-08 11:41 [PATCH] fs: Make sure data stored into inode is properly seen before unlocking new inode Jan Kara
  2009-09-08 18:42 ` Christoph Hellwig
@ 2009-09-09 22:03 ` Andrew Morton
  2009-09-10  9:07   ` Jan Kara
  2009-09-12 15:06 ` Al Viro
  2 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2009-09-09 22:03 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-kernel, linux-fsdevel, hch, jack, stable

On Tue,  8 Sep 2009 13:41:03 +0200
Jan Kara <jack@suse.cz> wrote:

> In theory it could happen that on one CPU we initialize a new inode but clearing
> of I_NEW | I_LOCK gets reordered before some of the initialization. Thus on
> another CPU we return not fully uptodate inode from iget_locked().
> 
> This seems to fix a corruption issue on ext3 mounted over NFS.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/inode.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
>   Since Al doesn't seem to be online, does anybody else have opinion on this
> patch? I can merge it via my tree but I'd like to get a review from someone
> else.

I'll merge it for 2.6.31.

Please always remember -stable kernels when preparing bugfixes!  This
one should have had a Cc:stable in the changelog and in the email
headers.

> diff --git a/fs/inode.c b/fs/inode.c
> index 901bad1..e9a8e77 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -696,6 +696,7 @@ void unlock_new_inode(struct inode *inode)
>  	 * just created it (so there can be no old holders
>  	 * that haven't tested I_LOCK).
>  	 */
> +	smp_mb();
>  	WARN_ON((inode->i_state & (I_LOCK|I_NEW)) != (I_LOCK|I_NEW));
>  	inode->i_state &= ~(I_LOCK|I_NEW);
>  	wake_up_inode(inode);

But an uncommented barrier is always a hard thing for a reader to
understand.  Let's add something to help people.  How's this look?

--- a/fs/inode.c~fs-make-sure-data-stored-into-inode-is-properly-seen-before-unlocking-new-inode-fix
+++ a/fs/inode.c
@@ -697,12 +697,13 @@ void unlock_new_inode(struct inode *inod
 	}
 #endif
 	/*
-	 * This is special!  We do not need the spinlock
-	 * when clearing I_LOCK, because we're guaranteed
-	 * that nobody else tries to do anything about the
-	 * state of the inode when it is locked, as we
-	 * just created it (so there can be no old holders
-	 * that haven't tested I_LOCK).
+	 * This is special!  We do not need the spinlock when clearing I_LOCK,
+	 * because we're guaranteed that nobody else tries to do anything about
+	 * the state of the inode when it is locked, as we just created it (so
+	 * there can be no old holders that haven't tested I_LOCK).
+	 * However we must emit the memory barrier so that other CPUs reliably
+	 * see the clearing of I_LOCK after the other inode initialisation has
+	 * completed.
 	 */
 	smp_mb();
 	WARN_ON((inode->i_state & (I_LOCK|I_NEW)) != (I_LOCK|I_NEW));
_


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fs: Make sure data stored into inode is properly seen before unlocking new inode
  2009-09-09 22:03 ` Andrew Morton
@ 2009-09-10  9:07   ` Jan Kara
  0 siblings, 0 replies; 5+ messages in thread
From: Jan Kara @ 2009-09-10  9:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jan Kara, linux-kernel, linux-fsdevel, hch, stable

On Wed 09-09-09 15:03:34, Andrew Morton wrote:
> On Tue,  8 Sep 2009 13:41:03 +0200
> Jan Kara <jack@suse.cz> wrote:
> 
> > In theory it could happen that on one CPU we initialize a new inode but clearing
> > of I_NEW | I_LOCK gets reordered before some of the initialization. Thus on
> > another CPU we return not fully uptodate inode from iget_locked().
> > 
> > This seems to fix a corruption issue on ext3 mounted over NFS.
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  fs/inode.c |    1 +
> >  1 files changed, 1 insertions(+), 0 deletions(-)
> > 
> >   Since Al doesn't seem to be online, does anybody else have opinion on this
> > patch? I can merge it via my tree but I'd like to get a review from someone
> > else.
> 
> I'll merge it for 2.6.31.
  Thanks!

> Please always remember -stable kernels when preparing bugfixes!  This
> one should have had a Cc:stable in the changelog and in the email
> headers.
  Good point. Thanks for reminding.

> > diff --git a/fs/inode.c b/fs/inode.c
> > index 901bad1..e9a8e77 100644
> > --- a/fs/inode.c
> > +++ b/fs/inode.c
> > @@ -696,6 +696,7 @@ void unlock_new_inode(struct inode *inode)
> >  	 * just created it (so there can be no old holders
> >  	 * that haven't tested I_LOCK).
> >  	 */
> > +	smp_mb();
> >  	WARN_ON((inode->i_state & (I_LOCK|I_NEW)) != (I_LOCK|I_NEW));
> >  	inode->i_state &= ~(I_LOCK|I_NEW);
> >  	wake_up_inode(inode);
> 
> But an uncommented barrier is always a hard thing for a reader to
> understand.  Let's add something to help people.  How's this look?
> 
> --- a/fs/inode.c~fs-make-sure-data-stored-into-inode-is-properly-seen-before-unlocking-new-inode-fix
> +++ a/fs/inode.c
> @@ -697,12 +697,13 @@ void unlock_new_inode(struct inode *inod
>  	}
>  #endif
>  	/*
> -	 * This is special!  We do not need the spinlock
> -	 * when clearing I_LOCK, because we're guaranteed
> -	 * that nobody else tries to do anything about the
> -	 * state of the inode when it is locked, as we
> -	 * just created it (so there can be no old holders
> -	 * that haven't tested I_LOCK).
> +	 * This is special!  We do not need the spinlock when clearing I_LOCK,
> +	 * because we're guaranteed that nobody else tries to do anything about
> +	 * the state of the inode when it is locked, as we just created it (so
> +	 * there can be no old holders that haven't tested I_LOCK).
> +	 * However we must emit the memory barrier so that other CPUs reliably
> +	 * see the clearing of I_LOCK after the other inode initialisation has
> +	 * completed.
>  	 */
>  	smp_mb();
>  	WARN_ON((inode->i_state & (I_LOCK|I_NEW)) != (I_LOCK|I_NEW));
  Looks good.

									Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fs: Make sure data stored into inode is properly seen before unlocking new inode
  2009-09-08 11:41 [PATCH] fs: Make sure data stored into inode is properly seen before unlocking new inode Jan Kara
  2009-09-08 18:42 ` Christoph Hellwig
  2009-09-09 22:03 ` Andrew Morton
@ 2009-09-12 15:06 ` Al Viro
  2 siblings, 0 replies; 5+ messages in thread
From: Al Viro @ 2009-09-12 15:06 UTC (permalink / raw)
  To: Jan Kara; +Cc: LKML, linux-fsdevel, Andrew Morton, hch

On Tue, Sep 08, 2009 at 01:41:03PM +0200, Jan Kara wrote:
> In theory it could happen that on one CPU we initialize a new inode but clearing
> of I_NEW | I_LOCK gets reordered before some of the initialization. Thus on
> another CPU we return not fully uptodate inode from iget_locked().
> 
> This seems to fix a corruption issue on ext3 mounted over NFS.

Nice catch.  ACK.

>   Since Al doesn't seem to be online, does anybody else have opinion on this
> patch? I can merge it via my tree but I'd like to get a review from someone
> else.

I'm back, actually, and finally had almost crawled from under the pile
of mail in mbox.  Will apply.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-09-12 15:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-08 11:41 [PATCH] fs: Make sure data stored into inode is properly seen before unlocking new inode Jan Kara
2009-09-08 18:42 ` Christoph Hellwig
2009-09-09 22:03 ` Andrew Morton
2009-09-10  9:07   ` Jan Kara
2009-09-12 15:06 ` Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).