public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: Fwd: Fw: [2.6.20.4] BUG: dentry xattrs still in use in shrink_dcache_for_umount() with reiserfs
       [not found]   ` <c04ebacf0704180152qdc72fcdnd0755dbf070b40d2@mail.gmail.com>
@ 2007-04-18 14:16     ` Vladimir V. Saveliev
  2007-04-18 15:00       ` Jeff Mahoney
  0 siblings, 1 reply; 11+ messages in thread
From: Vladimir V. Saveliev @ 2007-04-18 14:16 UTC (permalink / raw)
  To: Jeff Mahoney, a.righi; +Cc: linux-kernel, reiserfs-dev, Edward Shishkin, zam

Hello

On Wednesday 18 April 2007 12:52, ReiserFS Developers Mailing List wrote:
> *Forwarded Conversation*
> Subject: *[2.6.20.4] BUG: dentry xattrs still in use in
> shrink_dcache_for_umount() with reiserfs*
> ------------------------
> 
> * From: Andrea Righi* <righiandr@users.sourceforge.net> Reply-To:
> righiandr@users.sourceforge.net
> To: reiserfs-dev@namesys.com
> Cc: LKML <linux-kernel@vger.kernel.org>, Andrew Morton <
> akpm@linux-foundation.org>
> Date: Fri, Apr 13, 2007 at 3:04 PM
> Attachments: config.gz
> 
> I can reproduce the problem umounting my /var (reiserfs), but it doesn't
> occur with /usr or /opt, that are reiserfs too.
> 
> It seems very similar to this issue:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/hot-fixes/reiserfs-make-sure-all-dentries-refs-are-released-before-calling-kill_block_super-try-2.patch
> 
> How the xattrs->d_count can be 1 if the dentry is explicitly released
> in reiserfs_kill_sb(), before calling kill_super_block()?
> 

Jeff, in 2.6.20/fs/reiserfs/xattr.c we have get_xa_root,
which may increment xattr_root reference count either by 1 or 2:

If REISERFS_SB()->xattr_root != NULL - dentry reference count is incremented by 1 with dget.

If REISERFS_SB()->xattr_root == NULL, __get_xa_root is called which may increase xattr_root dentry reference count by 2:
once in lookup_one_len->__lookup_hash->cached_lookup->__d_lookup and once explicitly with dget in __get_xa_root.

Do you think that could be a reason of the extra reference count on   xattr_root dentry?


> (config attached)
> 
> -Andrea
> 
> BUG: Dentry dfcd2570{i=21bc,n=xattrs} still in use (1) [unmount of reiserfs
> dm-4]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: Fw: [2.6.20.4] BUG: dentry xattrs still in use in shrink_dcache_for_umount() with reiserfs
  2007-04-18 14:16     ` Fwd: Fw: [2.6.20.4] BUG: dentry xattrs still in use in shrink_dcache_for_umount() with reiserfs Vladimir V. Saveliev
@ 2007-04-18 15:00       ` Jeff Mahoney
  2007-04-21  0:07         ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff Mahoney @ 2007-04-18 15:00 UTC (permalink / raw)
  To: Vladimir V. Saveliev
  Cc: a.righi, linux-kernel, reiserfs-dev, Edward Shishkin, zam

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Vladimir V. Saveliev wrote:
> Hello
> 
> On Wednesday 18 April 2007 12:52, ReiserFS Developers Mailing List wrote:
>> *Forwarded Conversation*
>> Subject: *[2.6.20.4] BUG: dentry xattrs still in use in
>> shrink_dcache_for_umount() with reiserfs*
>> ------------------------
>>
>> * From: Andrea Righi* <righiandr@users.sourceforge.net> Reply-To:
>> righiandr@users.sourceforge.net
>> To: reiserfs-dev@namesys.com
>> Cc: LKML <linux-kernel@vger.kernel.org>, Andrew Morton <
>> akpm@linux-foundation.org>
>> Date: Fri, Apr 13, 2007 at 3:04 PM
>> Attachments: config.gz
>>
>> I can reproduce the problem umounting my /var (reiserfs), but it doesn't
>> occur with /usr or /opt, that are reiserfs too.
>>
>> It seems very similar to this issue:
>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/hot-fixes/reiserfs-make-sure-all-dentries-refs-are-released-before-calling-kill_block_super-try-2.patch
>>
>> How the xattrs->d_count can be 1 if the dentry is explicitly released
>> in reiserfs_kill_sb(), before calling kill_super_block()?
>>
> 
> Jeff, in 2.6.20/fs/reiserfs/xattr.c we have get_xa_root,
> which may increment xattr_root reference count either by 1 or 2:
> 
> If REISERFS_SB()->xattr_root != NULL - dentry reference count is incremented by 1 with dget.
> 
> If REISERFS_SB()->xattr_root == NULL, __get_xa_root is called which may increase xattr_root dentry reference count by 2:
> once in lookup_one_len->__lookup_hash->cached_lookup->__d_lookup and once explicitly with dget in __get_xa_root.

__get_xa_root() takes two references because it keeps one associated
with the superblock to avoid the lookup and locking later and passes one
back to the caller. When the file system is umounted, the saved
reference is freed.

The fix Andrea mentioned is unrelated. It's a fix for another patch in
- -mm. David Howells had a patch to tear down the entire dentry tree
associated with a superblock, but it depended on there being no live
references other than s_root at the time generic_shutdown_super is
called. It just moves the freeing of the dentries to ->put_super rather
than ->kill_sb.

> Do you think that could be a reason of the extra reference count on   xattr_root dentry?

No, I don't think it is. Looking at the code now, it seems obvious, but
I didn't notice it before and nobody else has reported a problem.

getxattr() doesn't require any VFS locking. When we get down into the
reiserfs code, it takes a read lock. If we get two concurrent threads
looking up an xattr before the root has been saved, there's a window
where REISERFS_SB(s)->xattr_root is NULL but we've already looked it up
and taken a reference on it.

I have a patch set to clean up the extended attribute code that fixes
this problem along the way by killing off the xattr locks and using the
backing files/dirs i_mutex instead. I'll post them to the reiserfs
mailing list.

- -Jeff

- --
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFGJjJwLPWxlyuTD7IRAle+AJ9erx2RC68KZWfkaIT/YkqFfYRsJgCgpEKA
fVDdl8KSCE4OUeOQhtpaLEY=
=YaYd
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: Fw: [2.6.20.4] BUG: dentry xattrs still in use in shrink_dcache_for_umount() with reiserfs
  2007-04-18 15:00       ` Jeff Mahoney
@ 2007-04-21  0:07         ` Andrew Morton
  2007-04-21  1:45           ` Jeff Mahoney
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2007-04-21  0:07 UTC (permalink / raw)
  To: Jeff Mahoney
  Cc: Vladimir V. Saveliev, a.righi, linux-kernel, reiserfs-dev,
	Edward Shishkin, zam

On Wed, 18 Apr 2007 11:00:00 -0400
Jeff Mahoney <jeffm@suse.com> wrote:

> > Do you think that could be a reason of the extra reference count on   xattr_root dentry?
> 
> No, I don't think it is. Looking at the code now, it seems obvious, but
> I didn't notice it before and nobody else has reported a problem.
> 
> getxattr() doesn't require any VFS locking. When we get down into the
> reiserfs code, it takes a read lock. If we get two concurrent threads
> looking up an xattr before the root has been saved, there's a window
> where REISERFS_SB(s)->xattr_root is NULL but we've already looked it up
> and taken a reference on it.
> 
> I have a patch set to clean up the extended attribute code that fixes
> this problem along the way by killing off the xattr locks and using the
> backing files/dirs i_mutex instead. I'll post them to the reiserfs
> mailing list.

Do we have anything suitable for 2.6.21 which will address this crash?

Also, it's not clear to me how many users we can expect to be impacted by it.
I assume that if the same bug is in 2.6.20 then the answer is "not many".
How come Andrea is able to keep hitting it?

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: Fw: [2.6.20.4] BUG: dentry xattrs still in use in shrink_dcache_for_umount() with reiserfs
  2007-04-21  0:07         ` Andrew Morton
@ 2007-04-21  1:45           ` Jeff Mahoney
  2007-04-21 13:17             ` Andrea Righi
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff Mahoney @ 2007-04-21  1:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vladimir V. Saveliev, a.righi, linux-kernel, reiserfs-dev,
	Edward Shishkin, zam, ak

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Morton wrote:
> On Wed, 18 Apr 2007 11:00:00 -0400
> Jeff Mahoney <jeffm@suse.com> wrote:
> 
>>> Do you think that could be a reason of the extra reference count on   xattr_root dentry?
>> No, I don't think it is. Looking at the code now, it seems obvious, but
>> I didn't notice it before and nobody else has reported a problem.
>>
>> getxattr() doesn't require any VFS locking. When we get down into the
>> reiserfs code, it takes a read lock. If we get two concurrent threads
>> looking up an xattr before the root has been saved, there's a window
>> where REISERFS_SB(s)->xattr_root is NULL but we've already looked it up
>> and taken a reference on it.
>>
>> I have a patch set to clean up the extended attribute code that fixes
>> this problem along the way by killing off the xattr locks and using the
>> backing files/dirs i_mutex instead. I'll post them to the reiserfs
>> mailing list.
> 
> Do we have anything suitable for 2.6.21 which will address this crash?
> 
> Also, it's not clear to me how many users we can expect to be impacted by it.
> I assume that if the same bug is in 2.6.20 then the answer is "not many".
> How come Andrea is able to keep hitting it?

I have the patchset that I mentioned, but I'm not proposing it for 2.6.21.
It's much too invasive to be introduced in an -rc7, but it does include
locking changes that I believe avoid this bug.

Vladimir was right in his analysis that sometimes get_xa_root() takes the
reference once and other times twice, but not for the right reasons.  I save
a reference to the xattr dir to avoid a lookup later, but when there are multiple
getxattrs or listxattrs as the first xattr operation on the file system, we can end
up taking a second reference when we shouldn't. This is because those operations
are protected by read locks and the ->xattr_root pointer isn't protected by anything
else. A quick fix would be to just extend the protection of the priv root's i_mutex
around the assignment, and test first. The right fix involves a complete rework of
the locking, and I have code to do that, it's just too late to include it in 2.6.21.

I'd love to know what Andrea (and now Andi Kleen) are doing to hit this now. There
haven't been any changes in this code in a while, and the shrink_dcache_for_umount()
has been around since October. I'm unable to reproduce locally so far, so if Andrea
or Andi could see if this fixes it for them, I'd appreciate it.

- -Jeff

- --- a/fs/reiserfs/xattr.c	2007-04-20 21:19:05.000000000 -0400
+++ b/fs/reiserfs/xattr.c	2007-04-20 21:41:16.000000000 -0400
@@ -72,14 +72,16 @@
 		err =
 		    privroot->d_inode->i_op->mkdir(privroot->d_inode, xaroot,
 						   0700);
- -		mutex_unlock(&privroot->d_inode->i_mutex);
 
 		if (err) {
+			mutex_unlock(&privroot->d_inode->i_mutex);
 			dput(xaroot);
 			dput(privroot);
 			return ERR_PTR(err);
 		}
- -		REISERFS_SB(sb)->xattr_root = dget(xaroot);
+		if (REISERFS_SB(sb)->xattr_root == NULL)
+			REISERFS_SB(sb)->xattr_root = dget(xaroot);
+		mutex_unlock(&privroot->d_inode->i_mutex);
 	}
 
       out:

- -- 
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFGKWy4LPWxlyuTD7IRAhcVAJ9vpYk2ayYf7xP7eB40inFpkiERvgCglayP
P7pDkPouMuBlw07rs1qaKPo=
=jdRe
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: Fw: [2.6.20.4] BUG: dentry xattrs still in use in shrink_dcache_for_umount() with reiserfs
  2007-04-21  1:45           ` Jeff Mahoney
@ 2007-04-21 13:17             ` Andrea Righi
  2007-04-21 14:31               ` Jeff Mahoney
  0 siblings, 1 reply; 11+ messages in thread
From: Andrea Righi @ 2007-04-21 13:17 UTC (permalink / raw)
  To: Jeff Mahoney
  Cc: Andrew Morton, Vladimir V. Saveliev, linux-kernel, reiserfs-dev,
	Edward Shishkin, zam, ak

Jeff Mahoney wrote:
> I have the patchset that I mentioned, but I'm not proposing it for 2.6.21.
> It's much too invasive to be introduced in an -rc7, but it does include
> locking changes that I believe avoid this bug.
> 
> Vladimir was right in his analysis that sometimes get_xa_root() takes the
> reference once and other times twice, but not for the right reasons.  I save
> a reference to the xattr dir to avoid a lookup later, but when there are multiple
> getxattrs or listxattrs as the first xattr operation on the file system, we can end
> up taking a second reference when we shouldn't. This is because those operations
> are protected by read locks and the ->xattr_root pointer isn't protected by anything
> else. A quick fix would be to just extend the protection of the priv root's i_mutex
> around the assignment, and test first. The right fix involves a complete rework of
> the locking, and I have code to do that, it's just too late to include it in 2.6.21.
> 
> I'd love to know what Andrea (and now Andi Kleen) are doing to hit this now. There
> haven't been any changes in this code in a while, and the shrink_dcache_for_umount()
> has been around since October. I'm unable to reproduce locally so far, so if Andrea
> or Andi could see if this fixes it for them, I'd appreciate it.

Jeff, your patch doesn't resolve, but you hit the problem. Actually, I
think the xattr_root pointer must be protected also in get_xa_root()
using the same private root i_mutex.

I've tested the following patch and it resolves in my case. What do you
think about it? could it be a reasonable fix?
 
Thanks!
-Andrea

--- linux/fs/reiserfs/xattr.c.orig	2007-04-14 18:53:02.000000000 +0200
+++ linux/fs/reiserfs/xattr.c	2007-04-21 14:16:02.000000000 +0200
@@ -72,14 +72,16 @@ static struct dentry *create_xa_root(str
 		err =
 		    privroot->d_inode->i_op->mkdir(privroot->d_inode, xaroot,
 						   0700);
-		mutex_unlock(&privroot->d_inode->i_mutex);
 
 		if (err) {
+			mutex_unlock(&privroot->d_inode->i_mutex);
 			dput(xaroot);
 			dput(privroot);
 			return ERR_PTR(err);
 		}
-		REISERFS_SB(sb)->xattr_root = dget(xaroot);
+		if (REISERFS_SB(sb)->xattr_root == NULL)
+			REISERFS_SB(sb)->xattr_root = dget(xaroot);
+		mutex_unlock(&privroot->d_inode->i_mutex);
 	}
 
       out:
@@ -122,12 +124,26 @@ static struct dentry *__get_xa_root(stru
  */
 static inline struct dentry *get_xa_root(struct super_block *s)
 {
-	struct dentry *dentry = dget(REISERFS_SB(s)->xattr_root);
+	struct dentry *privroot = dget(REISERFS_SB(s)->priv_root);
+	struct dentry *xaroot = NULL;
+
+	if (!privroot)
+		return ERR_PTR(-EOPNOTSUPP);
+
+	if (!privroot->d_inode)
+		goto out;
+
+	mutex_lock(&privroot->d_inode->i_mutex);
+
+	xaroot = (REISERFS_SB(s)->xattr_root) ?
+			dget(REISERFS_SB(s)->xattr_root) :
+			__get_xa_root(s);
 
-	if (!dentry)
-		dentry = __get_xa_root(s);
+	mutex_unlock(&privroot->d_inode->i_mutex);
 
-	return dentry;
+      out:
+	dput(privroot);
+	return xaroot;
 }
 
 /* Opens the directory corresponding to the inode's extended attribute store.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: Fw: [2.6.20.4] BUG: dentry xattrs still in use in shrink_dcache_for_umount() with reiserfs
  2007-04-21 13:17             ` Andrea Righi
@ 2007-04-21 14:31               ` Jeff Mahoney
  2007-04-21 14:34                 ` Jeff Mahoney
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff Mahoney @ 2007-04-21 14:31 UTC (permalink / raw)
  To: a.righi
  Cc: Andrew Morton, Vladimir V. Saveliev, linux-kernel, reiserfs-dev,
	Edward Shishkin, zam, ak

Andrea Righi wrote:
> Jeff Mahoney wrote:
>> I have the patchset that I mentioned, but I'm not proposing it for 2.6.21.
>> It's much too invasive to be introduced in an -rc7, but it does include
>> locking changes that I believe avoid this bug.
>>
>> Vladimir was right in his analysis that sometimes get_xa_root() takes the
>> reference once and other times twice, but not for the right reasons.  I save
>> a reference to the xattr dir to avoid a lookup later, but when there are multiple
>> getxattrs or listxattrs as the first xattr operation on the file system, we can end
>> up taking a second reference when we shouldn't. This is because those operations
>> are protected by read locks and the ->xattr_root pointer isn't protected by anything
>> else. A quick fix would be to just extend the protection of the priv root's i_mutex
>> around the assignment, and test first. The right fix involves a complete rework of
>> the locking, and I have code to do that, it's just too late to include it in 2.6.21.
>>
>> I'd love to know what Andrea (and now Andi Kleen) are doing to hit this now. There
>> haven't been any changes in this code in a while, and the shrink_dcache_for_umount()
>> has been around since October. I'm unable to reproduce locally so far, so if Andrea
>> or Andi could see if this fixes it for them, I'd appreciate it.
> 
> Jeff, your patch doesn't resolve, but you hit the problem. Actually, I
> think the xattr_root pointer must be protected also in get_xa_root()
> using the same private root i_mutex.
> 
> I've tested the following patch and it resolves in my case. What do you
> think about it? could it be a reasonable fix?

Your patch fixes the problem. That'll teach me to spin a patch up while I'm headed
out the door. I think a cleaner solution for now is just to refactor get_xa_root,
__get_xa_root, and create_xa_root into one function to simplify the lookup and the
locking. I think I'll add another step to my patch set that removes the caching of
xattr_root entirely or creates in on mount, as the priv root is created. The privroot
must be cached to avoid a deadlock on the fs-root directory's i_mutex, but xattr_root
was cached as an optimzation. If we need to take privroot->i_mutex every time, there's
no point in caching it since it will probably be in the dcache anyway.

-Jeff

---
 fs/reiserfs/xattr.c |   89 ++++++++++++----------------------------------------
 1 file changed, 22 insertions(+), 67 deletions(-)

--- a/fs/reiserfs/xattr.c	2007-04-21 10:15:00.000000000 -0400
+++ b/fs/reiserfs/xattr.c	2007-04-21 10:27:40.000000000 -0400
@@ -54,7 +54,12 @@
 static struct reiserfs_xattr_handler *find_xattr_handler_prefix(const char
 								*prefix);
 
-static struct dentry *create_xa_root(struct super_block *sb)
+/* Returns the dentry referring to the root of the extended attribute
+ * directory tree. If it has already been retrieved, it is used. If it
+ * hasn't been created and the flags indicate creation is allowed, we
+ * attempt to create it. On error, we return a pointer-encoded error.
+ */
+static struct dentry *get_xa_root(struct super_block *sb, int flags)
 {
 	struct dentry *privroot = dget(REISERFS_SB(sb)->priv_root);
 	struct dentry *xaroot;
@@ -63,73 +68,33 @@ static struct dentry *create_xa_root(str
 	if (!privroot)
 		return ERR_PTR(-EOPNOTSUPP);
 
-	xaroot = lookup_one_len(XAROOT_NAME, privroot, strlen(XAROOT_NAME));
-	if (IS_ERR(xaroot)) {
+	mutex_lock(&privroot->d_inode->i_mutex);
+	if (REISERFS_SB(sb)->xattr_root) {
+		xaroot = dget(REISERFS_SB(sb)->xattr_root);
 		goto out;
-	} else if (!xaroot->d_inode) {
-		int err;
-		mutex_lock(&privroot->d_inode->i_mutex);
-		err =
-		    privroot->d_inode->i_op->mkdir(privroot->d_inode, xaroot,
-						   0700);
-		mutex_unlock(&privroot->d_inode->i_mutex);
-
-		if (err) {
-			dput(xaroot);
-			dput(privroot);
-			return ERR_PTR(err);
-		}
-		REISERFS_SB(sb)->xattr_root = dget(xaroot);
 	}
 
-      out:
-	dput(privroot);
-	return xaroot;
-}
-
-/* This will return a dentry, or error, refering to the xa root directory.
- * If the xa root doesn't exist yet, the dentry will be returned without
- * an associated inode. This dentry can be used with ->mkdir to create
- * the xa directory. */
-static struct dentry *__get_xa_root(struct super_block *s)
-{
-	struct dentry *privroot = dget(REISERFS_SB(s)->priv_root);
-	struct dentry *xaroot = NULL;
-
-	if (IS_ERR(privroot) || !privroot)
-		return privroot;
-
 	xaroot = lookup_one_len(XAROOT_NAME, privroot, strlen(XAROOT_NAME));
 	if (IS_ERR(xaroot)) {
 		goto out;
 	} else if (!xaroot->d_inode) {
-		dput(xaroot);
-		xaroot = NULL;
-		goto out;
+		int err = -ENODATA;
+		if (flags == 0 || flags & XATTR_CREATE)
+			err = privroot->d_inode->i_op->mkdir(privroot->d_inode,
+			                                     xaroot, 0700);
+		if (err) {
+			dput(xaroot);
+			goto out;
+		}
 	}
-
-	REISERFS_SB(s)->xattr_root = dget(xaroot);
+	REISERFS_SB(sb)->xattr_root = dget(xaroot);
 
       out:
+	mutex_unlock(&privroot->d_inode->i_mutex);
 	dput(privroot);
 	return xaroot;
 }
 
-/* Returns the dentry (or NULL) referring to the root of the extended
- * attribute directory tree. If it has already been retrieved, it is used.
- * Otherwise, we attempt to retrieve it from disk. It may also return
- * a pointer-encoded error.
- */
-static inline struct dentry *get_xa_root(struct super_block *s)
-{
-	struct dentry *dentry = dget(REISERFS_SB(s)->xattr_root);
-
-	if (!dentry)
-		dentry = __get_xa_root(s);
-
-	return dentry;
-}
-
 /* Opens the directory corresponding to the inode's extended attribute store.
  * If flags allow, the tree to the directory may be created. If creation is
  * prohibited, -ENODATA is returned. */
@@ -138,21 +103,11 @@ static struct dentry *open_xa_dir(const 
 	struct dentry *xaroot, *xadir;
 	char namebuf[17];
 
-	xaroot = get_xa_root(inode->i_sb);
-	if (IS_ERR(xaroot)) {
+	xaroot = get_xa_root(inode->i_sb, flags);
+	if (IS_ERR(xaroot))
 		return xaroot;
-	} else if (!xaroot) {
-		if (flags == 0 || flags & XATTR_CREATE) {
-			xaroot = create_xa_root(inode->i_sb);
-			if (IS_ERR(xaroot))
-				return xaroot;
-		}
-		if (!xaroot)
-			return ERR_PTR(-ENODATA);
-	}
 
 	/* ok, we have xaroot open */
-
 	snprintf(namebuf, sizeof(namebuf), "%X.%X",
 		 le32_to_cpu(INODE_PKEY(inode)->k_objectid),
 		 inode->i_generation);
@@ -821,7 +776,7 @@ int reiserfs_delete_xattrs(struct inode 
 
 	/* Leftovers besides . and .. -- that's not good. */
 	if (dir->d_inode->i_nlink <= 2) {
-		root = get_xa_root(inode->i_sb);
+		root = get_xa_root(inode->i_sb, XATTR_REPLACE);
 		reiserfs_write_lock_xattrs(inode->i_sb);
 		err = vfs_rmdir(root->d_inode, dir);
 		reiserfs_write_unlock_xattrs(inode->i_sb);


-- 
Jeff Mahoney
SUSE Labs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: Fw: [2.6.20.4] BUG: dentry xattrs still in use in shrink_dcache_for_umount() with reiserfs
  2007-04-21 14:31               ` Jeff Mahoney
@ 2007-04-21 14:34                 ` Jeff Mahoney
  2007-04-21 15:26                   ` [PATCH] reiserfs: fix xattr root locking/refcount bug Jeff Mahoney
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff Mahoney @ 2007-04-21 14:34 UTC (permalink / raw)
  To: Jeff Mahoney
  Cc: a.righi, Andrew Morton, Vladimir V. Saveliev, linux-kernel,
	reiserfs-dev, Edward Shishkin, zam, ak

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jeff Mahoney wrote:
> Andrea Righi wrote:
>> Jeff Mahoney wrote:
>>> I have the patchset that I mentioned, but I'm not proposing it for 2.6.21.
>>> It's much too invasive to be introduced in an -rc7, but it does include
>>> locking changes that I believe avoid this bug.
>>>
>>> Vladimir was right in his analysis that sometimes get_xa_root() takes the
>>> reference once and other times twice, but not for the right reasons.  I save
>>> a reference to the xattr dir to avoid a lookup later, but when there are multiple
>>> getxattrs or listxattrs as the first xattr operation on the file system, we can end
>>> up taking a second reference when we shouldn't. This is because those operations
>>> are protected by read locks and the ->xattr_root pointer isn't protected by anything
>>> else. A quick fix would be to just extend the protection of the priv root's i_mutex
>>> around the assignment, and test first. The right fix involves a complete rework of
>>> the locking, and I have code to do that, it's just too late to include it in 2.6.21.
>>>
>>> I'd love to know what Andrea (and now Andi Kleen) are doing to hit this now. There
>>> haven't been any changes in this code in a while, and the shrink_dcache_for_umount()
>>> has been around since October. I'm unable to reproduce locally so far, so if Andrea
>>> or Andi could see if this fixes it for them, I'd appreciate it.
>> Jeff, your patch doesn't resolve, but you hit the problem. Actually, I
>> think the xattr_root pointer must be protected also in get_xa_root()
>> using the same private root i_mutex.
>>
>> I've tested the following patch and it resolves in my case. What do you
>> think about it? could it be a reasonable fix?
> 
> Your patch fixes the problem. That'll teach me to spin a patch up while I'm headed
> out the door. I think a cleaner solution for now is just to refactor get_xa_root,
> __get_xa_root, and create_xa_root into one function to simplify the lookup and the
> locking. I think I'll add another step to my patch set that removes the caching of
> xattr_root entirely or creates in on mount, as the priv root is created. The privroot
> must be cached to avoid a deadlock on the fs-root directory's i_mutex, but xattr_root
> was cached as an optimzation. If we need to take privroot->i_mutex every time, there's
> no point in caching it since it will probably be in the dcache anyway.
> 
> -Jeff
> 

Ignore this patch. I missed a spot.

- -Jeff

- --
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFGKiEHLPWxlyuTD7IRApRTAJ0WeJzkH7julSus9iqx+LYoAAlFOwCfahu0
4qN7BR/monh7ZcQWZ7Vmz3Y=
=UNNw
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] reiserfs: fix xattr root locking/refcount bug
  2007-04-21 14:34                 ` Jeff Mahoney
@ 2007-04-21 15:26                   ` Jeff Mahoney
  2007-04-21 19:39                     ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff Mahoney @ 2007-04-21 15:26 UTC (permalink / raw)
  To: a.righi
  Cc: Andrew Morton, Vladimir V. Saveliev, linux-kernel, reiserfs-dev,
	Edward Shishkin, zam, ak

 The listxattr() and getxattr() operations are only protected by a read
 lock. As a result, if either of these operations run in parallel, a race
 condition exists where the xattr_root will end up being cached twice,
 which results in the leaking of a reference and a BUG() on umount.

 This patch refactors get_xa_root(), __get_xa_root(), and
 create_xa_root(), into one get_xa_root() function that takes
 the appropriate locking around the entire critical section.

 Changes from initial version:
 With the previous version, established file systems worked, but freshly
 created file systems would fail to mount due to a circular dependency
 with the privroot. New inodes created in directories without I_PRIVATE
 will inherit the parent's default ACL. When creating the privroot,
 we'd get -EOPNOTSUPP due to the privroot not existing while looking up
 the default ACL. The old get_xa_dir() only checked for a negative
 dentry and returned -ENODATA. This version treats a missing privroot
 as -ENODATA. This is safe since xattrs are disabled automatically in
 reiserfs_xattr_init() when the privroot can't be created or found.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>

---
 fs/reiserfs/xattr.c |   92 +++++++++++++---------------------------------------
 1 file changed, 24 insertions(+), 68 deletions(-)

--- a/fs/reiserfs/xattr.c	2007-04-21 10:15:00.000000000 -0400
+++ b/fs/reiserfs/xattr.c	2007-04-21 11:21:02.000000000 -0400
@@ -54,82 +54,48 @@
 static struct reiserfs_xattr_handler *find_xattr_handler_prefix(const char
 								*prefix);
 
-static struct dentry *create_xa_root(struct super_block *sb)
+/* Returns the dentry referring to the root of the extended attribute
+ * directory tree. If it has already been retrieved, it is used. If it
+ * hasn't been created and the flags indicate creation is allowed, we
+ * attempt to create it. On error, we return a pointer-encoded error.
+ */
+static struct dentry *get_xa_root(struct super_block *sb, int flags)
 {
 	struct dentry *privroot = dget(REISERFS_SB(sb)->priv_root);
 	struct dentry *xaroot;
 
 	/* This needs to be created at mount-time */
 	if (!privroot)
-		return ERR_PTR(-EOPNOTSUPP);
+		return ERR_PTR(-ENODATA);
 
-	xaroot = lookup_one_len(XAROOT_NAME, privroot, strlen(XAROOT_NAME));
-	if (IS_ERR(xaroot)) {
+	mutex_lock(&privroot->d_inode->i_mutex);
+	if (REISERFS_SB(sb)->xattr_root) {
+		xaroot = dget(REISERFS_SB(sb)->xattr_root);
 		goto out;
-	} else if (!xaroot->d_inode) {
-		int err;
-		mutex_lock(&privroot->d_inode->i_mutex);
-		err =
-		    privroot->d_inode->i_op->mkdir(privroot->d_inode, xaroot,
-						   0700);
-		mutex_unlock(&privroot->d_inode->i_mutex);
-
-		if (err) {
-			dput(xaroot);
-			dput(privroot);
-			return ERR_PTR(err);
-		}
-		REISERFS_SB(sb)->xattr_root = dget(xaroot);
 	}
 
-      out:
-	dput(privroot);
-	return xaroot;
-}
-
-/* This will return a dentry, or error, refering to the xa root directory.
- * If the xa root doesn't exist yet, the dentry will be returned without
- * an associated inode. This dentry can be used with ->mkdir to create
- * the xa directory. */
-static struct dentry *__get_xa_root(struct super_block *s)
-{
-	struct dentry *privroot = dget(REISERFS_SB(s)->priv_root);
-	struct dentry *xaroot = NULL;
-
-	if (IS_ERR(privroot) || !privroot)
-		return privroot;
-
 	xaroot = lookup_one_len(XAROOT_NAME, privroot, strlen(XAROOT_NAME));
 	if (IS_ERR(xaroot)) {
 		goto out;
 	} else if (!xaroot->d_inode) {
-		dput(xaroot);
-		xaroot = NULL;
-		goto out;
+		int err = -ENODATA;
+		if (flags == 0 || flags & XATTR_CREATE)
+			err = privroot->d_inode->i_op->mkdir(privroot->d_inode,
+			                                     xaroot, 0700);
+		if (err) {
+			dput(xaroot);
+			xaroot = ERR_PTR(err);
+			goto out;
+		}
 	}
-
-	REISERFS_SB(s)->xattr_root = dget(xaroot);
+	REISERFS_SB(sb)->xattr_root = dget(xaroot);
 
       out:
+	mutex_unlock(&privroot->d_inode->i_mutex);
 	dput(privroot);
 	return xaroot;
 }
 
-/* Returns the dentry (or NULL) referring to the root of the extended
- * attribute directory tree. If it has already been retrieved, it is used.
- * Otherwise, we attempt to retrieve it from disk. It may also return
- * a pointer-encoded error.
- */
-static inline struct dentry *get_xa_root(struct super_block *s)
-{
-	struct dentry *dentry = dget(REISERFS_SB(s)->xattr_root);
-
-	if (!dentry)
-		dentry = __get_xa_root(s);
-
-	return dentry;
-}
-
 /* Opens the directory corresponding to the inode's extended attribute store.
  * If flags allow, the tree to the directory may be created. If creation is
  * prohibited, -ENODATA is returned. */
@@ -138,21 +104,11 @@ static struct dentry *open_xa_dir(const 
 	struct dentry *xaroot, *xadir;
 	char namebuf[17];
 
-	xaroot = get_xa_root(inode->i_sb);
-	if (IS_ERR(xaroot)) {
+	xaroot = get_xa_root(inode->i_sb, flags);
+	if (IS_ERR(xaroot))
 		return xaroot;
-	} else if (!xaroot) {
-		if (flags == 0 || flags & XATTR_CREATE) {
-			xaroot = create_xa_root(inode->i_sb);
-			if (IS_ERR(xaroot))
-				return xaroot;
-		}
-		if (!xaroot)
-			return ERR_PTR(-ENODATA);
-	}
 
 	/* ok, we have xaroot open */
-
 	snprintf(namebuf, sizeof(namebuf), "%X.%X",
 		 le32_to_cpu(INODE_PKEY(inode)->k_objectid),
 		 inode->i_generation);
@@ -821,7 +777,7 @@ int reiserfs_delete_xattrs(struct inode 
 
 	/* Leftovers besides . and .. -- that's not good. */
 	if (dir->d_inode->i_nlink <= 2) {
-		root = get_xa_root(inode->i_sb);
+		root = get_xa_root(inode->i_sb, XATTR_REPLACE);
 		reiserfs_write_lock_xattrs(inode->i_sb);
 		err = vfs_rmdir(root->d_inode, dir);
 		reiserfs_write_unlock_xattrs(inode->i_sb);

-- 
Jeff Mahoney
SUSE Labs


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] reiserfs: fix xattr root locking/refcount bug
  2007-04-21 15:26                   ` [PATCH] reiserfs: fix xattr root locking/refcount bug Jeff Mahoney
@ 2007-04-21 19:39                     ` Andrew Morton
  2007-04-22 10:28                       ` Andrea Righi
  2007-04-22 20:49                       ` Jeff Mahoney
  0 siblings, 2 replies; 11+ messages in thread
From: Andrew Morton @ 2007-04-21 19:39 UTC (permalink / raw)
  To: Jeff Mahoney
  Cc: a.righi, Vladimir V. Saveliev, linux-kernel, reiserfs-dev,
	Edward Shishkin, zam, ak

On Sat, 21 Apr 2007 11:26:31 -0400 Jeff Mahoney <jeffm@suse.com> wrote:

>  The listxattr() and getxattr() operations are only protected by a read
>  lock. As a result, if either of these operations run in parallel, a race
>  condition exists where the xattr_root will end up being cached twice,
>  which results in the leaking of a reference and a BUG() on umount.
> 
>  This patch refactors get_xa_root(), __get_xa_root(), and
>  create_xa_root(), into one get_xa_root() function that takes
>  the appropriate locking around the entire critical section.

Great, thanks.

Now we need to work out the timing.  Our options are to shove
it into 2.6.21 immediately, or to give it a run in 2.6.22-rc1 then
backport into 2.6.21.x.

What is everyone's confidence level?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] reiserfs: fix xattr root locking/refcount bug
  2007-04-21 19:39                     ` Andrew Morton
@ 2007-04-22 10:28                       ` Andrea Righi
  2007-04-22 20:49                       ` Jeff Mahoney
  1 sibling, 0 replies; 11+ messages in thread
From: Andrea Righi @ 2007-04-22 10:28 UTC (permalink / raw)
  To: Andrew Morton, Jeff Mahoney
  Cc: Vladimir V. Saveliev, linux-kernel, reiserfs-dev, Edward Shishkin,
	zam, ak

Andrew Morton wrote:
> On Sat, 21 Apr 2007 11:26:31 -0400 Jeff Mahoney <jeffm@suse.com> wrote:
> 
>>  The listxattr() and getxattr() operations are only protected by a read
>>  lock. As a result, if either of these operations run in parallel, a race
>>  condition exists where the xattr_root will end up being cached twice,
>>  which results in the leaking of a reference and a BUG() on umount.
>>
>>  This patch refactors get_xa_root(), __get_xa_root(), and
>>  create_xa_root(), into one get_xa_root() function that takes
>>  the appropriate locking around the entire critical section.
> 
> Great, thanks.
> 
> Now we need to work out the timing.  Our options are to shove
> it into 2.6.21 immediately, or to give it a run in 2.6.22-rc1 then
> backport into 2.6.21.x.
> 
> What is everyone's confidence level?
> 

Tested the patch and I confirm that it fixes the problem. Regarding the
confidence level, in my case I can say that on a suse 10.0 I can always
reproduce the bug on every shutdown. Anyway, I'm not using the default
kernel shipped with suse 10.0, but maybe the same bug could be seen in
opensuse 10.3 that (if I'm not wrong) uses a 2.6.20.

BTW there are not enough details, but a similar problem seems to be
already reported here: https://bugzilla.novell.com/show_bug.cgi?id=259215

-Andrea

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] reiserfs: fix xattr root locking/refcount bug
  2007-04-21 19:39                     ` Andrew Morton
  2007-04-22 10:28                       ` Andrea Righi
@ 2007-04-22 20:49                       ` Jeff Mahoney
  1 sibling, 0 replies; 11+ messages in thread
From: Jeff Mahoney @ 2007-04-22 20:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: a.righi, Vladimir V. Saveliev, linux-kernel, reiserfs-dev,
	Edward Shishkin, zam, ak

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Morton wrote:
> On Sat, 21 Apr 2007 11:26:31 -0400 Jeff Mahoney <jeffm@suse.com> wrote:
> 
>>  The listxattr() and getxattr() operations are only protected by a read
>>  lock. As a result, if either of these operations run in parallel, a race
>>  condition exists where the xattr_root will end up being cached twice,
>>  which results in the leaking of a reference and a BUG() on umount.
>>
>>  This patch refactors get_xa_root(), __get_xa_root(), and
>>  create_xa_root(), into one get_xa_root() function that takes
>>  the appropriate locking around the entire critical section.
> 
> Great, thanks.
>  
> Now we need to work out the timing.  Our options are to shove
> it into 2.6.21 immediately, or to give it a run in 2.6.22-rc1 then
> backport into 2.6.21.x.
> 
> What is everyone's confidence level?

I'm pretty confident in this patch for 2.6.21, but I wouldn't object to
waiting until 2.6.22-rc1 either. Operationally, the change isn't that
big and makes the locking more clear and the code simpler. I've tested
on populated file systems, virgin file systems, and tested the error
handling path with each. There is still a bug lurking in how a failure
in ACL inheritance is cleaned up that I ran into while testing, but this
patch didn't introduce it or exacerbate it. I'll add that to my xattr
patch queue.

- -Jeff

- --
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFGK8p0LPWxlyuTD7IRAthXAJ9Hk4f40vwuir2fp2dyte5U1juzlgCeLyiW
UnrEFDbKp/iVAE+CrFVSmqs=
=6s1J
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2007-04-22 20:50 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20070413095219.02075323.akpm@linux-foundation.org>
     [not found] ` <462536B2.5060308@users.sourceforge.net>
     [not found]   ` <c04ebacf0704180152qdc72fcdnd0755dbf070b40d2@mail.gmail.com>
2007-04-18 14:16     ` Fwd: Fw: [2.6.20.4] BUG: dentry xattrs still in use in shrink_dcache_for_umount() with reiserfs Vladimir V. Saveliev
2007-04-18 15:00       ` Jeff Mahoney
2007-04-21  0:07         ` Andrew Morton
2007-04-21  1:45           ` Jeff Mahoney
2007-04-21 13:17             ` Andrea Righi
2007-04-21 14:31               ` Jeff Mahoney
2007-04-21 14:34                 ` Jeff Mahoney
2007-04-21 15:26                   ` [PATCH] reiserfs: fix xattr root locking/refcount bug Jeff Mahoney
2007-04-21 19:39                     ` Andrew Morton
2007-04-22 10:28                       ` Andrea Righi
2007-04-22 20:49                       ` Jeff Mahoney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox