From mboxrd@z Thu Jan 1 00:00:00 1970 From: Uwe =?iso-8859-1?Q?Kleine-K=F6nig?= Subject: Re: [PATCH] fs: Work around NFS wreckage Date: Thu, 13 Jan 2011 15:12:50 +0100 Message-ID: <20110113141250.GR24920@pengutronix.de> References: Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Nick Piggin , LKML , Linus Torvalds , linux-fsdevel@vger.kernel.org, "Ramirez Luna, Omar" To: Thomas Gleixner Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org Hello, On Thu, Jan 13, 2011 at 02:54:30PM +0100, Thomas Gleixner wrote: > The dcache scalability work broke NFS root filesystems. >=20 > "cd /" results in the following problem: >=20 > link_path_walk("/",...); > jumps to return_reval > need_reval_dot() returns true for NFS > d_revalidate() > dentry->d_op->d_revalidate(dentry, nd); > returns -ECHILD due to nd->flags & LOOKUP_RCU > nameidata_dentry_drop_rcu() > spin_lock(&parent->d_lock); > spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED); >=20 > This deadlocks because dentry =3D=3D parent >=20 > This problem exists for any filesystem which implements d_revalidate. >=20 > Uwe bisected is down to commit 34286d6(fs: rcu-walk aware d_revalidat= e > method), but reverting that patch causes different wreckage to show u= p. >=20 > Check for parent equal dentry and skip the nested lock to avoid the > deadlock. I'm sure this is the wrong fix, but at least it "works" :) >=20 > Reported-by: Uwe Kleine-Koenig > Reported-by: "Ramirez Luna, Omar" > Not-Signed-off-by: Thomas Gleixner > --- > fs/namei.c | 4 ++++ > 1 file changed, 4 insertions(+) >=20 > Index: linux-2.6/fs/namei.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- linux-2.6.orig/fs/namei.c > +++ linux-2.6/fs/namei.c > @@ -487,6 +487,8 @@ static int nameidata_dentry_drop_rcu(str > goto err_root; > } > spin_lock(&parent->d_lock); > + if (parent =3D=3D dentry) > + goto same; > spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED); > if (!__d_rcu_to_refcount(dentry, nd->seq)) > goto err; > @@ -499,6 +501,8 @@ static int nameidata_dentry_drop_rcu(str > BUG_ON(!parent->d_count); > parent->d_count++; > spin_unlock(&dentry->d_lock); > + > +same: > spin_unlock(&parent->d_lock); > if (nd->root.mnt) { > path_get(&nd->root); >=20 Note there is a different patch available in the thread here: http://thread.gmane.org/gmane.linux.kernel/1087013/focus=3D1087048 The differences are that it tests for IS_ROOT(dentry) instead of parent= =3D=3D dentry (which looks more reasonable IMVHO) and that it increases parent->d_count even if the test triggered. (And it doesn't skip the BUG_ONs which hopefully doesn't make a difference.) Note I really have no glue about the code below fs/, but I wonder if the toplevel directories of mounts need some treatment here, too. (But I expect that they don't. So I ask just in case ...) Best regards Uwe --=20 Pengutronix e.K. | Uwe Kleine-K=F6nig = | Industrial Linux Solutions | http://www.pengutronix.de/= |