From: NeilBrown <neilb@suse.de>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>,
Nick Piggin <npiggin@gmail.com>, Nick Piggin <npiggin@kernel.dk>,
linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] fs/dcache: allow __d_obtain_alias() to return unhashed dentries
Date: Mon, 2 Jul 2012 09:15:11 +1000 [thread overview]
Message-ID: <20120702091511.314fc8e8@notabene.brown> (raw)
In-Reply-To: <20120629202903.GB17103@fieldses.org>
[-- Attachment #1: Type: text/plain, Size: 6463 bytes --]
On Fri, 29 Jun 2012 16:29:03 -0400 "J. Bruce Fields" <bfields@fieldses.org>
wrote:
> On Fri, Jun 29, 2012 at 04:10:34PM -0400, J. Bruce Fields wrote:
> > On Thu, Jun 28, 2012 at 09:59:27AM -0400, J. Bruce Fields wrote:
> > > Coming back to this now, just trying to review the
> > > filehandle-lookup/dcache interactions:
> > >
> > > On Fri, Mar 11, 2011 at 03:07:49PM +1100, NeilBrown wrote:
> > > > 1/ Originally DCACHE_DISCONNECTED didn't really mean much - it's presence
> > > > was only a hint, its absence was a strong statement.
> > > > If the flag is set, the dentry might not be linked to the root.
> > > > If it is clear, it definitely is link through to the root.
> > > > However I think it was used with stronger intent than that.
> > > >
> > > > Now it seems to mean a little bit more: If it is set and the dentry
> > > > is hashed, then it must be on the sb->s_anon list.
> > >
> > > The code that makes that assumption is __d_shrink (which does the work
> > > of d_drop)--it uses DCACHE_DISCONECTED to decide which hash chain to
> > > lock.
> > >
> > > I can't find any basis for that assumption. The only code that clears
> > > DCACHE_DISCONNECTED is in expfs.c, and it isn't done at the same time as
> > > hashing. Am I missing something?
> > >
> > > > This is a significant
> > > > which I never noticed (I haven't been watching). Originally a
> > > > disconnected dentry would be attached (and hashed) to its parent. Then
> > > > that parent would get its own parent and so on until it was attached all
> > > > the way to the root. Only then would be start clearing
> > > > DCACHE_DISCONNECTED. It seems we must clear it sooner now... I wonder if
> > > > that is correct.
> > >
> > > It looks wrong to me:
> > >
> > > If we clear DCACHE_DISCONNECTED too early, then we risk a filehandle
> > > lookup thinking the dentry is OK to use. That could mean for example
> > > trying to rename across directories that don't have any ancestor
> > > relationship to each other in the dcache yet.
> > >
> > > So we need to wait to clear DCACHE_DISCONNECTED until we *know* the
> > > dentry's parents go all the way back to the root. As you say, that's
> > > what the current code does.
> > >
> > > But that means DCACHE_DISCONNECTED dentries can be hashed to their
> > > parents, and __d_shrink can be handed such dentries and then get the
> > > locking wrong.
> > >
> > > It looks like this bug might originate with Nick Piggin's ceb5bdc2d246
> > > "fs: dcache per-bucket dcache hash locking"? There's no discussion in
> > > the changelog, so probably it was just based on an unexamined assumption
> > > about DCACHE_DISCONNECTED.
> > >
> > > I wonder if an IS_ROOT() test could replace the DCACHE_DISCONNECTED test
> > > in __d_shrink(), or if we need another flag, or ?
> >
> > Bah, sorry, and I only just noticed that you already said as much later
> > and did the IS_ROOT() thing in your patch.
> >
> > Anyway, here's just that one change with a slightly more painstaking
> > changelog.
> >
> > --b.
> >
> > commit b1fa644355122627424fe2240a9fc60cbef4c349
> > Author: J. Bruce Fields <bfields@redhat.com>
> > Date: Thu Jun 28 12:10:55 2012 -0400
> >
> > dcache: use IS_ROOT to decide where dentry is hashed
> >
> > Every hashed dentry is either hashed in the dentry_hashtable, or a
> > superblock's s_anon list.
> >
> > __d_shrink assumes it can determine which is the case by checking
> > DCACHE_DISCONNECTED; this is not true.
> >
> > It is true that when DCACHE_DISCONNECTED is cleared, the dentry is not
> > only hashed on dentry_hashtable, but is fully connected to its parents
> > back to the root.
> >
> > But the converse is *not* true: fs/exportfs/expfs.c:reconnect_path()
> > attempts to connect a directory (found by filehandle lookup) back to
> > root by ascending to parents and performing lookups one at a time. It
> > does not clear DCACHE_DISCONNECTED until its done, and that is not at
> > all an atomic process.
> >
> > In particular, it is possible for DCACHE_DISCONNECTED to be set on a
> > dentry which is hashed on the dentry_hashtable.
> >
> > Instead, use IS_ROOT() to check which hash chain a dentry is on. This
> > *does* work:
> >
> > Dentries are hashed only by:
> >
> > - d_obtain_alias, which adds an IS_ROOT() dentry to sb_anon.
> >
> > - __d_rehash, called by _d_rehash: hashes to the dentry's
> > parent, and all callers of _d_rehash appear to have d_parent
> > set to a "real" parent.
> > - __d_rehash, called by __d_move: rehashes the moved dentry to
> > hash chain determined by target, and assigns target's d_parent
> > to its d_parent, before dropping the dentry's d_lock.
> >
> > Therefore I believe it's safe for a holder of a dentry's d_lock to
> > assume that it is hashed on sb_anon if and only if IS_ROOT(dentry) is
> > true.
> >
> > I believe the incorrect assumption about DCACHE_DISCONNECTED was
> > originally introduced by ceb5bdc2d246 "fs: dcache per-bucket dcache hash
> > locking".
> >
> > Cc: Neil Brown <neilb@suse.de>
> > Cc: Nick Piggin <npiggin@kernel.dk>
> > Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> >
> > diff --git a/fs/dcache.c b/fs/dcache.c
> > index 87c2da7..b2b382c 100644
> > --- a/fs/dcache.c
> > +++ b/fs/dcache.c
> > @@ -410,7 +410,7 @@ static void __d_shrink(struct dentry *dentry)
> > {
> > if (!d_unhashed(dentry)) {
> > struct hlist_bl_head *b;
> > - if (unlikely(dentry->d_flags & DCACHE_DISCONNECTED))
> > + if (unlikely(IS_ROOT(dentry->d_flags)))
>
> Um, right--I'll send an actual tested version along with some other
> stuff later.
:-)
If that tested version looks like:
if (unlikely(IS_ROOT(dentry)))
you can add a
Reviewed-by: NeilBrown <neilb@suse.de>
Thanks,
NeilBrown
>
> --b.
>
> > b = &dentry->d_sb->s_anon;
> > else
> > b = d_hash(dentry->d_parent, dentry->d_name.hash);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
prev parent reply other threads:[~2012-07-01 23:15 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-12 18:43 lifetime of DCACHE_DISCONECTED dentries J. Bruce Fields
2010-11-13 11:53 ` Nick Piggin
2010-11-15 17:48 ` J. Bruce Fields
[not found] ` <20101115174837.GB10044-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-11-16 6:45 ` Nick Piggin
2010-11-29 3:56 ` Nick Piggin
2010-11-29 19:32 ` J. Bruce Fields
[not found] ` <20101129193248.GA9897-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-11-30 1:00 ` Nick Piggin
[not found] ` <AANLkTikwzDJ_q65==uxDsAhp3h8bU7Rkt7U9gVgRAK0D-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-11-30 18:39 ` J. Bruce Fields
2010-12-03 22:33 ` [PATCH] nfsd4: allow __d_obtain_alias() to return unhashed dentries J. Bruce Fields
2010-12-13 5:19 ` Nick Piggin
2010-12-14 22:01 ` J. Bruce Fields
[not found] ` <20101214220102.GM24828-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-12-17 17:53 ` [PATCH] fs/dcache: use standard list macro for d_find_alias J. Bruce Fields
2010-12-17 18:00 ` [PATCH 2/2] fs/dcache: allow __d_obtain_alias() to return unhashed dentries J. Bruce Fields
2010-12-18 2:01 ` Nick Piggin
2010-12-18 16:16 ` J. Bruce Fields
[not found] ` <20101218161609.GA22150-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-12-19 14:53 ` Nick Piggin
[not found] ` <AANLkTingRv_gtRSctGzMfYrKg02M_sKj97HSQPRm_mA_-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-12-27 23:46 ` [PATCH] " J. Bruce Fields
[not found] ` <20101227234641.GA22248-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2011-01-18 20:45 ` J. Bruce Fields
[not found] ` <20110118204509.GA10903-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2011-01-18 22:02 ` Nick Piggin
[not found] ` <AANLkTikL2CDSWQJ1QH_Y4G-j70Vd=VesNMMnYTmMGHC9-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-01-18 22:08 ` J. Bruce Fields
[not found] ` <20110118220817.GF10903-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2011-03-08 18:13 ` J. Bruce Fields
[not found] ` <20110308181320.GA15566-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2011-03-10 10:58 ` Al Viro
[not found] ` <20110310105821.GE22723-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2011-03-11 4:07 ` NeilBrown
[not found] ` <20110311150749.2fa2be66-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2012-02-14 17:03 ` J. Bruce Fields
[not found] ` <20120214170300.GA4309-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-02-15 16:56 ` J. Bruce Fields
[not found] ` <20120215165633.GE12490-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-02-16 3:06 ` NeilBrown
[not found] ` <20120216140603.08cb4900-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2012-02-16 11:51 ` J. Bruce Fields
[not found] ` <20120216115133.GA20279-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-02-16 16:08 ` J. Bruce Fields
2012-02-16 22:30 ` J. Bruce Fields
[not found] ` <20120216223011.GA23997-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-02-17 16:34 ` Peng Tao
2012-03-13 20:55 ` J. Bruce Fields
2012-03-13 20:58 ` [PATCH 1/2] vfs: stop d_splice_alias creating directory aliases J. Bruce Fields
2012-03-13 20:58 ` [PATCH 2/2] vfs: remove unused __d_splice_alias argument J. Bruce Fields
2012-02-20 2:55 ` [PATCH] fs/dcache: allow __d_obtain_alias() to return unhashed dentries NeilBrown
[not found] ` <20120220135537.3078e20b-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2012-02-29 23:10 ` J. Bruce Fields
2012-06-28 13:59 ` J. Bruce Fields
[not found] ` <20120628135927.GA6406-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-06-29 20:10 ` J. Bruce Fields
[not found] ` <20120629201034.GA17103-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-06-29 20:29 ` J. Bruce Fields
2012-07-01 23:15 ` NeilBrown [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120702091511.314fc8e8@notabene.brown \
--to=neilb@suse.de \
--cc=bfields@fieldses.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=npiggin@gmail.com \
--cc=npiggin@kernel.dk \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).