From: Josef Bacik <jbacik@fb.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
<linux-fsdevel@vger.kernel.org>
Subject: Re: find_fh_dentry returned a DISCONNECTED directory
Date: Fri, 14 Feb 2014 12:02:23 -0500 [thread overview]
Message-ID: <52FE4C1F.2020907@fb.com> (raw)
In-Reply-To: <20140214164507.GI21982@fieldses.org>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 02/14/2014 11:45 AM, J. Bruce Fields wrote:
> On Fri, Feb 14, 2014 at 11:38:25AM -0500, Josef Bacik wrote:
>
>
> On 02/14/2014 11:14 AM, J. Bruce Fields wrote:
>>>> On Fri, Feb 14, 2014 at 07:49:35AM -0800, Eric W. Biederman
>>>> wrote:
>>>>> "J. Bruce Fields" <bfields@fieldses.org> writes:
>>>>>
>>>>>> On Thu, Feb 13, 2014 at 08:25:43PM -0800, Eric W.
>>>>>> Biederman wrote:
>>>>>>> "J. Bruce Fields" <bfields@fieldses.org> writes:
>>>>>>>
>>>>>>>> On Thu, Feb 13, 2014 at 03:45:16PM -0800, Eric W.
>>>>>>>> Biederman wrote:
>>>>>>>>> "J. Bruce Fields" <bfields@fieldses.org> writes:
>>>>>>>>>
>>>>>>>>>> Yesterday you passed on a report of this printk
>>>>>>>>>> from nfsdfh.c firing:
>>>>>>>>>>
>>>>>>>>>> printk("nfsd: find_fh_dentry returned a
>>>>>>>>>> DISCONNECTED directory: %pd2\n", dentry);
>>>>>>>>>>
>>>>>>>>>> I think the dentry probably comes from the
>>>>>>>>>> FILEID_ROOT case of:
>>>>>>>>>>
>>>>>>>>>> if (fileid_type == FILEID_ROOT) dentry =
>>>>>>>>>> dget(exp->ex_path.dentry); else { dentry =
>>>>>>>>>> exportfs_decode_fh(exp->ex_path.mnt, fid,
>>>>>>>>>> data_left, fileid_type, nfsd_acceptable, exp); }
>>>>>>>>>>
>>>>>>>>>> In that case the dentry was found using ordinary
>>>>>>>>>> filesystem lookups, so doesn't go through the
>>>>>>>>>> same DISCONNECTED-clearing logic as in the case
>>>>>>>>>> of lookups by filehandle.
>>>>>>>>>>
>>>>>>>>>> Probably they have an export root that's not a
>>>>>>>>>> filesystem root, and the lookups happened in the
>>>>>>>>>> right order?
>>>>>>>>>>
>>>>>>>>>> I suspect that's fine, and that the printk is
>>>>>>>>>> just stupid, but maybe we should clear
>>>>>>>>>> DISCONNECTED when possible on normal lookups.
>>>>>>>>>> The following is my attempt, though I'm not sure
>>>>>>>>>> if d_alloc is the right place to do this. In any
>>>>>>>>>> case it might help confirm this is what's
>>>>>>>>>> happening.
>>>>>>>>>>
>>>>>>>>>> So if you pass along this patch to the person who
>>>>>>>>>> was seeing that printk I'd be interested in the
>>>>>>>>>> results.
>>>>>>>>>
>>>>>>>>> I have been reading through the dentry code for
>>>>>>>>> other reasons and your patch definitely won't
>>>>>>>>> change anything. __d_alloc sets d_flags = 0.
>>>>>>>>> Therefore d_alloc always returns with d_flags ==
>>>>>>>>> 0.
>>>>>>>>
>>>>>>>> You're right, of course. I wasn't thinking
>>>>>>>> straight.
>>>>>>>>
>>>>>>>> So the only dentries with DISCONNECTED set are those
>>>>>>>> created with d_obtain_alias, which is normally only
>>>>>>>> used when you're looking up by filehandle.
>>>>>>>>
>>>>>>>> Except btrfs has a weird use in get_default_root().
>>>>>>>> So maybe they were running into the dentry that
>>>>>>>> created?
>>>>>>>>
>>>>>>>> So btrfs should probably be using something else, I'm
>>>>>>>> not sure what.
>>>>>>>
>>>>>>> The nfs client also has the case where it uses
>>>>>>> DISCONNECTED dentries for directories that are not root
>>>>>>> on the server. Which seems very similiar to the btrfs
>>>>>>> case.
>>>>>>
>>>>>> I don't think there's any reason for those to be flagged
>>>>>> DISCONNECTED either.
>>>>>
>>>>> The only practical difference between the two cases is how
>>>>> quickly it is desirable to connect the entries.
>>>>>
>>>>> The disconnected dentries processed by exportfs are
>>>>> dentries that we want to connect immediately, and it is an
>>>>> error/problem to have the disconnected after processing.
>>>>>
>>>>> The dentries that are the roots of file systems we want to
>>>>> connect them if we get the chance with d_materialise_unique
>>>>> but we don't care if they go long periods without being
>>>>> connected.
>>>>>
>>>>> I believe we want both groups of dentries on the s_anon
>>>>> list so that if they remain disconnected when the
>>>>> filesystem is unmounted we can find them and deal with
>>>>> them.
>>>>
>>>> Note it's IS_ROOT(), not DCACHE_DISCONNECTED, that
>>>> determines whether a hashed dentry is on s_anon or not. (See
>>>> 7632e465feb182cadc3c9aa1282a057201818a8c for more detailed
>>>> discussion.)
>>>>
>>>>> I can see distinguishing between dentries that are supposed
>>>>> to be disconnected for a short time, and dentries that are
>>>>> supposed to be disconnected indefinitely but we currently
>>>>> (as of 3.14-rc1) don't have that distinction.
>>>>
>>>> I believe we do: DCACHE_DISCONNECTED is for the former, and
>>>> the latter should be IS_ROOT(), !DCACHE_DISCONNECTED.
>>>>
>>>> DCACHE_DISCONNECTED was intended to be used only for
>>>> dentries created while performing lookup-by-filehandle which
>>>> have not yet been confirmed to be linked back to filesystem
>>>> root by a chain of ->d_parent pointers.
>>>>
>>>>> But a blanket statement that the long term disconnected
>>>>> dentries are doing it wrong seems off base.
>>>>>
>>>>> If those dentires can tolerate not being on the s_anon list
>>>>> d_alloc_pseudo or d_make_root looks like it will serve
>>>>> just as well
>>>>
>>>> The difference is that d_alloc_pseudo and d_make_root
>>>> unconditionally create new dentries, whereas
>>>> d_materialise_unique lets us reuse an existing dentry.
>>>>
>>>> (In the NFS client case, I believe this happens for example
>>>> when you mount export A from a server, then export B from the
>>>> same server, and then one day you look up A/foo and find out
>>>> that it's the same directory as B's root. You don't want to
>>>> duplicate the inode or give it multiple dentries, so you
>>>> instead reuse the existing IS_ROOT dentry for B to represent
>>>> A/foo.
>>>>
>>>> In the btrfs case I guess it's a similar situation but with
>>>> two subvolumes instead of two exports?)
>>>>
>>>>> from the perspective of d_materialise_unique, but that
>>>>> leaves me with the queasy feeling that we will leak
>>>>> dentries and inodes when we unmount the filesystems in
>>>>> question, if those dentries have never been attached.
>>>>
>>>> In the NFS server (lookup-by-filehandle) case, I believe
>>>> dentries in the process of being reconnected are all children
>>>> of some IS_ROOT dentry which is on the s_anon list, so
>>>> everyone's accounted for.
>>>>
>>>>
>>>> Thanks for looking at this as I've found myself easily
>>>> confused by it all.... (And judging by some of the code in
>>>> the tree I'm not alone.)
>>>>
>
> Ok using d_make_root makes us leak inodes on umount (I don't know
> why, I'm not drunk enough yet.)
>
>> Oh, well, a little early in the morning, but that's a solveable
>> problem.
>
> I haven't tried it yet but I'm almost 100% positive if I just clear
> out DISCONNECTED we will BUG_ON() in d_splice_alias() when we go to
> walk into the subvol we've mounted with the default subvol option.
>
>> I'd expect it not to BUG_ON(), but instead to fall into the !new
>> case and create a second alias for that directory. Which
>> probably actually causes the same problems d_make_root does.
>
(weird that thunderbird is treating my replies as the most recent
reply, but whatever)
So neither of these cases happened, I just didn't see _anything_ in
the directory when I mounted the fs. Once I mounted the parent subvol
and walked into the default subvol I could see everything properly.
This is with clearing DISCONNECTED and still using d_splice_alias() in
lookup.
Once I switched d_splice_alias() to d_materialise_unique() in
btrfs_lookup() everything went back to working as normal. So I'm
calling it a win and forgetting we ever had this conversation. Thanks,
Josef
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQIcBAEBAgAGBQJS/kwfAAoJEANb+wAKly3BHSIP/jx01XDxT98Tk4TCRz2R12Cb
HmgWZ8etu82MPp/KChuXnidxJU00H7BOfaY4vs1q8XFz+MZg3ZCiazp5zu2RVPct
yjMYRY7XPEQcce0Oj5h40dfhclJ61eSupuUd7u+gNpNErsUdaFCpbtXGqVeGAuI4
lMbi8Tqd5YKNa3e+bs8OHEGRG+pQv1Ak2D+dcS+KDn3/Np38wC9eGRsQsZocubQf
EpjERcX3R8ug3k5C/sX15SkpVfHWZS2ydL0ypnOVAI5XZD9ufT+HyyAqhgFZFPlK
WhDTWH54gnq6e2RxU3bKN+q7VzwBD2L/82wHk0/bykXDL+56JRJ0MMptBimWt2M2
QH7+aUN3pM/hm0TiW3qC6WkfM4WBdEME/Mzf8s5vVJT4gQj7u0Xh3RlOX0Y+OEwV
MvICzlXNNT9jbp7EHHzDeJuSbvpLmvVOL24Tf9UmjYO2WXHIJK2bSW1ai2V5v+zx
jZ+NeAg8tfkpxKANbDxtOAzrehHTk8gPimsO3QXFlRvUyUggZdd524Chl07dxnQZ
JNvn4C9JrRkYE32d4dKrjmJvnQXsFKw1KlCvp14nLpQP6q5+J70KRo85NtcpK3fo
tUU24jzCD7m6W+Y5tT5rJgDLz7IAmTZ/1WTI9bc9YMk61ci7GlLE0g4hYbTWfS5h
Q0A5+xddyQT/I5HtCoPZ
=dBiR
-----END PGP SIGNATURE-----
next prev parent reply other threads:[~2014-02-14 17:02 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-13 21:27 find_fh_dentry returned a DISCONNECTED directory J. Bruce Fields
2014-02-13 23:45 ` Eric W. Biederman
2014-02-14 3:30 ` J. Bruce Fields
2014-02-14 4:25 ` Eric W. Biederman
2014-02-14 14:46 ` J. Bruce Fields
2014-02-14 15:49 ` Eric W. Biederman
2014-02-14 16:14 ` J. Bruce Fields
2014-02-14 16:38 ` Josef Bacik
2014-02-14 16:45 ` J. Bruce Fields
2014-02-14 17:02 ` Josef Bacik [this message]
2014-02-14 17:14 ` Eric W. Biederman
2014-02-14 17:11 ` Eric W. Biederman
2014-02-14 17:02 ` Eric W. Biederman
2014-02-14 22:19 ` J. Bruce Fields
2014-02-14 22:41 ` J. Bruce Fields
2014-02-14 14:17 ` Josef Bacik
2014-02-14 15:13 ` Josef Bacik
2014-02-14 15:38 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52FE4C1F.2020907@fb.com \
--to=jbacik@fb.com \
--cc=bfields@fieldses.org \
--cc=ebiederm@xmission.com \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).