From: Josef Bacik <jbacik@fb.com>
To: "J. Bruce Fields" <bfields@fieldses.org>,
"Eric W. Biederman" <ebiederm@xmission.com>
Cc: <linux-fsdevel@vger.kernel.org>
Subject: Re: find_fh_dentry returned a DISCONNECTED directory
Date: Fri, 14 Feb 2014 11:38:25 -0500 [thread overview]
Message-ID: <52FE4681.40901@fb.com> (raw)
In-Reply-To: <20140214161412.GH21982@fieldses.org>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 02/14/2014 11:14 AM, J. Bruce Fields wrote:
> On Fri, Feb 14, 2014 at 07:49:35AM -0800, Eric W. Biederman wrote:
>> "J. Bruce Fields" <bfields@fieldses.org> writes:
>>
>>> On Thu, Feb 13, 2014 at 08:25:43PM -0800, Eric W. Biederman
>>> wrote:
>>>> "J. Bruce Fields" <bfields@fieldses.org> writes:
>>>>
>>>>> On Thu, Feb 13, 2014 at 03:45:16PM -0800, Eric W. Biederman
>>>>> wrote:
>>>>>> "J. Bruce Fields" <bfields@fieldses.org> writes:
>>>>>>
>>>>>>> Yesterday you passed on a report of this printk from
>>>>>>> nfsdfh.c firing:
>>>>>>>
>>>>>>> printk("nfsd: find_fh_dentry returned a DISCONNECTED
>>>>>>> directory: %pd2\n", dentry);
>>>>>>>
>>>>>>> I think the dentry probably comes from the FILEID_ROOT
>>>>>>> case of:
>>>>>>>
>>>>>>> if (fileid_type == FILEID_ROOT) dentry =
>>>>>>> dget(exp->ex_path.dentry); else { dentry =
>>>>>>> exportfs_decode_fh(exp->ex_path.mnt, fid, data_left,
>>>>>>> fileid_type, nfsd_acceptable, exp); }
>>>>>>>
>>>>>>> In that case the dentry was found using ordinary
>>>>>>> filesystem lookups, so doesn't go through the same
>>>>>>> DISCONNECTED-clearing logic as in the case of lookups
>>>>>>> by filehandle.
>>>>>>>
>>>>>>> Probably they have an export root that's not a
>>>>>>> filesystem root, and the lookups happened in the right
>>>>>>> order?
>>>>>>>
>>>>>>> I suspect that's fine, and that the printk is just
>>>>>>> stupid, but maybe we should clear DISCONNECTED when
>>>>>>> possible on normal lookups. The following is my
>>>>>>> attempt, though I'm not sure if d_alloc is the right
>>>>>>> place to do this. In any case it might help confirm
>>>>>>> this is what's happening.
>>>>>>>
>>>>>>> So if you pass along this patch to the person who was
>>>>>>> seeing that printk I'd be interested in the results.
>>>>>>
>>>>>> I have been reading through the dentry code for other
>>>>>> reasons and your patch definitely won't change anything.
>>>>>> __d_alloc sets d_flags = 0. Therefore d_alloc always
>>>>>> returns with d_flags == 0.
>>>>>
>>>>> You're right, of course. I wasn't thinking straight.
>>>>>
>>>>> So the only dentries with DISCONNECTED set are those
>>>>> created with d_obtain_alias, which is normally only used
>>>>> when you're looking up by filehandle.
>>>>>
>>>>> Except btrfs has a weird use in get_default_root(). So
>>>>> maybe they were running into the dentry that created?
>>>>>
>>>>> So btrfs should probably be using something else, I'm not
>>>>> sure what.
>>>>
>>>> The nfs client also has the case where it uses DISCONNECTED
>>>> dentries for directories that are not root on the server.
>>>> Which seems very similiar to the btrfs case.
>>>
>>> I don't think there's any reason for those to be flagged
>>> DISCONNECTED either.
>>
>> The only practical difference between the two cases is how
>> quickly it is desirable to connect the entries.
>>
>> The disconnected dentries processed by exportfs are dentries that
>> we want to connect immediately, and it is an error/problem to
>> have the disconnected after processing.
>>
>> The dentries that are the roots of file systems we want to
>> connect them if we get the chance with d_materialise_unique but
>> we don't care if they go long periods without being connected.
>>
>> I believe we want both groups of dentries on the s_anon list so
>> that if they remain disconnected when the filesystem is unmounted
>> we can find them and deal with them.
>
> Note it's IS_ROOT(), not DCACHE_DISCONNECTED, that determines
> whether a hashed dentry is on s_anon or not. (See
> 7632e465feb182cadc3c9aa1282a057201818a8c for more detailed
> discussion.)
>
>> I can see distinguishing between dentries that are supposed to
>> be disconnected for a short time, and dentries that are supposed
>> to be disconnected indefinitely but we currently (as of 3.14-rc1)
>> don't have that distinction.
>
> I believe we do: DCACHE_DISCONNECTED is for the former, and the
> latter should be IS_ROOT(), !DCACHE_DISCONNECTED.
>
> DCACHE_DISCONNECTED was intended to be used only for dentries
> created while performing lookup-by-filehandle which have not yet
> been confirmed to be linked back to filesystem root by a chain of
> ->d_parent pointers.
>
>> But a blanket statement that the long term disconnected dentries
>> are doing it wrong seems off base.
>>
>> If those dentires can tolerate not being on the s_anon list
>> d_alloc_pseudo or d_make_root looks like it will serve just as
>> well
>
> The difference is that d_alloc_pseudo and d_make_root
> unconditionally create new dentries, whereas d_materialise_unique
> lets us reuse an existing dentry.
>
> (In the NFS client case, I believe this happens for example when
> you mount export A from a server, then export B from the same
> server, and then one day you look up A/foo and find out that it's
> the same directory as B's root. You don't want to duplicate the
> inode or give it multiple dentries, so you instead reuse the
> existing IS_ROOT dentry for B to represent A/foo.
>
> In the btrfs case I guess it's a similar situation but with two
> subvolumes instead of two exports?)
>
>> from the perspective of d_materialise_unique, but that leaves me
>> with the queasy feeling that we will leak dentries and inodes
>> when we unmount the filesystems in question, if those dentries
>> have never been attached.
>
> In the NFS server (lookup-by-filehandle) case, I believe dentries
> in the process of being reconnected are all children of some
> IS_ROOT dentry which is on the s_anon list, so everyone's accounted
> for.
>
>
> Thanks for looking at this as I've found myself easily confused by
> it all.... (And judging by some of the code in the tree I'm not
> alone.)
>
Ok using d_make_root makes us leak inodes on umount (I don't know why,
I'm not drunk enough yet.) I haven't tried it yet but I'm almost 100%
positive if I just clear out DISCONNECTED we will BUG_ON() in
d_splice_alias() when we go to walk into the subvol we've mounted with
the default subvol option. So what I think I need to do is use use
d_materialise_unique in our lookup function instead of
d_splice_alias() and keep using d_obtain_alias() in get_default_root()
and just clear DISOCONNECTED on success. Does this sound crazy to
anybody else? Thanks,
Josef
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQIcBAEBAgAGBQJS/kaBAAoJEANb+wAKly3BXp8P/RiWEPYsJXwmD9N2JamQZawa
p5Fh507ZX3PRaOAZ6zgWB2uN4xRQtqFpsIjoZl6c2NoxnR4L89d4x/ExBF39WE2t
PmK3Ot/KJ6GXBVHSyexWJMix8R8u40eKHJcyywqj35HhHML47Ll1fy6k2vmRno5q
FVP+E0kLiBOH0W2ae8o7kPImLp2Ue7moDvExlqKTI3jeRPQI5u+/lH4uj8oVQXx4
/HNoZ7+htzi0Eb+mA+ve0k2/efiUlPloerE0oFUaCrrOF0HMLAVoXxhiDR7eDM96
nZ1LHZ422DOBRCBQ6kpDo7iAebQyRKx6rdkdr/n5/5Bs0iTevg3bWy8WEbMP+NvW
OR+mQ4r3X8J7t91Dp15OJuiEPhyH6lb2McPcxq/ozRDcGR0enZ1iYGl/GNw02eaw
1/JS4+nwokLCcEvTiW16n8sLGv+iU4fXawisjdKs76KQGO+rzdOd2msYjnOF8ZgT
YTO4k689qlWJu6TtY8iC0fRStIksTAMesMmCoCBl+2zkGsHMS9tMQt3kzIF91UP4
WTuZOBxdqByKmQiWy0INKhYXOSVxAUs27JuaHPIBUz65fDDu80IW+2Ih1sro73y/
+t9+7vSueviUub1azUocEdYxuc8mDQ7totpzrJBYHCF7C1T2DZV5nEpGUHD/d19R
3QwsV2oPiu230jh+nJR2
=VIUq
-----END PGP SIGNATURE-----
next prev parent reply other threads:[~2014-02-14 16:38 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-13 21:27 find_fh_dentry returned a DISCONNECTED directory J. Bruce Fields
2014-02-13 23:45 ` Eric W. Biederman
2014-02-14 3:30 ` J. Bruce Fields
2014-02-14 4:25 ` Eric W. Biederman
2014-02-14 14:46 ` J. Bruce Fields
2014-02-14 15:49 ` Eric W. Biederman
2014-02-14 16:14 ` J. Bruce Fields
2014-02-14 16:38 ` Josef Bacik [this message]
2014-02-14 16:45 ` J. Bruce Fields
2014-02-14 17:02 ` Josef Bacik
2014-02-14 17:14 ` Eric W. Biederman
2014-02-14 17:11 ` Eric W. Biederman
2014-02-14 17:02 ` Eric W. Biederman
2014-02-14 22:19 ` J. Bruce Fields
2014-02-14 22:41 ` J. Bruce Fields
2014-02-14 14:17 ` Josef Bacik
2014-02-14 15:13 ` Josef Bacik
2014-02-14 15:38 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52FE4681.40901@fb.com \
--to=jbacik@fb.com \
--cc=bfields@fieldses.org \
--cc=ebiederm@xmission.com \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).