linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Josef Bacik <jbacik@fb.com>
To: "J. Bruce Fields" <bfields@fieldses.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>
Cc: <linux-fsdevel@vger.kernel.org>
Subject: Re: find_fh_dentry returned a DISCONNECTED directory
Date: Fri, 14 Feb 2014 11:38:25 -0500	[thread overview]
Message-ID: <52FE4681.40901@fb.com> (raw)
In-Reply-To: <20140214161412.GH21982@fieldses.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



On 02/14/2014 11:14 AM, J. Bruce Fields wrote:
> On Fri, Feb 14, 2014 at 07:49:35AM -0800, Eric W. Biederman wrote:
>> "J. Bruce Fields" <bfields@fieldses.org> writes:
>> 
>>> On Thu, Feb 13, 2014 at 08:25:43PM -0800, Eric W. Biederman
>>> wrote:
>>>> "J. Bruce Fields" <bfields@fieldses.org> writes:
>>>> 
>>>>> On Thu, Feb 13, 2014 at 03:45:16PM -0800, Eric W. Biederman
>>>>> wrote:
>>>>>> "J. Bruce Fields" <bfields@fieldses.org> writes:
>>>>>> 
>>>>>>> Yesterday you passed on a report of this printk from
>>>>>>> nfsdfh.c firing:
>>>>>>> 
>>>>>>> printk("nfsd: find_fh_dentry returned a DISCONNECTED
>>>>>>> directory: %pd2\n", dentry);
>>>>>>> 
>>>>>>> I think the dentry probably comes from the FILEID_ROOT
>>>>>>> case of:
>>>>>>> 
>>>>>>> if (fileid_type == FILEID_ROOT) dentry =
>>>>>>> dget(exp->ex_path.dentry); else { dentry =
>>>>>>> exportfs_decode_fh(exp->ex_path.mnt, fid, data_left,
>>>>>>> fileid_type, nfsd_acceptable, exp); }
>>>>>>> 
>>>>>>> In that case the dentry was found using ordinary
>>>>>>> filesystem lookups, so doesn't go through the same
>>>>>>> DISCONNECTED-clearing logic as in the case of lookups
>>>>>>> by filehandle.
>>>>>>> 
>>>>>>> Probably they have an export root that's not a
>>>>>>> filesystem root, and the lookups happened in the right
>>>>>>> order?
>>>>>>> 
>>>>>>> I suspect that's fine, and that the printk is just
>>>>>>> stupid, but maybe we should clear DISCONNECTED when
>>>>>>> possible on normal lookups.  The following is my
>>>>>>> attempt, though I'm not sure if d_alloc is the right 
>>>>>>> place to do this.  In any case it might help confirm
>>>>>>> this is what's happening.
>>>>>>> 
>>>>>>> So if you pass along this patch to the person who was
>>>>>>> seeing that printk I'd be interested in the results.
>>>>>> 
>>>>>> I have been reading through the dentry code for other
>>>>>> reasons and your patch definitely won't change anything.
>>>>>> __d_alloc sets d_flags = 0. Therefore d_alloc always
>>>>>> returns with d_flags == 0.
>>>>> 
>>>>> You're right, of course.  I wasn't thinking straight.
>>>>> 
>>>>> So the only dentries with DISCONNECTED set are those
>>>>> created with d_obtain_alias, which is normally only used
>>>>> when you're looking up by filehandle.
>>>>> 
>>>>> Except btrfs has a weird use in get_default_root().  So
>>>>> maybe they were running into the dentry that created?
>>>>> 
>>>>> So btrfs should probably be using something else, I'm not
>>>>> sure what.
>>>> 
>>>> The nfs client also has the case where it uses DISCONNECTED
>>>> dentries for directories that are not root on the server.
>>>> Which seems very similiar to the btrfs case.
>>> 
>>> I don't think there's any reason for those to be flagged
>>> DISCONNECTED either.
>> 
>> The only practical difference between the two cases is how
>> quickly it is desirable to connect the entries.
>> 
>> The disconnected dentries processed by exportfs are dentries that
>> we want to connect immediately, and it is an error/problem to
>> have the disconnected after processing.
>> 
>> The dentries that are the roots of file systems we want to
>> connect them if we get the chance with d_materialise_unique but
>> we don't care if they go long periods without being connected.
>> 
>> I believe we want both groups of dentries on the s_anon list so
>> that if they remain disconnected when the filesystem is unmounted
>> we can find them and deal with them.
> 
> Note it's IS_ROOT(), not DCACHE_DISCONNECTED, that determines
> whether a hashed dentry is on s_anon or not.  (See 
> 7632e465feb182cadc3c9aa1282a057201818a8c for more detailed
> discussion.)
> 
>> I can see distinguishing between dentries that are supposed to
>> be disconnected for a short time, and dentries that are supposed
>> to be disconnected indefinitely but we currently (as of 3.14-rc1)
>> don't have that distinction.
> 
> I believe we do: DCACHE_DISCONNECTED is for the former, and the
> latter should be IS_ROOT(), !DCACHE_DISCONNECTED.
> 
> DCACHE_DISCONNECTED was intended to be used only for dentries
> created while performing lookup-by-filehandle which have not yet
> been confirmed to be linked back to filesystem root by a chain of
> ->d_parent pointers.
> 
>> But a blanket statement that the long term disconnected dentries
>> are doing it wrong seems off base.
>> 
>> If those dentires can tolerate not being on the s_anon list 
>> d_alloc_pseudo or d_make_root looks like it will serve just as
>> well
> 
> The difference is that d_alloc_pseudo and d_make_root
> unconditionally create new dentries, whereas d_materialise_unique
> lets us reuse an existing dentry.
> 
> (In the NFS client case, I believe this happens for example when
> you mount export A from a server, then export B from the same
> server, and then one day you look up A/foo and find out that it's
> the same directory as B's root.  You don't want to duplicate the
> inode or give it multiple dentries, so you instead reuse the
> existing IS_ROOT dentry for B to represent A/foo.
> 
> In the btrfs case I guess it's a similar situation but with two 
> subvolumes instead of two exports?)
> 
>> from the perspective of d_materialise_unique, but that leaves me
>> with the queasy feeling that we will leak dentries and inodes
>> when we unmount the filesystems in question, if those dentries
>> have never been attached.
> 
> In the NFS server (lookup-by-filehandle) case, I believe dentries
> in the process of being reconnected are all children of some
> IS_ROOT dentry which is on the s_anon list, so everyone's accounted
> for.
> 
> 
> Thanks for looking at this as I've found myself easily confused by
> it all....  (And judging by some of the code in the tree I'm not
> alone.)
> 

Ok using d_make_root makes us leak inodes on umount (I don't know why,
I'm not drunk enough yet.)  I haven't tried it yet but I'm almost 100%
positive if I just clear out DISCONNECTED we will BUG_ON() in
d_splice_alias() when we go to walk into the subvol we've mounted with
the default subvol option.  So what I think I need to do is use use
d_materialise_unique in our lookup function instead of
d_splice_alias() and keep using d_obtain_alias() in get_default_root()
and just clear DISOCONNECTED on success.  Does this sound crazy to
anybody else?  Thanks,

Josef
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJS/kaBAAoJEANb+wAKly3BXp8P/RiWEPYsJXwmD9N2JamQZawa
p5Fh507ZX3PRaOAZ6zgWB2uN4xRQtqFpsIjoZl6c2NoxnR4L89d4x/ExBF39WE2t
PmK3Ot/KJ6GXBVHSyexWJMix8R8u40eKHJcyywqj35HhHML47Ll1fy6k2vmRno5q
FVP+E0kLiBOH0W2ae8o7kPImLp2Ue7moDvExlqKTI3jeRPQI5u+/lH4uj8oVQXx4
/HNoZ7+htzi0Eb+mA+ve0k2/efiUlPloerE0oFUaCrrOF0HMLAVoXxhiDR7eDM96
nZ1LHZ422DOBRCBQ6kpDo7iAebQyRKx6rdkdr/n5/5Bs0iTevg3bWy8WEbMP+NvW
OR+mQ4r3X8J7t91Dp15OJuiEPhyH6lb2McPcxq/ozRDcGR0enZ1iYGl/GNw02eaw
1/JS4+nwokLCcEvTiW16n8sLGv+iU4fXawisjdKs76KQGO+rzdOd2msYjnOF8ZgT
YTO4k689qlWJu6TtY8iC0fRStIksTAMesMmCoCBl+2zkGsHMS9tMQt3kzIF91UP4
WTuZOBxdqByKmQiWy0INKhYXOSVxAUs27JuaHPIBUz65fDDu80IW+2Ih1sro73y/
+t9+7vSueviUub1azUocEdYxuc8mDQ7totpzrJBYHCF7C1T2DZV5nEpGUHD/d19R
3QwsV2oPiu230jh+nJR2
=VIUq
-----END PGP SIGNATURE-----

  reply	other threads:[~2014-02-14 16:38 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-13 21:27 find_fh_dentry returned a DISCONNECTED directory J. Bruce Fields
2014-02-13 23:45 ` Eric W. Biederman
2014-02-14  3:30   ` J. Bruce Fields
2014-02-14  4:25     ` Eric W. Biederman
2014-02-14 14:46       ` J. Bruce Fields
2014-02-14 15:49         ` Eric W. Biederman
2014-02-14 16:14           ` J. Bruce Fields
2014-02-14 16:38             ` Josef Bacik [this message]
2014-02-14 16:45               ` J. Bruce Fields
2014-02-14 17:02                 ` Josef Bacik
2014-02-14 17:14                   ` Eric W. Biederman
2014-02-14 17:11               ` Eric W. Biederman
2014-02-14 17:02             ` Eric W. Biederman
2014-02-14 22:19               ` J. Bruce Fields
2014-02-14 22:41                 ` J. Bruce Fields
2014-02-14 14:17     ` Josef Bacik
2014-02-14 15:13     ` Josef Bacik
2014-02-14 15:38       ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52FE4681.40901@fb.com \
    --to=jbacik@fb.com \
    --cc=bfields@fieldses.org \
    --cc=ebiederm@xmission.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).