All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <jbacik@fb.com>
To: "J. Bruce Fields" <bfields@fieldses.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>
Cc: <linux-fsdevel@vger.kernel.org>
Subject: Re: find_fh_dentry returned a DISCONNECTED directory
Date: Fri, 14 Feb 2014 11:38:25 -0500	[thread overview]
Message-ID: <52FE4681.40901@fb.com> (raw)
In-Reply-To: <20140214161412.GH21982@fieldses.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



On 02/14/2014 11:14 AM, J. Bruce Fields wrote:
> On Fri, Feb 14, 2014 at 07:49:35AM -0800, Eric W. Biederman wrote:
>> "J. Bruce Fields" <bfields@fieldses.org> writes:
>> 
>>> On Thu, Feb 13, 2014 at 08:25:43PM -0800, Eric W. Biederman
>>> wrote:
>>>> "J. Bruce Fields" <bfields@fieldses.org> writes:
>>>> 
>>>>> On Thu, Feb 13, 2014 at 03:45:16PM -0800, Eric W. Biederman
>>>>> wrote:
>>>>>> "J. Bruce Fields" <bfields@fieldses.org> writes:
>>>>>> 
>>>>>>> Yesterday you passed on a report of this printk from
>>>>>>> nfsdfh.c firing:
>>>>>>> 
>>>>>>> printk("nfsd: find_fh_dentry returned a DISCONNECTED
>>>>>>> directory: %pd2\n", dentry);
>>>>>>> 
>>>>>>> I think the dentry probably comes from the FILEID_ROOT
>>>>>>> case of:
>>>>>>> 
>>>>>>> if (fileid_type == FILEID_ROOT) dentry =
>>>>>>> dget(exp->ex_path.dentry); else { dentry =
>>>>>>> exportfs_decode_fh(exp->ex_path.mnt, fid, data_left,
>>>>>>> fileid_type, nfsd_acceptable, exp); }
>>>>>>> 
>>>>>>> In that case the dentry was found using ordinary
>>>>>>> filesystem lookups, so doesn't go through the same
>>>>>>> DISCONNECTED-clearing logic as in the case of lookups
>>>>>>> by filehandle.
>>>>>>> 
>>>>>>> Probably they have an export root that's not a
>>>>>>> filesystem root, and the lookups happened in the right
>>>>>>> order?
>>>>>>> 
>>>>>>> I suspect that's fine, and that the printk is just
>>>>>>> stupid, but maybe we should clear DISCONNECTED when
>>>>>>> possible on normal lookups.  The following is my
>>>>>>> attempt, though I'm not sure if d_alloc is the right 
>>>>>>> place to do this.  In any case it might help confirm
>>>>>>> this is what's happening.
>>>>>>> 
>>>>>>> So if you pass along this patch to the person who was
>>>>>>> seeing that printk I'd be interested in the results.
>>>>>> 
>>>>>> I have been reading through the dentry code for other
>>>>>> reasons and your patch definitely won't change anything.
>>>>>> __d_alloc sets d_flags = 0. Therefore d_alloc always
>>>>>> returns with d_flags == 0.
>>>>> 
>>>>> You're right, of course.  I wasn't thinking straight.
>>>>> 
>>>>> So the only dentries with DISCONNECTED set are those
>>>>> created with d_obtain_alias, which is normally only used
>>>>> when you're looking up by filehandle.
>>>>> 
>>>>> Except btrfs has a weird use in get_default_root().  So
>>>>> maybe they were running into the dentry that created?
>>>>> 
>>>>> So btrfs should probably be using something else, I'm not
>>>>> sure what.
>>>> 
>>>> The nfs client also has the case where it uses DISCONNECTED
>>>> dentries for directories that are not root on the server.
>>>> Which seems very similiar to the btrfs case.
>>> 
>>> I don't think there's any reason for those to be flagged
>>> DISCONNECTED either.
>> 
>> The only practical difference between the two cases is how
>> quickly it is desirable to connect the entries.
>> 
>> The disconnected dentries processed by exportfs are dentries that
>> we want to connect immediately, and it is an error/problem to
>> have the disconnected after processing.
>> 
>> The dentries that are the roots of file systems we want to
>> connect them if we get the chance with d_materialise_unique but
>> we don't care if they go long periods without being connected.
>> 
>> I believe we want both groups of dentries on the s_anon list so
>> that if they remain disconnected when the filesystem is unmounted
>> we can find them and deal with them.
> 
> Note it's IS_ROOT(), not DCACHE_DISCONNECTED, that determines
> whether a hashed dentry is on s_anon or not.  (See 
> 7632e465feb182cadc3c9aa1282a057201818a8c for more detailed
> discussion.)
> 
>> I can see distinguishing between dentries that are supposed to
>> be disconnected for a short time, and dentries that are supposed
>> to be disconnected indefinitely but we currently (as of 3.14-rc1)
>> don't have that distinction.
> 
> I believe we do: DCACHE_DISCONNECTED is for the former, and the
> latter should be IS_ROOT(), !DCACHE_DISCONNECTED.
> 
> DCACHE_DISCONNECTED was intended to be used only for dentries
> created while performing lookup-by-filehandle which have not yet
> been confirmed to be linked back to filesystem root by a chain of
> ->d_parent pointers.
> 
>> But a blanket statement that the long term disconnected dentries
>> are doing it wrong seems off base.
>> 
>> If those dentires can tolerate not being on the s_anon list 
>> d_alloc_pseudo or d_make_root looks like it will serve just as
>> well
> 
> The difference is that d_alloc_pseudo and d_make_root
> unconditionally create new dentries, whereas d_materialise_unique
> lets us reuse an existing dentry.
> 
> (In the NFS client case, I believe this happens for example when
> you mount export A from a server, then export B from the same
> server, and then one day you look up A/foo and find out that it's
> the same directory as B's root.  You don't want to duplicate the
> inode or give it multiple dentries, so you instead reuse the
> existing IS_ROOT dentry for B to represent A/foo.
> 
> In the btrfs case I guess it's a similar situation but with two 
> subvolumes instead of two exports?)
> 
>> from the perspective of d_materialise_unique, but that leaves me
>> with the queasy feeling that we will leak dentries and inodes
>> when we unmount the filesystems in question, if those dentries
>> have never been attached.
> 
> In the NFS server (lookup-by-filehandle) case, I believe dentries
> in the process of being reconnected are all children of some
> IS_ROOT dentry which is on the s_anon list, so everyone's accounted
> for.
> 
> 
> Thanks for looking at this as I've found myself easily confused by
> it all....  (And judging by some of the code in the tree I'm not
> alone.)
> 

Ok using d_make_root makes us leak inodes on umount (I don't know why,
I'm not drunk enough yet.)  I haven't tried it yet but I'm almost 100%
positive if I just clear out DISCONNECTED we will BUG_ON() in
d_splice_alias() when we go to walk into the subvol we've mounted with
the default subvol option.  So what I think I need to do is use use
d_materialise_unique in our lookup function instead of
d_splice_alias() and keep using d_obtain_alias() in get_default_root()
and just clear DISOCONNECTED on success.  Does this sound crazy to
anybody else?  Thanks,

Josef
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJS/kaBAAoJEANb+wAKly3BXp8P/RiWEPYsJXwmD9N2JamQZawa
p5Fh507ZX3PRaOAZ6zgWB2uN4xRQtqFpsIjoZl6c2NoxnR4L89d4x/ExBF39WE2t
PmK3Ot/KJ6GXBVHSyexWJMix8R8u40eKHJcyywqj35HhHML47Ll1fy6k2vmRno5q
FVP+E0kLiBOH0W2ae8o7kPImLp2Ue7moDvExlqKTI3jeRPQI5u+/lH4uj8oVQXx4
/HNoZ7+htzi0Eb+mA+ve0k2/efiUlPloerE0oFUaCrrOF0HMLAVoXxhiDR7eDM96
nZ1LHZ422DOBRCBQ6kpDo7iAebQyRKx6rdkdr/n5/5Bs0iTevg3bWy8WEbMP+NvW
OR+mQ4r3X8J7t91Dp15OJuiEPhyH6lb2McPcxq/ozRDcGR0enZ1iYGl/GNw02eaw
1/JS4+nwokLCcEvTiW16n8sLGv+iU4fXawisjdKs76KQGO+rzdOd2msYjnOF8ZgT
YTO4k689qlWJu6TtY8iC0fRStIksTAMesMmCoCBl+2zkGsHMS9tMQt3kzIF91UP4
WTuZOBxdqByKmQiWy0INKhYXOSVxAUs27JuaHPIBUz65fDDu80IW+2Ih1sro73y/
+t9+7vSueviUub1azUocEdYxuc8mDQ7totpzrJBYHCF7C1T2DZV5nEpGUHD/d19R
3QwsV2oPiu230jh+nJR2
=VIUq
-----END PGP SIGNATURE-----

  reply	other threads:[~2014-02-14 16:38 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-13 21:27 find_fh_dentry returned a DISCONNECTED directory J. Bruce Fields
2014-02-13 23:45 ` Eric W. Biederman
2014-02-14  3:30   ` J. Bruce Fields
2014-02-14  4:25     ` Eric W. Biederman
2014-02-14 14:46       ` J. Bruce Fields
2014-02-14 15:49         ` Eric W. Biederman
2014-02-14 16:14           ` J. Bruce Fields
2014-02-14 16:38             ` Josef Bacik [this message]
2014-02-14 16:45               ` J. Bruce Fields
2014-02-14 17:02                 ` Josef Bacik
2014-02-14 17:14                   ` Eric W. Biederman
2014-02-14 17:11               ` Eric W. Biederman
2014-02-14 17:02             ` Eric W. Biederman
2014-02-14 22:19               ` J. Bruce Fields
2014-02-14 22:41                 ` J. Bruce Fields
2014-02-14 14:17     ` Josef Bacik
2014-02-14 15:13     ` Josef Bacik
2014-02-14 15:38       ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52FE4681.40901@fb.com \
    --to=jbacik@fb.com \
    --cc=bfields@fieldses.org \
    --cc=ebiederm@xmission.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.