From: Amir Goldstein <amir73il@gmail.com>
To: Trond Myklebust <trondmy@hammerspace.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
"chuck.lever@oracle.com" <chuck.lever@oracle.com>
Subject: Re: [PATCH] knfsd: fix the fallback implementation of the get_name export operation
Date: Sun, 31 Dec 2023 12:44:57 +0200 [thread overview]
Message-ID: <CAOQ4uxh5xpJSvmYxWRKe_i=h1PRPy+nEA=vAcCD0rCJQKnm1Ww@mail.gmail.com> (raw)
In-Reply-To: <9c4867cf1f94a8e46c2271bfd5a91d30d49ada70.camel@hammerspace.com>
On Sat, Dec 30, 2023 at 9:36 PM Trond Myklebust <trondmy@hammerspace.com> wrote:
>
> On Sat, 2023-12-30 at 08:23 +0200, Amir Goldstein wrote:
> > On Sat, Dec 30, 2023 at 1:50 AM Trond Myklebust
> > <trondmy@hammerspace.com> wrote:
> > >
> > > On Fri, 2023-12-29 at 18:29 -0500, Chuck Lever wrote:
> > > > On Fri, Dec 29, 2023 at 07:44:20PM +0200, Amir Goldstein wrote:
> > > > > On Fri, Dec 29, 2023 at 4:35 PM Chuck Lever
> > > > > <chuck.lever@oracle.com> wrote:
> > > > > >
> > > > > > On Fri, Dec 29, 2023 at 07:46:54AM +0200, Amir Goldstein
> > > > > > wrote:
> > > > > > > [CC: fsdevel, viro]
> > > > > >
> > > > > > Thanks for picking this up, Amir, and for copying
> > > > > > viro/fsdevel. I
> > > > > > was planning to repost this next week when more folks are
> > > > > > back,
> > > > > > but
> > > > > > this works too.
> > > > > >
> > > > > > Trond, if you'd like, I can handle review changes if you
> > > > > > don't
> > > > > > have
> > > > > > time to follow up.
> > > > > >
> > > > > >
> > > > > > > On Thu, Dec 28, 2023 at 10:22 PM <trondmy@kernel.org>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> > > > > > > >
> > > > > > > > The fallback implementation for the get_name export
> > > > > > > > operation
> > > > > > > > uses
> > > > > > > > readdir() to try to match the inode number to a filename.
> > > > > > > > That filename
> > > > > > > > is then used together with lookup_one() to produce a
> > > > > > > > dentry.
> > > > > > > > A problem arises when we match the '.' or '..' entries,
> > > > > > > > since
> > > > > > > > that
> > > > > > > > causes lookup_one() to fail. This has sometimes been seen
> > > > > > > > to
> > > > > > > > occur for
> > > > > > > > filesystems that violate POSIX requirements around
> > > > > > > > uniqueness
> > > > > > > > of inode
> > > > > > > > numbers, something that is common for snapshot
> > > > > > > > directories.
> > > > > > >
> > > > > > > Ouch. Nasty.
> > > > > > >
> > > > > > > Looks to me like the root cause is "filesystems that
> > > > > > > violate
> > > > > > > POSIX
> > > > > > > requirements around uniqueness of inode numbers".
> > > > > > > This violation can cause any of the parent's children to
> > > > > > > wrongly match
> > > > > > > get_name() not only '.' and '..' and fail the d_inode
> > > > > > > sanity
> > > > > > > check after
> > > > > > > lookup_one().
> > > > > > >
> > > > > > > I understand why this would be common with parent of
> > > > > > > snapshot
> > > > > > > dir,
> > > > > > > but the only fs that support snapshots that I know of
> > > > > > > (btrfs,
> > > > > > > bcachefs)
> > > > > > > do implement ->get_name(), so which filesystem did you
> > > > > > > encounter
> > > > > > > this behavior with? can it be fixed by implementing a
> > > > > > > snapshot
> > > > > > > aware ->get_name()?
> > > > > > >
> > > > > > > > This patch just ensures that we skip '.' and '..' rather
> > > > > > > > than
> > > > > > > > allowing a
> > > > > > > > match.
> > > > > > >
> > > > > > > I agree that skipping '.' and '..' makes sense, but...
> > > > > >
> > > > > > Does skipping '.' and '..' make sense for file systems that
> > > > > > do
> > > > >
> > > > > It makes sense because if the child's name in its parent would
> > > > > have been "." or ".." it would have been its own parent or its
> > > > > own
> > > > > grandparent (ELOOP situation).
> > > > > IOW, we can safely skip "." and "..", regardless of anything
> > > > > else.
> > > >
> > > > This new comment:
> > > >
> > > > + /* Ignore the '.' and '..' entries */
> > > >
> > > > then seems inadequate to explain why dot and dot-dot are now
> > > > never
> > > > matched. Perhaps the function's documenting comment could expand
> > > > on
> > > > this a little. I'll give it some thought.
> > >
> > > The point of this code is to attempt to create a valid path that
> > > connects the inode found by the filehandle to the export point. The
> > > readdir() must determine a valid name for a dentry that is a
> > > component
> > > of that path, which is why '.' and '..' can never be acceptable.
> > >
> > > This is why I think we should keep the 'Fixes:' line. The commit it
> > > points to explains quite concisely why this patch is needed.
> > >
> >
> > By all means, mention this commit, just not with a fixed tag please.
> > IIUC, commit 21d8a15ac333 did not introduce a regression that this
> > patch fixes. Right?
> > So why insist on abusing Fixes: tag instead of a mention?
>
> I don't see it as being that straightforward.
>
> Prior to commit 21d8a15ac333, the call to lookup_one_len() could return
> a dentry (albeit one with an invalid name) depending on whether or not
> the filesystem lookup succeeds. Note that knfsd does support a lookup
> of "." and "..", as do several other NFS servers.
>
> With commit 21d8a15ac333 applied, however, lookup_one_len()
> automatically returns an EACCES error.
>
> So while I agree that there are good reasons for introducing commit
> 21d8a15ac333, it does change the behaviour in this code path.
>
I feel that we are miscommunicating.
Let me explain how I understand the code and please tell me where I am wrong.
The way I see it, before 21d8a15ac333, exportfs_decode_fh_raw() would
call lookup_one() and may get a dentry (with invalid name), but then the
sanity check following lookup_one() would surely fail, because no fs should
allow a directory to be its own parent/grandparent:
if (unlikely(nresult->d_inode != result->d_inode)) {
dput(nresult);
nresult = ERR_PTR(-ESTALE);
}
The way I see it, the only thing that commit 21d8a15ac333 changed in
this code is the return value of exportfs_decode_fh_raw() from -ESTALE
to -EACCES.
exportfs_decode_fh() converts both these errors to -ESTALE and
so does nfsd_set_fh_dentry().
Bottom line, if I am reading the code correctly, commit 21d8a15ac333 did
not change the behaviour for knfsd nor any user visible behavior for
open_by_handle_at() for userspace nfsd.
Your fix is good because:
1. It saves an unneeded call to lookup_one()
2. skipping "." and ".." increases the chance of finding the correct child
name in the case of non-unique ino
So I have no objection to your fix in generic code, but I do not see
it being a regression fix.
Where are we miscommunicating? What am I missing?
Thanks,
Amir.
next prev parent reply other threads:[~2023-12-31 10:45 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20231228201510.985235-1-trondmy@kernel.org>
2023-12-29 5:46 ` [PATCH] knfsd: fix the fallback implementation of the get_name export operation Amir Goldstein
2023-12-29 14:34 ` Chuck Lever
2023-12-29 17:44 ` Amir Goldstein
2023-12-29 23:29 ` Chuck Lever
2023-12-29 23:49 ` Trond Myklebust
2023-12-30 6:23 ` Amir Goldstein
2023-12-30 19:36 ` Trond Myklebust
2023-12-31 10:44 ` Amir Goldstein [this message]
2023-12-29 15:21 ` Trond Myklebust
2023-12-29 17:54 ` Amir Goldstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAOQ4uxh5xpJSvmYxWRKe_i=h1PRPy+nEA=vAcCD0rCJQKnm1Ww@mail.gmail.com' \
--to=amir73il@gmail.com \
--cc=chuck.lever@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trondmy@hammerspace.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).