From: "NeilBrown" <neil@brown.name>
To: "Al Viro" <viro@zeniv.linux.org.uk>
Cc: "Christian Brauner" <brauner@kernel.org>,
"Jan Kara" <jack@suse.cz>, "David Howells" <dhowells@redhat.com>,
"Marc Dionne" <marc.dionne@auristor.com>,
"Xiubo Li" <xiubli@redhat.com>,
"Ilya Dryomov" <idryomov@gmail.com>,
"Tyler Hicks" <code@tyhicks.com>,
"Miklos Szeredi" <miklos@szeredi.hu>,
"Richard Weinberger" <richard@nod.at>,
"Anton Ivanov" <anton.ivanov@cambridgegreys.com>,
"Johannes Berg" <johannes@sipsolutions.net>,
"Trond Myklebust" <trondmy@kernel.org>,
"Anna Schumaker" <anna@kernel.org>,
"Chuck Lever" <chuck.lever@oracle.com>,
"Jeff Layton" <jlayton@kernel.org>,
"Amir Goldstein" <amir73il@gmail.com>,
"Steve French" <sfrench@samba.org>,
"Namjae Jeon" <linkinjeon@kernel.org>,
"Carlos Maiolino" <cem@kernel.org>,
linux-fsdevel@vger.kernel.org, linux-afs@lists.infradead.org,
netfs@lists.linux.dev, ceph-devel@vger.kernel.org,
ecryptfs@vger.kernel.org, linux-um@lists.infradead.org,
linux-nfs@vger.kernel.org, linux-unionfs@vger.kernel.org,
linux-cifs@vger.kernel.org, linux-xfs@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 02/11] VFS: introduce dentry_lookup() and friends
Date: Wed, 13 Aug 2025 17:48:09 +1000 [thread overview]
Message-ID: <175507128953.2234665.9075244835979746809@noble.neil.brown.name> (raw)
In-Reply-To: <20250813041253.GY222315@ZenIV>
On Wed, 13 Aug 2025, Al Viro wrote:
> On Tue, Aug 12, 2025 at 12:25:05PM +1000, NeilBrown wrote:
> > This patch is the first step in introducing a new API for locked
> > operation on names in directories. It supports operations that create or
> > remove names. Rename operations will also be part of this new API but
> > require different specific interfaces.
> >
> > The plan is to lock just the dentry (or dentries), not the whole
> > directory. dentry_lookup() combines locking the directory and
> > performing a lookup prior to a change to the directory. On success it
> > returns a dentry which is consider to be locked, though at this stage
> > the whole parent directory is actually locked.
> >
> > dentry_lookup_noperm() does the same without needing a mnt_idmap and
> > without checking permissions. This is useful for internal filesystem
> > management (e.g. creating virtual files in response to events) and in
> > other cases similar to lookup_noperm().
>
> Details, please. I seriously hope that simple_start_creating() will
> end up used for all of those; your variant allows passing LOOKUP_...
> flags and I would like to understand what the usecases will be.
simple_start_creating() would meet a lot of needs.
A corresponding simple_start_deleting() would suit
cachefiles_lookup_for_cull(), fuse_reverse_inval_entry(),
nfsd4_unlink_clid_dir() etc.
btrfs_ioctl_snap_destroy() would want a simple_start_deleting() but also
wants killable.
cachefiles_get_directory() wants a simple_start_creating() without the
LOOKUP_EXCL so that if is already exists, it can go ahead and use the
dentry without creating.
cachefiles_commit_tmpfile() has a similar need - if it exists it will
unlink and repeat the lookup. Once it doesn't it it will be target of
vfs_link()
nfsd3_create_file() wants simple_start_creating without LOOKUP_EXCL.
as do a few other nfsd functions.
nfsd4_list_rec_dir() effectively wants simple_start_deleting() (i.e.
fail if it doesn't exist) but sometimes it won't delete, it will do
something else.
All calls pass one of:
0
LOOKUP_CREATE
LOOKUP_CREATE | LOOKUP_EXCL
The first two aren't reliably called for any particular task so a
"simple_start_XXX" sort of name doesn't seem appropriate.
>
> What's more, IME the "intent" arguments are best avoided - better have
> separate primitives; if internally they call a common helper with some
> flags, etc., it's their business, but exposing that to callers ends
> up with very unpleasant audits down the road. As soon as you get
> callers that pass something other than explicit constants, you get
> data flow into the mix ("which values can end up passed in this one?")
> and that's asking for trouble.
lookup_no_create, lookup_may_create, lookup_must_create ???
Either as function names, or as an enum to pass to the function?
If we had separate functions we would need _noperm and potentially
_killable versions of each. Fortunately there is no current need for
_noperm_killable. Maybe that combinatorial explosion isn't too bad.
>
> > __dentry_lookup() is a VFS-internal interface which does no permissions
> > checking and assumes that the hash of the name has already been stored
> > in the qstr. This is useful following filename_parentat().
> >
> > done_dentry_lookup() is provided which performs the inverse of putting
> > the dentry and unlocking.
>
> Not sure I like the name, TBH...
I'm open to suggestions for alternatives.
>
> > Like lookup_one_qstr_excl(), dentry_lookup() returns -ENOENT if
> > LOOKUP_CREATE was NOT given and the name cannot be found,, and returns
> > -EEXIST if LOOKUP_EXCL WAS given and the name CAN be found.
> >
> > These functions replace all uses of lookup_one_qstr_excl() in namei.c
> > except for those used for rename.
> >
> > They also allow simple_start_creating() to be simplified into a
> > static-inline.
>
> Umm... You've also moved it into linux/namei.h; we'd better verify that
> it's included in all places that need that...
I added includes where necessary.
>
> > A __free() class is provided to allow done_dentry_lookup() to be called
> > transparently on scope exit. dget() is extended to ignore ERR_PTR()s
> > so that "return dget(dentry);" is always safe when dentry was provided
> > by dentry_lookup() and the variable was declared __free(dentry_lookup).
>
> Please separate RAII stuff from the rest of that commit. Deciding if
> it's worth doing in any given case is hard to do upfront.
I'd rather not - it does make a few changes much nicer. But I can if it
is necessary.
>
> > lookup_noperm_common() and lookup_one_common() are moved earlier in
> > namei.c.
>
> Again, separate commit - reduce the noise in less trivial ones.
>
> > +struct dentry *dentry_lookup(struct mnt_idmap *idmap,
> > + struct qstr *last,
> > + struct dentry *base,
> > + unsigned int lookup_flags)
>
> Same problem with flags, *ESPECIALLY* if your endgame involves the
> locking of result dependent upon those.
Locking the result happens precisely if a non-error is returned. The
lookup flags indicate which circumstances result in errors.
>
> > - dput(dentry);
> > + done_dentry_lookup(dentry);
> > dentry = ERR_PTR(error);
> > -unlock:
> > - inode_unlock(path->dentry->d_inode);
>
> Incidentally, this combination (dput()+unlock+return ERR_PTR())
> is common enough. Might be worth a helper (taking error as argument;
> that's one of the reasons why I'm not sure RAII is a good fit for this
> problem space)
I found RAII worked quite well in several places and very well in a few.
I think the main reason I had for *not* using RAII is that you really
need to use it for everything and I didn't want to change code too much.
>
> > +/* no_free_ptr() must not be used here - use dget() */
> > +DEFINE_FREE(dentry_lookup, struct dentry *, if (_T) done_dentry_lookup(_T))
>
> UGH. Please, take that to the very end of the series. And the comment
> re no_free_ptr() and dget() is really insufficient - you will get a dentry
> reference that outlives your destructor, except that locking environment will
> change. Which is subtle enough to warrant a discussion.
>
> Besides, I'm not fond of mixing that with dget(); put another way, this
> subset of dget() uses is worth being clearly marked as such. A primitive
> that calls dget() to do work? Sure, no problem.
>
> We have too many dget() callers with very little indication of what we are
> doing there (besides "bump the refcount"); tree-in-dcache series will
> at least peel off the ones that are about pinning a created object in
> ramfs-style filesystems. That's not going to cover everything (not even
> close), but let's at least not add to the already messy pile...
>
Thanks for the review,
NeilBrown
next prev parent reply other threads:[~2025-08-13 7:48 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-12 2:25 [PATCH 00/11] VFS: prepare for changes to directory locking NeilBrown
2025-08-12 2:25 ` [PATCH 01/11] VFS: discard err2 in filename_create() NeilBrown
2025-08-13 3:22 ` Al Viro
2025-08-12 2:25 ` [PATCH 02/11] VFS: introduce dentry_lookup() and friends NeilBrown
2025-08-13 4:12 ` Al Viro
2025-08-13 7:48 ` NeilBrown [this message]
2025-08-12 2:25 ` [PATCH 03/11] VFS: add dentry_lookup_killable() NeilBrown
2025-08-13 4:15 ` Al Viro
2025-08-13 7:50 ` NeilBrown
2025-08-12 2:25 ` [PATCH 04/11] VFS: introduce dentry_lookup_continue() NeilBrown
2025-08-13 4:22 ` Al Viro
2025-08-13 7:53 ` NeilBrown
2025-08-18 12:39 ` Amir Goldstein
2025-08-18 21:52 ` NeilBrown
2025-08-19 8:37 ` Amir Goldstein
2025-08-12 2:25 ` [PATCH 05/11] VFS: add rename_lookup() NeilBrown
2025-08-13 4:35 ` Al Viro
2025-08-13 8:04 ` NeilBrown
2025-08-14 1:40 ` Al Viro
2025-08-12 2:25 ` [PATCH 06/11] VFS: unify old_mnt_idmap and new_mnt_idmap in renamedata NeilBrown
2025-08-13 4:36 ` Al Viro
2025-08-12 2:25 ` [PATCH 07/11] VFS: Change vfs_mkdir() to unlock on failure NeilBrown
2025-08-13 7:22 ` Amir Goldstein
2025-08-14 1:13 ` NeilBrown
2025-08-14 13:29 ` Amir Goldstein
2025-08-12 2:25 ` [PATCH 08/11] VFS: allow d_splice_alias() and d_add() to work on hashed dentries NeilBrown
2025-08-13 5:07 ` Al Viro
2025-08-12 2:25 ` [PATCH 09/11] VFS: use global wait-queue table for d_alloc_parallel() NeilBrown
2025-08-13 6:44 ` Al Viro
2025-08-14 1:31 ` NeilBrown
2025-08-12 2:25 ` [PATCH 10/11] VFS: use d_alloc_parallel() in lookup_one_qstr_excl() NeilBrown
2025-08-13 5:19 ` Al Viro
2025-08-14 0:56 ` NeilBrown
2025-08-12 2:25 ` [PATCH 11/11] VFS: introduce d_alloc_noblock() and d_alloc_locked() NeilBrown
2025-08-13 6:53 ` Al Viro
2025-08-14 2:07 ` NeilBrown
2025-08-14 13:47 ` Amir Goldstein
2025-08-13 0:01 ` [PATCH 00/11] VFS: prepare for changes to directory locking Al Viro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=175507128953.2234665.9075244835979746809@noble.neil.brown.name \
--to=neil@brown.name \
--cc=amir73il@gmail.com \
--cc=anna@kernel.org \
--cc=anton.ivanov@cambridgegreys.com \
--cc=brauner@kernel.org \
--cc=cem@kernel.org \
--cc=ceph-devel@vger.kernel.org \
--cc=chuck.lever@oracle.com \
--cc=code@tyhicks.com \
--cc=dhowells@redhat.com \
--cc=ecryptfs@vger.kernel.org \
--cc=idryomov@gmail.com \
--cc=jack@suse.cz \
--cc=jlayton@kernel.org \
--cc=johannes@sipsolutions.net \
--cc=linkinjeon@kernel.org \
--cc=linux-afs@lists.infradead.org \
--cc=linux-cifs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-um@lists.infradead.org \
--cc=linux-unionfs@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=marc.dionne@auristor.com \
--cc=miklos@szeredi.hu \
--cc=netfs@lists.linux.dev \
--cc=richard@nod.at \
--cc=sfrench@samba.org \
--cc=trondmy@kernel.org \
--cc=viro@zeniv.linux.org.uk \
--cc=xiubli@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).