From: Al Viro <viro@zeniv.linux.org.uk>
To: Bernd Schubert <bschubert@ddn.com>
Cc: linux-fsdevel@vger.kernel.org, bernd.schubert@fastmail.fm,
miklos@szeredi.hu, dsingh@ddn.com,
Christian Brauner <brauner@kernel.org>,
Amir Goldstein <amir73il@gmail.com>
Subject: Re: [PATCH v10 4/8] [RFC] Allow atomic_open() on positive dentry (w/o O_CREAT)
Date: Sat, 28 Oct 2023 05:46:46 +0100 [thread overview]
Message-ID: <20231028044646.GS800259@ZenIV> (raw)
In-Reply-To: <20231023183035.11035-5-bschubert@ddn.com>
On Mon, Oct 23, 2023 at 08:30:31PM +0200, Bernd Schubert wrote:
> Previous patch allowed atomic-open on a positive dentry when
> O_CREAT was set (in lookup_open). This adds in atomic-open
> when O_CREAT is not set.
>
> Code wise it would be possible to just drop the dentry in
> open_last_lookups and then fall through to lookup_open.
> But then this would add some overhead for dentry drop,
> re-lookup and actually also call into d_revalidate.
> So as suggested by Miklos, this adds a helper function
> (atomic_revalidate_open) to immediately open the dentry
> with atomic_open.
>
> Signed-off-by: Bernd Schubert <bschubert@ddn.com>
> Cc: Miklos Szeredi <miklos@szeredi.hu>
> Cc: Dharmendra Singh <dsingh@ddn.com>
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Amir Goldstein <amir73il@gmail.com>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: linux-fsdevel@vger.kernel.org
> ---
> fs/namei.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 63 insertions(+), 3 deletions(-)
This is bloody awful.
> diff --git a/fs/namei.c b/fs/namei.c
> index ff913e6b12b4..5e2d569ffe38 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1614,10 +1614,11 @@ struct dentry *lookup_one_qstr_excl(const struct qstr *name,
> }
> EXPORT_SYMBOL(lookup_one_qstr_excl);
>
> -static struct dentry *lookup_fast(struct nameidata *nd)
> +static struct dentry *lookup_fast(struct nameidata *nd, bool *atomic_revalidate)
Yechhh... Note that absolute majority of calls will be nowhere near
the case when that atomic_revalidate thing might possibly be set.
> {
> struct dentry *dentry, *parent = nd->path.dentry;
> int status = 1;
> + *atomic_revalidate = false;
>
> /*
> * Rename seqlock is not required here because in the off chance
> @@ -1659,6 +1660,10 @@ static struct dentry *lookup_fast(struct nameidata *nd)
> dput(dentry);
> return ERR_PTR(status);
> }
> +
> + if (status == D_REVALIDATE_ATOMIC)
> + *atomic_revalidate = true;
> +
> return dentry;
> }
> @@ -1984,6 +1989,7 @@ static const char *handle_dots(struct nameidata *nd, int type)
> static const char *walk_component(struct nameidata *nd, int flags)
> {
> struct dentry *dentry;
> + bool atomic_revalidate;
> /*
> * "." and ".." are special - ".." especially so because it has
> * to be able to know about the current root directory and
> @@ -1994,7 +2000,7 @@ static const char *walk_component(struct nameidata *nd, int flags)
> put_link(nd);
> return handle_dots(nd, nd->last_type);
> }
> - dentry = lookup_fast(nd);
> + dentry = lookup_fast(nd, &atomic_revalidate);
> if (IS_ERR(dentry))
> return ERR_CAST(dentry);
> if (unlikely(!dentry)) {
> @@ -2002,6 +2008,9 @@ static const char *walk_component(struct nameidata *nd, int flags)
> if (IS_ERR(dentry))
> return ERR_CAST(dentry);
> }
> +
> + WARN_ON_ONCE(atomic_revalidate);
> +
> if (!(flags & WALK_MORE) && nd->depth)
> put_link(nd);
> return step_into(nd, flags, dentry);
> @@ -3383,6 +3392,42 @@ static struct dentry *atomic_open(struct nameidata *nd, struct dentry *dentry,
> return dentry;
> }
> +static struct dentry *atomic_revalidate_open(struct dentry *dentry,
> + struct nameidata *nd,
> + struct file *file,
> + const struct open_flags *op,
> + bool *got_write)
> +{
> + struct mnt_idmap *idmap;
> + struct dentry *dir = nd->path.dentry;
> + struct inode *dir_inode = dir->d_inode;
> + int open_flag = op->open_flag;
> + umode_t mode = op->mode;
> +
> + if (unlikely(IS_DEADDIR(dir_inode)))
> + return ERR_PTR(-ENOENT);
What's the point of doing that check when there's nothing to stop
directory from being removed right under you? Note that similar
check in lookup_open() is done after the caller has locked the
damn thing.
> + file->f_mode &= ~FMODE_CREATED;
> +
> + if (WARN_ON_ONCE(open_flag & O_CREAT))
> + return ERR_PTR(-EINVAL);
Really. With the only caller being under
int open_flag = op->open_flag;
...
if (!(open_flag & O_CREAT)) {
> +
> + if (open_flag & (O_TRUNC | O_WRONLY | O_RDWR))
> + *got_write = !mnt_want_write(nd->path.mnt);
> + else
> + *got_write = false;
> +
> + if (!*got_write)
> + open_flag &= ~O_TRUNC;
> +
> + inode_lock_shared(dir->d_inode);
> + dentry = atomic_open(nd, dentry, file, open_flag, mode);
> + inode_unlock_shared(dir->d_inode);
What will happen if you get that thing called with NULL ->i_op->atomic_open()?
> +
> + return dentry;
> +
> +}
> +
> /*
> * Look up and maybe create and open the last component.
> *
> @@ -3527,12 +3572,26 @@ static const char *open_last_lookups(struct nameidata *nd,
> }
>
> if (!(open_flag & O_CREAT)) {
> + bool atomic_revalidate;
> +
> if (nd->last.name[nd->last.len])
> nd->flags |= LOOKUP_FOLLOW | LOOKUP_DIRECTORY;
> /* we _can_ be in RCU mode here */
> - dentry = lookup_fast(nd);
> + dentry = lookup_fast(nd, &atomic_revalidate);
> if (IS_ERR(dentry))
> return ERR_CAST(dentry);
> + if (dentry && unlikely(atomic_revalidate)) {
> + /* The file system shall not claim to support atomic
> + * revalidate in RCU mode
> + */
> + if (WARN_ON_ONCE(nd->flags & LOOKUP_RCU)) {
> + dput(dentry);
dput() under rcu_read_lock()? For one thing, it's completely wrong
as far as recovery strategy goes; we do *NOT* grab references under
LOOKUP_RCU, so whatever we got here is not a counting reference.
What's more, your comment is actively misleading - you set that
atomic_revalidate thing in the very end of lookup_fast() and
there is no way to get there with LOOKUP_RCU. Look:
static struct dentry *lookup_fast(struct nameidata *nd)
{
...
if (nd->flags & LOOKUP_RCU) {
...
status = d_revalidate(dentry, nd->flags);
if (likely(status > 0))
return dentry;
That's where we leave if we'd found and successfully
revalidated a dentry in RCU mode.
if (!try_to_unlazy_next(nd, dentry))
return ERR_PTR(-ECHILD);
... and this is where we'd already left the RCU mode.
if (status == -ECHILD)
/* we'd been told to redo it in non-rcu mode */
status = d_revalidate(dentry, nd->flags);
} else {
here we hadn't been in RCU mode to start with and we *never*
switch from non-RCU to RCU.
...
}
// and here you set that flag of yours.
So no matter what your ->d_revalidate() returns, you are
not going to see atomic_... shite set in RCU mode. It's not
a matter of filesystem behaviour, contrary to your comment.
> + return ERR_PTR(-ECHILD);
> + }
> + dentry = atomic_revalidate_open(dentry, nd, file, op,
> + &got_write);
> + goto drop_write;
> + }
> if (likely(dentry))
> goto finish_lookup;
>
> @@ -3569,6 +3628,7 @@ static const char *open_last_lookups(struct nameidata *nd,
> else
> inode_unlock_shared(dir->d_inode);
>
> +drop_write:
> if (got_write)
> mnt_drop_write(nd->path.mnt);
That helper of yours is a bad idea. Control flow in that area is
messy and hard to follow as it is, and we had _MANY_ bugs stemming
from that. You are making it harder to follow; this stuff really
should've gone into lookup_open().
And I really hate that 'atomic_revalidate' thing of yours.
Especially since the reader gets to do major head-scratching about
the WARN_ON_ONCE(atomic_revalidate) in walk_component(). Takes
guessing that it's probably a matter of LOOKUP_OPEN *not* being
there in walk_component() and always being there in the
open_last_lookups() (we never get there for O_PATH opens, so
op->intent will have it). So at a guess you mean to have
->d_revalidate() only return that magical value if LOOKUP_OPEN
is present in flags. Which seems to be the case, judging by
the subsequent patches in the series.
_IF_ we want to go in that direction, at least make it
if (status == THAT_MAGIC_VALUE) {
if (unlikely(!atomic_revalidate)) {
if (WARN_ON_ONCE(nd->flags & LOOKUP_OPEN))
// insane caller
;
else
// insane ->d_revalidate() instance
WARN_ON_ONCE(1);
} else {
*atomic_revalidate = true;
}
}
and pass it NULL when calling it from walk_component().
Again, I'm not at all sure it's a good idea to start with. Hard to
tell without seeing how it'll look after massage that would move
that new call of atomic_open() down into lookup_open().
next prev parent reply other threads:[~2023-10-28 4:46 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-23 18:30 [PATCH v10 0/8] fuse: full atomic open and atomic-open-revalidate Bernd Schubert
2023-10-23 18:30 ` [PATCH v10 1/8] fuse: rename fuse_create_open Bernd Schubert
2023-10-23 18:30 ` [PATCH v10 2/8] fuse: introduce atomic open Bernd Schubert
2023-10-24 10:12 ` Yuan Yao
2023-10-24 12:36 ` Bernd Schubert
2023-10-26 2:42 ` Yuan Yao
2023-11-29 6:46 ` [PATCH 0/1] Adapt atomic open to fuse no_open/no_open_dir Yuan Yao
2023-11-29 6:46 ` [PATCH 1/1] fuse: Handle no_open/no_opendir in atomic_open Yuan Yao
2023-11-29 22:21 ` Bernd Schubert
2023-10-28 3:03 ` [PATCH v10 2/8] fuse: introduce atomic open Al Viro
2023-10-30 15:21 ` Bernd Schubert
2024-01-23 8:40 ` [PATCH 0/1] Fix-atomic_open-not-using-negative-d_entry Yuan Yao
2024-01-23 8:40 ` [PATCH 1/1] fuse: Make atomic_open use negative d_entry Yuan Yao
2024-01-27 13:38 ` Bernd Schubert
2024-02-09 7:46 ` Yuan Yao
2024-02-19 11:37 ` Bernd Schubert
2024-03-13 10:25 ` Yuan Yao
2024-03-13 23:00 ` Bernd Schubert
2024-03-14 10:34 ` [PATCH] fuse: Do NULL check instead of IS_ERR in atomic_open Keiichi Watanabe
2024-03-15 13:09 ` Keiichi Watanabe
2024-03-24 4:32 ` Al Viro
2023-10-23 18:30 ` [PATCH v10 3/8] [RFC] Allow atomic_open() on positive dentry (O_CREAT) Bernd Schubert
2023-10-23 18:30 ` [PATCH v10 4/8] [RFC] Allow atomic_open() on positive dentry (w/o O_CREAT) Bernd Schubert
2023-10-24 13:46 ` Bernd Schubert
2023-10-28 4:46 ` Al Viro [this message]
2023-10-23 18:30 ` [PATCH v10 5/8] fuse: Revalidate positive entries in fuse_atomic_open Bernd Schubert
2023-10-28 5:18 ` Al Viro
2023-10-28 5:25 ` Al Viro
2023-10-23 18:30 ` [PATCH v10 6/8] fuse: Return D_REVALIDATE_ATOMIC for cached dentries Bernd Schubert
2023-10-23 18:30 ` [PATCH v10 7/8] fuse: Avoid code duplication in atomic open Bernd Schubert
2023-10-23 18:30 ` [PATCH v10 8/8] fuse atomic open: No fallback for symlinks, just call finish_no_open Bernd Schubert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231028044646.GS800259@ZenIV \
--to=viro@zeniv.linux.org.uk \
--cc=amir73il@gmail.com \
--cc=bernd.schubert@fastmail.fm \
--cc=brauner@kernel.org \
--cc=bschubert@ddn.com \
--cc=dsingh@ddn.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=miklos@szeredi.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).