* [PATCH v3 0/3] Allow knfsd to use atomic_open()
@ 2025-11-26 14:31 Benjamin Coddington
2025-11-26 14:31 ` [PATCH v3 1/3] VFS: move dentry_create() from fs/open.c to fs/namei.c Benjamin Coddington
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Benjamin Coddington @ 2025-11-26 14:31 UTC (permalink / raw)
To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
Jeff Layton, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
Trond Myklebust, Mike Snitzer
Cc: linux-fsdevel, linux-kernel, linux-nfs
We have workloads that will benefit from allowing knfsd to use atomic_open()
in the open/create path. There are two benefits; the first is the original
matter of correctness: when knfsd must perform both vfs_create() and
vfs_open() in series there can be races or error results that cause the
caller to receive unexpected results. The second benefit is that for some
network filesystems, we can reduce the number of remote round-trip
operations by using a single atomic_open() path which provides a performance
benefit.
I've implemented this with the simplest possible change - by modifying
dentry_create() which has a single user: knfsd. The changes cause us to
insert ourselves part-way into the previously closed/static atomic_open()
path, so I expect VFS folks to have some good ideas about potentially
superior approaches.
Previous work on commit fb70bf124b05 ("NFSD: Instantiate a struct file when
creating a regular NFSv4 file") addressed most of the atomicity issues, but
there are still a few gaps on network filesystems.
The problem was noticed on a test that did open O_CREAT with mode 0 which
will succeed in creating the file but will return -EACCES from vfs_open() -
this specific test is mentioned in 3/3 description.
Also, Trond notes that independently of the permissions issues, atomic_open
also solves races in open(O_CREAT|O_TRUNC). The NFS client now uses it for
both NFSv4 and NFSv3 for that reason. See commit 7c6c5249f061 "NFS: add
atomic_open for NFSv3 to handle O_TRUNC correctly."
Changes on v3:
- rebased onto v6.18-rc7
- R-b on 3/3 thanks to Chuck Lever
Changes on v2:
- R-b thanks to Jeff Layton
- improvements to patch descriptions thanks to Chuck Lever, Neil
Brown, and Trond Myklebust
- improvements to dentry_create()'s doc comment to clarify dentry
handling thanks to Mike Snitzer
Thanks for any additional comment and critique.
Benjamin Coddington (3):
VFS: move dentry_create() from fs/open.c to fs/namei.c
VFS: Prepare atomic_open() for dentry_create()
VFS/knfsd: Teach dentry_create() to use atomic_open()
fs/namei.c | 86 ++++++++++++++++++++++++++++++++++++++++++----
fs/nfsd/nfs4proc.c | 8 +++--
fs/open.c | 41 ----------------------
include/linux/fs.h | 2 +-
4 files changed, 85 insertions(+), 52 deletions(-)
--
2.50.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v3 1/3] VFS: move dentry_create() from fs/open.c to fs/namei.c
2025-11-26 14:31 [PATCH v3 0/3] Allow knfsd to use atomic_open() Benjamin Coddington
@ 2025-11-26 14:31 ` Benjamin Coddington
2025-11-26 14:31 ` [PATCH v3 2/3] VFS: Prepare atomic_open() for dentry_create() Benjamin Coddington
2025-11-26 14:31 ` [PATCH v3 3/3] VFS/knfsd: Teach dentry_create() to use atomic_open() Benjamin Coddington
2 siblings, 0 replies; 4+ messages in thread
From: Benjamin Coddington @ 2025-11-26 14:31 UTC (permalink / raw)
To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
Jeff Layton, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
Trond Myklebust, Mike Snitzer
Cc: linux-fsdevel, linux-kernel, linux-nfs
To prepare knfsd's helper dentry_create(), move it to namei.c so that it
can access static functions within. Callers of dentry_create() can be
viewed as being mostly done with lookup, but still need to perform a few
final checks. In order to use atomic_open() we want dentry_create() to
be able to access:
- vfs_prepare_mode
- may_o_create
- atomic_open
.. all of which have static declarations.
Signed-off-by: Benjamin Coddington <bcodding@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
---
fs/namei.c | 41 +++++++++++++++++++++++++++++++++++++++++
fs/open.c | 41 -----------------------------------------
2 files changed, 41 insertions(+), 41 deletions(-)
diff --git a/fs/namei.c b/fs/namei.c
index 7377020a2cba..88f82cb5f7a0 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4278,6 +4278,47 @@ inline struct dentry *start_creating_user_path(
}
EXPORT_SYMBOL(start_creating_user_path);
+/**
+ * dentry_create - Create and open a file
+ * @path: path to create
+ * @flags: O_ flags
+ * @mode: mode bits for new file
+ * @cred: credentials to use
+ *
+ * Caller must hold the parent directory's lock, and have prepared
+ * a negative dentry, placed in @path->dentry, for the new file.
+ *
+ * Caller sets @path->mnt to the vfsmount of the filesystem where
+ * the new file is to be created. The parent directory and the
+ * negative dentry must reside on the same filesystem instance.
+ *
+ * On success, returns a "struct file *". Otherwise a ERR_PTR
+ * is returned.
+ */
+struct file *dentry_create(const struct path *path, int flags, umode_t mode,
+ const struct cred *cred)
+{
+ struct file *file;
+ int error;
+
+ file = alloc_empty_file(flags, cred);
+ if (IS_ERR(file))
+ return file;
+
+ error = vfs_create(mnt_idmap(path->mnt),
+ d_inode(path->dentry->d_parent),
+ path->dentry, mode, true);
+ if (!error)
+ error = vfs_open(path, file);
+
+ if (unlikely(error)) {
+ fput(file);
+ return ERR_PTR(error);
+ }
+ return file;
+}
+EXPORT_SYMBOL(dentry_create);
+
/**
* vfs_mknod - create device node or file
* @idmap: idmap of the mount the inode was found from
diff --git a/fs/open.c b/fs/open.c
index 3d64372ecc67..d6dbb43d41bd 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1144,47 +1144,6 @@ struct file *dentry_open_nonotify(const struct path *path, int flags,
return f;
}
-/**
- * dentry_create - Create and open a file
- * @path: path to create
- * @flags: O_ flags
- * @mode: mode bits for new file
- * @cred: credentials to use
- *
- * Caller must hold the parent directory's lock, and have prepared
- * a negative dentry, placed in @path->dentry, for the new file.
- *
- * Caller sets @path->mnt to the vfsmount of the filesystem where
- * the new file is to be created. The parent directory and the
- * negative dentry must reside on the same filesystem instance.
- *
- * On success, returns a "struct file *". Otherwise a ERR_PTR
- * is returned.
- */
-struct file *dentry_create(const struct path *path, int flags, umode_t mode,
- const struct cred *cred)
-{
- struct file *f;
- int error;
-
- f = alloc_empty_file(flags, cred);
- if (IS_ERR(f))
- return f;
-
- error = vfs_create(mnt_idmap(path->mnt),
- d_inode(path->dentry->d_parent),
- path->dentry, mode, true);
- if (!error)
- error = vfs_open(path, f);
-
- if (unlikely(error)) {
- fput(f);
- return ERR_PTR(error);
- }
- return f;
-}
-EXPORT_SYMBOL(dentry_create);
-
/**
* kernel_file_open - open a file for kernel internal use
* @path: path of the file to open
--
2.50.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v3 2/3] VFS: Prepare atomic_open() for dentry_create()
2025-11-26 14:31 [PATCH v3 0/3] Allow knfsd to use atomic_open() Benjamin Coddington
2025-11-26 14:31 ` [PATCH v3 1/3] VFS: move dentry_create() from fs/open.c to fs/namei.c Benjamin Coddington
@ 2025-11-26 14:31 ` Benjamin Coddington
2025-11-26 14:31 ` [PATCH v3 3/3] VFS/knfsd: Teach dentry_create() to use atomic_open() Benjamin Coddington
2 siblings, 0 replies; 4+ messages in thread
From: Benjamin Coddington @ 2025-11-26 14:31 UTC (permalink / raw)
To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
Jeff Layton, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
Trond Myklebust, Mike Snitzer
Cc: linux-fsdevel, linux-kernel, linux-nfs
The next patch allows dentry_create() to call atomic_open(), but it does
not have fabricated nameidata. Let atomic_open() take a path instead.
Since atomic_open() currently takes a nameidata of which it only uses the
path and the flags, and flags are only used to update open_flags, then the
flag update can happen before calling atomic_open(). Then, only the path
needs be passed to atomic_open() rather than the whole nameidata. This
makes it easier for dentry_create() To call atomic_open().
Signed-off-by: Benjamin Coddington <bcodding@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
---
fs/namei.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/fs/namei.c b/fs/namei.c
index 88f82cb5f7a0..389f91a4d121 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3640,19 +3640,16 @@ static int may_o_create(struct mnt_idmap *idmap,
*
* Returns an error code otherwise.
*/
-static struct dentry *atomic_open(struct nameidata *nd, struct dentry *dentry,
+static struct dentry *atomic_open(const struct path *path, struct dentry *dentry,
struct file *file,
int open_flag, umode_t mode)
{
struct dentry *const DENTRY_NOT_SET = (void *) -1UL;
- struct inode *dir = nd->path.dentry->d_inode;
+ struct inode *dir = path->dentry->d_inode;
int error;
- if (nd->flags & LOOKUP_DIRECTORY)
- open_flag |= O_DIRECTORY;
-
file->__f_path.dentry = DENTRY_NOT_SET;
- file->__f_path.mnt = nd->path.mnt;
+ file->__f_path.mnt = path->mnt;
error = dir->i_op->atomic_open(dir, dentry, file,
open_to_namei_flags(open_flag), mode);
d_lookup_done(dentry);
@@ -3764,7 +3761,9 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file,
if (create_error)
open_flag &= ~O_CREAT;
if (dir_inode->i_op->atomic_open) {
- dentry = atomic_open(nd, dentry, file, open_flag, mode);
+ if (nd->flags & LOOKUP_DIRECTORY)
+ open_flag |= O_DIRECTORY;
+ dentry = atomic_open(&nd->path, dentry, file, open_flag, mode);
if (unlikely(create_error) && dentry == ERR_PTR(-ENOENT))
dentry = ERR_PTR(create_error);
return dentry;
--
2.50.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v3 3/3] VFS/knfsd: Teach dentry_create() to use atomic_open()
2025-11-26 14:31 [PATCH v3 0/3] Allow knfsd to use atomic_open() Benjamin Coddington
2025-11-26 14:31 ` [PATCH v3 1/3] VFS: move dentry_create() from fs/open.c to fs/namei.c Benjamin Coddington
2025-11-26 14:31 ` [PATCH v3 2/3] VFS: Prepare atomic_open() for dentry_create() Benjamin Coddington
@ 2025-11-26 14:31 ` Benjamin Coddington
2 siblings, 0 replies; 4+ messages in thread
From: Benjamin Coddington @ 2025-11-26 14:31 UTC (permalink / raw)
To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
Jeff Layton, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
Trond Myklebust, Mike Snitzer
Cc: linux-fsdevel, linux-kernel, linux-nfs
While knfsd offers combined exclusive create and open results to clients,
on some filesystems those results may not be atomic. This behavior can be
observed. For example, an open O_CREAT with mode 0 will succeed in creating
the file but unexpectedly return -EACCES from vfs_open().
Additionally reducing the number of remote RPC calls required for O_CREAT
on network filesystem provides a performance benefit in the open path.
Teach knfsd's helper dentry_create() to use atomic_open() for filesystems
that support it. The previously const @path is passed up to atomic_open()
and may be modified depending on whether an existing entry was found or if
the atomic_open() returned an error and consumed the passed-in dentry.
Signed-off-by: Benjamin Coddington <bcodding@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/namei.c | 46 +++++++++++++++++++++++++++++++++++++++-------
fs/nfsd/nfs4proc.c | 8 +++++---
include/linux/fs.h | 2 +-
3 files changed, 45 insertions(+), 11 deletions(-)
diff --git a/fs/namei.c b/fs/namei.c
index 389f91a4d121..d70fd7362107 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4286,6 +4286,9 @@ EXPORT_SYMBOL(start_creating_user_path);
*
* Caller must hold the parent directory's lock, and have prepared
* a negative dentry, placed in @path->dentry, for the new file.
+ * If the file was looked up only or didn't need to be created,
+ * FMODE_OPENED will not be set, and @path will be updated with the
+ * new dentry. The dentry may be negative.
*
* Caller sets @path->mnt to the vfsmount of the filesystem where
* the new file is to be created. The parent directory and the
@@ -4294,21 +4297,50 @@ EXPORT_SYMBOL(start_creating_user_path);
* On success, returns a "struct file *". Otherwise a ERR_PTR
* is returned.
*/
-struct file *dentry_create(const struct path *path, int flags, umode_t mode,
+struct file *dentry_create(struct path *path, int flags, umode_t mode,
const struct cred *cred)
{
+ struct dentry *dentry = path->dentry;
+ struct dentry *dir = dentry->d_parent;
+ struct inode *dir_inode = d_inode(dir);
+ struct mnt_idmap *idmap;
struct file *file;
- int error;
+ int error, create_error;
file = alloc_empty_file(flags, cred);
if (IS_ERR(file))
return file;
- error = vfs_create(mnt_idmap(path->mnt),
- d_inode(path->dentry->d_parent),
- path->dentry, mode, true);
- if (!error)
- error = vfs_open(path, file);
+ idmap = mnt_idmap(path->mnt);
+
+ if (dir_inode->i_op->atomic_open) {
+ path->dentry = dir;
+ mode = vfs_prepare_mode(idmap, dir_inode, mode, S_IALLUGO, S_IFREG);
+
+ create_error = may_o_create(idmap, path, dentry, mode);
+ if (create_error)
+ flags &= ~O_CREAT;
+
+ dentry = atomic_open(path, dentry, file, flags, mode);
+ error = PTR_ERR_OR_ZERO(dentry);
+
+ if (unlikely(create_error) && error == -ENOENT)
+ error = create_error;
+
+ if (!error) {
+ if (file->f_mode & FMODE_CREATED)
+ fsnotify_create(dir->d_inode, dentry);
+ if (file->f_mode & FMODE_OPENED)
+ fsnotify_open(file);
+ }
+
+ path->dentry = dentry;
+
+ } else {
+ error = vfs_create(idmap, dir_inode, dentry, mode, true);
+ if (!error)
+ error = vfs_open(path, file);
+ }
if (unlikely(error)) {
fput(file);
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 7f7e6bb23a90..7e39234e0649 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -194,7 +194,7 @@ static inline bool nfsd4_create_is_exclusive(int createmode)
}
static __be32
-nfsd4_vfs_create(struct svc_fh *fhp, struct dentry *child,
+nfsd4_vfs_create(struct svc_fh *fhp, struct dentry **child,
struct nfsd4_open *open)
{
struct file *filp;
@@ -214,9 +214,11 @@ nfsd4_vfs_create(struct svc_fh *fhp, struct dentry *child,
}
path.mnt = fhp->fh_export->ex_path.mnt;
- path.dentry = child;
+ path.dentry = *child;
filp = dentry_create(&path, oflags, open->op_iattr.ia_mode,
current_cred());
+ *child = path.dentry;
+
if (IS_ERR(filp))
return nfserrno(PTR_ERR(filp));
@@ -353,7 +355,7 @@ nfsd4_create_file(struct svc_rqst *rqstp, struct svc_fh *fhp,
status = fh_fill_pre_attrs(fhp);
if (status != nfs_ok)
goto out;
- status = nfsd4_vfs_create(fhp, child, open);
+ status = nfsd4_vfs_create(fhp, &child, open);
if (status != nfs_ok)
goto out;
open->op_created = true;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index dd3b57cfadee..2d3fcb343993 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2883,7 +2883,7 @@ struct file *dentry_open(const struct path *path, int flags,
const struct cred *creds);
struct file *dentry_open_nonotify(const struct path *path, int flags,
const struct cred *cred);
-struct file *dentry_create(const struct path *path, int flags, umode_t mode,
+struct file *dentry_create(struct path *path, int flags, umode_t mode,
const struct cred *cred);
const struct path *backing_file_user_path(const struct file *f);
--
2.50.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-11-26 14:31 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-26 14:31 [PATCH v3 0/3] Allow knfsd to use atomic_open() Benjamin Coddington
2025-11-26 14:31 ` [PATCH v3 1/3] VFS: move dentry_create() from fs/open.c to fs/namei.c Benjamin Coddington
2025-11-26 14:31 ` [PATCH v3 2/3] VFS: Prepare atomic_open() for dentry_create() Benjamin Coddington
2025-11-26 14:31 ` [PATCH v3 3/3] VFS/knfsd: Teach dentry_create() to use atomic_open() Benjamin Coddington
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).