public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] btrfs: stop duplicating VFS code for subvolume/snapshot dentry
@ 2026-01-08 13:35 fdmanana
  2026-01-08 13:35 ` [PATCH 1/4] fs: export may_delete() as may_delete_dentry() fdmanana
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: fdmanana @ 2026-01-08 13:35 UTC (permalink / raw)
  To: linux-btrfs; +Cc: linux-fsdevel, brauner, viro, Filipe Manana

From: Filipe Manana <fdmanana@suse.com>

Currently btrfs has copies of two unexported functions from fs/namei.c
used in the snapshot/subvolume creation and deletion. This patchset
exports those functions and makes btrfs use them, to avoid duplication
and the burden of keeping the copies up to date.

Filipe Manana (4):
  fs: export may_delete() as may_delete_dentry()
  fs: export may_create() as may_create_dentry()
  btrfs: use may_delete_dentry() in btrfs_ioctl_snap_destroy()
  btrfs: use may_create_dentry() in btrfs_mksubvol()

 fs/btrfs/ioctl.c   | 73 ++--------------------------------------------
 fs/namei.c         | 36 ++++++++++++-----------
 include/linux/fs.h |  5 ++++
 3 files changed, 26 insertions(+), 88 deletions(-)

-- 
2.47.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] fs: export may_delete() as may_delete_dentry()
  2026-01-08 13:35 [PATCH 0/4] btrfs: stop duplicating VFS code for subvolume/snapshot dentry fdmanana
@ 2026-01-08 13:35 ` fdmanana
  2026-01-08 13:35 ` [PATCH 2/4] fs: export may_create() as may_create_dentry() fdmanana
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: fdmanana @ 2026-01-08 13:35 UTC (permalink / raw)
  To: linux-btrfs; +Cc: linux-fsdevel, brauner, viro, Filipe Manana

From: Filipe Manana <fdmanana@suse.com>

For many years btrfs as been using a copy of may_delete() in
fs/btrfs/ioctl.c:btrfs_may_delete(). Everytime may_delete() is updated we
need to update the btrfs copy, and this is a maintenance burden. Currently
there are minor differences between both because the btrfs side lacks
updates done in may_delete().

Export may_delete() so that btrfs can use it and with the less generic
name may_delete_dentry(). While at it change the calls in vfs_rmdir() to
pass a boolean literal instead of 1 and 0 as the last argument since the
argument has a bool type.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/namei.c         | 17 +++++++++--------
 include/linux/fs.h |  3 +++
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index bf0f66f0e9b9..28aebc786e8f 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3604,7 +3604,7 @@ EXPORT_SYMBOL(__check_sticky);
  * 11. We don't allow removal of NFS sillyrenamed files; it's handled by
  *     nfs_async_unlink().
  */
-static int may_delete(struct mnt_idmap *idmap, struct inode *dir,
+int may_delete_dentry(struct mnt_idmap *idmap, struct inode *dir,
 		      struct dentry *victim, bool isdir)
 {
 	struct inode *inode = d_backing_inode(victim);
@@ -3646,6 +3646,7 @@ static int may_delete(struct mnt_idmap *idmap, struct inode *dir,
 		return -EBUSY;
 	return 0;
 }
+EXPORT_SYMBOL(may_delete_dentry);
 
 /*	Check whether we can create an object with dentry child in directory
  *  dir.
@@ -5209,7 +5210,7 @@ SYSCALL_DEFINE2(mkdir, const char __user *, pathname, umode_t, mode)
 int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir,
 	      struct dentry *dentry, struct delegated_inode *delegated_inode)
 {
-	int error = may_delete(idmap, dir, dentry, 1);
+	int error = may_delete_dentry(idmap, dir, dentry, true);
 
 	if (error)
 		return error;
@@ -5344,7 +5345,7 @@ int vfs_unlink(struct mnt_idmap *idmap, struct inode *dir,
 	       struct dentry *dentry, struct delegated_inode *delegated_inode)
 {
 	struct inode *target = dentry->d_inode;
-	int error = may_delete(idmap, dir, dentry, 0);
+	int error = may_delete_dentry(idmap, dir, dentry, false);
 
 	if (error)
 		return error;
@@ -5816,7 +5817,7 @@ int vfs_rename(struct renamedata *rd)
 	if (source == target)
 		return 0;
 
-	error = may_delete(rd->mnt_idmap, old_dir, old_dentry, is_dir);
+	error = may_delete_dentry(rd->mnt_idmap, old_dir, old_dentry, is_dir);
 	if (error)
 		return error;
 
@@ -5826,11 +5827,11 @@ int vfs_rename(struct renamedata *rd)
 		new_is_dir = d_is_dir(new_dentry);
 
 		if (!(flags & RENAME_EXCHANGE))
-			error = may_delete(rd->mnt_idmap, new_dir,
-					   new_dentry, is_dir);
+			error = may_delete_dentry(rd->mnt_idmap, new_dir,
+						  new_dentry, is_dir);
 		else
-			error = may_delete(rd->mnt_idmap, new_dir,
-					   new_dentry, new_is_dir);
+			error = may_delete_dentry(rd->mnt_idmap, new_dir,
+						  new_dentry, new_is_dir);
 	}
 	if (error)
 		return error;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f5c9cf28c4dc..319aaeb876fd 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2657,6 +2657,9 @@ static inline int path_permission(const struct path *path, int mask)
 int __check_sticky(struct mnt_idmap *idmap, struct inode *dir,
 		   struct inode *inode);
 
+int may_delete_dentry(struct mnt_idmap *idmap, struct inode *dir,
+		      struct dentry *victim, bool isdir);
+
 static inline bool execute_ok(struct inode *inode)
 {
 	return (inode->i_mode & S_IXUGO) || S_ISDIR(inode->i_mode);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/4] fs: export may_create() as may_create_dentry()
  2026-01-08 13:35 [PATCH 0/4] btrfs: stop duplicating VFS code for subvolume/snapshot dentry fdmanana
  2026-01-08 13:35 ` [PATCH 1/4] fs: export may_delete() as may_delete_dentry() fdmanana
@ 2026-01-08 13:35 ` fdmanana
  2026-01-08 13:35 ` [PATCH 3/4] btrfs: use may_delete_dentry() in btrfs_ioctl_snap_destroy() fdmanana
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: fdmanana @ 2026-01-08 13:35 UTC (permalink / raw)
  To: linux-btrfs; +Cc: linux-fsdevel, brauner, viro, Filipe Manana

From: Filipe Manana <fdmanana@suse.com>

For many years btrfs as been using a copy of may_create() in
fs/btrfs/ioctl.c:btrfs_may_create(). Everytime may_create() is updated we
need to update the btrfs copy, and this is a maintenance burden. Currently
there are minor differences between both because the btrfs side lacks
updates done in may_create().

Export may_create() so that btrfs can use it and with the less generic
name may_create_dentry().

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/namei.c         | 19 ++++++++++---------
 include/linux/fs.h |  2 ++
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 28aebc786e8f..676b8c016839 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3657,8 +3657,8 @@ EXPORT_SYMBOL(may_delete_dentry);
  *  4. We should have write and exec permissions on dir
  *  5. We can't do it if dir is immutable (done in permission())
  */
-static inline int may_create(struct mnt_idmap *idmap,
-			     struct inode *dir, struct dentry *child)
+int may_create_dentry(struct mnt_idmap *idmap,
+		      struct inode *dir, struct dentry *child)
 {
 	audit_inode_child(dir, child, AUDIT_TYPE_CHILD_CREATE);
 	if (child->d_inode)
@@ -3670,6 +3670,7 @@ static inline int may_create(struct mnt_idmap *idmap,
 
 	return inode_permission(idmap, dir, MAY_WRITE | MAY_EXEC);
 }
+EXPORT_SYMBOL(may_create_dentry);
 
 // p1 != p2, both are on the same filesystem, ->s_vfs_rename_mutex is held
 static struct dentry *lock_two_directories(struct dentry *p1, struct dentry *p2)
@@ -4116,7 +4117,7 @@ int vfs_create(struct mnt_idmap *idmap, struct dentry *dentry, umode_t mode,
 	struct inode *dir = d_inode(dentry->d_parent);
 	int error;
 
-	error = may_create(idmap, dir, dentry);
+	error = may_create_dentry(idmap, dir, dentry);
 	if (error)
 		return error;
 
@@ -4142,7 +4143,7 @@ int vfs_mkobj(struct dentry *dentry, umode_t mode,
 		void *arg)
 {
 	struct inode *dir = dentry->d_parent->d_inode;
-	int error = may_create(&nop_mnt_idmap, dir, dentry);
+	int error = may_create_dentry(&nop_mnt_idmap, dir, dentry);
 	if (error)
 		return error;
 
@@ -4961,7 +4962,7 @@ int vfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
 	      struct delegated_inode *delegated_inode)
 {
 	bool is_whiteout = S_ISCHR(mode) && dev == WHITEOUT_DEV;
-	int error = may_create(idmap, dir, dentry);
+	int error = may_create_dentry(idmap, dir, dentry);
 
 	if (error)
 		return error;
@@ -5107,7 +5108,7 @@ struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
 	unsigned max_links = dir->i_sb->s_max_links;
 	struct dentry *de;
 
-	error = may_create(idmap, dir, dentry);
+	error = may_create_dentry(idmap, dir, dentry);
 	if (error)
 		goto err;
 
@@ -5497,7 +5498,7 @@ int vfs_symlink(struct mnt_idmap *idmap, struct inode *dir,
 {
 	int error;
 
-	error = may_create(idmap, dir, dentry);
+	error = may_create_dentry(idmap, dir, dentry);
 	if (error)
 		return error;
 
@@ -5605,7 +5606,7 @@ int vfs_link(struct dentry *old_dentry, struct mnt_idmap *idmap,
 	if (!inode)
 		return -ENOENT;
 
-	error = may_create(idmap, dir, new_dentry);
+	error = may_create_dentry(idmap, dir, new_dentry);
 	if (error)
 		return error;
 
@@ -5822,7 +5823,7 @@ int vfs_rename(struct renamedata *rd)
 		return error;
 
 	if (!target) {
-		error = may_create(rd->mnt_idmap, new_dir, new_dentry);
+		error = may_create_dentry(rd->mnt_idmap, new_dir, new_dentry);
 	} else {
 		new_is_dir = d_is_dir(new_dentry);
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 319aaeb876fd..558056e1e843 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2659,6 +2659,8 @@ int __check_sticky(struct mnt_idmap *idmap, struct inode *dir,
 
 int may_delete_dentry(struct mnt_idmap *idmap, struct inode *dir,
 		      struct dentry *victim, bool isdir);
+int may_create_dentry(struct mnt_idmap *idmap,
+		      struct inode *dir, struct dentry *child);
 
 static inline bool execute_ok(struct inode *inode)
 {
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/4] btrfs: use may_delete_dentry() in btrfs_ioctl_snap_destroy()
  2026-01-08 13:35 [PATCH 0/4] btrfs: stop duplicating VFS code for subvolume/snapshot dentry fdmanana
  2026-01-08 13:35 ` [PATCH 1/4] fs: export may_delete() as may_delete_dentry() fdmanana
  2026-01-08 13:35 ` [PATCH 2/4] fs: export may_create() as may_create_dentry() fdmanana
@ 2026-01-08 13:35 ` fdmanana
  2026-01-08 22:05   ` David Sterba
  2026-01-08 13:35 ` [PATCH 4/4] btrfs: use may_create_dentry() in btrfs_mksubvol() fdmanana
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: fdmanana @ 2026-01-08 13:35 UTC (permalink / raw)
  To: linux-btrfs; +Cc: linux-fsdevel, brauner, viro, Filipe Manana

From: Filipe Manana <fdmanana@suse.com>

There is no longer the need to use btrfs_may_delete(), which was a copy
of the VFS private function may_delete(), since now that functionality
is exported by the VFS as a function named may_delete_dentry(). In fact
our local copy of may_delete() lacks an update that happened to that
function which is point number 7 in that function's comment:

  "7. If the victim has an unknown uid or gid we can't change the inode."

which corresponds to this code:

	/* Inode writeback is not safe when the uid or gid are invalid. */
	if (!vfsuid_valid(i_uid_into_vfsuid(idmap, inode)) ||
	    !vfsgid_valid(i_gid_into_vfsgid(idmap, inode)))
		return -EOVERFLOW;

As long as we keep a separate copy, duplicating code, we are also prone
to updates to the VFS being missed in our local copy.

So change btrfs_ioctl_snap_destroy() to use the VFS function and remove
btrfs_may_delete().

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/ioctl.c | 58 +-----------------------------------------------
 1 file changed, 1 insertion(+), 57 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index d9e7dd317670..0cb3cd3d05a5 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -815,62 +815,6 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
 	return ret;
 }
 
-/*  copy of may_delete in fs/namei.c()
- *	Check whether we can remove a link victim from directory dir, check
- *  whether the type of victim is right.
- *  1. We can't do it if dir is read-only (done in permission())
- *  2. We should have write and exec permissions on dir
- *  3. We can't remove anything from append-only dir
- *  4. We can't do anything with immutable dir (done in permission())
- *  5. If the sticky bit on dir is set we should either
- *	a. be owner of dir, or
- *	b. be owner of victim, or
- *	c. have CAP_FOWNER capability
- *  6. If the victim is append-only or immutable we can't do anything with
- *     links pointing to it.
- *  7. If we were asked to remove a directory and victim isn't one - ENOTDIR.
- *  8. If we were asked to remove a non-directory and victim isn't one - EISDIR.
- *  9. We can't remove a root or mountpoint.
- * 10. We don't allow removal of NFS sillyrenamed files; it's handled by
- *     nfs_async_unlink().
- */
-
-static int btrfs_may_delete(struct mnt_idmap *idmap,
-			    struct inode *dir, struct dentry *victim, int isdir)
-{
-	int ret;
-
-	if (d_really_is_negative(victim))
-		return -ENOENT;
-
-	/* The @victim is not inside @dir. */
-	if (d_inode(victim->d_parent) != dir)
-		return -EINVAL;
-	audit_inode_child(dir, victim, AUDIT_TYPE_CHILD_DELETE);
-
-	ret = inode_permission(idmap, dir, MAY_WRITE | MAY_EXEC);
-	if (ret)
-		return ret;
-	if (IS_APPEND(dir))
-		return -EPERM;
-	if (check_sticky(idmap, dir, d_inode(victim)) ||
-	    IS_APPEND(d_inode(victim)) || IS_IMMUTABLE(d_inode(victim)) ||
-	    IS_SWAPFILE(d_inode(victim)))
-		return -EPERM;
-	if (isdir) {
-		if (!d_is_dir(victim))
-			return -ENOTDIR;
-		if (IS_ROOT(victim))
-			return -EBUSY;
-	} else if (d_is_dir(victim))
-		return -EISDIR;
-	if (IS_DEADDIR(dir))
-		return -ENOENT;
-	if (victim->d_flags & DCACHE_NFSFS_RENAMED)
-		return -EBUSY;
-	return 0;
-}
-
 /* copy of may_create in fs/namei.c() */
 static inline int btrfs_may_create(struct mnt_idmap *idmap,
 				   struct inode *dir, const struct dentry *child)
@@ -2420,7 +2364,7 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
 	}
 
 	/* check if subvolume may be deleted by a user */
-	ret = btrfs_may_delete(idmap, dir, dentry, 1);
+	ret = may_delete_dentry(idmap, dir, dentry, true);
 	if (ret)
 		goto out_end_removing;
 
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/4] btrfs: use may_create_dentry() in btrfs_mksubvol()
  2026-01-08 13:35 [PATCH 0/4] btrfs: stop duplicating VFS code for subvolume/snapshot dentry fdmanana
                   ` (2 preceding siblings ...)
  2026-01-08 13:35 ` [PATCH 3/4] btrfs: use may_delete_dentry() in btrfs_ioctl_snap_destroy() fdmanana
@ 2026-01-08 13:35 ` fdmanana
  2026-01-08 21:48   ` David Sterba
  2026-01-08 22:09 ` [PATCH 0/4] btrfs: stop duplicating VFS code for subvolume/snapshot dentry David Sterba
  2026-01-12 12:48 ` Christian Brauner
  5 siblings, 1 reply; 14+ messages in thread
From: fdmanana @ 2026-01-08 13:35 UTC (permalink / raw)
  To: linux-btrfs; +Cc: linux-fsdevel, brauner, viro, Filipe Manana

From: Filipe Manana <fdmanana@suse.com>

There is no longer the need to use btrfs_may_create(), which was a copy
of the VFS private function may_create(), since now that functionality
is exported by the VFS as a function named may_create_dentry(). So change
btrfs_mksubvol() to use the VFS function and remove btrfs_may_create().

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/ioctl.c | 15 +--------------
 1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 0cb3cd3d05a5..9cf37459ef6d 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -815,19 +815,6 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
 	return ret;
 }
 
-/* copy of may_create in fs/namei.c() */
-static inline int btrfs_may_create(struct mnt_idmap *idmap,
-				   struct inode *dir, const struct dentry *child)
-{
-	if (d_really_is_positive(child))
-		return -EEXIST;
-	if (IS_DEADDIR(dir))
-		return -ENOENT;
-	if (!fsuidgid_has_mapping(dir->i_sb, idmap))
-		return -EOVERFLOW;
-	return inode_permission(idmap, dir, MAY_WRITE | MAY_EXEC);
-}
-
 /*
  * Create a new subvolume below @parent.  This is largely modeled after
  * sys_mkdirat and vfs_mkdir, but we only do a single component lookup
@@ -849,7 +836,7 @@ static noinline int btrfs_mksubvol(struct dentry *parent,
 	if (IS_ERR(dentry))
 		return PTR_ERR(dentry);
 
-	ret = btrfs_may_create(idmap, dir, dentry);
+	ret = may_create_dentry(idmap, dir, dentry);
 	if (ret)
 		goto out_dput;
 
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] btrfs: use may_create_dentry() in btrfs_mksubvol()
  2026-01-08 13:35 ` [PATCH 4/4] btrfs: use may_create_dentry() in btrfs_mksubvol() fdmanana
@ 2026-01-08 21:48   ` David Sterba
  2026-01-09 17:08     ` Filipe Manana
  0 siblings, 1 reply; 14+ messages in thread
From: David Sterba @ 2026-01-08 21:48 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs, linux-fsdevel, brauner, viro, Filipe Manana

On Thu, Jan 08, 2026 at 01:35:34PM +0000, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> There is no longer the need to use btrfs_may_create(), which was a copy
> of the VFS private function may_create(), since now that functionality
> is exported by the VFS as a function named may_create_dentry(). So change
> btrfs_mksubvol() to use the VFS function and remove btrfs_may_create().
> 
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
>  fs/btrfs/ioctl.c | 15 +--------------
>  1 file changed, 1 insertion(+), 14 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 0cb3cd3d05a5..9cf37459ef6d 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -815,19 +815,6 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
>  	return ret;
>  }
>  
> -/* copy of may_create in fs/namei.c() */
> -static inline int btrfs_may_create(struct mnt_idmap *idmap,
> -				   struct inode *dir, const struct dentry *child)
> -{

The difference to the VFS version is lack of audit_inode_child() in
this place, so this may be good to mention in the changelog.
Functionally the audit subsystem missed the event of subvolume creation.

> -	if (d_really_is_positive(child))
> -		return -EEXIST;
> -	if (IS_DEADDIR(dir))
> -		return -ENOENT;
> -	if (!fsuidgid_has_mapping(dir->i_sb, idmap))
> -		return -EOVERFLOW;
> -	return inode_permission(idmap, dir, MAY_WRITE | MAY_EXEC);
> -}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] btrfs: use may_delete_dentry() in btrfs_ioctl_snap_destroy()
  2026-01-08 13:35 ` [PATCH 3/4] btrfs: use may_delete_dentry() in btrfs_ioctl_snap_destroy() fdmanana
@ 2026-01-08 22:05   ` David Sterba
  2026-01-09 17:06     ` Filipe Manana
  0 siblings, 1 reply; 14+ messages in thread
From: David Sterba @ 2026-01-08 22:05 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs, linux-fsdevel, brauner, viro, Filipe Manana

On Thu, Jan 08, 2026 at 01:35:33PM +0000, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> There is no longer the need to use btrfs_may_delete(), which was a copy
> of the VFS private function may_delete(), since now that functionality
> is exported by the VFS as a function named may_delete_dentry(). In fact
> our local copy of may_delete() lacks an update that happened to that
> function which is point number 7 in that function's comment:
> 
>   "7. If the victim has an unknown uid or gid we can't change the inode."
> 
> which corresponds to this code:
> 
> 	/* Inode writeback is not safe when the uid or gid are invalid. */
> 	if (!vfsuid_valid(i_uid_into_vfsuid(idmap, inode)) ||
> 	    !vfsgid_valid(i_gid_into_vfsgid(idmap, inode)))
> 		return -EOVERFLOW;
> 
> As long as we keep a separate copy, duplicating code, we are also prone
> to updates to the VFS being missed in our local copy.
> 
> So change btrfs_ioctl_snap_destroy() to use the VFS function and remove
> btrfs_may_delete().
> 
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
>  fs/btrfs/ioctl.c | 58 +-----------------------------------------------
>  1 file changed, 1 insertion(+), 57 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index d9e7dd317670..0cb3cd3d05a5 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -815,62 +815,6 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
>  	return ret;
>  }
>  
> -/*  copy of may_delete in fs/namei.c()
> - *	Check whether we can remove a link victim from directory dir, check
> - *  whether the type of victim is right.
> - *  1. We can't do it if dir is read-only (done in permission())
> - *  2. We should have write and exec permissions on dir
> - *  3. We can't remove anything from append-only dir
> - *  4. We can't do anything with immutable dir (done in permission())
> - *  5. If the sticky bit on dir is set we should either
> - *	a. be owner of dir, or
> - *	b. be owner of victim, or
> - *	c. have CAP_FOWNER capability
> - *  6. If the victim is append-only or immutable we can't do anything with
> - *     links pointing to it.
> - *  7. If we were asked to remove a directory and victim isn't one - ENOTDIR.
> - *  8. If we were asked to remove a non-directory and victim isn't one - EISDIR.
> - *  9. We can't remove a root or mountpoint.
> - * 10. We don't allow removal of NFS sillyrenamed files; it's handled by
> - *     nfs_async_unlink().
> - */
> -
> -static int btrfs_may_delete(struct mnt_idmap *idmap,
> -			    struct inode *dir, struct dentry *victim, int isdir)
> -{
> -	int ret;

There are some differences in VFS may_delete that I don't know if are
significant, they seem to be releated to stacked filesystems.

For example the associated inode of the victim dentry is obtained as
d_backing_inode() vs our simple d_inode().

> -
> -	if (d_really_is_negative(victim))

VFS does d_is_negative() which does not check for NULL pointer but some
other internal state.

> -		return -ENOENT;
> -
> -	/* The @victim is not inside @dir. */
> -	if (d_inode(victim->d_parent) != dir)
> -		return -EINVAL;

We handle that properly, while VFS does BUG_ON, so this can be fixed
separeately in the VFS version.

There are no changes in the rest of the function (other than the
different way how inode is obtained).

The original commit 4260f7c7516f4c ("Btrfs: allow subvol deletion by
unprivileged user with -o user_subvol_rm_allowed") adding this helper
says something about adding the write and exec checks and size checks
but I don't see what it's referring to, neither in the current nor in
the old code.

> -	audit_inode_child(dir, victim, AUDIT_TYPE_CHILD_DELETE);
> -
> -	ret = inode_permission(idmap, dir, MAY_WRITE | MAY_EXEC);
> -	if (ret)
> -		return ret;
> -	if (IS_APPEND(dir))
> -		return -EPERM;
> -	if (check_sticky(idmap, dir, d_inode(victim)) ||
> -	    IS_APPEND(d_inode(victim)) || IS_IMMUTABLE(d_inode(victim)) ||
> -	    IS_SWAPFILE(d_inode(victim)))
> -		return -EPERM;
> -	if (isdir) {
> -		if (!d_is_dir(victim))
> -			return -ENOTDIR;
> -		if (IS_ROOT(victim))
> -			return -EBUSY;
> -	} else if (d_is_dir(victim))
> -		return -EISDIR;
> -	if (IS_DEADDIR(dir))
> -		return -ENOENT;
> -	if (victim->d_flags & DCACHE_NFSFS_RENAMED)
> -		return -EBUSY;
> -	return 0;
> -}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] btrfs: stop duplicating VFS code for subvolume/snapshot dentry
  2026-01-08 13:35 [PATCH 0/4] btrfs: stop duplicating VFS code for subvolume/snapshot dentry fdmanana
                   ` (3 preceding siblings ...)
  2026-01-08 13:35 ` [PATCH 4/4] btrfs: use may_create_dentry() in btrfs_mksubvol() fdmanana
@ 2026-01-08 22:09 ` David Sterba
  2026-01-12 12:48 ` Christian Brauner
  5 siblings, 0 replies; 14+ messages in thread
From: David Sterba @ 2026-01-08 22:09 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs, linux-fsdevel, brauner, viro, Filipe Manana

On Thu, Jan 08, 2026 at 01:35:30PM +0000, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> Currently btrfs has copies of two unexported functions from fs/namei.c
> used in the snapshot/subvolume creation and deletion. This patchset
> exports those functions and makes btrfs use them, to avoid duplication
> and the burden of keeping the copies up to date.
> 
> Filipe Manana (4):
>   fs: export may_delete() as may_delete_dentry()
>   fs: export may_create() as may_create_dentry()
>   btrfs: use may_delete_dentry() in btrfs_ioctl_snap_destroy()
>   btrfs: use may_create_dentry() in btrfs_mksubvol()

Great, thanks, we should have done that years ago. We should use the VFS
interfaces though I'm not sure about some of the implementation
differences.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] btrfs: use may_delete_dentry() in btrfs_ioctl_snap_destroy()
  2026-01-08 22:05   ` David Sterba
@ 2026-01-09 17:06     ` Filipe Manana
  0 siblings, 0 replies; 14+ messages in thread
From: Filipe Manana @ 2026-01-09 17:06 UTC (permalink / raw)
  To: dsterba; +Cc: linux-btrfs, linux-fsdevel, brauner, viro, Filipe Manana

On Thu, Jan 8, 2026 at 10:05 PM David Sterba <dsterba@suse.cz> wrote:
>
> On Thu, Jan 08, 2026 at 01:35:33PM +0000, fdmanana@kernel.org wrote:
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > There is no longer the need to use btrfs_may_delete(), which was a copy
> > of the VFS private function may_delete(), since now that functionality
> > is exported by the VFS as a function named may_delete_dentry(). In fact
> > our local copy of may_delete() lacks an update that happened to that
> > function which is point number 7 in that function's comment:
> >
> >   "7. If the victim has an unknown uid or gid we can't change the inode."
> >
> > which corresponds to this code:
> >
> >       /* Inode writeback is not safe when the uid or gid are invalid. */
> >       if (!vfsuid_valid(i_uid_into_vfsuid(idmap, inode)) ||
> >           !vfsgid_valid(i_gid_into_vfsgid(idmap, inode)))
> >               return -EOVERFLOW;
> >
> > As long as we keep a separate copy, duplicating code, we are also prone
> > to updates to the VFS being missed in our local copy.
> >
> > So change btrfs_ioctl_snap_destroy() to use the VFS function and remove
> > btrfs_may_delete().
> >
> > Signed-off-by: Filipe Manana <fdmanana@suse.com>
> > ---
> >  fs/btrfs/ioctl.c | 58 +-----------------------------------------------
> >  1 file changed, 1 insertion(+), 57 deletions(-)
> >
> > diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> > index d9e7dd317670..0cb3cd3d05a5 100644
> > --- a/fs/btrfs/ioctl.c
> > +++ b/fs/btrfs/ioctl.c
> > @@ -815,62 +815,6 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
> >       return ret;
> >  }
> >
> > -/*  copy of may_delete in fs/namei.c()
> > - *   Check whether we can remove a link victim from directory dir, check
> > - *  whether the type of victim is right.
> > - *  1. We can't do it if dir is read-only (done in permission())
> > - *  2. We should have write and exec permissions on dir
> > - *  3. We can't remove anything from append-only dir
> > - *  4. We can't do anything with immutable dir (done in permission())
> > - *  5. If the sticky bit on dir is set we should either
> > - *   a. be owner of dir, or
> > - *   b. be owner of victim, or
> > - *   c. have CAP_FOWNER capability
> > - *  6. If the victim is append-only or immutable we can't do anything with
> > - *     links pointing to it.
> > - *  7. If we were asked to remove a directory and victim isn't one - ENOTDIR.
> > - *  8. If we were asked to remove a non-directory and victim isn't one - EISDIR.
> > - *  9. We can't remove a root or mountpoint.
> > - * 10. We don't allow removal of NFS sillyrenamed files; it's handled by
> > - *     nfs_async_unlink().
> > - */
> > -
> > -static int btrfs_may_delete(struct mnt_idmap *idmap,
> > -                         struct inode *dir, struct dentry *victim, int isdir)
> > -{
> > -     int ret;
>
> There are some differences in VFS may_delete that I don't know if are
> significant, they seem to be releated to stacked filesystems.
>
> For example the associated inode of the victim dentry is obtained as
> d_backing_inode() vs our simple d_inode().
>
> > -
> > -     if (d_really_is_negative(victim))
>
> VFS does d_is_negative() which does not check for NULL pointer but some
> other internal state.
>
> > -             return -ENOENT;
> > -
> > -     /* The @victim is not inside @dir. */
> > -     if (d_inode(victim->d_parent) != dir)
> > -             return -EINVAL;
>
> We handle that properly, while VFS does BUG_ON, so this can be fixed
> separeately in the VFS version.

Yes, but it's one of those cases that should never happen.
In fact we used to have the BUG_ON in btrfs too, you converted it to
proper error handling in:

commit 1686570265559ebfa828c1b784a31407ec2877bd
Author: David Sterba <dsterba@suse.com>
Date:   Fri Jan 19 20:23:56 2024 +0100

    btrfs: handle directory and dentry mismatch in btrfs_may_delete()



>
> There are no changes in the rest of the function (other than the
> different way how inode is obtained).

Yes, small differences like that. Some of those things seem to be
because our btrfs copy was not updated.
fstests pass with this patchset.

>
> The original commit 4260f7c7516f4c ("Btrfs: allow subvol deletion by
> unprivileged user with -o user_subvol_rm_allowed") adding this helper
> says something about adding the write and exec checks and size checks
> but I don't see what it's referring to, neither in the current nor in
> the old code.

I noticed that, but the VFS may_delete() does those checks, which is this line:

error = inode_permission(idmap, dir, MAY_WRITE | MAY_EXEC);

Exactly the same as our btrfs copy.


>
> > -     audit_inode_child(dir, victim, AUDIT_TYPE_CHILD_DELETE);
> > -
> > -     ret = inode_permission(idmap, dir, MAY_WRITE | MAY_EXEC);
> > -     if (ret)
> > -             return ret;
> > -     if (IS_APPEND(dir))
> > -             return -EPERM;
> > -     if (check_sticky(idmap, dir, d_inode(victim)) ||
> > -         IS_APPEND(d_inode(victim)) || IS_IMMUTABLE(d_inode(victim)) ||
> > -         IS_SWAPFILE(d_inode(victim)))
> > -             return -EPERM;
> > -     if (isdir) {
> > -             if (!d_is_dir(victim))
> > -                     return -ENOTDIR;
> > -             if (IS_ROOT(victim))
> > -                     return -EBUSY;
> > -     } else if (d_is_dir(victim))
> > -             return -EISDIR;
> > -     if (IS_DEADDIR(dir))
> > -             return -ENOENT;
> > -     if (victim->d_flags & DCACHE_NFSFS_RENAMED)
> > -             return -EBUSY;
> > -     return 0;
> > -}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] btrfs: use may_create_dentry() in btrfs_mksubvol()
  2026-01-08 21:48   ` David Sterba
@ 2026-01-09 17:08     ` Filipe Manana
  0 siblings, 0 replies; 14+ messages in thread
From: Filipe Manana @ 2026-01-09 17:08 UTC (permalink / raw)
  To: dsterba; +Cc: linux-btrfs, linux-fsdevel, brauner, viro, Filipe Manana

On Thu, Jan 8, 2026 at 9:48 PM David Sterba <dsterba@suse.cz> wrote:
>
> On Thu, Jan 08, 2026 at 01:35:34PM +0000, fdmanana@kernel.org wrote:
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > There is no longer the need to use btrfs_may_create(), which was a copy
> > of the VFS private function may_create(), since now that functionality
> > is exported by the VFS as a function named may_create_dentry(). So change
> > btrfs_mksubvol() to use the VFS function and remove btrfs_may_create().
> >
> > Signed-off-by: Filipe Manana <fdmanana@suse.com>
> > ---
> >  fs/btrfs/ioctl.c | 15 +--------------
> >  1 file changed, 1 insertion(+), 14 deletions(-)
> >
> > diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> > index 0cb3cd3d05a5..9cf37459ef6d 100644
> > --- a/fs/btrfs/ioctl.c
> > +++ b/fs/btrfs/ioctl.c
> > @@ -815,19 +815,6 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
> >       return ret;
> >  }
> >
> > -/* copy of may_create in fs/namei.c() */
> > -static inline int btrfs_may_create(struct mnt_idmap *idmap,
> > -                                struct inode *dir, const struct dentry *child)
> > -{
>
> The difference to the VFS version is lack of audit_inode_child() in
> this place, so this may be good to mention in the changelog.
> Functionally the audit subsystem missed the event of subvolume creation.

I'll add that to the changelog in the next version, after getting
comments from the VFS people.

>
> > -     if (d_really_is_positive(child))
> > -             return -EEXIST;
> > -     if (IS_DEADDIR(dir))
> > -             return -ENOENT;
> > -     if (!fsuidgid_has_mapping(dir->i_sb, idmap))
> > -             return -EOVERFLOW;
> > -     return inode_permission(idmap, dir, MAY_WRITE | MAY_EXEC);
> > -}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] btrfs: stop duplicating VFS code for subvolume/snapshot dentry
  2026-01-08 13:35 [PATCH 0/4] btrfs: stop duplicating VFS code for subvolume/snapshot dentry fdmanana
                   ` (4 preceding siblings ...)
  2026-01-08 22:09 ` [PATCH 0/4] btrfs: stop duplicating VFS code for subvolume/snapshot dentry David Sterba
@ 2026-01-12 12:48 ` Christian Brauner
  2026-01-12 13:49   ` Filipe Manana
  5 siblings, 1 reply; 14+ messages in thread
From: Christian Brauner @ 2026-01-12 12:48 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs, linux-fsdevel, viro, Filipe Manana

On Thu, Jan 08, 2026 at 01:35:30PM +0000, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> Currently btrfs has copies of two unexported functions from fs/namei.c
> used in the snapshot/subvolume creation and deletion. This patchset
> exports those functions and makes btrfs use them, to avoid duplication
> and the burden of keeping the copies up to date.

Seems like a good idea to me.
Let me know once it's ready and I'll give you a stable branch with the
VFS bits applied where you can apply the btrfs specific changes on top.
Thanks!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] btrfs: stop duplicating VFS code for subvolume/snapshot dentry
  2026-01-12 12:48 ` Christian Brauner
@ 2026-01-12 13:49   ` Filipe Manana
  2026-01-13  1:16     ` David Sterba
  0 siblings, 1 reply; 14+ messages in thread
From: Filipe Manana @ 2026-01-12 13:49 UTC (permalink / raw)
  To: Christian Brauner; +Cc: linux-btrfs, linux-fsdevel, viro, Filipe Manana

On Mon, Jan 12, 2026 at 12:48 PM Christian Brauner <brauner@kernel.org> wrote:
>
> On Thu, Jan 08, 2026 at 01:35:30PM +0000, fdmanana@kernel.org wrote:
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > Currently btrfs has copies of two unexported functions from fs/namei.c
> > used in the snapshot/subvolume creation and deletion. This patchset
> > exports those functions and makes btrfs use them, to avoid duplication
> > and the burden of keeping the copies up to date.
>
> Seems like a good idea to me.
> Let me know once it's ready and I'll give you a stable branch with the
> VFS bits applied where you can apply the btrfs specific changes on top.

Right now what's missing is just an update to the changelog of patch
4/4 to mention that the btrfs copy is missing the audit_inode_child()
call.

If there are no other comments, I can prepare a v2 with that update,
and then you can pick the patches into a branch.
For the btrfs patches, you can probably pick them too as they are
trivial, though I'll defer to David in case he has a different
preference.

Thanks.

> Thanks!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] btrfs: stop duplicating VFS code for subvolume/snapshot dentry
  2026-01-12 13:49   ` Filipe Manana
@ 2026-01-13  1:16     ` David Sterba
  2026-01-13  9:00       ` Christian Brauner
  0 siblings, 1 reply; 14+ messages in thread
From: David Sterba @ 2026-01-13  1:16 UTC (permalink / raw)
  To: Filipe Manana
  Cc: Christian Brauner, linux-btrfs, linux-fsdevel, viro,
	Filipe Manana

On Mon, Jan 12, 2026 at 01:49:07PM +0000, Filipe Manana wrote:
> On Mon, Jan 12, 2026 at 12:48 PM Christian Brauner <brauner@kernel.org> wrote:
> >
> > On Thu, Jan 08, 2026 at 01:35:30PM +0000, fdmanana@kernel.org wrote:
> > > From: Filipe Manana <fdmanana@suse.com>
> > >
> > > Currently btrfs has copies of two unexported functions from fs/namei.c
> > > used in the snapshot/subvolume creation and deletion. This patchset
> > > exports those functions and makes btrfs use them, to avoid duplication
> > > and the burden of keeping the copies up to date.
> >
> > Seems like a good idea to me.
> > Let me know once it's ready and I'll give you a stable branch with the
> > VFS bits applied where you can apply the btrfs specific changes on top.
> 
> Right now what's missing is just an update to the changelog of patch
> 4/4 to mention that the btrfs copy is missing the audit_inode_child()
> call.
> 
> If there are no other comments, I can prepare a v2 with that update,
> and then you can pick the patches into a branch.
> For the btrfs patches, you can probably pick them too as they are
> trivial, though I'll defer to David in case he has a different
> preference.

I agree that this is relatively trivial, also it does not affect
anything in particular we have in for-next, so Christian feel free to
take it via the vfs trees. Thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] btrfs: stop duplicating VFS code for subvolume/snapshot dentry
  2026-01-13  1:16     ` David Sterba
@ 2026-01-13  9:00       ` Christian Brauner
  0 siblings, 0 replies; 14+ messages in thread
From: Christian Brauner @ 2026-01-13  9:00 UTC (permalink / raw)
  To: David Sterba
  Cc: Filipe Manana, linux-btrfs, linux-fsdevel, viro, Filipe Manana

On Tue, Jan 13, 2026 at 02:16:43AM +0100, David Sterba wrote:
> On Mon, Jan 12, 2026 at 01:49:07PM +0000, Filipe Manana wrote:
> > On Mon, Jan 12, 2026 at 12:48 PM Christian Brauner <brauner@kernel.org> wrote:
> > >
> > > On Thu, Jan 08, 2026 at 01:35:30PM +0000, fdmanana@kernel.org wrote:
> > > > From: Filipe Manana <fdmanana@suse.com>
> > > >
> > > > Currently btrfs has copies of two unexported functions from fs/namei.c
> > > > used in the snapshot/subvolume creation and deletion. This patchset
> > > > exports those functions and makes btrfs use them, to avoid duplication
> > > > and the burden of keeping the copies up to date.
> > >
> > > Seems like a good idea to me.
> > > Let me know once it's ready and I'll give you a stable branch with the
> > > VFS bits applied where you can apply the btrfs specific changes on top.
> > 
> > Right now what's missing is just an update to the changelog of patch
> > 4/4 to mention that the btrfs copy is missing the audit_inode_child()
> > call.
> > 
> > If there are no other comments, I can prepare a v2 with that update,
> > and then you can pick the patches into a branch.
> > For the btrfs patches, you can probably pick them too as they are
> > trivial, though I'll defer to David in case he has a different
> > preference.
> 
> I agree that this is relatively trivial, also it does not affect
> anything in particular we have in for-next, so Christian feel free to
> take it via the vfs trees. Thanks.

Thanks for the trust, appreciate it.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2026-01-13  9:00 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-08 13:35 [PATCH 0/4] btrfs: stop duplicating VFS code for subvolume/snapshot dentry fdmanana
2026-01-08 13:35 ` [PATCH 1/4] fs: export may_delete() as may_delete_dentry() fdmanana
2026-01-08 13:35 ` [PATCH 2/4] fs: export may_create() as may_create_dentry() fdmanana
2026-01-08 13:35 ` [PATCH 3/4] btrfs: use may_delete_dentry() in btrfs_ioctl_snap_destroy() fdmanana
2026-01-08 22:05   ` David Sterba
2026-01-09 17:06     ` Filipe Manana
2026-01-08 13:35 ` [PATCH 4/4] btrfs: use may_create_dentry() in btrfs_mksubvol() fdmanana
2026-01-08 21:48   ` David Sterba
2026-01-09 17:08     ` Filipe Manana
2026-01-08 22:09 ` [PATCH 0/4] btrfs: stop duplicating VFS code for subvolume/snapshot dentry David Sterba
2026-01-12 12:48 ` Christian Brauner
2026-01-12 13:49   ` Filipe Manana
2026-01-13  1:16     ` David Sterba
2026-01-13  9:00       ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox