linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/2] Fix shmem_rename2 directory offset calculation
@ 2024-04-11 18:26 cel
  2024-04-11 18:26 ` [PATCH v1 1/2] shmem: Fix shmem_rename2() cel
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: cel @ 2024-04-11 18:26 UTC (permalink / raw)
  To: Christian Brauner; +Cc: linux-fsdevel, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

The existing code in shmem_rename2() allocates a fresh directory
offset value when renaming over an existing destination entry. User
space does not expect this behavior. In particular, applications
that rename while walking a directory can loop indefinitely because
they never reach the end of the directory.

The first patch in this series corrects that problem, which exists
in v6.6 - current. The second patch is a clean-up and can be deferred
until v6.10.

Chuck Lever (2):
  shmem: Fix shmem_rename2()
  libfs: Clean up the simple_offset API

 fs/libfs.c         | 89 ++++++++++++++++++++++++++++++++++------------
 include/linux/fs.h | 10 +++---
 mm/shmem.c         | 17 +++++----
 3 files changed, 81 insertions(+), 35 deletions(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v1 1/2] shmem: Fix shmem_rename2()
  2024-04-11 18:26 [PATCH v1 0/2] Fix shmem_rename2 directory offset calculation cel
@ 2024-04-11 18:26 ` cel
  2024-04-11 18:26 ` [PATCH v1 2/2] libfs: Clean up the simple_offset API cel
  2024-04-12 14:29 ` [PATCH v1 0/2] Fix shmem_rename2 directory offset calculation Chuck Lever
  2 siblings, 0 replies; 5+ messages in thread
From: cel @ 2024-04-11 18:26 UTC (permalink / raw)
  To: Christian Brauner; +Cc: linux-fsdevel, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

When renaming onto an existing directory entry, user space expects
the replacement entry to have the same directory offset as the
original one.

The details of handling directory offsets during a rename are moved
to fs/libfs.c so that they can be reused by other API consumers.

For backporting to stable kernels: use xa_store() rather than
mtree_store(), and octx->xa rather than octx->mt. See commit
0e4a862174f2 ("libfs: Convert simple directory offsets to use a
Maple Tree") for details.

Link: https://gitlab.alpinelinux.org/alpine/aports/-/issues/15966
Fixes: a2e459555c5f ("shmem: stable directory offsets")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/libfs.c         | 53 ++++++++++++++++++++++++++++++++++++++++------
 include/linux/fs.h |  2 ++
 mm/shmem.c         |  3 +--
 3 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/fs/libfs.c b/fs/libfs.c
index 3a6f2cb364f8..ff767288e5dd 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -295,6 +295,17 @@ int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry)
 	return 0;
 }
 
+/*
+ * Internal helper for use when it is known that the tree entry at
+ * @index is already NULL.
+ */
+static int simple_offset_store(struct offset_ctx *octx, struct dentry *dentry,
+			       long index)
+{
+	offset_set(dentry, index);
+	return mtree_store(&octx->mt, index, dentry, GFP_KERNEL);
+}
+
 /**
  * simple_offset_remove - Remove an entry to a directory's offset map
  * @octx: directory offset ctx to be updated
@@ -345,6 +356,35 @@ int simple_offset_empty(struct dentry *dentry)
 	return ret;
 }
 
+/**
+ * simple_offset_rename - handle directory offsets for rename
+ * @old_dir: parent directory of source entry
+ * @old_dentry: dentry of source entry
+ * @new_dir: parent_directory of destination entry
+ * @new_dentry: dentry of destination
+ *
+ * Caller provides appropriate serialization.
+ *
+ * Returns zero on success, a negative errno value on failure.
+ */
+int simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry,
+			 struct inode *new_dir, struct dentry *new_dentry)
+{
+	struct offset_ctx *old_ctx = old_dir->i_op->get_offset_ctx(old_dir);
+	struct offset_ctx *new_ctx = new_dir->i_op->get_offset_ctx(new_dir);
+	long new_index = dentry2offset(new_dentry);
+
+	simple_offset_remove(old_ctx, old_dentry);
+
+	/*
+	 * When the destination entry already exists, user space expects
+	 * its directory offset value to be unchanged after the rename.
+	 */
+	if (new_index)
+		return simple_offset_store(new_ctx, old_dentry, new_index);
+	return simple_offset_add(new_ctx, old_dentry);
+}
+
 /**
  * simple_offset_rename_exchange - exchange rename with directory offsets
  * @old_dir: parent of dentry being moved
@@ -352,6 +392,9 @@ int simple_offset_empty(struct dentry *dentry)
  * @new_dir: destination parent
  * @new_dentry: destination dentry
  *
+ * This API preserves the directory offset values. Caller provides
+ * appropriate serialization.
+ *
  * Returns zero on success. Otherwise a negative errno is returned and the
  * rename is rolled back.
  */
@@ -369,11 +412,11 @@ int simple_offset_rename_exchange(struct inode *old_dir,
 	simple_offset_remove(old_ctx, old_dentry);
 	simple_offset_remove(new_ctx, new_dentry);
 
-	ret = simple_offset_add(new_ctx, old_dentry);
+	ret = simple_offset_store(new_ctx, old_dentry, new_index);
 	if (ret)
 		goto out_restore;
 
-	ret = simple_offset_add(old_ctx, new_dentry);
+	ret = simple_offset_store(old_ctx, new_dentry, old_index);
 	if (ret) {
 		simple_offset_remove(new_ctx, old_dentry);
 		goto out_restore;
@@ -388,10 +431,8 @@ int simple_offset_rename_exchange(struct inode *old_dir,
 	return 0;
 
 out_restore:
-	offset_set(old_dentry, old_index);
-	mtree_store(&old_ctx->mt, old_index, old_dentry, GFP_KERNEL);
-	offset_set(new_dentry, new_index);
-	mtree_store(&new_ctx->mt, new_index, new_dentry, GFP_KERNEL);
+	(void)simple_offset_store(old_ctx, old_dentry, old_index);
+	(void)simple_offset_store(new_ctx, new_dentry, new_index);
 	return ret;
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8dfd53b52744..b09f14132110 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3340,6 +3340,8 @@ void simple_offset_init(struct offset_ctx *octx);
 int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry);
 void simple_offset_remove(struct offset_ctx *octx, struct dentry *dentry);
 int simple_offset_empty(struct dentry *dentry);
+int simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry,
+			 struct inode *new_dir, struct dentry *new_dentry);
 int simple_offset_rename_exchange(struct inode *old_dir,
 				  struct dentry *old_dentry,
 				  struct inode *new_dir,
diff --git a/mm/shmem.c b/mm/shmem.c
index 0aad0d9a621b..c0fb65223963 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3473,8 +3473,7 @@ static int shmem_rename2(struct mnt_idmap *idmap,
 			return error;
 	}
 
-	simple_offset_remove(shmem_get_offset_ctx(old_dir), old_dentry);
-	error = simple_offset_add(shmem_get_offset_ctx(new_dir), old_dentry);
+	error = simple_offset_rename(old_dir, old_dentry, new_dir, new_dentry);
 	if (error)
 		return error;
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v1 2/2] libfs: Clean up the simple_offset API
  2024-04-11 18:26 [PATCH v1 0/2] Fix shmem_rename2 directory offset calculation cel
  2024-04-11 18:26 ` [PATCH v1 1/2] shmem: Fix shmem_rename2() cel
@ 2024-04-11 18:26 ` cel
  2024-04-12 14:29 ` [PATCH v1 0/2] Fix shmem_rename2 directory offset calculation Chuck Lever
  2 siblings, 0 replies; 5+ messages in thread
From: cel @ 2024-04-11 18:26 UTC (permalink / raw)
  To: Christian Brauner; +Cc: linux-fsdevel, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

The original plan was to avoid an indirect function call on every
call to simple_offset_add() and simple_offset_remove().

But that clutters the call sites with duplicated code and makes
observability difficult because some API functions take an inode
pointer, and others take an offset_ctx.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/libfs.c         | 56 +++++++++++++++++++++++++---------------------
 include/linux/fs.h |  8 +++----
 mm/shmem.c         | 14 ++++++------
 3 files changed, 41 insertions(+), 37 deletions(-)

diff --git a/fs/libfs.c b/fs/libfs.c
index ff767288e5dd..1015111657b9 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -260,11 +260,13 @@ static struct lock_class_key simple_offset_lock_class;
 
 /**
  * simple_offset_init - initialize an offset_ctx
- * @octx: directory offset map to be initialized
+ * @dir: directory to be initialized
  *
  */
-void simple_offset_init(struct offset_ctx *octx)
+void simple_offset_init(struct inode *dir)
 {
+	struct offset_ctx *octx = dir->i_op->get_offset_ctx(dir);
+
 	mt_init_flags(&octx->mt, MT_FLAGS_ALLOC_RANGE);
 	lockdep_set_class(&octx->mt.ma_lock, &simple_offset_lock_class);
 	octx->next_offset = DIR_OFFSET_MIN;
@@ -272,14 +274,15 @@ void simple_offset_init(struct offset_ctx *octx)
 
 /**
  * simple_offset_add - Add an entry to a directory's offset map
- * @octx: directory offset ctx to be updated
+ * @dir: directory to be updated
  * @dentry: new dentry being added
  *
- * Returns zero on success. @octx and the dentry's offset are updated.
+ * Returns zero on success. @dir and the dentry's offset are updated.
  * Otherwise, a negative errno value is returned.
  */
-int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry)
+int simple_offset_add(struct inode *dir, struct dentry *dentry)
 {
+	struct offset_ctx *octx = dir->i_op->get_offset_ctx(dir);
 	unsigned long offset;
 	int ret;
 
@@ -299,21 +302,24 @@ int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry)
  * Internal helper for use when it is known that the tree entry at
  * @index is already NULL.
  */
-static int simple_offset_store(struct offset_ctx *octx, struct dentry *dentry,
+static int simple_offset_store(struct inode *dir, struct dentry *dentry,
 			       long index)
 {
+	struct offset_ctx *octx = dir->i_op->get_offset_ctx(dir);
+
 	offset_set(dentry, index);
 	return mtree_store(&octx->mt, index, dentry, GFP_KERNEL);
 }
 
 /**
  * simple_offset_remove - Remove an entry to a directory's offset map
- * @octx: directory offset ctx to be updated
+ * @dir: directory to be updated
  * @dentry: dentry being removed
  *
  */
-void simple_offset_remove(struct offset_ctx *octx, struct dentry *dentry)
+void simple_offset_remove(struct inode *dir, struct dentry *dentry)
 {
+	struct offset_ctx *octx = dir->i_op->get_offset_ctx(dir);
 	long offset;
 
 	offset = dentry2offset(dentry);
@@ -370,19 +376,17 @@ int simple_offset_empty(struct dentry *dentry)
 int simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry,
 			 struct inode *new_dir, struct dentry *new_dentry)
 {
-	struct offset_ctx *old_ctx = old_dir->i_op->get_offset_ctx(old_dir);
-	struct offset_ctx *new_ctx = new_dir->i_op->get_offset_ctx(new_dir);
 	long new_index = dentry2offset(new_dentry);
 
-	simple_offset_remove(old_ctx, old_dentry);
+	simple_offset_remove(old_dir, old_dentry);
 
 	/*
 	 * When the destination entry already exists, user space expects
 	 * its directory offset value to be unchanged after the rename.
 	 */
 	if (new_index)
-		return simple_offset_store(new_ctx, old_dentry, new_index);
-	return simple_offset_add(new_ctx, old_dentry);
+		return simple_offset_store(new_dir, old_dentry, new_index);
+	return simple_offset_add(new_dir, old_dentry);
 }
 
 /**
@@ -403,48 +407,48 @@ int simple_offset_rename_exchange(struct inode *old_dir,
 				  struct inode *new_dir,
 				  struct dentry *new_dentry)
 {
-	struct offset_ctx *old_ctx = old_dir->i_op->get_offset_ctx(old_dir);
-	struct offset_ctx *new_ctx = new_dir->i_op->get_offset_ctx(new_dir);
 	long old_index = dentry2offset(old_dentry);
 	long new_index = dentry2offset(new_dentry);
 	int ret;
 
-	simple_offset_remove(old_ctx, old_dentry);
-	simple_offset_remove(new_ctx, new_dentry);
+	simple_offset_remove(old_dir, old_dentry);
+	simple_offset_remove(new_dir, new_dentry);
 
-	ret = simple_offset_store(new_ctx, old_dentry, new_index);
+	ret = simple_offset_store(new_dir, old_dentry, new_index);
 	if (ret)
 		goto out_restore;
 
-	ret = simple_offset_store(old_ctx, new_dentry, old_index);
+	ret = simple_offset_store(old_dir, new_dentry, old_index);
 	if (ret) {
-		simple_offset_remove(new_ctx, old_dentry);
+		simple_offset_remove(new_dir, old_dentry);
 		goto out_restore;
 	}
 
 	ret = simple_rename_exchange(old_dir, old_dentry, new_dir, new_dentry);
 	if (ret) {
-		simple_offset_remove(new_ctx, old_dentry);
-		simple_offset_remove(old_ctx, new_dentry);
+		simple_offset_remove(new_dir, old_dentry);
+		simple_offset_remove(old_dir, new_dentry);
 		goto out_restore;
 	}
 	return 0;
 
 out_restore:
-	(void)simple_offset_store(old_ctx, old_dentry, old_index);
-	(void)simple_offset_store(new_ctx, new_dentry, new_index);
+	(void)simple_offset_store(old_dir, old_dentry, old_index);
+	(void)simple_offset_store(new_dir, new_dentry, new_index);
 	return ret;
 }
 
 /**
  * simple_offset_destroy - Release offset map
- * @octx: directory offset ctx that is about to be destroyed
+ * @dir: directory that is about to be destroyed
  *
  * During fs teardown (eg. umount), a directory's offset map might still
  * contain entries. xa_destroy() cleans out anything that remains.
  */
-void simple_offset_destroy(struct offset_ctx *octx)
+void simple_offset_destroy(struct inode *dir)
 {
+	struct offset_ctx *octx = dir->i_op->get_offset_ctx(dir);
+
 	mtree_destroy(&octx->mt);
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b09f14132110..26c98dfa3397 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3336,9 +3336,9 @@ struct offset_ctx {
 	unsigned long		next_offset;
 };
 
-void simple_offset_init(struct offset_ctx *octx);
-int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry);
-void simple_offset_remove(struct offset_ctx *octx, struct dentry *dentry);
+void simple_offset_init(struct inode *dir);
+int simple_offset_add(struct inode *dir, struct dentry *dentry);
+void simple_offset_remove(struct inode *dir, struct dentry *dentry);
 int simple_offset_empty(struct dentry *dentry);
 int simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry,
 			 struct inode *new_dir, struct dentry *new_dentry);
@@ -3346,7 +3346,7 @@ int simple_offset_rename_exchange(struct inode *old_dir,
 				  struct dentry *old_dentry,
 				  struct inode *new_dir,
 				  struct dentry *new_dentry);
-void simple_offset_destroy(struct offset_ctx *octx);
+void simple_offset_destroy(struct inode *dir);
 
 extern const struct file_operations simple_offset_dir_operations;
 
diff --git a/mm/shmem.c b/mm/shmem.c
index c0fb65223963..ac4f59f536cd 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2555,7 +2555,7 @@ static struct inode *__shmem_get_inode(struct mnt_idmap *idmap,
 		inode->i_size = 2 * BOGO_DIRENT_SIZE;
 		inode->i_op = &shmem_dir_inode_operations;
 		inode->i_fop = &simple_offset_dir_operations;
-		simple_offset_init(shmem_get_offset_ctx(inode));
+		simple_offset_init(inode);
 		break;
 	case S_IFLNK:
 		/*
@@ -3284,7 +3284,7 @@ shmem_mknod(struct mnt_idmap *idmap, struct inode *dir,
 	if (error && error != -EOPNOTSUPP)
 		goto out_iput;
 
-	error = simple_offset_add(shmem_get_offset_ctx(dir), dentry);
+	error = simple_offset_add(dir, dentry);
 	if (error)
 		goto out_iput;
 
@@ -3368,7 +3368,7 @@ static int shmem_link(struct dentry *old_dentry, struct inode *dir,
 			goto out;
 	}
 
-	ret = simple_offset_add(shmem_get_offset_ctx(dir), dentry);
+	ret = simple_offset_add(dir, dentry);
 	if (ret) {
 		if (inode->i_nlink)
 			shmem_free_inode(inode->i_sb, 0);
@@ -3394,7 +3394,7 @@ static int shmem_unlink(struct inode *dir, struct dentry *dentry)
 	if (inode->i_nlink > 1 && !S_ISDIR(inode->i_mode))
 		shmem_free_inode(inode->i_sb, 0);
 
-	simple_offset_remove(shmem_get_offset_ctx(dir), dentry);
+	simple_offset_remove(dir, dentry);
 
 	dir->i_size -= BOGO_DIRENT_SIZE;
 	inode_set_mtime_to_ts(dir,
@@ -3518,7 +3518,7 @@ static int shmem_symlink(struct mnt_idmap *idmap, struct inode *dir,
 	if (error && error != -EOPNOTSUPP)
 		goto out_iput;
 
-	error = simple_offset_add(shmem_get_offset_ctx(dir), dentry);
+	error = simple_offset_add(dir, dentry);
 	if (error)
 		goto out_iput;
 
@@ -3551,7 +3551,7 @@ static int shmem_symlink(struct mnt_idmap *idmap, struct inode *dir,
 	return 0;
 
 out_remove_offset:
-	simple_offset_remove(shmem_get_offset_ctx(dir), dentry);
+	simple_offset_remove(dir, dentry);
 out_iput:
 	iput(inode);
 	return error;
@@ -4490,7 +4490,7 @@ static void shmem_destroy_inode(struct inode *inode)
 	if (S_ISREG(inode->i_mode))
 		mpol_free_shared_policy(&SHMEM_I(inode)->policy);
 	if (S_ISDIR(inode->i_mode))
-		simple_offset_destroy(shmem_get_offset_ctx(inode));
+		simple_offset_destroy(inode);
 }
 
 static void shmem_init_inode(void *foo)
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v1 0/2] Fix shmem_rename2 directory offset calculation
  2024-04-11 18:26 [PATCH v1 0/2] Fix shmem_rename2 directory offset calculation cel
  2024-04-11 18:26 ` [PATCH v1 1/2] shmem: Fix shmem_rename2() cel
  2024-04-11 18:26 ` [PATCH v1 2/2] libfs: Clean up the simple_offset API cel
@ 2024-04-12 14:29 ` Chuck Lever
  2024-04-15 14:48   ` Christian Brauner
  2 siblings, 1 reply; 5+ messages in thread
From: Chuck Lever @ 2024-04-12 14:29 UTC (permalink / raw)
  To: cel; +Cc: Christian Brauner, linux-fsdevel

On Thu, Apr 11, 2024 at 02:26:09PM -0400, cel@kernel.org wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> The existing code in shmem_rename2() allocates a fresh directory
> offset value when renaming over an existing destination entry. User
> space does not expect this behavior. In particular, applications
> that rename while walking a directory can loop indefinitely because
> they never reach the end of the directory.
> 
> The first patch in this series corrects that problem, which exists
> in v6.6 - current. The second patch is a clean-up and can be deferred
> until v6.10.
> 
> Chuck Lever (2):
>   shmem: Fix shmem_rename2()
>   libfs: Clean up the simple_offset API
> 
>  fs/libfs.c         | 89 ++++++++++++++++++++++++++++++++++------------
>  include/linux/fs.h | 10 +++---
>  mm/shmem.c         | 17 +++++----
>  3 files changed, 81 insertions(+), 35 deletions(-)

A cursory pass with fstests seemed to work fine, but a number of
tests in the git regression suite are failing. Please feel free
to send review comments, but do not merge this series yet.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v1 0/2] Fix shmem_rename2 directory offset calculation
  2024-04-12 14:29 ` [PATCH v1 0/2] Fix shmem_rename2 directory offset calculation Chuck Lever
@ 2024-04-15 14:48   ` Christian Brauner
  0 siblings, 0 replies; 5+ messages in thread
From: Christian Brauner @ 2024-04-15 14:48 UTC (permalink / raw)
  To: Chuck Lever; +Cc: cel, linux-fsdevel

On Fri, Apr 12, 2024 at 10:29:23AM -0400, Chuck Lever wrote:
> On Thu, Apr 11, 2024 at 02:26:09PM -0400, cel@kernel.org wrote:
> > From: Chuck Lever <chuck.lever@oracle.com>
> > 
> > The existing code in shmem_rename2() allocates a fresh directory
> > offset value when renaming over an existing destination entry. User
> > space does not expect this behavior. In particular, applications
> > that rename while walking a directory can loop indefinitely because
> > they never reach the end of the directory.
> > 
> > The first patch in this series corrects that problem, which exists
> > in v6.6 - current. The second patch is a clean-up and can be deferred
> > until v6.10.
> > 
> > Chuck Lever (2):
> >   shmem: Fix shmem_rename2()
> >   libfs: Clean up the simple_offset API
> > 
> >  fs/libfs.c         | 89 ++++++++++++++++++++++++++++++++++------------
> >  include/linux/fs.h | 10 +++---
> >  mm/shmem.c         | 17 +++++----
> >  3 files changed, 81 insertions(+), 35 deletions(-)
> 
> A cursory pass with fstests seemed to work fine, but a number of
> tests in the git regression suite are failing. Please feel free
> to send review comments, but do not merge this series yet.

Ok!

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-04-15 14:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-11 18:26 [PATCH v1 0/2] Fix shmem_rename2 directory offset calculation cel
2024-04-11 18:26 ` [PATCH v1 1/2] shmem: Fix shmem_rename2() cel
2024-04-11 18:26 ` [PATCH v1 2/2] libfs: Clean up the simple_offset API cel
2024-04-12 14:29 ` [PATCH v1 0/2] Fix shmem_rename2 directory offset calculation Chuck Lever
2024-04-15 14:48   ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).