linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] mnt_idmapping: decouple from namespaces
@ 2023-11-22 12:44 Christian Brauner
  2023-11-22 12:44 ` [PATCH 1/4] mnt_idmapping: remove check_fsmapping() Christian Brauner
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Christian Brauner @ 2023-11-22 12:44 UTC (permalink / raw)
  To: linux-fsdevel, Seth Forshee, Christian Brauner

Hey,

This is a tiny series to fully decouple idmapped mounts from namespaces.
We already have a dedicated type and nothing matters from a namespace
apart from it's permissions. So just get rid of it. Also means we could
extend this to allow changing of idmapping completely independent of
namespaces in the future. There's no need to tie them that close
together.

Survives xfstests for btrfs, ext4, xfs and specifically the idmapped
mount tests.

Thanks!
Christian

Signed-off-by: Christian Brauner <brauner@kernel.org>

---
base-commit: b85ea95d086471afb4ad062012a4d73cd328fa86
change-id: 20231101-vfs-mnt_idmap-c3b7502f409d


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] mnt_idmapping: remove check_fsmapping()
  2023-11-22 12:44 [PATCH 0/4] mnt_idmapping: decouple from namespaces Christian Brauner
@ 2023-11-22 12:44 ` Christian Brauner
  2023-11-22 12:44 ` [PATCH 2/4] mnt_idmapping: remove nop check Christian Brauner
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Christian Brauner @ 2023-11-22 12:44 UTC (permalink / raw)
  To: linux-fsdevel, Seth Forshee, Christian Brauner

The helper is a bit pointless. Just open-code the check.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/mnt_idmapping.c            | 17 ++---------------
 fs/namespace.c                |  2 +-
 include/linux/mnt_idmapping.h |  3 ---
 3 files changed, 3 insertions(+), 19 deletions(-)

diff --git a/fs/mnt_idmapping.c b/fs/mnt_idmapping.c
index 57d1dedf3f8f..2674942311c3 100644
--- a/fs/mnt_idmapping.c
+++ b/fs/mnt_idmapping.c
@@ -25,19 +25,6 @@ struct mnt_idmap nop_mnt_idmap = {
 };
 EXPORT_SYMBOL_GPL(nop_mnt_idmap);
 
-/**
- * check_fsmapping - check whether an mount idmapping is allowed
- * @idmap: idmap of the relevent mount
- * @sb:    super block of the filesystem
- *
- * Return: true if @idmap is allowed, false if not.
- */
-bool check_fsmapping(const struct mnt_idmap *idmap,
-		     const struct super_block *sb)
-{
-	return idmap->owner != sb->s_user_ns;
-}
-
 /**
  * initial_idmapping - check whether this is the initial mapping
  * @ns: idmapping to check
@@ -94,8 +81,8 @@ static inline bool no_idmapping(const struct user_namespace *mnt_userns,
  */
 
 vfsuid_t make_vfsuid(struct mnt_idmap *idmap,
-				   struct user_namespace *fs_userns,
-				   kuid_t kuid)
+		     struct user_namespace *fs_userns,
+		     kuid_t kuid)
 {
 	uid_t uid;
 	struct user_namespace *mnt_userns = idmap->owner;
diff --git a/fs/namespace.c b/fs/namespace.c
index fbf0e596fcd3..736baf07115c 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -4288,7 +4288,7 @@ static int can_idmap_mount(const struct mount_kattr *kattr, struct mount *mnt)
 	 * Creating an idmapped mount with the filesystem wide idmapping
 	 * doesn't make sense so block that. We don't allow mushy semantics.
 	 */
-	if (!check_fsmapping(kattr->mnt_idmap, m->mnt_sb))
+	if (kattr->mnt_userns == m->mnt_sb->s_user_ns)
 		return -EINVAL;
 
 	/*
diff --git a/include/linux/mnt_idmapping.h b/include/linux/mnt_idmapping.h
index b8da2db4ecd2..cd4d5c8781f5 100644
--- a/include/linux/mnt_idmapping.h
+++ b/include/linux/mnt_idmapping.h
@@ -244,7 +244,4 @@ static inline kgid_t mapped_fsgid(struct mnt_idmap *idmap,
 	return from_vfsgid(idmap, fs_userns, VFSGIDT_INIT(current_fsgid()));
 }
 
-bool check_fsmapping(const struct mnt_idmap *idmap,
-		     const struct super_block *sb);
-
 #endif /* _LINUX_MNT_IDMAPPING_H */

-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] mnt_idmapping: remove nop check
  2023-11-22 12:44 [PATCH 0/4] mnt_idmapping: decouple from namespaces Christian Brauner
  2023-11-22 12:44 ` [PATCH 1/4] mnt_idmapping: remove check_fsmapping() Christian Brauner
@ 2023-11-22 12:44 ` Christian Brauner
  2023-11-22 12:44 ` [PATCH 3/4] mnt_idmapping: decouple from namespaces Christian Brauner
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Christian Brauner @ 2023-11-22 12:44 UTC (permalink / raw)
  To: linux-fsdevel, Seth Forshee, Christian Brauner

All mounts default to nop_mnt_idmap and we don't allow creating idmapped
mounts that reuse the idmapping of the filesystem. So unless someone
passes a non-superblock namespace to these helpers this check will
always be false. Remove it and replace it with a simple check for
nop_mnt_idmap.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/mnt_idmapping.c | 36 ++++++++----------------------------
 1 file changed, 8 insertions(+), 28 deletions(-)

diff --git a/fs/mnt_idmapping.c b/fs/mnt_idmapping.c
index 2674942311c3..35d78cb3c38a 100644
--- a/fs/mnt_idmapping.c
+++ b/fs/mnt_idmapping.c
@@ -39,26 +39,6 @@ static inline bool initial_idmapping(const struct user_namespace *ns)
 	return ns == &init_user_ns;
 }
 
-/**
- * no_idmapping - check whether we can skip remapping a kuid/gid
- * @mnt_userns: the mount's idmapping
- * @fs_userns: the filesystem's idmapping
- *
- * This function can be used to check whether a remapping between two
- * idmappings is required.
- * An idmapped mount is a mount that has an idmapping attached to it that
- * is different from the filsystem's idmapping and the initial idmapping.
- * If the initial mapping is used or the idmapping of the mount and the
- * filesystem are identical no remapping is required.
- *
- * Return: true if remapping can be skipped, false if not.
- */
-static inline bool no_idmapping(const struct user_namespace *mnt_userns,
-				const struct user_namespace *fs_userns)
-{
-	return initial_idmapping(mnt_userns) || mnt_userns == fs_userns;
-}
-
 /**
  * make_vfsuid - map a filesystem kuid according to an idmapping
  * @idmap: the mount's idmapping
@@ -68,8 +48,8 @@ static inline bool no_idmapping(const struct user_namespace *mnt_userns,
  * Take a @kuid and remap it from @fs_userns into @idmap. Use this
  * function when preparing a @kuid to be reported to userspace.
  *
- * If no_idmapping() determines that this is not an idmapped mount we can
- * simply return @kuid unchanged.
+ * If initial_idmapping() determines that this is not an idmapped mount
+ * we can simply return @kuid unchanged.
  * If initial_idmapping() tells us that the filesystem is not mounted with an
  * idmapping we know the value of @kuid won't change when calling
  * from_kuid() so we can simply retrieve the value via __kuid_val()
@@ -87,7 +67,7 @@ vfsuid_t make_vfsuid(struct mnt_idmap *idmap,
 	uid_t uid;
 	struct user_namespace *mnt_userns = idmap->owner;
 
-	if (no_idmapping(mnt_userns, fs_userns))
+	if (idmap == &nop_mnt_idmap)
 		return VFSUIDT_INIT(kuid);
 	if (initial_idmapping(fs_userns))
 		uid = __kuid_val(kuid);
@@ -108,8 +88,8 @@ EXPORT_SYMBOL_GPL(make_vfsuid);
  * Take a @kgid and remap it from @fs_userns into @idmap. Use this
  * function when preparing a @kgid to be reported to userspace.
  *
- * If no_idmapping() determines that this is not an idmapped mount we can
- * simply return @kgid unchanged.
+ * If initial_idmapping() determines that this is not an idmapped mount
+ * we can simply return @kgid unchanged.
  * If initial_idmapping() tells us that the filesystem is not mounted with an
  * idmapping we know the value of @kgid won't change when calling
  * from_kgid() so we can simply retrieve the value via __kgid_val()
@@ -125,7 +105,7 @@ vfsgid_t make_vfsgid(struct mnt_idmap *idmap,
 	gid_t gid;
 	struct user_namespace *mnt_userns = idmap->owner;
 
-	if (no_idmapping(mnt_userns, fs_userns))
+	if (idmap == &nop_mnt_idmap)
 		return VFSGIDT_INIT(kgid);
 	if (initial_idmapping(fs_userns))
 		gid = __kgid_val(kgid);
@@ -154,7 +134,7 @@ kuid_t from_vfsuid(struct mnt_idmap *idmap,
 	uid_t uid;
 	struct user_namespace *mnt_userns = idmap->owner;
 
-	if (no_idmapping(mnt_userns, fs_userns))
+	if (idmap == &nop_mnt_idmap)
 		return AS_KUIDT(vfsuid);
 	uid = from_kuid(mnt_userns, AS_KUIDT(vfsuid));
 	if (uid == (uid_t)-1)
@@ -182,7 +162,7 @@ kgid_t from_vfsgid(struct mnt_idmap *idmap,
 	gid_t gid;
 	struct user_namespace *mnt_userns = idmap->owner;
 
-	if (no_idmapping(mnt_userns, fs_userns))
+	if (idmap == &nop_mnt_idmap)
 		return AS_KGIDT(vfsgid);
 	gid = from_kgid(mnt_userns, AS_KGIDT(vfsgid));
 	if (gid == (gid_t)-1)

-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] mnt_idmapping: decouple from namespaces
  2023-11-22 12:44 [PATCH 0/4] mnt_idmapping: decouple from namespaces Christian Brauner
  2023-11-22 12:44 ` [PATCH 1/4] mnt_idmapping: remove check_fsmapping() Christian Brauner
  2023-11-22 12:44 ` [PATCH 2/4] mnt_idmapping: remove nop check Christian Brauner
@ 2023-11-22 12:44 ` Christian Brauner
  2023-11-22 14:26   ` Josef Bacik
  2023-11-22 12:44 ` [PATCH 4/4] fs: reformat idmapped mounts entry Christian Brauner
  2023-11-24  7:52 ` [PATCH 0/4] mnt_idmapping: decouple from namespaces Christian Brauner
  4 siblings, 1 reply; 9+ messages in thread
From: Christian Brauner @ 2023-11-22 12:44 UTC (permalink / raw)
  To: linux-fsdevel, Seth Forshee, Christian Brauner

There's no reason we need to couple mnt idmapping to namespaces in the
way we currently do. Copy the idmapping when an idmapped mount is
created and don't take any reference on the namespace at all.

We also can't easily refcount struct uid_gid_map because it needs to
stay the size of a cacheline otherwise we risk performance regressions
(Ignoring for a second that right now struct uid_gid_map isn't actually
 64 byte but 72 but that's a fix for another patch series.).

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/mnt_idmapping.c      | 106 +++++++++++++++++++++++++++++++++++++++++-------
 include/linux/uidgid.h  |  13 ++++++
 kernel/user_namespace.c |   4 +-
 3 files changed, 106 insertions(+), 17 deletions(-)

diff --git a/fs/mnt_idmapping.c b/fs/mnt_idmapping.c
index 35d78cb3c38a..64c5205e2b5e 100644
--- a/fs/mnt_idmapping.c
+++ b/fs/mnt_idmapping.c
@@ -9,8 +9,16 @@
 
 #include "internal.h"
 
+/*
+ * Outside of this file vfs{g,u}id_t are always created from k{g,u}id_t,
+ * never from raw values. These are just internal helpers.
+ */
+#define VFSUIDT_INIT_RAW(val) (vfsuid_t){ val }
+#define VFSGIDT_INIT_RAW(val) (vfsgid_t){ val }
+
 struct mnt_idmap {
-	struct user_namespace *owner;
+	struct uid_gid_map uid_map;
+	struct uid_gid_map gid_map;
 	refcount_t count;
 };
 
@@ -20,7 +28,6 @@ struct mnt_idmap {
  * mapped to {g,u}id 1, [...], {g,u}id 1000 to {g,u}id 1000, [...].
  */
 struct mnt_idmap nop_mnt_idmap = {
-	.owner	= &init_user_ns,
 	.count	= REFCOUNT_INIT(1),
 };
 EXPORT_SYMBOL_GPL(nop_mnt_idmap);
@@ -65,7 +72,6 @@ vfsuid_t make_vfsuid(struct mnt_idmap *idmap,
 		     kuid_t kuid)
 {
 	uid_t uid;
-	struct user_namespace *mnt_userns = idmap->owner;
 
 	if (idmap == &nop_mnt_idmap)
 		return VFSUIDT_INIT(kuid);
@@ -75,7 +81,7 @@ vfsuid_t make_vfsuid(struct mnt_idmap *idmap,
 		uid = from_kuid(fs_userns, kuid);
 	if (uid == (uid_t)-1)
 		return INVALID_VFSUID;
-	return VFSUIDT_INIT(make_kuid(mnt_userns, uid));
+	return VFSUIDT_INIT_RAW(map_id_down(&idmap->uid_map, uid));
 }
 EXPORT_SYMBOL_GPL(make_vfsuid);
 
@@ -103,7 +109,6 @@ vfsgid_t make_vfsgid(struct mnt_idmap *idmap,
 		     struct user_namespace *fs_userns, kgid_t kgid)
 {
 	gid_t gid;
-	struct user_namespace *mnt_userns = idmap->owner;
 
 	if (idmap == &nop_mnt_idmap)
 		return VFSGIDT_INIT(kgid);
@@ -113,7 +118,7 @@ vfsgid_t make_vfsgid(struct mnt_idmap *idmap,
 		gid = from_kgid(fs_userns, kgid);
 	if (gid == (gid_t)-1)
 		return INVALID_VFSGID;
-	return VFSGIDT_INIT(make_kgid(mnt_userns, gid));
+	return VFSGIDT_INIT_RAW(map_id_down(&idmap->gid_map, gid));
 }
 EXPORT_SYMBOL_GPL(make_vfsgid);
 
@@ -132,11 +137,10 @@ kuid_t from_vfsuid(struct mnt_idmap *idmap,
 		   struct user_namespace *fs_userns, vfsuid_t vfsuid)
 {
 	uid_t uid;
-	struct user_namespace *mnt_userns = idmap->owner;
 
 	if (idmap == &nop_mnt_idmap)
 		return AS_KUIDT(vfsuid);
-	uid = from_kuid(mnt_userns, AS_KUIDT(vfsuid));
+	uid = map_id_up(&idmap->uid_map, __vfsuid_val(vfsuid));
 	if (uid == (uid_t)-1)
 		return INVALID_UID;
 	if (initial_idmapping(fs_userns))
@@ -160,11 +164,10 @@ kgid_t from_vfsgid(struct mnt_idmap *idmap,
 		   struct user_namespace *fs_userns, vfsgid_t vfsgid)
 {
 	gid_t gid;
-	struct user_namespace *mnt_userns = idmap->owner;
 
 	if (idmap == &nop_mnt_idmap)
 		return AS_KGIDT(vfsgid);
-	gid = from_kgid(mnt_userns, AS_KGIDT(vfsgid));
+	gid = map_id_up(&idmap->gid_map, __vfsgid_val(vfsgid));
 	if (gid == (gid_t)-1)
 		return INVALID_GID;
 	if (initial_idmapping(fs_userns))
@@ -195,16 +198,91 @@ int vfsgid_in_group_p(vfsgid_t vfsgid)
 #endif
 EXPORT_SYMBOL_GPL(vfsgid_in_group_p);
 
+static int copy_mnt_idmap(struct uid_gid_map *map_from,
+			  struct uid_gid_map *map_to)
+{
+	struct uid_gid_extent *forward, *reverse;
+	u32 nr_extents = READ_ONCE(map_from->nr_extents);
+	/* Pairs with smp_wmb() when writing the idmapping. */
+	smp_rmb();
+
+	/*
+	 * Don't blindly copy @map_to into @map_from if nr_extents is
+	 * smaller or equal to UID_GID_MAP_MAX_BASE_EXTENTS. Since we
+	 * read @nr_extents someone could have written an idmapping and
+	 * then we might end up with inconsistent data. So just don't do
+	 * anything at all.
+	 */
+	if (nr_extents == 0)
+		return 0;
+
+	/*
+	 * Here we know that nr_extents is greater than zero which means
+	 * a map has been written. Since idmappings can't be changed
+	 * once they have been written we know that we can safely copy
+	 * from @map_to into @map_from.
+	 */
+
+	if (nr_extents <= UID_GID_MAP_MAX_BASE_EXTENTS) {
+		*map_to = *map_from;
+		return 0;
+	}
+
+	forward = kmemdup(map_from->forward,
+			  nr_extents * sizeof(struct uid_gid_extent),
+			  GFP_KERNEL_ACCOUNT);
+	if (!forward)
+		return -ENOMEM;
+
+	reverse = kmemdup(map_from->reverse,
+			  nr_extents * sizeof(struct uid_gid_extent),
+			  GFP_KERNEL_ACCOUNT);
+	if (!reverse) {
+		kfree(forward);
+		return -ENOMEM;
+	}
+
+	/*
+	 * The idmapping isn't exposed anywhere so we don't need to care
+	 * about ordering between extent pointers and @nr_extents
+	 * initialization.
+	 */
+	map_to->forward = forward;
+	map_to->reverse = reverse;
+	map_to->nr_extents = nr_extents;
+	return 0;
+}
+
+static void free_mnt_idmap(struct mnt_idmap *idmap)
+{
+	if (idmap->uid_map.nr_extents > UID_GID_MAP_MAX_BASE_EXTENTS) {
+		kfree(idmap->uid_map.forward);
+		kfree(idmap->uid_map.reverse);
+	}
+	if (idmap->gid_map.nr_extents > UID_GID_MAP_MAX_BASE_EXTENTS) {
+		kfree(idmap->gid_map.forward);
+		kfree(idmap->gid_map.reverse);
+	}
+	kfree(idmap);
+}
+
 struct mnt_idmap *alloc_mnt_idmap(struct user_namespace *mnt_userns)
 {
 	struct mnt_idmap *idmap;
+	int ret;
 
 	idmap = kzalloc(sizeof(struct mnt_idmap), GFP_KERNEL_ACCOUNT);
 	if (!idmap)
 		return ERR_PTR(-ENOMEM);
 
-	idmap->owner = get_user_ns(mnt_userns);
 	refcount_set(&idmap->count, 1);
+	ret = copy_mnt_idmap(&mnt_userns->uid_map, &idmap->uid_map);
+	if (!ret)
+		ret = copy_mnt_idmap(&mnt_userns->gid_map, &idmap->gid_map);
+	if (ret) {
+		free_mnt_idmap(idmap);
+		idmap = ERR_PTR(ret);
+	}
 	return idmap;
 }
 
@@ -234,9 +312,7 @@ EXPORT_SYMBOL_GPL(mnt_idmap_get);
  */
 void mnt_idmap_put(struct mnt_idmap *idmap)
 {
-	if (idmap != &nop_mnt_idmap && refcount_dec_and_test(&idmap->count)) {
-		put_user_ns(idmap->owner);
-		kfree(idmap);
-	}
+	if (idmap != &nop_mnt_idmap && refcount_dec_and_test(&idmap->count))
+		free_mnt_idmap(idmap);
 }
 EXPORT_SYMBOL_GPL(mnt_idmap_put);
diff --git a/include/linux/uidgid.h b/include/linux/uidgid.h
index b0542cd11aeb..7806e93b907d 100644
--- a/include/linux/uidgid.h
+++ b/include/linux/uidgid.h
@@ -17,6 +17,7 @@
 
 struct user_namespace;
 extern struct user_namespace init_user_ns;
+struct uid_gid_map;
 
 typedef struct {
 	uid_t val;
@@ -138,6 +139,9 @@ static inline bool kgid_has_mapping(struct user_namespace *ns, kgid_t gid)
 	return from_kgid(ns, gid) != (gid_t) -1;
 }
 
+u32 map_id_down(struct uid_gid_map *map, u32 id);
+u32 map_id_up(struct uid_gid_map *map, u32 id);
+
 #else
 
 static inline kuid_t make_kuid(struct user_namespace *from, uid_t uid)
@@ -186,6 +190,15 @@ static inline bool kgid_has_mapping(struct user_namespace *ns, kgid_t gid)
 	return gid_valid(gid);
 }
 
+static inline u32 map_id_down(struct uid_gid_map *map, u32 id)
+{
+	return id;
+}
+
+static inline u32 map_id_up(struct uid_gid_map *map, u32 id);
+{
+	return id;
+}
 #endif /* CONFIG_USER_NS */
 
 #endif /* _LINUX_UIDGID_H */
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index eabe8bcc7042..a649e58e3b6a 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -332,7 +332,7 @@ static u32 map_id_range_down(struct uid_gid_map *map, u32 id, u32 count)
 	return id;
 }
 
-static u32 map_id_down(struct uid_gid_map *map, u32 id)
+u32 map_id_down(struct uid_gid_map *map, u32 id)
 {
 	return map_id_range_down(map, id, 1);
 }
@@ -375,7 +375,7 @@ map_id_up_max(unsigned extents, struct uid_gid_map *map, u32 id)
 		       sizeof(struct uid_gid_extent), cmp_map_id);
 }
 
-static u32 map_id_up(struct uid_gid_map *map, u32 id)
+u32 map_id_up(struct uid_gid_map *map, u32 id)
 {
 	struct uid_gid_extent *extent;
 	unsigned extents = map->nr_extents;

-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] fs: reformat idmapped mounts entry
  2023-11-22 12:44 [PATCH 0/4] mnt_idmapping: decouple from namespaces Christian Brauner
                   ` (2 preceding siblings ...)
  2023-11-22 12:44 ` [PATCH 3/4] mnt_idmapping: decouple from namespaces Christian Brauner
@ 2023-11-22 12:44 ` Christian Brauner
  2023-11-24  7:52 ` [PATCH 0/4] mnt_idmapping: decouple from namespaces Christian Brauner
  4 siblings, 0 replies; 9+ messages in thread
From: Christian Brauner @ 2023-11-22 12:44 UTC (permalink / raw)
  To: linux-fsdevel, Seth Forshee, Christian Brauner

Reformat idmapped mounts to clearly mark where it belongs.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 MAINTAINERS            | 20 ++++++++++----------
 include/linux/uidgid.h |  2 +-
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 97f51d5ec1cf..d0a7b6f357ce 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8177,6 +8177,16 @@ F:	fs/exportfs/
 F:	fs/fhandle.c
 F:	include/linux/exportfs.h
 
+FILESYSTEMS [IDMAPPED MOUNTS]
+M:	Christian Brauner <brauner@kernel.org>
+M:	Seth Forshee <sforshee@kernel.org>
+L:	linux-fsdevel@vger.kernel.org
+S:	Maintained
+F:	Documentation/filesystems/idmappings.rst
+F:	fs/mnt_idmapping.c
+F:	include/linux/mnt_idmapping.*
+F:	tools/testing/selftests/mount_setattr/
+
 FILESYSTEMS [IOMAP]
 M:	Christian Brauner <brauner@kernel.org>
 R:	Darrick J. Wong <djwong@kernel.org>
@@ -10252,16 +10262,6 @@ S:	Maintained
 W:	https://github.com/o2genum/ideapad-slidebar
 F:	drivers/input/misc/ideapad_slidebar.c
 
-IDMAPPED MOUNTS
-M:	Christian Brauner <brauner@kernel.org>
-M:	Seth Forshee <sforshee@kernel.org>
-L:	linux-fsdevel@vger.kernel.org
-S:	Maintained
-T:	git git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping.git
-F:	Documentation/filesystems/idmappings.rst
-F:	include/linux/mnt_idmapping.*
-F:	tools/testing/selftests/mount_setattr/
-
 IDT VersaClock 5 CLOCK DRIVER
 M:	Luca Ceresoli <luca@lucaceresoli.net>
 S:	Maintained
diff --git a/include/linux/uidgid.h b/include/linux/uidgid.h
index 7806e93b907d..415a7ca2b882 100644
--- a/include/linux/uidgid.h
+++ b/include/linux/uidgid.h
@@ -195,7 +195,7 @@ static inline u32 map_id_down(struct uid_gid_map *map, u32 id)
 	return id;
 }
 
-static inline u32 map_id_up(struct uid_gid_map *map, u32 id);
+static inline u32 map_id_up(struct uid_gid_map *map, u32 id)
 {
 	return id;
 }

-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/4] mnt_idmapping: decouple from namespaces
  2023-11-22 12:44 ` [PATCH 3/4] mnt_idmapping: decouple from namespaces Christian Brauner
@ 2023-11-22 14:26   ` Josef Bacik
  2023-11-22 14:34     ` Christian Brauner
  0 siblings, 1 reply; 9+ messages in thread
From: Josef Bacik @ 2023-11-22 14:26 UTC (permalink / raw)
  To: Christian Brauner; +Cc: linux-fsdevel, Seth Forshee

On Wed, Nov 22, 2023 at 01:44:39PM +0100, Christian Brauner wrote:
> There's no reason we need to couple mnt idmapping to namespaces in the
> way we currently do. Copy the idmapping when an idmapped mount is
> created and don't take any reference on the namespace at all.
> 
> We also can't easily refcount struct uid_gid_map because it needs to
> stay the size of a cacheline otherwise we risk performance regressions
> (Ignoring for a second that right now struct uid_gid_map isn't actually
>  64 byte but 72 but that's a fix for another patch series.).
> 
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
>  fs/mnt_idmapping.c      | 106 +++++++++++++++++++++++++++++++++++++++++-------
>  include/linux/uidgid.h  |  13 ++++++
>  kernel/user_namespace.c |   4 +-
>  3 files changed, 106 insertions(+), 17 deletions(-)
> 
> diff --git a/fs/mnt_idmapping.c b/fs/mnt_idmapping.c
> index 35d78cb3c38a..64c5205e2b5e 100644
> --- a/fs/mnt_idmapping.c
> +++ b/fs/mnt_idmapping.c
> @@ -9,8 +9,16 @@
>  
>  #include "internal.h"
>  
> +/*
> + * Outside of this file vfs{g,u}id_t are always created from k{g,u}id_t,
> + * never from raw values. These are just internal helpers.
> + */
> +#define VFSUIDT_INIT_RAW(val) (vfsuid_t){ val }
> +#define VFSGIDT_INIT_RAW(val) (vfsgid_t){ val }
> +
>  struct mnt_idmap {
> -	struct user_namespace *owner;
> +	struct uid_gid_map uid_map;
> +	struct uid_gid_map gid_map;
>  	refcount_t count;
>  };
>  
> @@ -20,7 +28,6 @@ struct mnt_idmap {
>   * mapped to {g,u}id 1, [...], {g,u}id 1000 to {g,u}id 1000, [...].
>   */
>  struct mnt_idmap nop_mnt_idmap = {
> -	.owner	= &init_user_ns,
>  	.count	= REFCOUNT_INIT(1),
>  };
>  EXPORT_SYMBOL_GPL(nop_mnt_idmap);
> @@ -65,7 +72,6 @@ vfsuid_t make_vfsuid(struct mnt_idmap *idmap,
>  		     kuid_t kuid)
>  {
>  	uid_t uid;
> -	struct user_namespace *mnt_userns = idmap->owner;
>  
>  	if (idmap == &nop_mnt_idmap)
>  		return VFSUIDT_INIT(kuid);
> @@ -75,7 +81,7 @@ vfsuid_t make_vfsuid(struct mnt_idmap *idmap,
>  		uid = from_kuid(fs_userns, kuid);
>  	if (uid == (uid_t)-1)
>  		return INVALID_VFSUID;
> -	return VFSUIDT_INIT(make_kuid(mnt_userns, uid));
> +	return VFSUIDT_INIT_RAW(map_id_down(&idmap->uid_map, uid));
>  }
>  EXPORT_SYMBOL_GPL(make_vfsuid);
>  
> @@ -103,7 +109,6 @@ vfsgid_t make_vfsgid(struct mnt_idmap *idmap,
>  		     struct user_namespace *fs_userns, kgid_t kgid)
>  {
>  	gid_t gid;
> -	struct user_namespace *mnt_userns = idmap->owner;
>  
>  	if (idmap == &nop_mnt_idmap)
>  		return VFSGIDT_INIT(kgid);
> @@ -113,7 +118,7 @@ vfsgid_t make_vfsgid(struct mnt_idmap *idmap,
>  		gid = from_kgid(fs_userns, kgid);
>  	if (gid == (gid_t)-1)
>  		return INVALID_VFSGID;
> -	return VFSGIDT_INIT(make_kgid(mnt_userns, gid));
> +	return VFSGIDT_INIT_RAW(map_id_down(&idmap->gid_map, gid));
>  }
>  EXPORT_SYMBOL_GPL(make_vfsgid);
>  
> @@ -132,11 +137,10 @@ kuid_t from_vfsuid(struct mnt_idmap *idmap,
>  		   struct user_namespace *fs_userns, vfsuid_t vfsuid)
>  {
>  	uid_t uid;
> -	struct user_namespace *mnt_userns = idmap->owner;
>  
>  	if (idmap == &nop_mnt_idmap)
>  		return AS_KUIDT(vfsuid);
> -	uid = from_kuid(mnt_userns, AS_KUIDT(vfsuid));
> +	uid = map_id_up(&idmap->uid_map, __vfsuid_val(vfsuid));
>  	if (uid == (uid_t)-1)
>  		return INVALID_UID;
>  	if (initial_idmapping(fs_userns))
> @@ -160,11 +164,10 @@ kgid_t from_vfsgid(struct mnt_idmap *idmap,
>  		   struct user_namespace *fs_userns, vfsgid_t vfsgid)
>  {
>  	gid_t gid;
> -	struct user_namespace *mnt_userns = idmap->owner;
>  
>  	if (idmap == &nop_mnt_idmap)
>  		return AS_KGIDT(vfsgid);
> -	gid = from_kgid(mnt_userns, AS_KGIDT(vfsgid));
> +	gid = map_id_up(&idmap->gid_map, __vfsgid_val(vfsgid));
>  	if (gid == (gid_t)-1)
>  		return INVALID_GID;
>  	if (initial_idmapping(fs_userns))
> @@ -195,16 +198,91 @@ int vfsgid_in_group_p(vfsgid_t vfsgid)
>  #endif
>  EXPORT_SYMBOL_GPL(vfsgid_in_group_p);
>  
> +static int copy_mnt_idmap(struct uid_gid_map *map_from,
> +			  struct uid_gid_map *map_to)
> +{
> +	struct uid_gid_extent *forward, *reverse;
> +	u32 nr_extents = READ_ONCE(map_from->nr_extents);
> +	/* Pairs with smp_wmb() when writing the idmapping. */
> +	smp_rmb();
> +
> +	/*
> +	 * Don't blindly copy @map_to into @map_from if nr_extents is
> +	 * smaller or equal to UID_GID_MAP_MAX_BASE_EXTENTS. Since we
> +	 * read @nr_extents someone could have written an idmapping and
> +	 * then we might end up with inconsistent data. So just don't do
> +	 * anything at all.
> +	 */
> +	if (nr_extents == 0)
> +		return 0;
> +
> +	/*
> +	 * Here we know that nr_extents is greater than zero which means
> +	 * a map has been written. Since idmappings can't be changed
> +	 * once they have been written we know that we can safely copy
> +	 * from @map_to into @map_from.
> +	 */
> +
> +	if (nr_extents <= UID_GID_MAP_MAX_BASE_EXTENTS) {
> +		*map_to = *map_from;
> +		return 0;
> +	}
> +
> +	forward = kmemdup(map_from->forward,
> +			  nr_extents * sizeof(struct uid_gid_extent),
> +			  GFP_KERNEL_ACCOUNT);
> +	if (!forward)
> +		return -ENOMEM;
> +
> +	reverse = kmemdup(map_from->reverse,
> +			  nr_extents * sizeof(struct uid_gid_extent),
> +			  GFP_KERNEL_ACCOUNT);
> +	if (!reverse) {
> +		kfree(forward);
> +		return -ENOMEM;
> +	}
> +
> +	/*
> +	 * The idmapping isn't exposed anywhere so we don't need to care
> +	 * about ordering between extent pointers and @nr_extents
> +	 * initialization.
> +	 */
> +	map_to->forward = forward;
> +	map_to->reverse = reverse;
> +	map_to->nr_extents = nr_extents;
> +	return 0;
> +}
> +
> +static void free_mnt_idmap(struct mnt_idmap *idmap)
> +{
> +	if (idmap->uid_map.nr_extents > UID_GID_MAP_MAX_BASE_EXTENTS) {
> +		kfree(idmap->uid_map.forward);
> +		kfree(idmap->uid_map.reverse);
> +	}
> +	if (idmap->gid_map.nr_extents > UID_GID_MAP_MAX_BASE_EXTENTS) {
> +		kfree(idmap->gid_map.forward);
> +		kfree(idmap->gid_map.reverse);
> +	}
> +	kfree(idmap);
> +}
> +
>  struct mnt_idmap *alloc_mnt_idmap(struct user_namespace *mnt_userns)
>  {
>  	struct mnt_idmap *idmap;
> +	int ret;
>  
>  	idmap = kzalloc(sizeof(struct mnt_idmap), GFP_KERNEL_ACCOUNT);
>  	if (!idmap)
>  		return ERR_PTR(-ENOMEM);
>  
> -	idmap->owner = get_user_ns(mnt_userns);
>  	refcount_set(&idmap->count, 1);
> +	ret = copy_mnt_idmap(&mnt_userns->uid_map, &idmap->uid_map);
> +	if (!ret)
> +		ret = copy_mnt_idmap(&mnt_userns->gid_map, &idmap->gid_map);
> +	if (ret) {
> +		free_mnt_idmap(idmap);
> +		idmap = ERR_PTR(ret);
> +	}
>  	return idmap;
>  }
>  
> @@ -234,9 +312,7 @@ EXPORT_SYMBOL_GPL(mnt_idmap_get);
>   */
>  void mnt_idmap_put(struct mnt_idmap *idmap)
>  {
> -	if (idmap != &nop_mnt_idmap && refcount_dec_and_test(&idmap->count)) {
> -		put_user_ns(idmap->owner);
> -		kfree(idmap);
> -	}
> +	if (idmap != &nop_mnt_idmap && refcount_dec_and_test(&idmap->count))
> +		free_mnt_idmap(idmap);
>  }
>  EXPORT_SYMBOL_GPL(mnt_idmap_put);
> diff --git a/include/linux/uidgid.h b/include/linux/uidgid.h
> index b0542cd11aeb..7806e93b907d 100644
> --- a/include/linux/uidgid.h
> +++ b/include/linux/uidgid.h
> @@ -17,6 +17,7 @@
>  
>  struct user_namespace;
>  extern struct user_namespace init_user_ns;
> +struct uid_gid_map;
>  
>  typedef struct {
>  	uid_t val;
> @@ -138,6 +139,9 @@ static inline bool kgid_has_mapping(struct user_namespace *ns, kgid_t gid)
>  	return from_kgid(ns, gid) != (gid_t) -1;
>  }
>  
> +u32 map_id_down(struct uid_gid_map *map, u32 id);
> +u32 map_id_up(struct uid_gid_map *map, u32 id);
> +
>  #else
>  
>  static inline kuid_t make_kuid(struct user_namespace *from, uid_t uid)
> @@ -186,6 +190,15 @@ static inline bool kgid_has_mapping(struct user_namespace *ns, kgid_t gid)
>  	return gid_valid(gid);
>  }
>  
> +static inline u32 map_id_down(struct uid_gid_map *map, u32 id)
> +{
> +	return id;
> +}
> +
> +static inline u32 map_id_up(struct uid_gid_map *map, u32 id);

You accidentally put a ; here, and then fix it up in the next patch, it needs to
be fixed here.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/4] mnt_idmapping: decouple from namespaces
  2023-11-22 14:26   ` Josef Bacik
@ 2023-11-22 14:34     ` Christian Brauner
  2023-11-22 15:14       ` Josef Bacik
  0 siblings, 1 reply; 9+ messages in thread
From: Christian Brauner @ 2023-11-22 14:34 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-fsdevel, Seth Forshee

> You accidentally put a ; here, and then fix it up in the next patch, it needs to
> be fixed here.  Thanks,

Bah, fixed this now. Thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/4] mnt_idmapping: decouple from namespaces
  2023-11-22 14:34     ` Christian Brauner
@ 2023-11-22 15:14       ` Josef Bacik
  0 siblings, 0 replies; 9+ messages in thread
From: Josef Bacik @ 2023-11-22 15:14 UTC (permalink / raw)
  To: Christian Brauner; +Cc: linux-fsdevel, Seth Forshee

On Wed, Nov 22, 2023 at 03:34:39PM +0100, Christian Brauner wrote:
> > You accidentally put a ; here, and then fix it up in the next patch, it needs to
> > be fixed here.  Thanks,
> 
> Bah, fixed this now. Thanks!

You can add

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] mnt_idmapping: decouple from namespaces
  2023-11-22 12:44 [PATCH 0/4] mnt_idmapping: decouple from namespaces Christian Brauner
                   ` (3 preceding siblings ...)
  2023-11-22 12:44 ` [PATCH 4/4] fs: reformat idmapped mounts entry Christian Brauner
@ 2023-11-24  7:52 ` Christian Brauner
  4 siblings, 0 replies; 9+ messages in thread
From: Christian Brauner @ 2023-11-24  7:52 UTC (permalink / raw)
  To: linux-fsdevel, Seth Forshee, Christian Brauner

On Wed, 22 Nov 2023 13:44:36 +0100, Christian Brauner wrote:
> Hey,
> 
> This is a tiny series to fully decouple idmapped mounts from namespaces.
> We already have a dedicated type and nothing matters from a namespace
> apart from it's permissions. So just get rid of it. Also means we could
> extend this to allow changing of idmapping completely independent of
> namespaces in the future. There's no need to tie them that close
> together.
> 
> [...]

Applied to the vfs.misc branch of the vfs/vfs.git tree.
Patches in the vfs.misc branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.misc

[1/4] mnt_idmapping: remove check_fsmapping()
      https://git.kernel.org/vfs/vfs/c/a4fd34a68d61
[2/4] mnt_idmapping: remove nop check
      https://git.kernel.org/vfs/vfs/c/b77a69e35261
[3/4] mnt_idmapping: decouple from namespaces
      https://git.kernel.org/vfs/vfs/c/cc8ac0ea8188
[4/4] fs: reformat idmapped mounts entry
      https://git.kernel.org/vfs/vfs/c/5c7b656ebb3b

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-11-24  7:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-22 12:44 [PATCH 0/4] mnt_idmapping: decouple from namespaces Christian Brauner
2023-11-22 12:44 ` [PATCH 1/4] mnt_idmapping: remove check_fsmapping() Christian Brauner
2023-11-22 12:44 ` [PATCH 2/4] mnt_idmapping: remove nop check Christian Brauner
2023-11-22 12:44 ` [PATCH 3/4] mnt_idmapping: decouple from namespaces Christian Brauner
2023-11-22 14:26   ` Josef Bacik
2023-11-22 14:34     ` Christian Brauner
2023-11-22 15:14       ` Josef Bacik
2023-11-22 12:44 ` [PATCH 4/4] fs: reformat idmapped mounts entry Christian Brauner
2023-11-24  7:52 ` [PATCH 0/4] mnt_idmapping: decouple from namespaces Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).