linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Use guest mem inodes instead of anonymous inodes
@ 2025-06-02 19:17 Ackerley Tng
  2025-06-02 19:17 ` [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode Ackerley Tng
  2025-06-02 19:17 ` [PATCH 2/2] KVM: guest_memfd: Use guest mem inodes instead of anonymous inodes Ackerley Tng
  0 siblings, 2 replies; 15+ messages in thread
From: Ackerley Tng @ 2025-06-02 19:17 UTC (permalink / raw)
  To: kvm, linux-mm, linux-kernel, x86, linux-fsdevel
  Cc: ackerleytng, aik, ajones, akpm, amoorthy, anthony.yznaga, anup,
	aou, bfoster, binbin.wu, brauner, catalin.marinas, chao.p.peng,
	chenhuacai, dave.hansen, david, dmatlack, dwmw, erdemaktas,
	fan.du, fvdl, graf, haibo1.xu, hch, hughd, ira.weiny,
	isaku.yamahata, jack, james.morse, jarkko, jgg, jgowans, jhubbard,
	jroedel, jthoughton, jun.miao, kai.huang, keirf, kent.overstreet,
	kirill.shutemov, liam.merwick, maciej.wieczor-retman, mail, maz,
	mic, michael.roth, mpe, muchun.song, nikunj, nsaenz, oliver.upton,
	palmer, pankaj.gupta, paul.walmsley, pbonzini, pdurrant, peterx,
	pgonda, pvorel, qperret, quic_cvanscha, quic_eberman,
	quic_mnalajal, quic_pderrin, quic_pheragu, quic_svaddagi,
	quic_tsoni, richard.weiyang, rick.p.edgecombe, rientjes, roypat,
	rppt, seanjc, shuah, steven.price, steven.sistare, suzuki.poulose,
	tabba, thomas.lendacky, vannapurve, vbabka, viro, vkuznets,
	wei.w.wang, will, willy, xiaoyao.li, yan.y.zhao, yilun.xu,
	yuzenghui, zhiquan1.li

Hi,

This small patch series makes guest_memfd use guest mem inodes instead
of anonymous inodes and also includes some refactoring to expose a new
function that allocates an inode and runs security checks.

This patch series will serve as a common base for some in-flight series:

* Add NUMA mempolicy support for KVM guest-memfd [1]
* New KVM ioctl to link a gmem inode to a new gmem file [2]
* Restricted mapping of guest_memfd at the host and arm64 support [3]
  aka shared/private conversion support for guest_memfd

[1] https://lore.kernel.org/all/20250408112402.181574-1-shivankg@amd.com/
[2] https://lore.kernel.org/lkml/cover.1747368092.git.afranji@google.com/
[3] https://lore.kernel.org/all/20250328153133.3504118-1-tabba@google.com/

Ackerley Tng (2):
  fs: Provide function that allocates a secure anonymous inode
  KVM: guest_memfd: Use guest mem inodes instead of anonymous inodes

 fs/anon_inodes.c           |  22 ++++--
 include/linux/fs.h         |   1 +
 include/uapi/linux/magic.h |   1 +
 mm/secretmem.c             |   9 +--
 virt/kvm/guest_memfd.c     | 134 +++++++++++++++++++++++++++++++------
 virt/kvm/kvm_main.c        |   7 +-
 virt/kvm/kvm_mm.h          |   9 ++-
 7 files changed, 143 insertions(+), 40 deletions(-)


base-commit: a5806cd506af5a7c19bcd596e4708b5c464bfd21
--
2.49.0.1204.g71687c7c1d-goog

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode
  2025-06-02 19:17 [PATCH 0/2] Use guest mem inodes instead of anonymous inodes Ackerley Tng
@ 2025-06-02 19:17 ` Ackerley Tng
  2025-06-02 20:02   ` David Hildenbrand
                     ` (3 more replies)
  2025-06-02 19:17 ` [PATCH 2/2] KVM: guest_memfd: Use guest mem inodes instead of anonymous inodes Ackerley Tng
  1 sibling, 4 replies; 15+ messages in thread
From: Ackerley Tng @ 2025-06-02 19:17 UTC (permalink / raw)
  To: kvm, linux-mm, linux-kernel, x86, linux-fsdevel
  Cc: ackerleytng, aik, ajones, akpm, amoorthy, anthony.yznaga, anup,
	aou, bfoster, binbin.wu, brauner, catalin.marinas, chao.p.peng,
	chenhuacai, dave.hansen, david, dmatlack, dwmw, erdemaktas,
	fan.du, fvdl, graf, haibo1.xu, hch, hughd, ira.weiny,
	isaku.yamahata, jack, james.morse, jarkko, jgg, jgowans, jhubbard,
	jroedel, jthoughton, jun.miao, kai.huang, keirf, kent.overstreet,
	kirill.shutemov, liam.merwick, maciej.wieczor-retman, mail, maz,
	mic, michael.roth, mpe, muchun.song, nikunj, nsaenz, oliver.upton,
	palmer, pankaj.gupta, paul.walmsley, pbonzini, pdurrant, peterx,
	pgonda, pvorel, qperret, quic_cvanscha, quic_eberman,
	quic_mnalajal, quic_pderrin, quic_pheragu, quic_svaddagi,
	quic_tsoni, richard.weiyang, rick.p.edgecombe, rientjes, roypat,
	rppt, seanjc, shuah, steven.price, steven.sistare, suzuki.poulose,
	tabba, thomas.lendacky, vannapurve, vbabka, viro, vkuznets,
	wei.w.wang, will, willy, xiaoyao.li, yan.y.zhao, yilun.xu,
	yuzenghui, zhiquan1.li

The new function, alloc_anon_secure_inode(), returns an inode after
running checks in security_inode_init_security_anon().

Also refactor secretmem's file creation process to use the new
function.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
---
 fs/anon_inodes.c   | 22 ++++++++++++++++------
 include/linux/fs.h |  1 +
 mm/secretmem.c     |  9 +--------
 3 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c
index 583ac81669c2..4c3110378647 100644
--- a/fs/anon_inodes.c
+++ b/fs/anon_inodes.c
@@ -55,17 +55,20 @@ static struct file_system_type anon_inode_fs_type = {
 	.kill_sb	= kill_anon_super,
 };

-static struct inode *anon_inode_make_secure_inode(
-	const char *name,
-	const struct inode *context_inode)
+static struct inode *anon_inode_make_secure_inode(struct super_block *s,
+		const char *name, const struct inode *context_inode,
+		bool fs_internal)
 {
 	struct inode *inode;
 	int error;

-	inode = alloc_anon_inode(anon_inode_mnt->mnt_sb);
+	inode = alloc_anon_inode(s);
 	if (IS_ERR(inode))
 		return inode;
-	inode->i_flags &= ~S_PRIVATE;
+
+	if (!fs_internal)
+		inode->i_flags &= ~S_PRIVATE;
+
 	error =	security_inode_init_security_anon(inode, &QSTR(name),
 						  context_inode);
 	if (error) {
@@ -75,6 +78,12 @@ static struct inode *anon_inode_make_secure_inode(
 	return inode;
 }

+struct inode *alloc_anon_secure_inode(struct super_block *s, const char *name)
+{
+	return anon_inode_make_secure_inode(s, name, NULL, true);
+}
+EXPORT_SYMBOL_GPL(alloc_anon_secure_inode);
+
 static struct file *__anon_inode_getfile(const char *name,
 					 const struct file_operations *fops,
 					 void *priv, int flags,
@@ -88,7 +97,8 @@ static struct file *__anon_inode_getfile(const char *name,
 		return ERR_PTR(-ENOENT);

 	if (make_inode) {
-		inode =	anon_inode_make_secure_inode(name, context_inode);
+		inode = anon_inode_make_secure_inode(anon_inode_mnt->mnt_sb,
+						     name, context_inode, false);
 		if (IS_ERR(inode)) {
 			file = ERR_CAST(inode);
 			goto err;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 016b0fe1536e..0fded2e3c661 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3550,6 +3550,7 @@ extern int simple_write_begin(struct file *file, struct address_space *mapping,
 extern const struct address_space_operations ram_aops;
 extern int always_delete_dentry(const struct dentry *);
 extern struct inode *alloc_anon_inode(struct super_block *);
+extern struct inode *alloc_anon_secure_inode(struct super_block *, const char *);
 extern int simple_nosetlease(struct file *, int, struct file_lease **, void **);
 extern const struct dentry_operations simple_dentry_operations;

diff --git a/mm/secretmem.c b/mm/secretmem.c
index 1b0a214ee558..c0e459e58cb6 100644
--- a/mm/secretmem.c
+++ b/mm/secretmem.c
@@ -195,18 +195,11 @@ static struct file *secretmem_file_create(unsigned long flags)
 	struct file *file;
 	struct inode *inode;
 	const char *anon_name = "[secretmem]";
-	int err;

-	inode = alloc_anon_inode(secretmem_mnt->mnt_sb);
+	inode = alloc_anon_secure_inode(secretmem_mnt->mnt_sb, anon_name);
 	if (IS_ERR(inode))
 		return ERR_CAST(inode);

-	err = security_inode_init_security_anon(inode, &QSTR(anon_name), NULL);
-	if (err) {
-		file = ERR_PTR(err);
-		goto err_free_inode;
-	}
-
 	file = alloc_file_pseudo(inode, secretmem_mnt, "secretmem",
 				 O_RDWR, &secretmem_fops);
 	if (IS_ERR(file))
--
2.49.0.1204.g71687c7c1d-goog

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] KVM: guest_memfd: Use guest mem inodes instead of anonymous inodes
  2025-06-02 19:17 [PATCH 0/2] Use guest mem inodes instead of anonymous inodes Ackerley Tng
  2025-06-02 19:17 ` [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode Ackerley Tng
@ 2025-06-02 19:17 ` Ackerley Tng
  2025-06-02 20:13   ` David Hildenbrand
  1 sibling, 1 reply; 15+ messages in thread
From: Ackerley Tng @ 2025-06-02 19:17 UTC (permalink / raw)
  To: kvm, linux-mm, linux-kernel, x86, linux-fsdevel
  Cc: ackerleytng, aik, ajones, akpm, amoorthy, anthony.yznaga, anup,
	aou, bfoster, binbin.wu, brauner, catalin.marinas, chao.p.peng,
	chenhuacai, dave.hansen, david, dmatlack, dwmw, erdemaktas,
	fan.du, fvdl, graf, haibo1.xu, hch, hughd, ira.weiny,
	isaku.yamahata, jack, james.morse, jarkko, jgg, jgowans, jhubbard,
	jroedel, jthoughton, jun.miao, kai.huang, keirf, kent.overstreet,
	kirill.shutemov, liam.merwick, maciej.wieczor-retman, mail, maz,
	mic, michael.roth, mpe, muchun.song, nikunj, nsaenz, oliver.upton,
	palmer, pankaj.gupta, paul.walmsley, pbonzini, pdurrant, peterx,
	pgonda, pvorel, qperret, quic_cvanscha, quic_eberman,
	quic_mnalajal, quic_pderrin, quic_pheragu, quic_svaddagi,
	quic_tsoni, richard.weiyang, rick.p.edgecombe, rientjes, roypat,
	rppt, seanjc, shuah, steven.price, steven.sistare, suzuki.poulose,
	tabba, thomas.lendacky, vannapurve, vbabka, viro, vkuznets,
	wei.w.wang, will, willy, xiaoyao.li, yan.y.zhao, yilun.xu,
	yuzenghui, zhiquan1.li

guest_memfd's inode represents memory the guest_memfd is
providing. guest_memfd's file represents a struct kvm's view of that
memory.

Using a custom inode allows customization of the inode teardown
process via callbacks. For example, ->evict_inode() allows
customization of the truncation process on file close, and
->destroy_inode() and ->free_inode() allow customization of the inode
freeing process.

Customizing the truncation process allows flexibility in management of
guest_memfd memory and customization of the inode freeing process
allows proper cleanup of memory metadata stored on the inode.

Memory metadata is more appropriately stored on the inode (as opposed
to the file), since the metadata is for the memory and is not unique
to a specific binding and struct kvm.

Signed-off-by: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 include/uapi/linux/magic.h |   1 +
 virt/kvm/guest_memfd.c     | 134 +++++++++++++++++++++++++++++++------
 virt/kvm/kvm_main.c        |   7 +-
 virt/kvm/kvm_mm.h          |   9 ++-
 4 files changed, 125 insertions(+), 26 deletions(-)

diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
index bb575f3ab45e..638ca21b7a90 100644
--- a/include/uapi/linux/magic.h
+++ b/include/uapi/linux/magic.h
@@ -103,5 +103,6 @@
 #define DEVMEM_MAGIC		0x454d444d	/* "DMEM" */
 #define SECRETMEM_MAGIC		0x5345434d	/* "SECM" */
 #define PID_FS_MAGIC		0x50494446	/* "PIDF" */
+#define GUEST_MEMFD_MAGIC	0x474d454d	/* "GMEM" */

 #endif /* __LINUX_MAGIC_H__ */
diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
index b2aa6bf24d3a..1283b85aeb44 100644
--- a/virt/kvm/guest_memfd.c
+++ b/virt/kvm/guest_memfd.c
@@ -1,12 +1,16 @@
 // SPDX-License-Identifier: GPL-2.0
+#include <linux/anon_inodes.h>
 #include <linux/backing-dev.h>
 #include <linux/falloc.h>
+#include <linux/fs.h>
 #include <linux/kvm_host.h>
+#include <linux/pseudo_fs.h>
 #include <linux/pagemap.h>
-#include <linux/anon_inodes.h>

 #include "kvm_mm.h"

+static struct vfsmount *kvm_gmem_mnt;
+
 struct kvm_gmem {
 	struct kvm *kvm;
 	struct xarray bindings;
@@ -318,9 +322,51 @@ static struct file_operations kvm_gmem_fops = {
 	.fallocate	= kvm_gmem_fallocate,
 };

-void kvm_gmem_init(struct module *module)
+static const struct super_operations kvm_gmem_super_operations = {
+	.statfs		= simple_statfs,
+};
+
+static int kvm_gmem_init_fs_context(struct fs_context *fc)
+{
+	struct pseudo_fs_context *ctx;
+
+	if (!init_pseudo(fc, GUEST_MEMFD_MAGIC))
+		return -ENOMEM;
+
+	ctx = fc->fs_private;
+	ctx->ops = &kvm_gmem_super_operations;
+
+	return 0;
+}
+
+static struct file_system_type kvm_gmem_fs = {
+	.name		 = "kvm_guest_memory",
+	.init_fs_context = kvm_gmem_init_fs_context,
+	.kill_sb	 = kill_anon_super,
+};
+
+static int kvm_gmem_init_mount(void)
+{
+	kvm_gmem_mnt = kern_mount(&kvm_gmem_fs);
+
+	if (WARN_ON_ONCE(IS_ERR(kvm_gmem_mnt)))
+		return PTR_ERR(kvm_gmem_mnt);
+
+	kvm_gmem_mnt->mnt_flags |= MNT_NOEXEC;
+	return 0;
+}
+
+int kvm_gmem_init(struct module *module)
 {
 	kvm_gmem_fops.owner = module;
+
+	return kvm_gmem_init_mount();
+}
+
+void kvm_gmem_exit(void)
+{
+	kern_unmount(kvm_gmem_mnt);
+	kvm_gmem_mnt = NULL;
 }

 static int kvm_gmem_migrate_folio(struct address_space *mapping,
@@ -402,11 +448,71 @@ static const struct inode_operations kvm_gmem_iops = {
 	.setattr	= kvm_gmem_setattr,
 };

+static struct inode *kvm_gmem_inode_make_secure_inode(const char *name,
+						      loff_t size, u64 flags)
+{
+	struct inode *inode;
+
+	inode = alloc_anon_secure_inode(kvm_gmem_mnt->mnt_sb, name);
+	if (IS_ERR(inode))
+		return inode;
+
+	inode->i_private = (void *)(unsigned long)flags;
+	inode->i_op = &kvm_gmem_iops;
+	inode->i_mapping->a_ops = &kvm_gmem_aops;
+	inode->i_mode |= S_IFREG;
+	inode->i_size = size;
+	mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER);
+	mapping_set_inaccessible(inode->i_mapping);
+	/* Unmovable mappings are supposed to be marked unevictable as well. */
+	WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping));
+
+	return inode;
+}
+
+static struct file *kvm_gmem_inode_create_getfile(void *priv, loff_t size,
+						  u64 flags)
+{
+	static const char *name = "[kvm-gmem]";
+	struct inode *inode;
+	struct file *file;
+	int err;
+
+	err = -ENOENT;
+	if (!try_module_get(kvm_gmem_fops.owner))
+		goto err;
+
+	inode = kvm_gmem_inode_make_secure_inode(name, size, flags);
+	if (IS_ERR(inode)) {
+		err = PTR_ERR(inode);
+		goto err_put_module;
+	}
+
+	file = alloc_file_pseudo(inode, kvm_gmem_mnt, name, O_RDWR,
+				 &kvm_gmem_fops);
+	if (IS_ERR(file)) {
+		err = PTR_ERR(file);
+		goto err_put_inode;
+	}
+
+	file->f_flags |= O_LARGEFILE;
+	file->private_data = priv;
+
+out:
+	return file;
+
+err_put_inode:
+	iput(inode);
+err_put_module:
+	module_put(kvm_gmem_fops.owner);
+err:
+	file = ERR_PTR(err);
+	goto out;
+}
+
 static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags)
 {
-	const char *anon_name = "[kvm-gmem]";
 	struct kvm_gmem *gmem;
-	struct inode *inode;
 	struct file *file;
 	int fd, err;

@@ -420,32 +526,16 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags)
 		goto err_fd;
 	}

-	file = anon_inode_create_getfile(anon_name, &kvm_gmem_fops, gmem,
-					 O_RDWR, NULL);
+	file = kvm_gmem_inode_create_getfile(gmem, size, flags);
 	if (IS_ERR(file)) {
 		err = PTR_ERR(file);
 		goto err_gmem;
 	}

-	file->f_flags |= O_LARGEFILE;
-
-	inode = file->f_inode;
-	WARN_ON(file->f_mapping != inode->i_mapping);
-
-	inode->i_private = (void *)(unsigned long)flags;
-	inode->i_op = &kvm_gmem_iops;
-	inode->i_mapping->a_ops = &kvm_gmem_aops;
-	inode->i_mode |= S_IFREG;
-	inode->i_size = size;
-	mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER);
-	mapping_set_inaccessible(inode->i_mapping);
-	/* Unmovable mappings are supposed to be marked unevictable as well. */
-	WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping));
-
 	kvm_get_kvm(kvm);
 	gmem->kvm = kvm;
 	xa_init(&gmem->bindings);
-	list_add(&gmem->entry, &inode->i_mapping->i_private_list);
+	list_add(&gmem->entry, &file_inode(file)->i_mapping->i_private_list);

 	fd_install(fd, file);
 	return fd;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e85b33a92624..094cc0ad31fb 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -6420,7 +6420,9 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 	if (WARN_ON_ONCE(r))
 		goto err_vfio;

-	kvm_gmem_init(module);
+	r = kvm_gmem_init(module);
+	if (r)
+		goto err_gmem;

 	r = kvm_init_virtualization();
 	if (r)
@@ -6441,6 +6443,8 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 err_register:
 	kvm_uninit_virtualization();
 err_virt:
+	kvm_gmem_exit();
+err_gmem:
 	kvm_vfio_ops_exit();
 err_vfio:
 	kvm_async_pf_deinit();
@@ -6472,6 +6476,7 @@ void kvm_exit(void)
 	for_each_possible_cpu(cpu)
 		free_cpumask_var(per_cpu(cpu_kick_mask, cpu));
 	kmem_cache_destroy(kvm_vcpu_cache);
+	kvm_gmem_exit();
 	kvm_vfio_ops_exit();
 	kvm_async_pf_deinit();
 	kvm_irqfd_exit();
diff --git a/virt/kvm/kvm_mm.h b/virt/kvm/kvm_mm.h
index acef3f5c582a..dcacb76b8f00 100644
--- a/virt/kvm/kvm_mm.h
+++ b/virt/kvm/kvm_mm.h
@@ -68,17 +68,20 @@ static inline void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm,
 #endif /* HAVE_KVM_PFNCACHE */

 #ifdef CONFIG_KVM_PRIVATE_MEM
-void kvm_gmem_init(struct module *module);
+int kvm_gmem_init(struct module *module);
+void kvm_gmem_exit(void);
 int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args);
 int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot,
 		  unsigned int fd, loff_t offset);
 void kvm_gmem_unbind(struct kvm_memory_slot *slot);
 #else
-static inline void kvm_gmem_init(struct module *module)
+static inline int kvm_gmem_init(struct module *module)
 {
-
+	return 0;
 }

+static inline void kvm_gmem_exit(void) {};
+
 static inline int kvm_gmem_bind(struct kvm *kvm,
 					 struct kvm_memory_slot *slot,
 					 unsigned int fd, loff_t offset)
--
2.49.0.1204.g71687c7c1d-goog

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode
  2025-06-02 19:17 ` [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode Ackerley Tng
@ 2025-06-02 20:02   ` David Hildenbrand
  2025-06-03  4:52   ` Christoph Hellwig
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: David Hildenbrand @ 2025-06-02 20:02 UTC (permalink / raw)
  To: Ackerley Tng, kvm, linux-mm, linux-kernel, x86, linux-fsdevel
  Cc: aik, ajones, akpm, amoorthy, anthony.yznaga, anup, aou, bfoster,
	binbin.wu, brauner, catalin.marinas, chao.p.peng, chenhuacai,
	dave.hansen, dmatlack, dwmw, erdemaktas, fan.du, fvdl, graf,
	haibo1.xu, hch, hughd, ira.weiny, isaku.yamahata, jack,
	james.morse, jarkko, jgg, jgowans, jhubbard, jroedel, jthoughton,
	jun.miao, kai.huang, keirf, kent.overstreet, kirill.shutemov,
	liam.merwick, maciej.wieczor-retman, mail, maz, mic, michael.roth,
	mpe, muchun.song, nikunj, nsaenz, oliver.upton, palmer,
	pankaj.gupta, paul.walmsley, pbonzini, pdurrant, peterx, pgonda,
	pvorel, qperret, quic_cvanscha, quic_eberman, quic_mnalajal,
	quic_pderrin, quic_pheragu, quic_svaddagi, quic_tsoni,
	richard.weiyang, rick.p.edgecombe, rientjes, roypat, rppt, seanjc,
	shuah, steven.price, steven.sistare, suzuki.poulose, tabba,
	thomas.lendacky, vannapurve, vbabka, viro, vkuznets, wei.w.wang,
	will, willy, xiaoyao.li, yan.y.zhao, yilun.xu, yuzenghui,
	zhiquan1.li

On 02.06.25 21:17, Ackerley Tng wrote:
> The new function, alloc_anon_secure_inode(), returns an inode after
> running checks in security_inode_init_security_anon().
> 
> Also refactor secretmem's file creation process to use the new
> function.
> 
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> ---

Not sure about the subject, "secure anon inode" is misleading, it's an 
anon inode where we gave security callbacks a chance to intervene, right?

maybe simply "fs: factor out anon inode creation + init security"

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] KVM: guest_memfd: Use guest mem inodes instead of anonymous inodes
  2025-06-02 19:17 ` [PATCH 2/2] KVM: guest_memfd: Use guest mem inodes instead of anonymous inodes Ackerley Tng
@ 2025-06-02 20:13   ` David Hildenbrand
  0 siblings, 0 replies; 15+ messages in thread
From: David Hildenbrand @ 2025-06-02 20:13 UTC (permalink / raw)
  To: Ackerley Tng, kvm, linux-mm, linux-kernel, x86, linux-fsdevel
  Cc: aik, ajones, akpm, amoorthy, anthony.yznaga, anup, aou, bfoster,
	binbin.wu, brauner, catalin.marinas, chao.p.peng, chenhuacai,
	dave.hansen, dmatlack, dwmw, erdemaktas, fan.du, fvdl, graf,
	haibo1.xu, hch, hughd, ira.weiny, isaku.yamahata, jack,
	james.morse, jarkko, jgg, jgowans, jhubbard, jroedel, jthoughton,
	jun.miao, kai.huang, keirf, kent.overstreet, kirill.shutemov,
	liam.merwick, maciej.wieczor-retman, mail, maz, mic, michael.roth,
	mpe, muchun.song, nikunj, nsaenz, oliver.upton, palmer,
	pankaj.gupta, paul.walmsley, pbonzini, pdurrant, peterx, pgonda,
	pvorel, qperret, quic_cvanscha, quic_eberman, quic_mnalajal,
	quic_pderrin, quic_pheragu, quic_svaddagi, quic_tsoni,
	richard.weiyang, rick.p.edgecombe, rientjes, roypat, rppt, seanjc,
	shuah, steven.price, steven.sistare, suzuki.poulose, tabba,
	thomas.lendacky, vannapurve, vbabka, viro, vkuznets, wei.w.wang,
	will, willy, xiaoyao.li, yan.y.zhao, yilun.xu, yuzenghui,
	zhiquan1.li

On 02.06.25 21:17, Ackerley Tng wrote:
> guest_memfd's inode represents memory the guest_memfd is
> providing. guest_memfd's file represents a struct kvm's view of that
> memory.
> 
> Using a custom inode allows customization of the inode teardown
> process via callbacks. For example, ->evict_inode() allows
> customization of the truncation process on file close, and
> ->destroy_inode() and ->free_inode() allow customization of the inode
> freeing process.
> 
> Customizing the truncation process allows flexibility in management of
> guest_memfd memory and customization of the inode freeing process
> allows proper cleanup of memory metadata stored on the inode.
> 
> Memory metadata is more appropriately stored on the inode (as opposed
> to the file), since the metadata is for the memory and is not unique
> to a specific binding and struct kvm.
> 
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> Signed-off-by: Fuad Tabba <tabba@google.com>

The trailing SOB from Fuag is likely wrong. Probably you wnat

Co-developed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Ackerley Tng <ackerleytng@google.com>


> ---
>   include/uapi/linux/magic.h |   1 +
>   virt/kvm/guest_memfd.c     | 134 +++++++++++++++++++++++++++++++------
>   virt/kvm/kvm_main.c        |   7 +-
>   virt/kvm/kvm_mm.h          |   9 ++-
>   4 files changed, 125 insertions(+), 26 deletions(-)
> 
> diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
> index bb575f3ab45e..638ca21b7a90 100644
> --- a/include/uapi/linux/magic.h
> +++ b/include/uapi/linux/magic.h
> @@ -103,5 +103,6 @@
>   #define DEVMEM_MAGIC		0x454d444d	/* "DMEM" */
>   #define SECRETMEM_MAGIC		0x5345434d	/* "SECM" */
>   #define PID_FS_MAGIC		0x50494446	/* "PIDF" */
> +#define GUEST_MEMFD_MAGIC	0x474d454d	/* "GMEM" */
> 
>   #endif /* __LINUX_MAGIC_H__ */
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index b2aa6bf24d3a..1283b85aeb44 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -1,12 +1,16 @@
>   // SPDX-License-Identifier: GPL-2.0
> +#include <linux/anon_inodes.h>
>   #include <linux/backing-dev.h>
>   #include <linux/falloc.h>
> +#include <linux/fs.h>
>   #include <linux/kvm_host.h>
> +#include <linux/pseudo_fs.h>
>   #include <linux/pagemap.h>
> -#include <linux/anon_inodes.h>
> 
>   #include "kvm_mm.h"
> 
> +static struct vfsmount *kvm_gmem_mnt;
> +
>   struct kvm_gmem {
>   	struct kvm *kvm;
>   	struct xarray bindings;
> @@ -318,9 +322,51 @@ static struct file_operations kvm_gmem_fops = {
>   	.fallocate	= kvm_gmem_fallocate,
>   };
> 
> -void kvm_gmem_init(struct module *module)
> +static const struct super_operations kvm_gmem_super_operations = {
> +	.statfs		= simple_statfs,
> +};
> +
> +static int kvm_gmem_init_fs_context(struct fs_context *fc)
> +{
> +	struct pseudo_fs_context *ctx;
> +
> +	if (!init_pseudo(fc, GUEST_MEMFD_MAGIC))
> +		return -ENOMEM;
> +
> +	ctx = fc->fs_private;
> +	ctx->ops = &kvm_gmem_super_operations;
> +
> +	return 0;
> +}
> +
> +static struct file_system_type kvm_gmem_fs = {
> +	.name		 = "kvm_guest_memory",
> +	.init_fs_context = kvm_gmem_init_fs_context,
> +	.kill_sb	 = kill_anon_super,
> +};
> +
> +static int kvm_gmem_init_mount(void)
> +{
> +	kvm_gmem_mnt = kern_mount(&kvm_gmem_fs);
> +
> +	if (WARN_ON_ONCE(IS_ERR(kvm_gmem_mnt)))
> +		return PTR_ERR(kvm_gmem_mnt);

Hmm, is this WARN_ON_ONCE really warrented?


Nothing else jumped at me. I hope some fs experts can take a look as well.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode
  2025-06-02 19:17 ` [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode Ackerley Tng
  2025-06-02 20:02   ` David Hildenbrand
@ 2025-06-03  4:52   ` Christoph Hellwig
  2025-06-03 10:40     ` Shivank Garg
  2025-06-04  7:59   ` Mike Rapoport
  2025-06-04  8:02   ` Christian Brauner
  3 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2025-06-03  4:52 UTC (permalink / raw)
  To: Ackerley Tng
  Cc: kvm, linux-mm, linux-kernel, x86, linux-fsdevel, aik, ajones,
	akpm, amoorthy, anthony.yznaga, anup, aou, bfoster, binbin.wu,
	brauner, catalin.marinas, chao.p.peng, chenhuacai, dave.hansen,
	david, dmatlack, dwmw, erdemaktas, fan.du, fvdl, graf, haibo1.xu,
	hch, hughd, ira.weiny, isaku.yamahata, jack, james.morse, jarkko,
	jgg, jgowans, jhubbard, jroedel, jthoughton, jun.miao, kai.huang,
	keirf, kent.overstreet, kirill.shutemov, liam.merwick,
	maciej.wieczor-retman, mail, maz, mic, michael.roth, mpe,
	muchun.song, nikunj, nsaenz, oliver.upton, palmer, pankaj.gupta,
	paul.walmsley, pbonzini, pdurrant, peterx, pgonda, pvorel,
	qperret, quic_cvanscha, quic_eberman, quic_mnalajal, quic_pderrin,
	quic_pheragu, quic_svaddagi, quic_tsoni, richard.weiyang,
	rick.p.edgecombe, rientjes, roypat, rppt, seanjc, shuah,
	steven.price, steven.sistare, suzuki.poulose, tabba,
	thomas.lendacky, vannapurve, vbabka, viro, vkuznets, wei.w.wang,
	will, willy, xiaoyao.li, yan.y.zhao, yilun.xu, yuzenghui,
	zhiquan1.li

On Mon, Jun 02, 2025 at 12:17:54PM -0700, Ackerley Tng wrote:
> +struct inode *alloc_anon_secure_inode(struct super_block *s, const char *name)
> +{
> +	return anon_inode_make_secure_inode(s, name, NULL, true);
> +}
> +EXPORT_SYMBOL_GPL(alloc_anon_secure_inode);

What is "secure" about this inode?

A kerneldoc explaining that would probably help.

> +extern struct inode *alloc_anon_secure_inode(struct super_block *, const char *);

No need for the extern here.  Spelling out the parameter names in
protypes is nice, though. (and fix the long line while you're at it).


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode
  2025-06-03  4:52   ` Christoph Hellwig
@ 2025-06-03 10:40     ` Shivank Garg
  0 siblings, 0 replies; 15+ messages in thread
From: Shivank Garg @ 2025-06-03 10:40 UTC (permalink / raw)
  To: Christoph Hellwig, Ackerley Tng
  Cc: kvm, linux-mm, linux-kernel, x86, linux-fsdevel, aik, ajones,
	akpm, amoorthy, anthony.yznaga, anup, aou, bfoster, binbin.wu,
	brauner, catalin.marinas, chao.p.peng, chenhuacai, dave.hansen,
	david, dmatlack, dwmw, erdemaktas, fan.du, fvdl, graf, haibo1.xu,
	hughd, ira.weiny, isaku.yamahata, jack, james.morse, jarkko, jgg,
	jgowans, jhubbard, jroedel, jthoughton, jun.miao, kai.huang,
	keirf, kent.overstreet, kirill.shutemov, liam.merwick,
	maciej.wieczor-retman, mail, maz, mic, michael.roth, mpe,
	muchun.song, nikunj, nsaenz, oliver.upton, palmer, pankaj.gupta,
	paul.walmsley, pbonzini, pdurrant, peterx, pgonda, pvorel,
	qperret, quic_cvanscha, quic_eberman, quic_mnalajal, quic_pderrin,
	quic_pheragu, quic_svaddagi, quic_tsoni, richard.weiyang,
	rick.p.edgecombe, rientjes, roypat, rppt, seanjc, shuah,
	steven.price, steven.sistare, suzuki.poulose, tabba,
	thomas.lendacky, vannapurve, vbabka, viro, vkuznets, wei.w.wang,
	will, willy, xiaoyao.li, yan.y.zhao, yilun.xu, yuzenghui,
	zhiquan1.li

On 6/3/2025 10:22 AM, Christoph Hellwig wrote:
> On Mon, Jun 02, 2025 at 12:17:54PM -0700, Ackerley Tng wrote:
>> +struct inode *alloc_anon_secure_inode(struct super_block *s, const char *name)
>> +{
>> +	return anon_inode_make_secure_inode(s, name, NULL, true);
>> +}
>> +EXPORT_SYMBOL_GPL(alloc_anon_secure_inode);
> 
> What is "secure" about this inode?
> 
> A kerneldoc explaining that would probably help.
> 
Hi Ackerley,

I had been working on the same based on David's suggestion and included kernel-doc
for the new functions.

https://lore.kernel.org/linux-mm/fc6b74e1-cbe4-4871-89d4-3855ca8f576b@amd.com

Feel free to incorporate the documentation from my patches,
Happy to send it as a follow-up patch or you can grab it from my earlier version.

Thanks,
Shivank

>> +extern struct inode *alloc_anon_secure_inode(struct super_block *, const char *);
> 
> No need for the extern here.  Spelling out the parameter names in
> protypes is nice, though. (and fix the long line while you're at it).
> 
> 





^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode
  2025-06-02 19:17 ` [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode Ackerley Tng
  2025-06-02 20:02   ` David Hildenbrand
  2025-06-03  4:52   ` Christoph Hellwig
@ 2025-06-04  7:59   ` Mike Rapoport
  2025-06-04 21:13     ` Paul Moore
  2025-06-04  8:02   ` Christian Brauner
  3 siblings, 1 reply; 15+ messages in thread
From: Mike Rapoport @ 2025-06-04  7:59 UTC (permalink / raw)
  To: Ackerley Tng
  Cc: kvm, linux-mm, linux-kernel, x86, linux-fsdevel, aik, ajones,
	akpm, amoorthy, anthony.yznaga, anup, aou, bfoster, binbin.wu,
	brauner, catalin.marinas, chao.p.peng, chenhuacai, dave.hansen,
	david, dmatlack, dwmw, erdemaktas, fan.du, fvdl, graf, haibo1.xu,
	hch, hughd, ira.weiny, isaku.yamahata, jack, james.morse, jarkko,
	jgg, jgowans, jhubbard, jroedel, jthoughton, jun.miao, kai.huang,
	keirf, kent.overstreet, kirill.shutemov, liam.merwick,
	maciej.wieczor-retman, mail, maz, mic, michael.roth, mpe,
	muchun.song, nikunj, nsaenz, oliver.upton, palmer, pankaj.gupta,
	paul.walmsley, pbonzini, pdurrant, peterx, pgonda, pvorel,
	qperret, quic_cvanscha, quic_eberman, quic_mnalajal, quic_pderrin,
	quic_pheragu, quic_svaddagi, quic_tsoni, richard.weiyang,
	rick.p.edgecombe, rientjes, roypat, seanjc, shuah, steven.price,
	steven.sistare, suzuki.poulose, tabba, thomas.lendacky,
	vannapurve, vbabka, viro, vkuznets, wei.w.wang, will, willy,
	xiaoyao.li, yan.y.zhao, yilun.xu, yuzenghui, zhiquan1.li,
	Paul Moore

(added Paul Moore for selinux bits)

On Mon, Jun 02, 2025 at 12:17:54PM -0700, Ackerley Tng wrote:
> The new function, alloc_anon_secure_inode(), returns an inode after
> running checks in security_inode_init_security_anon().
> 
> Also refactor secretmem's file creation process to use the new
> function.
> 
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> ---
>  fs/anon_inodes.c   | 22 ++++++++++++++++------
>  include/linux/fs.h |  1 +
>  mm/secretmem.c     |  9 +--------
>  3 files changed, 18 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c
> index 583ac81669c2..4c3110378647 100644
> --- a/fs/anon_inodes.c
> +++ b/fs/anon_inodes.c
> @@ -55,17 +55,20 @@ static struct file_system_type anon_inode_fs_type = {
>  	.kill_sb	= kill_anon_super,
>  };
> 
> -static struct inode *anon_inode_make_secure_inode(
> -	const char *name,
> -	const struct inode *context_inode)
> +static struct inode *anon_inode_make_secure_inode(struct super_block *s,
> +		const char *name, const struct inode *context_inode,
> +		bool fs_internal)
>  {
>  	struct inode *inode;
>  	int error;
> 
> -	inode = alloc_anon_inode(anon_inode_mnt->mnt_sb);
> +	inode = alloc_anon_inode(s);
>  	if (IS_ERR(inode))
>  		return inode;
> -	inode->i_flags &= ~S_PRIVATE;
> +
> +	if (!fs_internal)
> +		inode->i_flags &= ~S_PRIVATE;
> +
>  	error =	security_inode_init_security_anon(inode, &QSTR(name),
>  						  context_inode);
>  	if (error) {
> @@ -75,6 +78,12 @@ static struct inode *anon_inode_make_secure_inode(
>  	return inode;
>  }
> 
> +struct inode *alloc_anon_secure_inode(struct super_block *s, const char *name)
> +{
> +	return anon_inode_make_secure_inode(s, name, NULL, true);
> +}
> +EXPORT_SYMBOL_GPL(alloc_anon_secure_inode);
> +
>  static struct file *__anon_inode_getfile(const char *name,
>  					 const struct file_operations *fops,
>  					 void *priv, int flags,
> @@ -88,7 +97,8 @@ static struct file *__anon_inode_getfile(const char *name,
>  		return ERR_PTR(-ENOENT);
> 
>  	if (make_inode) {
> -		inode =	anon_inode_make_secure_inode(name, context_inode);
> +		inode = anon_inode_make_secure_inode(anon_inode_mnt->mnt_sb,
> +						     name, context_inode, false);
>  		if (IS_ERR(inode)) {
>  			file = ERR_CAST(inode);
>  			goto err;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 016b0fe1536e..0fded2e3c661 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3550,6 +3550,7 @@ extern int simple_write_begin(struct file *file, struct address_space *mapping,
>  extern const struct address_space_operations ram_aops;
>  extern int always_delete_dentry(const struct dentry *);
>  extern struct inode *alloc_anon_inode(struct super_block *);
> +extern struct inode *alloc_anon_secure_inode(struct super_block *, const char *);
>  extern int simple_nosetlease(struct file *, int, struct file_lease **, void **);
>  extern const struct dentry_operations simple_dentry_operations;
> 
> diff --git a/mm/secretmem.c b/mm/secretmem.c
> index 1b0a214ee558..c0e459e58cb6 100644
> --- a/mm/secretmem.c
> +++ b/mm/secretmem.c
> @@ -195,18 +195,11 @@ static struct file *secretmem_file_create(unsigned long flags)
>  	struct file *file;
>  	struct inode *inode;
>  	const char *anon_name = "[secretmem]";
> -	int err;
> 
> -	inode = alloc_anon_inode(secretmem_mnt->mnt_sb);
> +	inode = alloc_anon_secure_inode(secretmem_mnt->mnt_sb, anon_name);
>  	if (IS_ERR(inode))
>  		return ERR_CAST(inode);

I don't think we should not hide secretmem and guest_memfd inodes from
selinux, so clearing S_PRIVATE for them is not needed and you can just drop
fs_internal parameter in anon_inode_make_secure_inode() 

> 
> -	err = security_inode_init_security_anon(inode, &QSTR(anon_name), NULL);
> -	if (err) {
> -		file = ERR_PTR(err);
> -		goto err_free_inode;
> -	}
> -
>  	file = alloc_file_pseudo(inode, secretmem_mnt, "secretmem",
>  				 O_RDWR, &secretmem_fops);
>  	if (IS_ERR(file))
> --
> 2.49.0.1204.g71687c7c1d-goog

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode
  2025-06-02 19:17 ` [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode Ackerley Tng
                     ` (2 preceding siblings ...)
  2025-06-04  7:59   ` Mike Rapoport
@ 2025-06-04  8:02   ` Christian Brauner
  3 siblings, 0 replies; 15+ messages in thread
From: Christian Brauner @ 2025-06-04  8:02 UTC (permalink / raw)
  To: Ackerley Tng
  Cc: kvm, linux-mm, linux-kernel, x86, linux-fsdevel, aik, ajones,
	akpm, amoorthy, anthony.yznaga, anup, aou, bfoster, binbin.wu,
	catalin.marinas, chao.p.peng, chenhuacai, dave.hansen, david,
	dmatlack, dwmw, erdemaktas, fan.du, fvdl, graf, haibo1.xu, hch,
	hughd, ira.weiny, isaku.yamahata, jack, james.morse, jarkko, jgg,
	jgowans, jhubbard, jroedel, jthoughton, jun.miao, kai.huang,
	keirf, kent.overstreet, kirill.shutemov, liam.merwick,
	maciej.wieczor-retman, mail, maz, mic, michael.roth, mpe,
	muchun.song, nikunj, nsaenz, oliver.upton, palmer, pankaj.gupta,
	paul.walmsley, pbonzini, pdurrant, peterx, pgonda, pvorel,
	qperret, quic_cvanscha, quic_eberman, quic_mnalajal, quic_pderrin,
	quic_pheragu, quic_svaddagi, quic_tsoni, richard.weiyang,
	rick.p.edgecombe, rientjes, roypat, rppt, seanjc, shuah,
	steven.price, steven.sistare, suzuki.poulose, tabba,
	thomas.lendacky, vannapurve, vbabka, viro, vkuznets, wei.w.wang,
	will, willy, xiaoyao.li, yan.y.zhao, yilun.xu, yuzenghui,
	zhiquan1.li

On Mon, Jun 02, 2025 at 12:17:54PM -0700, Ackerley Tng wrote:
> The new function, alloc_anon_secure_inode(), returns an inode after
> running checks in security_inode_init_security_anon().
> 
> Also refactor secretmem's file creation process to use the new
> function.
> 
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> ---

Once -rc1 is out I'll pull the VFS bits and provide a branch that
remains stable for the duration of v6.16 development that can be pulled.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode
  2025-06-04  7:59   ` Mike Rapoport
@ 2025-06-04 21:13     ` Paul Moore
  2025-06-05  5:49       ` Mike Rapoport
  0 siblings, 1 reply; 15+ messages in thread
From: Paul Moore @ 2025-06-04 21:13 UTC (permalink / raw)
  To: Mike Rapoport, Ackerley Tng
  Cc: linux-security-module, selinux, kvm, linux-mm, linux-kernel, x86,
	linux-fsdevel, aik, ajones, akpm, amoorthy, anthony.yznaga, anup,
	aou, bfoster, binbin.wu, brauner, catalin.marinas, chao.p.peng,
	chenhuacai, dave.hansen, david, dmatlack, dwmw, erdemaktas,
	fan.du, fvdl, graf, haibo1.xu, hch, hughd, ira.weiny,
	isaku.yamahata, jack, james.morse, jarkko, jgg, jgowans, jhubbard,
	jroedel, jthoughton, jun.miao, kai.huang, keirf, kent.overstreet,
	kirill.shutemov, liam.merwick, maciej.wieczor-retman, mail, maz,
	mic, michael.roth, mpe, muchun.song, nikunj, nsaenz, oliver.upton,
	palmer, pankaj.gupta, paul.walmsley, pbonzini, pdurrant, peterx,
	pgonda, pvorel, qperret, quic_cvanscha, quic_eberman,
	quic_mnalajal, quic_pderrin, quic_pheragu, quic_svaddagi,
	quic_tsoni, richard.weiyang, rick.p.edgecombe, rientjes, roypat,
	seanjc, shuah, steven.price, steven.sistare, suzuki.poulose,
	tabba, thomas.lendacky, vannapurve, vbabka, viro, vkuznets,
	wei.w.wang, will, willy, xiaoyao.li, yan.y.zhao, yilun.xu,
	yuzenghui, zhiquan1.li

On Wed, Jun 4, 2025 at 3:59 AM Mike Rapoport <rppt@kernel.org> wrote:
>
> (added Paul Moore for selinux bits)

Thanks Mike.

I'm adding the LSM and SELinux lists too since there are others that
will be interested as well.

> On Mon, Jun 02, 2025 at 12:17:54PM -0700, Ackerley Tng wrote:
> > The new function, alloc_anon_secure_inode(), returns an inode after
> > running checks in security_inode_init_security_anon().
> >
> > Also refactor secretmem's file creation process to use the new
> > function.
> >
> > Suggested-by: David Hildenbrand <david@redhat.com>
> > Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> > ---
> >  fs/anon_inodes.c   | 22 ++++++++++++++++------
> >  include/linux/fs.h |  1 +
> >  mm/secretmem.c     |  9 +--------
> >  3 files changed, 18 insertions(+), 14 deletions(-)
> >
> > diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c
> > index 583ac81669c2..4c3110378647 100644
> > --- a/fs/anon_inodes.c
> > +++ b/fs/anon_inodes.c
> > @@ -55,17 +55,20 @@ static struct file_system_type anon_inode_fs_type = {
> >       .kill_sb        = kill_anon_super,
> >  };
> >
> > -static struct inode *anon_inode_make_secure_inode(
> > -     const char *name,
> > -     const struct inode *context_inode)
> > +static struct inode *anon_inode_make_secure_inode(struct super_block *s,
> > +             const char *name, const struct inode *context_inode,
> > +             bool fs_internal)
> >  {
> >       struct inode *inode;
> >       int error;
> >
> > -     inode = alloc_anon_inode(anon_inode_mnt->mnt_sb);
> > +     inode = alloc_anon_inode(s);
> >       if (IS_ERR(inode))
> >               return inode;
> > -     inode->i_flags &= ~S_PRIVATE;
> > +
> > +     if (!fs_internal)
> > +             inode->i_flags &= ~S_PRIVATE;
> > +
> >       error = security_inode_init_security_anon(inode, &QSTR(name),
> >                                                 context_inode);
> >       if (error) {
> > @@ -75,6 +78,12 @@ static struct inode *anon_inode_make_secure_inode(
> >       return inode;
> >  }
> >
> > +struct inode *alloc_anon_secure_inode(struct super_block *s, const char *name)
> > +{
> > +     return anon_inode_make_secure_inode(s, name, NULL, true);
> > +}
> > +EXPORT_SYMBOL_GPL(alloc_anon_secure_inode);
> > +
> >  static struct file *__anon_inode_getfile(const char *name,
> >                                        const struct file_operations *fops,
> >                                        void *priv, int flags,
> > @@ -88,7 +97,8 @@ static struct file *__anon_inode_getfile(const char *name,
> >               return ERR_PTR(-ENOENT);
> >
> >       if (make_inode) {
> > -             inode = anon_inode_make_secure_inode(name, context_inode);
> > +             inode = anon_inode_make_secure_inode(anon_inode_mnt->mnt_sb,
> > +                                                  name, context_inode, false);
> >               if (IS_ERR(inode)) {
> >                       file = ERR_CAST(inode);
> >                       goto err;
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 016b0fe1536e..0fded2e3c661 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -3550,6 +3550,7 @@ extern int simple_write_begin(struct file *file, struct address_space *mapping,
> >  extern const struct address_space_operations ram_aops;
> >  extern int always_delete_dentry(const struct dentry *);
> >  extern struct inode *alloc_anon_inode(struct super_block *);
> > +extern struct inode *alloc_anon_secure_inode(struct super_block *, const char *);
> >  extern int simple_nosetlease(struct file *, int, struct file_lease **, void **);
> >  extern const struct dentry_operations simple_dentry_operations;
> >
> > diff --git a/mm/secretmem.c b/mm/secretmem.c
> > index 1b0a214ee558..c0e459e58cb6 100644
> > --- a/mm/secretmem.c
> > +++ b/mm/secretmem.c
> > @@ -195,18 +195,11 @@ static struct file *secretmem_file_create(unsigned long flags)
> >       struct file *file;
> >       struct inode *inode;
> >       const char *anon_name = "[secretmem]";
> > -     int err;
> >
> > -     inode = alloc_anon_inode(secretmem_mnt->mnt_sb);
> > +     inode = alloc_anon_secure_inode(secretmem_mnt->mnt_sb, anon_name);
> >       if (IS_ERR(inode))
> >               return ERR_CAST(inode);
>
> I don't think we should not hide secretmem and guest_memfd inodes from
> selinux, so clearing S_PRIVATE for them is not needed and you can just drop
> fs_internal parameter in anon_inode_make_secure_inode()

It's especially odd since I don't see any comments or descriptions
about why this is being done.  The secretmem change is concerning as
this is user accessible and marking the inode with S_PRIVATE will
bypass a number of LSM/SELinux access controls, possibly resulting in
a security regression (one would need to dig a bit deeper to see what
is possible with secretmem and which LSM/SELinux code paths would be
affected).

I'm less familiar with guest_memfd, but generally speaking if
userspace can act on the inode/fd then we likely don't want the
S_PRIVATE flag stripped from the anon_inode.

Ackerley can you provide an explanation about why the change in
S_PRIVATE was necessary?

> > -     err = security_inode_init_security_anon(inode, &QSTR(anon_name), NULL);
> > -     if (err) {
> > -             file = ERR_PTR(err);
> > -             goto err_free_inode;
> > -     }
> > -
> >       file = alloc_file_pseudo(inode, secretmem_mnt, "secretmem",
> >                                O_RDWR, &secretmem_fops);
> >       if (IS_ERR(file))
> > --
> > 2.49.0.1204.g71687c7c1d-goog

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode
  2025-06-04 21:13     ` Paul Moore
@ 2025-06-05  5:49       ` Mike Rapoport
  2025-06-05 18:23         ` Paul Moore
  0 siblings, 1 reply; 15+ messages in thread
From: Mike Rapoport @ 2025-06-05  5:49 UTC (permalink / raw)
  To: Paul Moore
  Cc: Ackerley Tng, linux-security-module, selinux, kvm, linux-mm,
	linux-kernel, x86, linux-fsdevel, aik, ajones, akpm, amoorthy,
	anthony.yznaga, anup, aou, bfoster, binbin.wu, brauner,
	catalin.marinas, chao.p.peng, chenhuacai, dave.hansen, david,
	dmatlack, dwmw, erdemaktas, fan.du, fvdl, graf, haibo1.xu, hch,
	hughd, ira.weiny, isaku.yamahata, jack, james.morse, jarkko, jgg,
	jgowans, jhubbard, jroedel, jthoughton, jun.miao, kai.huang,
	keirf, kent.overstreet, kirill.shutemov, liam.merwick,
	maciej.wieczor-retman, mail, maz, mic, michael.roth, mpe,
	muchun.song, nikunj, nsaenz, oliver.upton, palmer, pankaj.gupta,
	paul.walmsley, pbonzini, pdurrant, peterx, pgonda, pvorel,
	qperret, quic_cvanscha, quic_eberman, quic_mnalajal, quic_pderrin,
	quic_pheragu, quic_svaddagi, quic_tsoni, richard.weiyang,
	rick.p.edgecombe, rientjes, roypat, seanjc, shuah, steven.price,
	steven.sistare, suzuki.poulose, tabba, thomas.lendacky,
	vannapurve, vbabka, viro, vkuznets, wei.w.wang, will, willy,
	xiaoyao.li, yan.y.zhao, yilun.xu, yuzenghui, zhiquan1.li

On Wed, Jun 04, 2025 at 05:13:35PM -0400, Paul Moore wrote:
> On Wed, Jun 4, 2025 at 3:59 AM Mike Rapoport <rppt@kernel.org> wrote:
> >
> > (added Paul Moore for selinux bits)
> 
> Thanks Mike.
> 
> I'm adding the LSM and SELinux lists too since there are others that
> will be interested as well.
> 
> > On Mon, Jun 02, 2025 at 12:17:54PM -0700, Ackerley Tng wrote:
> > > The new function, alloc_anon_secure_inode(), returns an inode after
> > > running checks in security_inode_init_security_anon().
> > >
> > > Also refactor secretmem's file creation process to use the new
> > > function.
> > >
> > > Suggested-by: David Hildenbrand <david@redhat.com>
> > > Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> > > ---
> > >  fs/anon_inodes.c   | 22 ++++++++++++++++------
> > >  include/linux/fs.h |  1 +
> > >  mm/secretmem.c     |  9 +--------
> > >  3 files changed, 18 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c
> > > index 583ac81669c2..4c3110378647 100644
> > > --- a/fs/anon_inodes.c
> > > +++ b/fs/anon_inodes.c
> > > @@ -55,17 +55,20 @@ static struct file_system_type anon_inode_fs_type = {
> > >       .kill_sb        = kill_anon_super,
> > >  };
> > >
> > > -static struct inode *anon_inode_make_secure_inode(
> > > -     const char *name,
> > > -     const struct inode *context_inode)
> > > +static struct inode *anon_inode_make_secure_inode(struct super_block *s,
> > > +             const char *name, const struct inode *context_inode,
> > > +             bool fs_internal)
> > >  {
> > >       struct inode *inode;
> > >       int error;
> > >
> > > -     inode = alloc_anon_inode(anon_inode_mnt->mnt_sb);
> > > +     inode = alloc_anon_inode(s);
> > >       if (IS_ERR(inode))
> > >               return inode;
> > > -     inode->i_flags &= ~S_PRIVATE;
> > > +
> > > +     if (!fs_internal)
> > > +             inode->i_flags &= ~S_PRIVATE;
> > > +
> > >       error = security_inode_init_security_anon(inode, &QSTR(name),
> > >                                                 context_inode);
> > >       if (error) {
> > > @@ -75,6 +78,12 @@ static struct inode *anon_inode_make_secure_inode(
> > >       return inode;
> > >  }
> > >
> > > +struct inode *alloc_anon_secure_inode(struct super_block *s, const char *name)
> > > +{
> > > +     return anon_inode_make_secure_inode(s, name, NULL, true);
> > > +}
> > > +EXPORT_SYMBOL_GPL(alloc_anon_secure_inode);
> > > +
> > >  static struct file *__anon_inode_getfile(const char *name,
> > >                                        const struct file_operations *fops,
> > >                                        void *priv, int flags,
> > > @@ -88,7 +97,8 @@ static struct file *__anon_inode_getfile(const char *name,
> > >               return ERR_PTR(-ENOENT);
> > >
> > >       if (make_inode) {
> > > -             inode = anon_inode_make_secure_inode(name, context_inode);
> > > +             inode = anon_inode_make_secure_inode(anon_inode_mnt->mnt_sb,
> > > +                                                  name, context_inode, false);
> > >               if (IS_ERR(inode)) {
> > >                       file = ERR_CAST(inode);
> > >                       goto err;
> > > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > > index 016b0fe1536e..0fded2e3c661 100644
> > > --- a/include/linux/fs.h
> > > +++ b/include/linux/fs.h
> > > @@ -3550,6 +3550,7 @@ extern int simple_write_begin(struct file *file, struct address_space *mapping,
> > >  extern const struct address_space_operations ram_aops;
> > >  extern int always_delete_dentry(const struct dentry *);
> > >  extern struct inode *alloc_anon_inode(struct super_block *);
> > > +extern struct inode *alloc_anon_secure_inode(struct super_block *, const char *);
> > >  extern int simple_nosetlease(struct file *, int, struct file_lease **, void **);
> > >  extern const struct dentry_operations simple_dentry_operations;
> > >
> > > diff --git a/mm/secretmem.c b/mm/secretmem.c
> > > index 1b0a214ee558..c0e459e58cb6 100644
> > > --- a/mm/secretmem.c
> > > +++ b/mm/secretmem.c
> > > @@ -195,18 +195,11 @@ static struct file *secretmem_file_create(unsigned long flags)
> > >       struct file *file;
> > >       struct inode *inode;
> > >       const char *anon_name = "[secretmem]";
> > > -     int err;
> > >
> > > -     inode = alloc_anon_inode(secretmem_mnt->mnt_sb);
> > > +     inode = alloc_anon_secure_inode(secretmem_mnt->mnt_sb, anon_name);
> > >       if (IS_ERR(inode))
> > >               return ERR_CAST(inode);
> >
> > I don't think we should not hide secretmem and guest_memfd inodes from
> > selinux, so clearing S_PRIVATE for them is not needed and you can just drop
> > fs_internal parameter in anon_inode_make_secure_inode()
> 
> It's especially odd since I don't see any comments or descriptions
> about why this is being done.  The secretmem change is concerning as
> this is user accessible and marking the inode with S_PRIVATE will
> bypass a number of LSM/SELinux access controls, possibly resulting in
> a security regression (one would need to dig a bit deeper to see what
> is possible with secretmem and which LSM/SELinux code paths would be
> affected).

secretmem always had S_PRIVATE set because alloc_anon_inode() clears it
anyway and this patch does not change it.
I'm just thinking that it makes sense to actually allow LSM/SELinux
controls that S_PRIVATE bypasses for both secretmem and guest_memfd.
 
-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode
  2025-06-05  5:49       ` Mike Rapoport
@ 2025-06-05 18:23         ` Paul Moore
  2025-06-06 15:09           ` Ira Weiny
  0 siblings, 1 reply; 15+ messages in thread
From: Paul Moore @ 2025-06-05 18:23 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Ackerley Tng, linux-security-module, selinux, kvm, linux-mm,
	linux-kernel, x86, linux-fsdevel, aik, ajones, akpm, amoorthy,
	anthony.yznaga, anup, aou, bfoster, binbin.wu, brauner,
	catalin.marinas, chao.p.peng, chenhuacai, dave.hansen, david,
	dmatlack, dwmw, erdemaktas, fan.du, fvdl, graf, haibo1.xu, hch,
	hughd, ira.weiny, isaku.yamahata, jack, james.morse, jarkko, jgg,
	jgowans, jhubbard, jroedel, jthoughton, jun.miao, kai.huang,
	keirf, kent.overstreet, kirill.shutemov, liam.merwick,
	maciej.wieczor-retman, mail, maz, mic, michael.roth, mpe,
	muchun.song, nikunj, nsaenz, oliver.upton, palmer, pankaj.gupta,
	paul.walmsley, pbonzini, pdurrant, peterx, pgonda, pvorel,
	qperret, quic_cvanscha, quic_eberman, quic_mnalajal, quic_pderrin,
	quic_pheragu, quic_svaddagi, quic_tsoni, richard.weiyang,
	rick.p.edgecombe, rientjes, roypat, seanjc, shuah, steven.price,
	steven.sistare, suzuki.poulose, tabba, thomas.lendacky,
	vannapurve, vbabka, viro, vkuznets, wei.w.wang, will, willy,
	xiaoyao.li, yan.y.zhao, yilun.xu, yuzenghui, zhiquan1.li

On Thu, Jun 5, 2025 at 1:50 AM Mike Rapoport <rppt@kernel.org> wrote:
>
> secretmem always had S_PRIVATE set because alloc_anon_inode() clears it
> anyway and this patch does not change it.

Yes, my apologies, I didn't look closely enough at the code.

> I'm just thinking that it makes sense to actually allow LSM/SELinux
> controls that S_PRIVATE bypasses for both secretmem and guest_memfd.

It's been a while since we added the anon_inode hooks so I'd have to
go dig through the old thread to understand the logic behind marking
secretmem S_PRIVATE, especially when the
anon_inode_make_secure_inode() function cleared it.  It's entirely
possible it may have just been an oversight.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode
  2025-06-05 18:23         ` Paul Moore
@ 2025-06-06 15:09           ` Ira Weiny
  2025-06-16 13:00             ` Shivank Garg
  0 siblings, 1 reply; 15+ messages in thread
From: Ira Weiny @ 2025-06-06 15:09 UTC (permalink / raw)
  To: Paul Moore, Mike Rapoport
  Cc: Ackerley Tng, linux-security-module, selinux, kvm, linux-mm,
	linux-kernel, x86, linux-fsdevel, aik, ajones, akpm, amoorthy,
	anthony.yznaga, anup, aou, bfoster, binbin.wu, brauner,
	catalin.marinas, chao.p.peng, chenhuacai, dave.hansen, david,
	dmatlack, dwmw, erdemaktas, fan.du, fvdl, graf, haibo1.xu, hch,
	hughd, ira.weiny, isaku.yamahata, jack, james.morse, jarkko, jgg,
	jgowans, jhubbard, jroedel, jthoughton, jun.miao, kai.huang,
	keirf, kent.overstreet, kirill.shutemov, liam.merwick,
	maciej.wieczor-retman, mail, maz, mic, michael.roth, mpe,
	muchun.song, nikunj, nsaenz, oliver.upton, palmer, pankaj.gupta,
	paul.walmsley, pbonzini, pdurrant, peterx, pgonda, pvorel,
	qperret, quic_cvanscha, quic_eberman, quic_mnalajal, quic_pderrin,
	quic_pheragu, quic_svaddagi, quic_tsoni, richard.weiyang,
	rick.p.edgecombe, rientjes, roypat, seanjc, shuah, steven.price,
	steven.sistare, suzuki.poulose, tabba, thomas.lendacky,
	vannapurve, vbabka, viro, vkuznets, wei.w.wang, will, willy,
	xiaoyao.li, yan.y.zhao, yilun.xu, yuzenghui, zhiquan1.li

Paul Moore wrote:
> On Thu, Jun 5, 2025 at 1:50 AM Mike Rapoport <rppt@kernel.org> wrote:
> >
> > secretmem always had S_PRIVATE set because alloc_anon_inode() clears it
> > anyway and this patch does not change it.
> 
> Yes, my apologies, I didn't look closely enough at the code.
> 
> > I'm just thinking that it makes sense to actually allow LSM/SELinux
> > controls that S_PRIVATE bypasses for both secretmem and guest_memfd.
> 
> It's been a while since we added the anon_inode hooks so I'd have to
> go dig through the old thread to understand the logic behind marking
> secretmem S_PRIVATE, especially when the
> anon_inode_make_secure_inode() function cleared it.  It's entirely
> possible it may have just been an oversight.

I'm jumping in where I don't know what I'm talking about...

But my reading of the S_PRIVATE flag is that the memory can't be mapped by
user space.  So for guest_memfd() we need !S_PRIVATE because it is
intended to be mapped by user space.  So we want the secure checks.

I think secretmem is the same.

Do I have that right?

Ira

[snip]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode
  2025-06-06 15:09           ` Ira Weiny
@ 2025-06-16 13:00             ` Shivank Garg
  2025-06-19  5:36               ` Mike Rapoport
  0 siblings, 1 reply; 15+ messages in thread
From: Shivank Garg @ 2025-06-16 13:00 UTC (permalink / raw)
  To: Ira Weiny, Paul Moore, Mike Rapoport
  Cc: Ackerley Tng, linux-security-module, selinux, kvm, linux-mm,
	linux-kernel, x86, linux-fsdevel, aik, ajones, akpm, amoorthy,
	anthony.yznaga, anup, aou, bfoster, binbin.wu, brauner,
	catalin.marinas, chao.p.peng, chenhuacai, dave.hansen, david,
	dmatlack, dwmw, erdemaktas, fan.du, fvdl, graf, haibo1.xu, hch,
	hughd, isaku.yamahata, jack, james.morse, jarkko, jgg, jgowans,
	jhubbard, jroedel, jthoughton, jun.miao, kai.huang, keirf,
	kent.overstreet, kirill.shutemov, liam.merwick,
	maciej.wieczor-retman, mail, maz, mic, michael.roth, mpe,
	muchun.song, nikunj, nsaenz, oliver.upton, palmer, pankaj.gupta,
	paul.walmsley, pbonzini, pdurrant, peterx, pgonda, pvorel,
	qperret, quic_cvanscha, quic_eberman, quic_mnalajal, quic_pderrin,
	quic_pheragu, quic_svaddagi, quic_tsoni, richard.weiyang,
	rick.p.edgecombe, rientjes, roypat, seanjc, shuah, steven.price,
	steven.sistare, suzuki.poulose, tabba, thomas.lendacky,
	vannapurve, vbabka, viro, vkuznets, wei.w.wang, will, willy,
	xiaoyao.li, yan.y.zhao, yilun.xu, yuzenghui, zhiquan1.li



On 6/6/2025 8:39 PM, Ira Weiny wrote:
> Paul Moore wrote:
>> On Thu, Jun 5, 2025 at 1:50 AM Mike Rapoport <rppt@kernel.org> wrote:
>>>
>>> secretmem always had S_PRIVATE set because alloc_anon_inode() clears it
>>> anyway and this patch does not change it.
>>
>> Yes, my apologies, I didn't look closely enough at the code.
>>
>>> I'm just thinking that it makes sense to actually allow LSM/SELinux
>>> controls that S_PRIVATE bypasses for both secretmem and guest_memfd.
>>
>> It's been a while since we added the anon_inode hooks so I'd have to
>> go dig through the old thread to understand the logic behind marking
>> secretmem S_PRIVATE, especially when the
>> anon_inode_make_secure_inode() function cleared it.  It's entirely
>> possible it may have just been an oversight.
> 
> I'm jumping in where I don't know what I'm talking about...
> 
> But my reading of the S_PRIVATE flag is that the memory can't be mapped by
> user space.  So for guest_memfd() we need !S_PRIVATE because it is
> intended to be mapped by user space.  So we want the secure checks.
> 
> I think secretmem is the same.
> 
> Do I have that right?


Hi Mike, Paul,

If I understand correctly,
we need to clear the S_PRIVATE flag for all secure inodes. The S_PRIVATE flag was previously
set for  secretmem (via alloc_anon_inode()), which caused security checks to be 
bypassed - this was unintentional since the original anon_inode_make_secure_inode() 
was already clearing it.

Both secretmem and guest_memfd create file descriptors
(memfd_create/kvm_create_guest_memfd)
so they should be subject to LSM/SELinux security policies rather than bypassing them with S_PRIVATE?

static struct inode *anon_inode_make_secure_inode(struct super_block *s,
		const char *name, const struct inode *context_inode)
{
...
	/* Clear S_PRIVATE for all inodes*/
	inode->i_flags &= ~S_PRIVATE;
...
}

Please let me know if this conclusion makes sense?

Thanks,
Shivank

> 
> Ira
> 
> [snip]
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode
  2025-06-16 13:00             ` Shivank Garg
@ 2025-06-19  5:36               ` Mike Rapoport
  0 siblings, 0 replies; 15+ messages in thread
From: Mike Rapoport @ 2025-06-19  5:36 UTC (permalink / raw)
  To: Shivank Garg
  Cc: Ira Weiny, Paul Moore, Ackerley Tng, linux-security-module,
	selinux, kvm, linux-mm, linux-kernel, x86, linux-fsdevel, aik,
	ajones, akpm, amoorthy, anthony.yznaga, anup, aou, bfoster,
	binbin.wu, brauner, catalin.marinas, chao.p.peng, chenhuacai,
	dave.hansen, david, dmatlack, dwmw, erdemaktas, fan.du, fvdl,
	graf, haibo1.xu, hch, hughd, isaku.yamahata, jack, james.morse,
	jarkko, jgg, jgowans, jhubbard, jroedel, jthoughton, jun.miao,
	kai.huang, keirf, kent.overstreet, kirill.shutemov, liam.merwick,
	maciej.wieczor-retman, mail, maz, mic, michael.roth, mpe,
	muchun.song, nikunj, nsaenz, oliver.upton, palmer, pankaj.gupta,
	paul.walmsley, pbonzini, pdurrant, peterx, pgonda, pvorel,
	qperret, quic_cvanscha, quic_eberman, quic_mnalajal, quic_pderrin,
	quic_pheragu, quic_svaddagi, quic_tsoni, richard.weiyang,
	rick.p.edgecombe, rientjes, roypat, seanjc, shuah, steven.price,
	steven.sistare, suzuki.poulose, tabba, thomas.lendacky,
	vannapurve, vbabka, viro, vkuznets, wei.w.wang, will, willy,
	xiaoyao.li, yan.y.zhao, yilun.xu, yuzenghui, zhiquan1.li

On Mon, Jun 16, 2025 at 06:30:09PM +0530, Shivank Garg wrote:
> 
> 
> On 6/6/2025 8:39 PM, Ira Weiny wrote:
> > Paul Moore wrote:
> >> On Thu, Jun 5, 2025 at 1:50 AM Mike Rapoport <rppt@kernel.org> wrote:
> >>>
> >>> secretmem always had S_PRIVATE set because alloc_anon_inode() clears it
> >>> anyway and this patch does not change it.
> >>
> >> Yes, my apologies, I didn't look closely enough at the code.
> >>
> >>> I'm just thinking that it makes sense to actually allow LSM/SELinux
> >>> controls that S_PRIVATE bypasses for both secretmem and guest_memfd.
> >>
> >> It's been a while since we added the anon_inode hooks so I'd have to
> >> go dig through the old thread to understand the logic behind marking
> >> secretmem S_PRIVATE, especially when the
> >> anon_inode_make_secure_inode() function cleared it.  It's entirely
> >> possible it may have just been an oversight.

anon_inode_make_secure_inode() was introduced when more than 10 versions of
secretmem already were posted so it didn't jump at me to replace
alloc_anon_inode() with anon_inode_make_secure_inode().
 
> > I'm jumping in where I don't know what I'm talking about...
> > 
> > But my reading of the S_PRIVATE flag is that the memory can't be mapped by
> > user space.  So for guest_memfd() we need !S_PRIVATE because it is
> > intended to be mapped by user space.  So we want the secure checks.
> > 
> > I think secretmem is the same.

Agree.

> > Do I have that right?
> 
> 
> Hi Mike, Paul,
> 
> If I understand correctly,
> we need to clear the S_PRIVATE flag for all secure inodes. The S_PRIVATE flag was previously
> set for  secretmem (via alloc_anon_inode()), which caused security checks to be 
> bypassed - this was unintentional since the original anon_inode_make_secure_inode() 
> was already clearing it.
> 
> Both secretmem and guest_memfd create file descriptors
> (memfd_create/kvm_create_guest_memfd)
> so they should be subject to LSM/SELinux security policies rather than bypassing them with S_PRIVATE?
> 
> static struct inode *anon_inode_make_secure_inode(struct super_block *s,
> 		const char *name, const struct inode *context_inode)
> {
> ...
> 	/* Clear S_PRIVATE for all inodes*/
> 	inode->i_flags &= ~S_PRIVATE;
> ...
> }
> 
> Please let me know if this conclusion makes sense?

Yes, makes sense to me.
 
> Thanks,
> Shivank

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-06-19  5:37 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-02 19:17 [PATCH 0/2] Use guest mem inodes instead of anonymous inodes Ackerley Tng
2025-06-02 19:17 ` [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode Ackerley Tng
2025-06-02 20:02   ` David Hildenbrand
2025-06-03  4:52   ` Christoph Hellwig
2025-06-03 10:40     ` Shivank Garg
2025-06-04  7:59   ` Mike Rapoport
2025-06-04 21:13     ` Paul Moore
2025-06-05  5:49       ` Mike Rapoport
2025-06-05 18:23         ` Paul Moore
2025-06-06 15:09           ` Ira Weiny
2025-06-16 13:00             ` Shivank Garg
2025-06-19  5:36               ` Mike Rapoport
2025-06-04  8:02   ` Christian Brauner
2025-06-02 19:17 ` [PATCH 2/2] KVM: guest_memfd: Use guest mem inodes instead of anonymous inodes Ackerley Tng
2025-06-02 20:13   ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).