From: "Garg, Shivank" <shivankg@amd.com>
To: Sean Christopherson <seanjc@google.com>,
Ackerley Tng <ackerleytng@google.com>
Cc: willy@infradead.org, akpm@linux-foundation.org, david@redhat.com,
pbonzini@redhat.com, shuah@kernel.org, vbabka@suse.cz,
brauner@kernel.org, viro@zeniv.linux.org.uk, dsterba@suse.com,
xiang@kernel.org, chao@kernel.org, jaegeuk@kernel.org,
clm@fb.com, josef@toxicpanda.com, kent.overstreet@linux.dev,
zbestahu@gmail.com, jefflexu@linux.alibaba.com,
dhavale@google.com, lihongbo22@huawei.com,
lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
rppt@kernel.org, surenb@google.com, mhocko@suse.com,
ziy@nvidia.com, matthew.brost@intel.com, joshua.hahnjy@gmail.com,
rakie.kim@sk.com, byungchul@sk.com, gourry@gourry.net,
ying.huang@linux.alibaba.com, apopple@nvidia.com,
tabba@google.com, paul@paul-moore.com, jmorris@namei.org,
serge@hallyn.com, pvorel@suse.cz, bfoster@redhat.com,
vannapurve@google.com, chao.gao@intel.com, bharata@amd.com,
nikunj@amd.com, michael.day@amd.com, shdhiman@amd.com,
yan.y.zhao@intel.com, Neeraj.Upadhyay@amd.com,
thomas.lendacky@amd.com, michael.roth@amd.com, aik@amd.com,
jgg@nvidia.com, kalyazin@amazon.com, peterx@redhat.com,
jack@suse.cz, hch@infradead.org, cgzones@googlemail.com,
ira.weiny@intel.com, rientjes@google.com, roypat@amazon.co.uk,
chao.p.peng@intel.com, amit@infradead.org, ddutile@redhat.com,
dan.j.williams@intel.com, ashish.kalra@amd.com, gshan@redhat.com,
jgowans@amazon.com, pankaj.gupta@amd.com, papaluri@amd.com,
yuzhao@google.com, suzuki.poulose@arm.com,
quic_eberman@quicinc.com, linux-bcachefs@vger.kernel.org,
linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org,
linux-f2fs-devel@lists.sourceforge.net,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org,
linux-security-module@vger.kernel.org, kvm@vger.kernel.org,
linux-kselftest@vger.kernel.org, linux-coco@lists.linux.dev
Subject: Re: [PATCH kvm-next V11 4/7] KVM: guest_memfd: Use guest mem inodes instead of anonymous inodes
Date: Thu, 25 Sep 2025 17:14:15 +0530 [thread overview]
Message-ID: <dc6eb85f-87b6-43a1-b1f7-4727c0b834cc@amd.com> (raw)
In-Reply-To: <aNSt9QT8dmpDK1eE@google.com>
On 9/25/2025 8:20 AM, Sean Christopherson wrote:
> My apologies for the super late feedback. None of this is critical (mechanical
> things that can be cleaned up after the fact), so if there's any urgency to
> getting this series into 6.18, just ignore it.
>
> On Wed, Aug 27, 2025, Ackerley Tng wrote:
>> Shivank Garg <shivankg@amd.com> writes:
>> @@ -463,11 +502,70 @@ bool __weak kvm_arch_supports_gmem_mmap(struct kvm *kvm)
>> return true;
>> }
>>
>> +static struct inode *kvm_gmem_inode_create(const char *name, loff_t size,
>> + u64 flags)
>> +{
>> + struct inode *inode;
>> +
>> + inode = anon_inode_make_secure_inode(kvm_gmem_mnt->mnt_sb, name, NULL);
>> + if (IS_ERR(inode))
>> + return inode;
>> +
>> + inode->i_private = (void *)(unsigned long)flags;
>> + inode->i_op = &kvm_gmem_iops;
>> + inode->i_mapping->a_ops = &kvm_gmem_aops;
>> + inode->i_mode |= S_IFREG;
>> + inode->i_size = size;
>> + mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER);
>> + mapping_set_inaccessible(inode->i_mapping);
>> + /* Unmovable mappings are supposed to be marked unevictable as well. */
>> + WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping));
>> +
>> + return inode;
>> +}
>> +
>> +static struct file *kvm_gmem_inode_create_getfile(void *priv, loff_t size,
>> + u64 flags)
>> +{
>> + static const char *name = "[kvm-gmem]";
>> + struct inode *inode;
>> + struct file *file;
>> + int err;
>> +
>> + err = -ENOENT;
>> + /* __fput() will take care of fops_put(). */
>> + if (!fops_get(&kvm_gmem_fops))
>> + goto err;
>> +
>> + inode = kvm_gmem_inode_create(name, size, flags);
>> + if (IS_ERR(inode)) {
>> + err = PTR_ERR(inode);
>> + goto err_fops_put;
>> + }
>> +
>> + file = alloc_file_pseudo(inode, kvm_gmem_mnt, name, O_RDWR,
>> + &kvm_gmem_fops);
>> + if (IS_ERR(file)) {
>> + err = PTR_ERR(file);
>> + goto err_put_inode;
>> + }
>> +
>> + file->f_flags |= O_LARGEFILE;
>> + file->private_data = priv;
>> +
>> + return file;
>> +
>> +err_put_inode:
>> + iput(inode);
>> +err_fops_put:
>> + fops_put(&kvm_gmem_fops);
>> +err:
>> + return ERR_PTR(err);
>> +}
>
> I don't see any reason to add two helpers. It requires quite a bit more lines
> of code due to adding more error paths and local variables, and IMO doesn't make
> the code any easier to read.
>
> Passing in "gmem" as @priv is especially ridiculous, as it adds code and
> obfuscates what file->private_data is set to.
>
> I get the sense that the code was written to be a "replacement" for common APIs,
> but that is nonsensical (no pun intended).
>
>> static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags)
>> {
>> - const char *anon_name = "[kvm-gmem]";
>> struct kvm_gmem *gmem;
>> - struct inode *inode;
>> struct file *file;
>> int fd, err;
>>
>> @@ -481,32 +579,16 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags)
>> goto err_fd;
>> }
>>
>> - file = anon_inode_create_getfile(anon_name, &kvm_gmem_fops, gmem,
>> - O_RDWR, NULL);
>> + file = kvm_gmem_inode_create_getfile(gmem, size, flags);
>> if (IS_ERR(file)) {
>> err = PTR_ERR(file);
>> goto err_gmem;
>> }
>>
>> - file->f_flags |= O_LARGEFILE;
>> -
>> - inode = file->f_inode;
>> - WARN_ON(file->f_mapping != inode->i_mapping);
>> -
>> - inode->i_private = (void *)(unsigned long)flags;
>> - inode->i_op = &kvm_gmem_iops;
>> - inode->i_mapping->a_ops = &kvm_gmem_aops;
>> - inode->i_mode |= S_IFREG;
>> - inode->i_size = size;
>> - mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER);
>> - mapping_set_inaccessible(inode->i_mapping);
>> - /* Unmovable mappings are supposed to be marked unevictable as well. */
>> - WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping));
>> -
>> kvm_get_kvm(kvm);
>> gmem->kvm = kvm;
>> xa_init(&gmem->bindings);
>> - list_add(&gmem->entry, &inode->i_mapping->i_private_list);
>> + list_add(&gmem->entry, &file_inode(file)->i_mapping->i_private_list);
>
> I don't understand this change? Isn't file_inode(file) == inode?
>
> Compile tested only, and again not critical, but it's -40 LoC...
>
>
Thanks.
I did functional testing and it works fine.
> ---
> include/uapi/linux/magic.h | 1 +
> virt/kvm/guest_memfd.c | 75 ++++++++++++++++++++++++++++++++------
> virt/kvm/kvm_main.c | 7 +++-
> virt/kvm/kvm_mm.h | 9 +++--
> 4 files changed, 76 insertions(+), 16 deletions(-)
>
> diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
> index bb575f3ab45e..638ca21b7a90 100644
> --- a/include/uapi/linux/magic.h
> +++ b/include/uapi/linux/magic.h
> @@ -103,5 +103,6 @@
> #define DEVMEM_MAGIC 0x454d444d /* "DMEM" */
> #define SECRETMEM_MAGIC 0x5345434d /* "SECM" */
> #define PID_FS_MAGIC 0x50494446 /* "PIDF" */
> +#define GUEST_MEMFD_MAGIC 0x474d454d /* "GMEM" */
>
> #endif /* __LINUX_MAGIC_H__ */
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 08a6bc7d25b6..73c9791879d5 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -1,12 +1,16 @@
> // SPDX-License-Identifier: GPL-2.0
> +#include <linux/anon_inodes.h>
> #include <linux/backing-dev.h>
> #include <linux/falloc.h>
> +#include <linux/fs.h>
> #include <linux/kvm_host.h>
> +#include <linux/pseudo_fs.h>
> #include <linux/pagemap.h>
> -#include <linux/anon_inodes.h>
>
> #include "kvm_mm.h"
>
> +static struct vfsmount *kvm_gmem_mnt;
> +
> struct kvm_gmem {
> struct kvm *kvm;
> struct xarray bindings;
> @@ -385,9 +389,45 @@ static struct file_operations kvm_gmem_fops = {
> .fallocate = kvm_gmem_fallocate,
> };
>
> -void kvm_gmem_init(struct module *module)
> +static int kvm_gmem_init_fs_context(struct fs_context *fc)
> +{
> + if (!init_pseudo(fc, GUEST_MEMFD_MAGIC))
> + return -ENOMEM;
> +
> + fc->s_iflags |= SB_I_NOEXEC;
> + fc->s_iflags |= SB_I_NODEV;
> +
> + return 0;
> +}
> +
> +static struct file_system_type kvm_gmem_fs = {
> + .name = "guest_memfd",
> + .init_fs_context = kvm_gmem_init_fs_context,
> + .kill_sb = kill_anon_super,
> +};
> +
> +static int kvm_gmem_init_mount(void)
> +{
> + kvm_gmem_mnt = kern_mount(&kvm_gmem_fs);
> +
> + if (IS_ERR(kvm_gmem_mnt))
> + return PTR_ERR(kvm_gmem_mnt);
> +
> + kvm_gmem_mnt->mnt_flags |= MNT_NOEXEC;
> + return 0;
> +}
> +
> +int kvm_gmem_init(struct module *module)
> {
> kvm_gmem_fops.owner = module;
> +
> + return kvm_gmem_init_mount();
> +}
> +
> +void kvm_gmem_exit(void)
> +{
> + kern_unmount(kvm_gmem_mnt);
> + kvm_gmem_mnt = NULL;
> }
>
> static int kvm_gmem_migrate_folio(struct address_space *mapping,
> @@ -465,7 +505,7 @@ bool __weak kvm_arch_supports_gmem_mmap(struct kvm *kvm)
>
> static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags)
> {
> - const char *anon_name = "[kvm-gmem]";
> + static const char *name = "[kvm-gmem]";
> struct kvm_gmem *gmem;
> struct inode *inode;
> struct file *file;
> @@ -481,17 +521,17 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags)
> goto err_fd;
> }
>
> - file = anon_inode_create_getfile(anon_name, &kvm_gmem_fops, gmem,
> - O_RDWR, NULL);
> - if (IS_ERR(file)) {
> - err = PTR_ERR(file);
> + /* __fput() will take care of fops_put(). */
> + if (!fops_get(&kvm_gmem_fops)) {
> + err = -ENOENT;
> goto err_gmem;
> }
>
> - file->f_flags |= O_LARGEFILE;
> -
> - inode = file->f_inode;
> - WARN_ON(file->f_mapping != inode->i_mapping);
> + inode = anon_inode_make_secure_inode(kvm_gmem_mnt->mnt_sb, name, NULL);
> + if (IS_ERR(inode)) {
> + err = PTR_ERR(inode);
> + goto err_fops;
> + }
>
> inode->i_private = (void *)(unsigned long)flags;
> inode->i_op = &kvm_gmem_iops;
> @@ -503,6 +543,15 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags)
> /* Unmovable mappings are supposed to be marked unevictable as well. */
> WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping));
>
> + file = alloc_file_pseudo(inode, kvm_gmem_mnt, name, O_RDWR, &kvm_gmem_fops);
> + if (IS_ERR(file)) {
> + err = PTR_ERR(file);
> + goto err_inode;
> + }
> +
> + file->f_flags |= O_LARGEFILE;
> + file->private_data = gmem;
> +
> kvm_get_kvm(kvm);
> gmem->kvm = kvm;
> xa_init(&gmem->bindings);
> @@ -511,6 +560,10 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags)
> fd_install(fd, file);
> return fd;
>
> +err_inode:
> + iput(inode);
> +err_fops:
> + fops_put(&kvm_gmem_fops);
> err_gmem:
> kfree(gmem);
> err_fd:
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 18f29ef93543..301d48d6e00d 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -6489,7 +6489,9 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
> if (WARN_ON_ONCE(r))
> goto err_vfio;
>
> - kvm_gmem_init(module);
> + r = kvm_gmem_init(module);
> + if (r)
> + goto err_gmem;
>
> r = kvm_init_virtualization();
> if (r)
> @@ -6510,6 +6512,8 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
> err_register:
> kvm_uninit_virtualization();
> err_virt:
> + kvm_gmem_exit();
> +err_gmem:
> kvm_vfio_ops_exit();
> err_vfio:
> kvm_async_pf_deinit();
> @@ -6541,6 +6545,7 @@ void kvm_exit(void)
> for_each_possible_cpu(cpu)
> free_cpumask_var(per_cpu(cpu_kick_mask, cpu));
> kmem_cache_destroy(kvm_vcpu_cache);
> + kvm_gmem_exit();
> kvm_vfio_ops_exit();
> kvm_async_pf_deinit();
> kvm_irqfd_exit();
> diff --git a/virt/kvm/kvm_mm.h b/virt/kvm/kvm_mm.h
> index 31defb08ccba..9fcc5d5b7f8d 100644
> --- a/virt/kvm/kvm_mm.h
> +++ b/virt/kvm/kvm_mm.h
> @@ -68,17 +68,18 @@ static inline void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm,
> #endif /* HAVE_KVM_PFNCACHE */
>
> #ifdef CONFIG_KVM_GUEST_MEMFD
> -void kvm_gmem_init(struct module *module);
> +int kvm_gmem_init(struct module *module);
> +void kvm_gmem_exit(void);
> int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args);
> int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot,
> unsigned int fd, loff_t offset);
> void kvm_gmem_unbind(struct kvm_memory_slot *slot);
> #else
> -static inline void kvm_gmem_init(struct module *module)
> +static inline int kvm_gmem_init(struct module *module)
> {
> -
> + return 0;
> }
> -
> +static inline void kvm_gmem_exit(void) {};
> static inline int kvm_gmem_bind(struct kvm *kvm,
> struct kvm_memory_slot *slot,
> unsigned int fd, loff_t offset)
>
> base-commit: d133892dddd6607de651b7e32510359a6af97c4c
> --
next prev parent reply other threads:[~2025-09-25 11:45 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-27 17:52 [PATCH kvm-next V11 0/7] Add NUMA mempolicy support for KVM guest-memfd Shivank Garg
2025-08-27 17:52 ` [PATCH kvm-next V11 1/7] mm/filemap: Add NUMA mempolicy support to filemap_alloc_folio() Shivank Garg
2025-08-27 17:52 ` [PATCH kvm-next V11 2/7] mm/filemap: Extend __filemap_get_folio() to support NUMA memory policies Shivank Garg
2025-08-27 17:52 ` [PATCH kvm-next V11 3/7] mm/mempolicy: Export memory policy symbols Shivank Garg
2025-08-27 17:52 ` [PATCH kvm-next V11 4/7] KVM: guest_memfd: Use guest mem inodes instead of anonymous inodes Shivank Garg
2025-08-27 22:43 ` Ackerley Tng
2025-08-28 5:49 ` Garg, Shivank
2025-08-28 10:06 ` David Hildenbrand
2025-09-25 2:50 ` Sean Christopherson
2025-09-25 11:44 ` Garg, Shivank [this message]
2025-09-25 11:55 ` David Hildenbrand
2025-09-25 13:41 ` Sean Christopherson
2025-09-25 13:44 ` Fuad Tabba
2025-09-25 14:26 ` David Hildenbrand
2025-09-25 15:06 ` Sean Christopherson
2025-08-27 17:52 ` [PATCH kvm-next V11 5/7] KVM: guest_memfd: Add slab-allocated inode cache Shivank Garg
2025-09-25 14:05 ` Sean Christopherson
2025-09-25 14:17 ` Sean Christopherson
2025-08-27 17:52 ` [PATCH kvm-next V11 6/7] KVM: guest_memfd: Enforce NUMA mempolicy using shared policy Shivank Garg
2025-09-25 14:22 ` Sean Christopherson
2025-09-26 19:36 ` Sean Christopherson
2025-10-15 21:45 ` [f2fs-dev] " Gregory Price
2025-10-15 22:48 ` Sean Christopherson
2025-10-16 12:58 ` Garg, Shivank
2025-10-16 14:17 ` Gregory Price
2025-08-27 17:52 ` [PATCH kvm-next V11 7/7] KVM: guest_memfd: selftests: Add tests for mmap and NUMA policy support Shivank Garg
2025-09-25 21:35 ` Sean Christopherson
2025-09-25 23:03 ` Sean Christopherson
2025-09-25 23:04 ` Jason Gunthorpe
2025-09-25 23:12 ` Sean Christopherson
2025-09-26 7:32 ` David Hildenbrand
2025-09-26 7:31 ` David Hildenbrand
2025-09-26 7:37 ` Garg, Shivank
2025-08-28 12:44 ` [PATCH kvm-next V11 0/7] Add NUMA mempolicy support for KVM guest-memfd David Hildenbrand
2025-09-24 18:19 ` David Hildenbrand
2025-09-24 20:35 ` Kalra, Ashish
2025-10-15 18:02 ` Sean Christopherson
2025-10-20 15:52 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dc6eb85f-87b6-43a1-b1f7-4727c0b834cc@amd.com \
--to=shivankg@amd.com \
--cc=Liam.Howlett@oracle.com \
--cc=Neeraj.Upadhyay@amd.com \
--cc=ackerleytng@google.com \
--cc=aik@amd.com \
--cc=akpm@linux-foundation.org \
--cc=amit@infradead.org \
--cc=apopple@nvidia.com \
--cc=ashish.kalra@amd.com \
--cc=bfoster@redhat.com \
--cc=bharata@amd.com \
--cc=brauner@kernel.org \
--cc=byungchul@sk.com \
--cc=cgzones@googlemail.com \
--cc=chao.gao@intel.com \
--cc=chao.p.peng@intel.com \
--cc=chao@kernel.org \
--cc=clm@fb.com \
--cc=dan.j.williams@intel.com \
--cc=david@redhat.com \
--cc=ddutile@redhat.com \
--cc=dhavale@google.com \
--cc=dsterba@suse.com \
--cc=gourry@gourry.net \
--cc=gshan@redhat.com \
--cc=hch@infradead.org \
--cc=ira.weiny@intel.com \
--cc=jack@suse.cz \
--cc=jaegeuk@kernel.org \
--cc=jefflexu@linux.alibaba.com \
--cc=jgg@nvidia.com \
--cc=jgowans@amazon.com \
--cc=jmorris@namei.org \
--cc=josef@toxicpanda.com \
--cc=joshua.hahnjy@gmail.com \
--cc=kalyazin@amazon.com \
--cc=kent.overstreet@linux.dev \
--cc=kvm@vger.kernel.org \
--cc=lihongbo22@huawei.com \
--cc=linux-bcachefs@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-erofs@lists.ozlabs.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-security-module@vger.kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=matthew.brost@intel.com \
--cc=mhocko@suse.com \
--cc=michael.day@amd.com \
--cc=michael.roth@amd.com \
--cc=nikunj@amd.com \
--cc=pankaj.gupta@amd.com \
--cc=papaluri@amd.com \
--cc=paul@paul-moore.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=pvorel@suse.cz \
--cc=quic_eberman@quicinc.com \
--cc=rakie.kim@sk.com \
--cc=rientjes@google.com \
--cc=roypat@amazon.co.uk \
--cc=rppt@kernel.org \
--cc=seanjc@google.com \
--cc=serge@hallyn.com \
--cc=shdhiman@amd.com \
--cc=shuah@kernel.org \
--cc=surenb@google.com \
--cc=suzuki.poulose@arm.com \
--cc=tabba@google.com \
--cc=thomas.lendacky@amd.com \
--cc=vannapurve@google.com \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
--cc=xiang@kernel.org \
--cc=yan.y.zhao@intel.com \
--cc=ying.huang@linux.alibaba.com \
--cc=yuzhao@google.com \
--cc=zbestahu@gmail.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).