linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Vishal Annapurve <vannapurve@google.com>
Cc: Xiaoyao Li <xiaoyao.li@intel.com>, Fuad Tabba <tabba@google.com>,
	kvm@vger.kernel.org,  linux-arm-msm@vger.kernel.org,
	linux-mm@kvack.org, kvmarm@lists.linux.dev,  pbonzini@redhat.com,
	chenhuacai@kernel.org, mpe@ellerman.id.au,  anup@brainfault.org,
	paul.walmsley@sifive.com, palmer@dabbelt.com,
	 aou@eecs.berkeley.edu, viro@zeniv.linux.org.uk,
	brauner@kernel.org,  willy@infradead.org,
	akpm@linux-foundation.org, yilun.xu@intel.com,
	 chao.p.peng@linux.intel.com, jarkko@kernel.org,
	amoorthy@google.com,  dmatlack@google.com,
	isaku.yamahata@intel.com, mic@digikod.net,  vbabka@suse.cz,
	ackerleytng@google.com, mail@maciej.szmigiero.name,
	 david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com,
	 liam.merwick@oracle.com, isaku.yamahata@gmail.com,
	 kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com,
	steven.price@arm.com,  quic_eberman@quicinc.com,
	quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com,
	 quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com,
	 quic_pderrin@quicinc.com, quic_pheragu@quicinc.com,
	catalin.marinas@arm.com,  james.morse@arm.com,
	yuzenghui@huawei.com, oliver.upton@linux.dev,  maz@kernel.org,
	will@kernel.org, qperret@google.com, keirf@google.com,
	 roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org,
	jgg@nvidia.com,  rientjes@google.com, jhubbard@nvidia.com,
	fvdl@google.com, hughd@google.com,  jthoughton@google.com,
	peterx@redhat.com, pankaj.gupta@amd.com,  ira.weiny@intel.com
Subject: Re: [PATCH v15 14/21] KVM: x86: Enable guest_memfd mmap for default VM type
Date: Mon, 21 Jul 2025 07:42:14 -0700	[thread overview]
Message-ID: <aH5RxqcTXRnQbP5R@google.com> (raw)
In-Reply-To: <CAGtprH8swz6GjM57DBryDRD2c6VP=Ayg+dUh5MBK9cg1-YKCDg@mail.gmail.com>

On Mon, Jul 21, 2025, Vishal Annapurve wrote:
> On Mon, Jul 21, 2025 at 5:22 AM Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> >
> > On 7/18/2025 12:27 AM, Fuad Tabba wrote:
> > > +/*
> > > + * CoCo VMs with hardware support that use guest_memfd only for backing private
> > > + * memory, e.g., TDX, cannot use guest_memfd with userspace mapping enabled.
> > > + */
> > > +#define kvm_arch_supports_gmem_mmap(kvm)             \
> > > +     (IS_ENABLED(CONFIG_KVM_GMEM_SUPPORTS_MMAP) &&   \
> > > +      (kvm)->arch.vm_type == KVM_X86_DEFAULT_VM)
> >
> > I want to share the findings when I do the POC to enable gmem mmap in QEMU.
> >
> > Actually, QEMU can use gmem with mmap support as the normal memory even
> > without passing the gmem fd to kvm_userspace_memory_region2.guest_memfd
> > on KVM_SET_USER_MEMORY_REGION2.
> >
> > Since the gmem is mmapable, QEMU can pass the userspace addr got from
> > mmap() on gmem fd to kvm_userspace_memory_region(2).userspace_addr. It
> > works well for non-coco VMs on x86.
> >
> > Then it seems feasible to use gmem with mmap for the shared memory of
> > TDX, and an additional gmem without mmap for the private memory. i.e.,
> > For struct kvm_userspace_memory_region, the @userspace_addr is passed
> > with the uaddr returned from gmem0 with mmap, while @guest_memfd is
> > passed with another gmem1 fd without mmap.
> >
> > However, it fails actually, because the kvm_arch_suports_gmem_mmap()
> > returns false for TDX VMs, which means userspace cannot allocate gmem
> > with mmap just for shared memory for TDX.
> 
> Why do you want such a usecase to work?

I'm guessing Xiaoyao was asking an honest question in response to finding a
perceived flaw when trying to get this all working in QEMU.

> If kvm allows mappable guest_memfd files for TDX VMs without
> conversion support, userspace will be able to use those for backing

s/able/unable?

> private memory unless:
> 1) KVM checks at binding time if the guest_memfd passed during memslot
> creation is not a mappable one and doesn't enforce "not mappable"
> requirement for TDX VMs at creation time.

Xiaoyao's question is about "just for shared memory", so this is irrelevant for
the question at hand.

> 2) KVM fetches shared faults through userspace page tables and not
> guest_memfd directly.

This is also irrelevant.  KVM _already_ supports resolving shared faults through
userspace page tables.  That support won't go away as KVM will always need/want
to support mapping VM_IO and/or VM_PFNMAP memory into the guest (even for TDX).

> I don't see value in trying to go out of way to support such a usecase.

But if/when KVM gains support for tracking shared vs. private in guest_memfd
itself, i.e. when TDX _does_ support mmap() on guest_memfd, KVM won't have to go
out of its to support using guest_memfd for the @userspace_addr backing store.
Unless I'm missing something, the only thing needed to "support" this scenario is:

diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
index d01bd7a2c2bd..34403d2f1eeb 100644
--- a/virt/kvm/guest_memfd.c
+++ b/virt/kvm/guest_memfd.c
@@ -533,7 +533,7 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args)
        u64 flags = args->flags;
        u64 valid_flags = 0;
 
-       if (kvm_arch_supports_gmem_mmap(kvm))
+       // if (kvm_arch_supports_gmem_mmap(kvm))
                valid_flags |= GUEST_MEMFD_FLAG_MMAP;
 
        if (flags & ~valid_flags)

I think the question we actually want to answer is: do we want to go out of our
way to *prevent* such a usecase.  E.g. is there any risk/danger that we need to
mitigate, and would the cost of the mitigation be acceptable?

I think the answer is "no", because preventing userspace from using guest_memfd
as shared-only memory would require resolving the VMA during hva_to_pfn() in order
to fully prevent such behavior, and I defintely don't want to take mmap_lock
around hva_to_pfn_fast().

I don't see any obvious danger lurking.  KVM's pre-guest_memfd memory management
scheme is all about effectively making KVM behave like "just another" userspace
agent.  E.g. if/when TDX/SNP support comes along, guest_memfd must not allow mapping
private memory into userspace regardless of what KVM supports for page faults.

So unless I'm missing something, for now we do nothing, and let this support come
along naturally once TDX support mmap() on guest_memfd.

  parent reply	other threads:[~2025-07-21 14:42 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-17 16:27 [PATCH v15 00/21] KVM: Enable host userspace mapping for guest_memfd-backed memory for non-CoCo VMs Fuad Tabba
2025-07-17 16:27 ` [PATCH v15 01/21] KVM: Rename CONFIG_KVM_PRIVATE_MEM to CONFIG_KVM_GMEM Fuad Tabba
2025-07-21 15:17   ` Sean Christopherson
2025-07-21 15:26     ` Fuad Tabba
2025-07-17 16:27 ` [PATCH v15 02/21] KVM: Rename CONFIG_KVM_GENERIC_PRIVATE_MEM to CONFIG_KVM_GENERIC_GMEM_POPULATE Fuad Tabba
2025-07-21 16:44   ` Sean Christopherson
2025-07-21 16:51     ` Fuad Tabba
2025-07-21 17:33       ` Sean Christopherson
2025-07-22  9:29         ` Fuad Tabba
2025-07-22 15:58           ` Sean Christopherson
2025-07-22 16:01             ` Fuad Tabba
2025-07-22 23:42               ` Sean Christopherson
2025-07-23  9:22                 ` Fuad Tabba
2025-07-17 16:27 ` [PATCH v15 03/21] KVM: Introduce kvm_arch_supports_gmem() Fuad Tabba
2025-07-18  1:42   ` Xiaoyao Li
2025-07-21 14:47     ` Sean Christopherson
2025-07-21 14:55     ` Fuad Tabba
2025-07-21 16:44   ` Sean Christopherson
2025-07-17 16:27 ` [PATCH v15 04/21] KVM: x86: Introduce kvm->arch.supports_gmem Fuad Tabba
2025-07-21 16:45   ` Sean Christopherson
2025-07-21 17:00     ` Fuad Tabba
2025-07-21 19:09       ` Sean Christopherson
2025-07-17 16:27 ` [PATCH v15 05/21] KVM: Rename kvm_slot_can_be_private() to kvm_slot_has_gmem() Fuad Tabba
2025-07-17 16:27 ` [PATCH v15 06/21] KVM: Fix comments that refer to slots_lock Fuad Tabba
2025-07-17 16:27 ` [PATCH v15 07/21] KVM: Fix comment that refers to kvm uapi header path Fuad Tabba
2025-07-17 16:27 ` [PATCH v15 08/21] KVM: guest_memfd: Allow host to map guest_memfd pages Fuad Tabba
2025-07-18  2:56   ` Xiaoyao Li
2025-07-17 16:27 ` [PATCH v15 09/21] KVM: guest_memfd: Track guest_memfd mmap support in memslot Fuad Tabba
2025-07-18  3:33   ` Xiaoyao Li
2025-07-17 16:27 ` [PATCH v15 10/21] KVM: x86/mmu: Generalize private_max_mapping_level x86 op to max_mapping_level Fuad Tabba
2025-07-18  6:19   ` Xiaoyao Li
2025-07-21 19:46   ` Sean Christopherson
2025-07-17 16:27 ` [PATCH v15 11/21] KVM: x86/mmu: Allow NULL-able fault in kvm_max_private_mapping_level Fuad Tabba
2025-07-18  5:10   ` Xiaoyao Li
2025-07-21 23:17     ` Sean Christopherson
2025-07-22  5:35       ` Xiaoyao Li
2025-07-22 11:08         ` Fuad Tabba
2025-07-22 14:32           ` Sean Christopherson
2025-07-22 15:30             ` Fuad Tabba
2025-07-22 10:35       ` Fuad Tabba
2025-07-17 16:27 ` [PATCH v15 12/21] KVM: x86/mmu: Consult guest_memfd when computing max_mapping_level Fuad Tabba
2025-07-18  5:32   ` Xiaoyao Li
2025-07-18  5:57     ` Xiaoyao Li
2025-07-17 16:27 ` [PATCH v15 13/21] KVM: x86/mmu: Handle guest page faults for guest_memfd with shared memory Fuad Tabba
2025-07-18  6:09   ` Xiaoyao Li
2025-07-21 16:47   ` Sean Christopherson
2025-07-21 16:56     ` Fuad Tabba
2025-07-22  5:41     ` Xiaoyao Li
2025-07-22  8:43       ` Fuad Tabba
2025-07-17 16:27 ` [PATCH v15 14/21] KVM: x86: Enable guest_memfd mmap for default VM type Fuad Tabba
2025-07-18  6:10   ` Xiaoyao Li
2025-07-21 12:22   ` Xiaoyao Li
2025-07-21 12:41     ` Fuad Tabba
2025-07-21 13:45     ` Vishal Annapurve
2025-07-21 14:42       ` Xiaoyao Li
2025-07-21 14:42       ` Sean Christopherson [this message]
2025-07-21 15:07         ` Xiaoyao Li
2025-07-21 17:29           ` Sean Christopherson
2025-07-21 20:33             ` Vishal Annapurve
2025-07-21 22:21               ` Sean Christopherson
2025-07-21 23:50                 ` Vishal Annapurve
2025-07-22 14:35                   ` Sean Christopherson
2025-07-23 14:08                     ` Vishal Annapurve
2025-07-23 14:43                       ` Sean Christopherson
2025-07-23 14:46                         ` David Hildenbrand
2025-07-22 14:28     ` Xiaoyao Li
2025-07-22 14:37       ` Sean Christopherson
2025-07-22 15:31         ` Xiaoyao Li
2025-07-22 15:50           ` David Hildenbrand
2025-07-22 15:54           ` Sean Christopherson
2025-07-17 16:27 ` [PATCH v15 15/21] KVM: arm64: Refactor user_mem_abort() Fuad Tabba
2025-07-17 16:27 ` [PATCH v15 16/21] KVM: arm64: Handle guest_memfd-backed guest page faults Fuad Tabba
2025-07-22 12:31   ` Kunwu Chan
2025-07-23  8:20     ` Marc Zyngier
2025-07-23 11:44       ` Kunwu Chan
2025-07-23  8:26   ` Marc Zyngier
2025-07-17 16:27 ` [PATCH v15 17/21] KVM: arm64: nv: Handle VNCR_EL2-triggered faults backed by guest_memfd Fuad Tabba
2025-07-23  8:29   ` Marc Zyngier
2025-07-17 16:27 ` [PATCH v15 18/21] KVM: arm64: Enable host mapping of shared guest_memfd memory Fuad Tabba
2025-07-23  8:33   ` Marc Zyngier
2025-07-23  9:18     ` Fuad Tabba
2025-07-17 16:27 ` [PATCH v15 19/21] KVM: Introduce the KVM capability KVM_CAP_GMEM_MMAP Fuad Tabba
2025-07-18  6:14   ` Xiaoyao Li
2025-07-21 17:31   ` Sean Christopherson
2025-07-17 16:27 ` [PATCH v15 20/21] KVM: selftests: Do not use hardcoded page sizes in guest_memfd test Fuad Tabba
2025-07-17 16:27 ` [PATCH v15 21/21] KVM: selftests: guest_memfd mmap() test when mmap is supported Fuad Tabba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aH5RxqcTXRnQbP5R@google.com \
    --to=seanjc@google.com \
    --cc=ackerleytng@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=amoorthy@google.com \
    --cc=anup@brainfault.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=brauner@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=chenhuacai@kernel.org \
    --cc=david@redhat.com \
    --cc=dmatlack@google.com \
    --cc=fvdl@google.com \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=ira.weiny@intel.com \
    --cc=isaku.yamahata@gmail.com \
    --cc=isaku.yamahata@intel.com \
    --cc=james.morse@arm.com \
    --cc=jarkko@kernel.org \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=jthoughton@google.com \
    --cc=keirf@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=liam.merwick@oracle.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=maz@kernel.org \
    --cc=mic@digikod.net \
    --cc=michael.roth@amd.com \
    --cc=mpe@ellerman.id.au \
    --cc=oliver.upton@linux.dev \
    --cc=palmer@dabbelt.com \
    --cc=pankaj.gupta@amd.com \
    --cc=paul.walmsley@sifive.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qperret@google.com \
    --cc=quic_cvanscha@quicinc.com \
    --cc=quic_eberman@quicinc.com \
    --cc=quic_mnalajal@quicinc.com \
    --cc=quic_pderrin@quicinc.com \
    --cc=quic_pheragu@quicinc.com \
    --cc=quic_svaddagi@quicinc.com \
    --cc=quic_tsoni@quicinc.com \
    --cc=rientjes@google.com \
    --cc=roypat@amazon.co.uk \
    --cc=shuah@kernel.org \
    --cc=steven.price@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=vannapurve@google.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wei.w.wang@intel.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yilun.xu@intel.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).