qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: David Hildenbrand <david@redhat.com>
Cc: "Michael Roth" <michael.roth@amd.com>,
	qemu-devel@nongnu.org, kvm@vger.kernel.org,
	"Tom Lendacky" <thomas.lendacky@amd.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Pankaj Gupta" <pankaj.gupta@amd.com>,
	"Xiaoyao Li" <xiaoyao.li@intel.com>,
	"Isaku Yamahata" <isaku.yamahata@linux.intel.com>
Subject: Re: [PATCH v3 07/49] HostMem: Add mechanism to opt in kvm guest memfd via MachineState
Date: Tue, 21 Jan 2025 15:21:49 -0500	[thread overview]
Message-ID: <Z5AB3SlwRYo19dOa@x1n> (raw)
In-Reply-To: <fa29f4ef-f67d-44d7-93f0-753437cf12cb@redhat.com>

On Tue, Jan 21, 2025 at 07:24:29PM +0100, David Hildenbrand wrote:
> On 21.01.25 18:39, Peter Xu wrote:
> > On Wed, Mar 20, 2024 at 03:39:03AM -0500, Michael Roth wrote:
> > > From: Xiaoyao Li <xiaoyao.li@intel.com>
> > > 
> > > Add a new member "guest_memfd" to memory backends. When it's set
> > > to true, it enables RAM_GUEST_MEMFD in ram_flags, thus private kvm
> > > guest_memfd will be allocated during RAMBlock allocation.
> > > 
> > > Memory backend's @guest_memfd is wired with @require_guest_memfd
> > > field of MachineState. It avoid looking up the machine in phymem.c.
> > > 
> > > MachineState::require_guest_memfd is supposed to be set by any VMs
> > > that requires KVM guest memfd as private memory, e.g., TDX VM.
> > > 
> > > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> > > Reviewed-by: David Hildenbrand <david@redhat.com>
> > > ---
> > > Changes in v4:
> > >   - rename "require_guest_memfd" to "guest_memfd" in struct
> > >     HostMemoryBackend;	(David Hildenbrand)
> > > Signed-off-by: Michael Roth <michael.roth@amd.com>
> > > ---
> > >   backends/hostmem-file.c  | 1 +
> > >   backends/hostmem-memfd.c | 1 +
> > >   backends/hostmem-ram.c   | 1 +
> > >   backends/hostmem.c       | 1 +
> > >   hw/core/machine.c        | 5 +++++
> > >   include/hw/boards.h      | 2 ++
> > >   include/sysemu/hostmem.h | 1 +
> > >   7 files changed, 12 insertions(+)
> > > 
> > > diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
> > > index ac3e433cbd..3c69db7946 100644
> > > --- a/backends/hostmem-file.c
> > > +++ b/backends/hostmem-file.c
> > > @@ -85,6 +85,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
> > >       ram_flags |= fb->readonly ? RAM_READONLY_FD : 0;
> > >       ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0;
> > >       ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> > > +    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
> > >       ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
> > >       ram_flags |= RAM_NAMED_FILE;
> > >       return memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
> > > diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
> > > index 3923ea9364..745ead0034 100644
> > > --- a/backends/hostmem-memfd.c
> > > +++ b/backends/hostmem-memfd.c
> > > @@ -55,6 +55,7 @@ memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
> > >       name = host_memory_backend_get_name(backend);
> > >       ram_flags = backend->share ? RAM_SHARED : 0;
> > >       ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> > > +    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
> > >       return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend), name,
> > >                                             backend->size, ram_flags, fd, 0, errp);
> > >   }
> > > diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
> > > index d121249f0f..f7d81af783 100644
> > > --- a/backends/hostmem-ram.c
> > > +++ b/backends/hostmem-ram.c
> > > @@ -30,6 +30,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
> > >       name = host_memory_backend_get_name(backend);
> > >       ram_flags = backend->share ? RAM_SHARED : 0;
> > >       ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> > > +    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
> > >       return memory_region_init_ram_flags_nomigrate(&backend->mr, OBJECT(backend),
> > >                                                     name, backend->size,
> > >                                                     ram_flags, errp);
> > 
> > These change look a bit confusing to me, as I don't see how gmemfd can be
> > used with either file or ram typed memory backends..
> 
> I recall that the following should work:
> 
> "private" memory will come from guest_memfd, "shared" (as in, accessible by
> the host) will come from anonymous memory.
> 
> This "anon" memory cannot be "shared" with other processes, but
> virtio-kernel etc. can just use it.
> 
> To "share" the memory with other processes, we'd need memfd/file.

Ah OK, thanks David.  Is this the planned long term solution for
vhost-kernel?

I wonder what happens if vhost tries to DMA to a region that is private
with this setup.

AFAIU, it'll try to DMA to the fake address of ramblock->host that is
pointing to by the memory backend (either anon, shmem, file, etc.).  The
ideal case IIUC is it should crash QEMU because it's trying to access an
illegal page which is private. But if with this model, it won't crash but
silently populate some page in the non-gmemfd backend.

Is that expected?

> 
> > 
> > When specified gmemfd=on with those, IIUC it'll allocate both the memory
> > (ramblock->host) and gmemfd, but without using ->host.  Meanwhile AFAIU the
> > ramblock->host will start to conflict with gmemfd in the future when it
> > might be able to be mapp-able (having valid ->host).
> 
> These will require a new guest_memfd memory backend (I recall that was
> discussed a couple of times).

Do you know if anyone is working on this one?

> 
> > 
> > I have a local fix for this (and actually more than below.. but starting
> > from it), I'm not sure whether I overlooked something, but from reading the
> > cover letter it's only using memfd backend which makes perfect sense to me
> > so far.
> 
> Does the anon+guest_memfd combination not work or are you speculating about
> the usability (which I hopefully addressed above).

IIUC, if with above solution and with how QEMU interacts memory convertions
right now, at least hugetlb pages will suffer from double allocation, as
kvm_convert_memory() won't free hugetlb pages even if converted to private.

It sounds like also doable (and also preferrable..) that for each of the VM
we always stich with pages in the gmemfd page cache, no matter if it's
shared or private.  For private, we could zap all pgtables and sigbus any
faults afterwards.  I thought that was always the plan, but I could lose
many latest informations..

Thanks,

-- 
Peter Xu



  reply	other threads:[~2025-01-21 20:23 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-20  8:38 [PATCH RFC v3 00/49] Add AMD Secure Nested Paging (SEV-SNP) support Michael Roth
2024-03-20  8:38 ` [PATCH v3 01/49] Revert "linux-headers hack" from sevinit2 base tree Michael Roth
2024-03-20  8:38 ` [PATCH v3 02/49] scripts/update-linux-headers: Add setup_data.h to import list Michael Roth
2024-03-20  9:19   ` Paolo Bonzini
2024-03-20  8:38 ` [PATCH v3 03/49] scripts/update-linux-headers: Add bits.h to file imports Michael Roth
2024-03-20  8:39 ` [PATCH v3 04/49] [HACK] linux-headers: Update headers for 6.8 + kvm-coco-queue + SNP Michael Roth
2024-03-20  8:39 ` [PATCH v3 05/49] [TEMP] hw/i386: Remove redeclaration of struct setup_data Michael Roth
2024-03-20  8:39 ` [PATCH v3 06/49] RAMBlock: Add support of KVM private guest memfd Michael Roth
2024-03-20 16:38   ` Paolo Bonzini
2024-03-20  8:39 ` [PATCH v3 07/49] HostMem: Add mechanism to opt in kvm guest memfd via MachineState Michael Roth
2025-01-21 17:39   ` Peter Xu
2025-01-21 18:24     ` David Hildenbrand
2025-01-21 20:21       ` Peter Xu [this message]
2025-01-21 20:41         ` David Hildenbrand
2025-01-21 20:59           ` Peter Xu
2025-01-21 21:00             ` David Hildenbrand
2024-03-20  8:39 ` [PATCH v3 08/49] trace/kvm: Split address space and slot id in trace_kvm_set_user_memory() Michael Roth
2024-03-20  8:39 ` [PATCH v3 09/49] kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot Michael Roth
2024-03-20 15:56   ` Paolo Bonzini
2024-03-20  8:39 ` [PATCH v3 10/49] kvm: Introduce support for memory_attributes Michael Roth
2024-03-20 16:00   ` Paolo Bonzini
2024-03-20  8:39 ` [PATCH v3 11/49] physmem: Introduce ram_block_discard_guest_memfd_range() Michael Roth
2024-03-20  9:37   ` David Hildenbrand
2024-03-20 12:43     ` Xiaoyao Li
2024-03-20 12:58       ` David Hildenbrand
2024-03-20 17:38     ` Michael Roth
2024-03-20 20:04       ` David Hildenbrand
2024-03-21 20:24         ` Michael Roth
2024-03-20  8:39 ` [PATCH v3 12/49] kvm: handle KVM_EXIT_MEMORY_FAULT Michael Roth
2024-03-20  8:39 ` [PATCH v3 13/49] [FIXUP] "kvm: handle KVM_EXIT_MEMORY_FAULT": drop qemu_host_page_size Michael Roth
2024-03-20 12:46   ` Xiaoyao Li
2024-03-20  8:39 ` [PATCH v3 14/49] trace/kvm: Add trace for page convertion between shared and private Michael Roth
2024-03-20  8:39 ` [PATCH v3 15/49] kvm/memory: Make memory type private by default if it has guest memfd backend Michael Roth
2024-03-20  8:39 ` [PATCH v3 16/49] memory: Introduce memory_region_init_ram_guest_memfd() Michael Roth
2024-03-20  8:39 ` [PATCH v3 17/49] pci-host/q35: Move PAM initialization above SMRAM initialization Michael Roth
2024-03-20  8:39 ` [PATCH v3 18/49] q35: Introduce smm_ranges property for q35-pci-host Michael Roth
2024-03-20  8:39 ` [PATCH v3 19/49] kvm: Make kvm_convert_memory() obey ram_block_discard_is_enabled() Michael Roth
2024-03-20 16:26   ` Paolo Bonzini
2024-03-20 19:47     ` Michael Roth
2024-03-20  8:39 ` [PATCH v3 20/49] trace/kvm: Add trace for KVM_EXIT_MEMORY_FAULT Michael Roth
2024-03-20  8:39 ` [PATCH v3 21/49] i386/sev: Introduce "sev-common" type to encapsulate common SEV state Michael Roth
2024-03-20 11:44   ` Daniel P. Berrangé
2024-03-20 21:36     ` Michael Roth via
2024-03-27 15:22     ` Markus Armbruster
2024-03-20 11:47   ` Daniel P. Berrangé
2024-03-20 21:45     ` Michael Roth via
2024-04-22 13:06   ` Markus Armbruster
2024-03-20  8:39 ` [PATCH v3 22/49] i386/sev: Introduce 'sev-snp-guest' object Michael Roth
2024-03-20 11:58   ` Daniel P. Berrangé
2024-03-20 22:09     ` Michael Roth via
2024-04-22 13:52   ` Markus Armbruster
2024-03-20  8:39 ` [PATCH v3 23/49] i386/sev: Add a sev_snp_enabled() helper Michael Roth
2024-03-20 12:35   ` Daniel P. Berrangé
2024-03-20 22:11     ` Michael Roth via
2024-03-20  8:39 ` [PATCH v3 24/49] target/i386: Add handling for KVM_X86_SNP_VM VM type Michael Roth
2024-03-20  9:33   ` Paolo Bonzini
2024-03-20  8:39 ` [PATCH v3 25/49] i386/sev: Skip RAMBlock notifiers for SNP Michael Roth
2024-03-20  9:46   ` Paolo Bonzini
2024-03-20 22:14     ` Michael Roth
2024-03-20  8:39 ` [PATCH v3 26/49] i386/sev: Skip machine-init-done " Michael Roth
2024-03-20  8:39 ` [PATCH v3 27/49] i386/sev: Set ms->require_guest_memfd " Michael Roth
2024-03-20  9:48   ` Paolo Bonzini
2024-03-20  8:39 ` [PATCH v3 28/49] i386/sev: Disable SMM " Michael Roth
2024-03-20 12:32   ` Daniel P. Berrangé
2024-03-20  8:39 ` [PATCH v3 29/49] i386/sev: Don't disable block discarding " Michael Roth
2024-03-20 12:33   ` Daniel P. Berrangé
2024-03-20  8:39 ` [PATCH v3 30/49] i386/cpu: Set SEV-SNP CPUID bit when SNP enabled Michael Roth
2024-03-20  8:39 ` [PATCH v3 31/49] i386/sev: Update query-sev QAPI format to handle SEV-SNP Michael Roth
2024-03-20 12:10   ` Daniel P. Berrangé
2024-03-20 22:23     ` Michael Roth via
2024-04-22 15:01   ` Markus Armbruster
2024-03-20  8:39 ` [PATCH v3 32/49] i386/sev: Don't return launch measurements for SEV-SNP guests Michael Roth
2024-03-20 12:15   ` Daniel P. Berrangé
2024-03-20 12:27     ` Daniel P. Berrangé
2024-03-20  8:39 ` [PATCH v3 33/49] kvm: Make kvm_convert_memory() non-static Michael Roth
2024-03-20  8:39 ` [PATCH v3 34/49] i386/sev: Add KVM_EXIT_VMGEXIT handling for Page State Changes Michael Roth
2024-03-20  8:39 ` [PATCH v3 35/49] i386/sev: Add KVM_EXIT_VMGEXIT handling for Page State Changes (MSR-based) Michael Roth
2024-03-20  8:39 ` [PATCH v3 36/49] i386/sev: Add KVM_EXIT_VMGEXIT handling for Extended Guest Requests Michael Roth
2024-04-22 15:02   ` Markus Armbruster
2024-03-20  8:39 ` [PATCH v3 37/49] i386/sev: Add the SNP launch start context Michael Roth
2024-03-20  9:58   ` Paolo Bonzini
2024-03-20 22:32     ` Michael Roth
2024-03-21 11:55       ` Paolo Bonzini
2024-03-20  8:39 ` [PATCH v3 38/49] i386/sev: Add handling to encrypt/finalize guest launch data Michael Roth
2024-03-20  8:39 ` [PATCH v3 39/49] i386/sev: Set CPU state to protected once SNP guest payload is finalized Michael Roth
2024-03-20  8:39 ` [PATCH v3 40/49] hw/i386/sev: Add function to get SEV metadata from OVMF header Michael Roth
2024-03-20 17:55   ` Isaku Yamahata
2024-03-20 22:35     ` Michael Roth
2024-03-20  8:39 ` [PATCH v3 41/49] i386/sev: Add support for populating OVMF metadata pages Michael Roth
2024-03-20  8:39 ` [PATCH v3 42/49] i386/sev: Add support for SNP CPUID validation Michael Roth
2024-03-20 12:18   ` Daniel P. Berrangé
2024-03-20  8:39 ` [PATCH v3 43/49] qapi, i386: Move kernel-hashes to SevCommonProperties Michael Roth
2024-03-20 12:20   ` Daniel P. Berrangé
2024-04-22 15:03     ` Markus Armbruster
2024-03-20  8:39 ` [PATCH v3 44/49] i386/sev: Extract build_kernel_loader_hashes Michael Roth
2024-03-20  8:39 ` [PATCH v3 45/49] i386/sev: Reorder struct declarations Michael Roth
2024-03-20  8:39 ` [PATCH v3 46/49] i386/sev: Allow measured direct kernel boot on SNP Michael Roth
2024-03-20  8:39 ` [PATCH v3 47/49] hw/i386/sev: Add support to encrypt BIOS when SEV-SNP is enabled Michael Roth
2024-03-20 12:22   ` Daniel P. Berrangé
2024-03-21 13:42     ` Michael Roth via
2024-03-20  8:39 ` [PATCH v3 48/49] hw/i386/sev: Use guest_memfd for legacy ROMs Michael Roth
2024-03-20 18:12   ` Isaku Yamahata
2024-03-28  0:45     ` Xiaoyao Li
2024-04-24  0:08       ` Michael Roth
2024-03-20  8:39 ` [PATCH v3 49/49] hw/i386: Add support for loading BIOS using guest_memfd Michael Roth
2024-03-20  9:59 ` [PATCH RFC v3 00/49] Add AMD Secure Nested Paging (SEV-SNP) support Paolo Bonzini
2024-03-20 17:08   ` Paolo Bonzini
2024-03-20 20:54     ` Xiaoyao Li
2024-03-21 20:26 ` Michael Roth
2024-04-18 11:37 ` Ani Sinha

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z5AB3SlwRYo19dOa@x1n \
    --to=peterx@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=david@redhat.com \
    --cc=isaku.yamahata@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=michael.roth@amd.com \
    --cc=pankaj.gupta@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=thomas.lendacky@amd.com \
    --cc=xiaoyao.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).