qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: "Michael Roth" <michael.roth@amd.com>,
	qemu-devel@nongnu.org, kvm@vger.kernel.org,
	"Tom Lendacky" <thomas.lendacky@amd.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Pankaj Gupta" <pankaj.gupta@amd.com>,
	"Xiaoyao Li" <xiaoyao.li@intel.com>,
	"Isaku Yamahata" <isaku.yamahata@linux.intel.com>
Subject: Re: [PATCH v3 07/49] HostMem: Add mechanism to opt in kvm guest memfd via MachineState
Date: Tue, 21 Jan 2025 21:41:55 +0100	[thread overview]
Message-ID: <bc0b4372-d8ca-4d5c-aee8-6e2521ebb2ec@redhat.com> (raw)
In-Reply-To: <Z5AB3SlwRYo19dOa@x1n>

>> This "anon" memory cannot be "shared" with other processes, but
>> virtio-kernel etc. can just use it.
>>
>> To "share" the memory with other processes, we'd need memfd/file.
> 
> Ah OK, thanks David.  Is this the planned long term solution for
> vhost-kernel?

I think the basic idea was that the memory backend defines how the 
"non-private" memory is backed, which is the same just like for any 
other non-CC VM.

The "private" memory always comes from guest_memfd.

So for the time being using anon+guest_memfd coresponds to "just a 
simple VM".

Long-term I expect that we use guest_memfd for shared+private, and use 
in-place conversion. Access to "private" memory using the mmap() will 
result in a SIGBUS.

 > > I wonder what happens if vhost tries to DMA to a region that is private
> with this setup.
 > > AFAIU, it'll try to DMA to the fake address of ramblock->host that is
> pointing to by the memory backend (either anon, shmem, file, etc.).  The
> ideal case IIUC is it should crash QEMU because it's trying to access an
> illegal page which is private. But if with this model, it won't crash but
> silently populate some page in the non-gmemfd backend.
> 
> Is that expected?

Yes, it's all just a big mmap() which will populate memory on access -- 
independent of using anon/file/memfd.

Similar to virtio-mem, long-term we'd want a mechanism to check/enforce 
that some memory in there will not be populated on access from QEMU 
(well, and vhost-user processes ...).

In memory_get_xlat_addr() we perform such checks, but it's only used for 
iommu. vhost-kernel likely has no such checks, just like vhost-user etc 
does not.

> 
>>
>>>
>>> When specified gmemfd=on with those, IIUC it'll allocate both the memory
>>> (ramblock->host) and gmemfd, but without using ->host.  Meanwhile AFAIU the
>>> ramblock->host will start to conflict with gmemfd in the future when it
>>> might be able to be mapp-able (having valid ->host).
>>
>> These will require a new guest_memfd memory backend (I recall that was
>> discussed a couple of times).
> 
> Do you know if anyone is working on this one?

So far my understanding is that Google that does shared+private 
guest_memfd kernel part won't be working on QEMU patches. I raised that 
to our management recently, that this would be a good project for RH to 
focus on.

I am not aware of real implementations of the guest_memfd backend (yet).

> 
>>
>>>
>>> I have a local fix for this (and actually more than below.. but starting
>>> from it), I'm not sure whether I overlooked something, but from reading the
>>> cover letter it's only using memfd backend which makes perfect sense to me
>>> so far.
>>
>> Does the anon+guest_memfd combination not work or are you speculating about
>> the usability (which I hopefully addressed above).
> 
> IIUC, if with above solution and with how QEMU interacts memory convertions
> right now, at least hugetlb pages will suffer from double allocation, as
> kvm_convert_memory() won't free hugetlb pages even if converted to private.

Yes, that's why I'm invested in teaching guest_memfd in-place conversion 
alongside huge page support (which fortunately Google engineers are 
doing great work on).

> 
> It sounds like also doable (and also preferrable..) that for each of the VM
> we always stich with pages in the gmemfd page cache, no matter if it's
> shared or private.  For private, we could zap all pgtables and sigbus any
> faults afterwards.  I thought that was always the plan, but I could lose
> many latest informations..

Yes, with the guest_memfd backend (shared+private) that's the plan: 
SIGBUS on invalid access.


-- 
Cheers,

David / dhildenb



  reply	other threads:[~2025-01-21 20:43 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-20  8:38 [PATCH RFC v3 00/49] Add AMD Secure Nested Paging (SEV-SNP) support Michael Roth
2024-03-20  8:38 ` [PATCH v3 01/49] Revert "linux-headers hack" from sevinit2 base tree Michael Roth
2024-03-20  8:38 ` [PATCH v3 02/49] scripts/update-linux-headers: Add setup_data.h to import list Michael Roth
2024-03-20  9:19   ` Paolo Bonzini
2024-03-20  8:38 ` [PATCH v3 03/49] scripts/update-linux-headers: Add bits.h to file imports Michael Roth
2024-03-20  8:39 ` [PATCH v3 04/49] [HACK] linux-headers: Update headers for 6.8 + kvm-coco-queue + SNP Michael Roth
2024-03-20  8:39 ` [PATCH v3 05/49] [TEMP] hw/i386: Remove redeclaration of struct setup_data Michael Roth
2024-03-20  8:39 ` [PATCH v3 06/49] RAMBlock: Add support of KVM private guest memfd Michael Roth
2024-03-20 16:38   ` Paolo Bonzini
2024-03-20  8:39 ` [PATCH v3 07/49] HostMem: Add mechanism to opt in kvm guest memfd via MachineState Michael Roth
2025-01-21 17:39   ` Peter Xu
2025-01-21 18:24     ` David Hildenbrand
2025-01-21 20:21       ` Peter Xu
2025-01-21 20:41         ` David Hildenbrand [this message]
2025-01-21 20:59           ` Peter Xu
2025-01-21 21:00             ` David Hildenbrand
2024-03-20  8:39 ` [PATCH v3 08/49] trace/kvm: Split address space and slot id in trace_kvm_set_user_memory() Michael Roth
2024-03-20  8:39 ` [PATCH v3 09/49] kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot Michael Roth
2024-03-20 15:56   ` Paolo Bonzini
2024-03-20  8:39 ` [PATCH v3 10/49] kvm: Introduce support for memory_attributes Michael Roth
2024-03-20 16:00   ` Paolo Bonzini
2024-03-20  8:39 ` [PATCH v3 11/49] physmem: Introduce ram_block_discard_guest_memfd_range() Michael Roth
2024-03-20  9:37   ` David Hildenbrand
2024-03-20 12:43     ` Xiaoyao Li
2024-03-20 12:58       ` David Hildenbrand
2024-03-20 17:38     ` Michael Roth
2024-03-20 20:04       ` David Hildenbrand
2024-03-21 20:24         ` Michael Roth
2024-03-20  8:39 ` [PATCH v3 12/49] kvm: handle KVM_EXIT_MEMORY_FAULT Michael Roth
2024-03-20  8:39 ` [PATCH v3 13/49] [FIXUP] "kvm: handle KVM_EXIT_MEMORY_FAULT": drop qemu_host_page_size Michael Roth
2024-03-20 12:46   ` Xiaoyao Li
2024-03-20  8:39 ` [PATCH v3 14/49] trace/kvm: Add trace for page convertion between shared and private Michael Roth
2024-03-20  8:39 ` [PATCH v3 15/49] kvm/memory: Make memory type private by default if it has guest memfd backend Michael Roth
2024-03-20  8:39 ` [PATCH v3 16/49] memory: Introduce memory_region_init_ram_guest_memfd() Michael Roth
2024-03-20  8:39 ` [PATCH v3 17/49] pci-host/q35: Move PAM initialization above SMRAM initialization Michael Roth
2024-03-20  8:39 ` [PATCH v3 18/49] q35: Introduce smm_ranges property for q35-pci-host Michael Roth
2024-03-20  8:39 ` [PATCH v3 19/49] kvm: Make kvm_convert_memory() obey ram_block_discard_is_enabled() Michael Roth
2024-03-20 16:26   ` Paolo Bonzini
2024-03-20 19:47     ` Michael Roth
2024-03-20  8:39 ` [PATCH v3 20/49] trace/kvm: Add trace for KVM_EXIT_MEMORY_FAULT Michael Roth
2024-03-20  8:39 ` [PATCH v3 21/49] i386/sev: Introduce "sev-common" type to encapsulate common SEV state Michael Roth
2024-03-20 11:44   ` Daniel P. Berrangé
2024-03-20 21:36     ` Michael Roth via
2024-03-27 15:22     ` Markus Armbruster
2024-03-20 11:47   ` Daniel P. Berrangé
2024-03-20 21:45     ` Michael Roth via
2024-04-22 13:06   ` Markus Armbruster
2024-03-20  8:39 ` [PATCH v3 22/49] i386/sev: Introduce 'sev-snp-guest' object Michael Roth
2024-03-20 11:58   ` Daniel P. Berrangé
2024-03-20 22:09     ` Michael Roth via
2024-04-22 13:52   ` Markus Armbruster
2024-03-20  8:39 ` [PATCH v3 23/49] i386/sev: Add a sev_snp_enabled() helper Michael Roth
2024-03-20 12:35   ` Daniel P. Berrangé
2024-03-20 22:11     ` Michael Roth via
2024-03-20  8:39 ` [PATCH v3 24/49] target/i386: Add handling for KVM_X86_SNP_VM VM type Michael Roth
2024-03-20  9:33   ` Paolo Bonzini
2024-03-20  8:39 ` [PATCH v3 25/49] i386/sev: Skip RAMBlock notifiers for SNP Michael Roth
2024-03-20  9:46   ` Paolo Bonzini
2024-03-20 22:14     ` Michael Roth
2024-03-20  8:39 ` [PATCH v3 26/49] i386/sev: Skip machine-init-done " Michael Roth
2024-03-20  8:39 ` [PATCH v3 27/49] i386/sev: Set ms->require_guest_memfd " Michael Roth
2024-03-20  9:48   ` Paolo Bonzini
2024-03-20  8:39 ` [PATCH v3 28/49] i386/sev: Disable SMM " Michael Roth
2024-03-20 12:32   ` Daniel P. Berrangé
2024-03-20  8:39 ` [PATCH v3 29/49] i386/sev: Don't disable block discarding " Michael Roth
2024-03-20 12:33   ` Daniel P. Berrangé
2024-03-20  8:39 ` [PATCH v3 30/49] i386/cpu: Set SEV-SNP CPUID bit when SNP enabled Michael Roth
2024-03-20  8:39 ` [PATCH v3 31/49] i386/sev: Update query-sev QAPI format to handle SEV-SNP Michael Roth
2024-03-20 12:10   ` Daniel P. Berrangé
2024-03-20 22:23     ` Michael Roth via
2024-04-22 15:01   ` Markus Armbruster
2024-03-20  8:39 ` [PATCH v3 32/49] i386/sev: Don't return launch measurements for SEV-SNP guests Michael Roth
2024-03-20 12:15   ` Daniel P. Berrangé
2024-03-20 12:27     ` Daniel P. Berrangé
2024-03-20  8:39 ` [PATCH v3 33/49] kvm: Make kvm_convert_memory() non-static Michael Roth
2024-03-20  8:39 ` [PATCH v3 34/49] i386/sev: Add KVM_EXIT_VMGEXIT handling for Page State Changes Michael Roth
2024-03-20  8:39 ` [PATCH v3 35/49] i386/sev: Add KVM_EXIT_VMGEXIT handling for Page State Changes (MSR-based) Michael Roth
2024-03-20  8:39 ` [PATCH v3 36/49] i386/sev: Add KVM_EXIT_VMGEXIT handling for Extended Guest Requests Michael Roth
2024-04-22 15:02   ` Markus Armbruster
2024-03-20  8:39 ` [PATCH v3 37/49] i386/sev: Add the SNP launch start context Michael Roth
2024-03-20  9:58   ` Paolo Bonzini
2024-03-20 22:32     ` Michael Roth
2024-03-21 11:55       ` Paolo Bonzini
2024-03-20  8:39 ` [PATCH v3 38/49] i386/sev: Add handling to encrypt/finalize guest launch data Michael Roth
2024-03-20  8:39 ` [PATCH v3 39/49] i386/sev: Set CPU state to protected once SNP guest payload is finalized Michael Roth
2024-03-20  8:39 ` [PATCH v3 40/49] hw/i386/sev: Add function to get SEV metadata from OVMF header Michael Roth
2024-03-20 17:55   ` Isaku Yamahata
2024-03-20 22:35     ` Michael Roth
2024-03-20  8:39 ` [PATCH v3 41/49] i386/sev: Add support for populating OVMF metadata pages Michael Roth
2024-03-20  8:39 ` [PATCH v3 42/49] i386/sev: Add support for SNP CPUID validation Michael Roth
2024-03-20 12:18   ` Daniel P. Berrangé
2024-03-20  8:39 ` [PATCH v3 43/49] qapi, i386: Move kernel-hashes to SevCommonProperties Michael Roth
2024-03-20 12:20   ` Daniel P. Berrangé
2024-04-22 15:03     ` Markus Armbruster
2024-03-20  8:39 ` [PATCH v3 44/49] i386/sev: Extract build_kernel_loader_hashes Michael Roth
2024-03-20  8:39 ` [PATCH v3 45/49] i386/sev: Reorder struct declarations Michael Roth
2024-03-20  8:39 ` [PATCH v3 46/49] i386/sev: Allow measured direct kernel boot on SNP Michael Roth
2024-03-20  8:39 ` [PATCH v3 47/49] hw/i386/sev: Add support to encrypt BIOS when SEV-SNP is enabled Michael Roth
2024-03-20 12:22   ` Daniel P. Berrangé
2024-03-21 13:42     ` Michael Roth via
2024-03-20  8:39 ` [PATCH v3 48/49] hw/i386/sev: Use guest_memfd for legacy ROMs Michael Roth
2024-03-20 18:12   ` Isaku Yamahata
2024-03-28  0:45     ` Xiaoyao Li
2024-04-24  0:08       ` Michael Roth
2024-03-20  8:39 ` [PATCH v3 49/49] hw/i386: Add support for loading BIOS using guest_memfd Michael Roth
2024-03-20  9:59 ` [PATCH RFC v3 00/49] Add AMD Secure Nested Paging (SEV-SNP) support Paolo Bonzini
2024-03-20 17:08   ` Paolo Bonzini
2024-03-20 20:54     ` Xiaoyao Li
2024-03-21 20:26 ` Michael Roth
2024-04-18 11:37 ` Ani Sinha

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bc0b4372-d8ca-4d5c-aee8-6e2521ebb2ec@redhat.com \
    --to=david@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=isaku.yamahata@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=michael.roth@amd.com \
    --cc=pankaj.gupta@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=thomas.lendacky@amd.com \
    --cc=xiaoyao.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).