From: Ackerley Tng via B4 Relay <devnull+ackerleytng.google.com@kernel.org>
To: aik@amd.com, andrew.jones@linux.dev, binbin.wu@linux.intel.com,
brauner@kernel.org, chao.p.peng@linux.intel.com,
david@kernel.org, ira.weiny@intel.com, jmattson@google.com,
jthoughton@google.com, michael.roth@amd.com, oupton@kernel.org,
pankaj.gupta@amd.com, qperret@google.com,
rick.p.edgecombe@intel.com, rientjes@google.com,
shivankg@amd.com, steven.price@arm.com, tabba@google.com,
willy@infradead.org, wyihan@google.com, yan.y.zhao@intel.com,
forkloop@google.com, pratyush@kernel.org,
suzuki.poulose@arm.com, aneesh.kumar@kernel.org,
Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <seanjc@google.com>,
Thomas Gleixner <tglx@kernel.org>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
Steven Rostedt <rostedt@goodmis.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Jonathan Corbet <corbet@lwn.net>,
Shuah Khan <skhan@linuxfoundation.org>,
Shuah Khan <shuah@kernel.org>,
Vishal Annapurve <vannapurve@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Chris Li <chrisl@kernel.org>, Kairui Song <kasong@tencent.com>,
Kemeng Shi <shikemeng@huaweicloud.com>,
Nhat Pham <nphamcs@gmail.com>, Baoquan He <bhe@redhat.com>,
Barry Song <baohua@kernel.org>,
Axel Rasmussen <axelrasmussen@google.com>,
Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
Youngjun Park <youngjun.park@lge.com>,
Qi Zheng <qi.zheng@linux.dev>,
Shakeel Butt <shakeel.butt@linux.dev>,
Kiryl Shutsemau <kas@kernel.org>, Jason Gunthorpe <jgg@ziepe.ca>,
Vlastimil Babka <vbabka@kernel.org>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kselftest@vger.kernel.org, linux-mm@kvack.org,
linux-coco@lists.linux.dev,
Ackerley Tng <ackerleytng@google.com>
Subject: [PATCH RFC v5 26/53] KVM: x86: Support SNP and TDX applying content modes
Date: Tue, 28 Apr 2026 16:25:21 -0700 [thread overview]
Message-ID: <20260428-gmem-inplace-conversion-v5-26-d8608ccfca22@google.com> (raw)
In-Reply-To: <20260428-gmem-inplace-conversion-v5-0-d8608ccfca22@google.com>
From: Ackerley Tng <ackerleytng@google.com>
Define supported content modes for TDX and SNP.
For now, content preservation is not generally supported for conversions.
Allow conversion only from shared to private before the VM is finalized to
support this VM set up flow from userspace:
1. Set up guest_memfd as shared.
2. Write directly to guest_memfd.
3. Set memory attributes to private with the PRESERVE flag
4. Call KVM_TDX_INIT_MEM_REGION/KVM_SEV_SNP_LAUNCH_UPDATE to load and
encrypt memory
An alternative would be to the work done by the kernel in step 3 into 4,
but the process of conversion is complicated (needs to check refcounts,
handle failures, etc) and plumbing the errors out through the
platform-specific ioctl is complex and pollutes the platform-specific
ioctl.
Allow conversion with content preservation only to_private since preserving
content on a to-shared conversion after population cannot be supported.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
Documentation/virt/kvm/api.rst | 3 +++
arch/x86/kvm/x86.c | 38 ++++++++++++++++++++++++++++++++++++++
2 files changed, 41 insertions(+)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 61b9974ba52e9..aaa4a82f0b75d 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6659,6 +6659,9 @@ The content modes available are as follows:
converts the memory to shared, the host (and guest) will read
``0xbeef`` (if the memory is accessible).
+ For TDX and SNP, content preservation is only supported before the
+ VM is finalized, and only on conversion to private.
+
Note: These content modes apply to the entire requested range, not
just the parts of the range that underwent conversion. For example, if
this was the initial state:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e8abff71001eb..296ed3b8ace6c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -14206,6 +14206,32 @@ u64 kvm_arch_gmem_supported_content_modes(struct kvm *kvm, bool to_private)
case KVM_X86_SW_PROTECTED_VM:
return KVM_SET_MEMORY_ATTRIBUTES2_ZERO |
KVM_SET_MEMORY_ATTRIBUTES2_PRESERVE;
+ case KVM_X86_SNP_VM:
+ case KVM_X86_TDX_VM: {
+ u64 supported = KVM_SET_MEMORY_ATTRIBUTES2_ZERO;
+
+ /*
+ * Preservation is only supported for VMs with
+ * protected state up until the guest is launched and
+ * vCPUs become capable of generating KVM MMU faults,
+ * since those faults can be destructive to the
+ * initial memory contents from the guest point of
+ * view, i.e. plaintext data will become random data,
+ * or zeroed, after a shared->private conversion.
+ *
+ * Use pre_fault_allowed to guard PRESERVE support,
+ * since that is set to true when VMs are finalized.
+ *
+ * Along the same lines, only support PRESERVE for
+ * to_private conversions, since when converting to
+ * shared, memory contents for pages that had already
+ * been faulted could be zeroed.
+ */
+ if (to_private && !kvm->arch.pre_fault_allowed)
+ supported |= KVM_SET_MEMORY_ATTRIBUTES2_PRESERVE;
+
+ return supported;
+ }
default:
return 0;
}
@@ -14216,6 +14242,16 @@ int kvm_arch_gmem_apply_content_mode_zero(struct kvm *kvm, struct inode *inode,
{
switch (kvm->arch.vm_type) {
case KVM_X86_SW_PROTECTED_VM:
+ case KVM_X86_SNP_VM:
+ case KVM_X86_TDX_VM:
+ /*
+ * TDX firmware will zero on unmapping from the
+ * Secure-EPTs, but suppose a shared page with
+ * contents was converted to private, and then
+ * converted back without ever being mapped into
+ * Secure-EPTs: guest_memfd can't rely on TDX firmware
+ * for zeroing then.
+ */
return kvm_gmem_apply_content_mode_zero(inode, start, end);
default:
return 0;
@@ -14228,6 +14264,8 @@ int kvm_arch_gmem_apply_content_mode_preserve(struct kvm *kvm,
{
switch (kvm->arch.vm_type) {
case KVM_X86_SW_PROTECTED_VM:
+ case KVM_X86_SNP_VM:
+ case KVM_X86_TDX_VM:
/* Do nothing to preserve content. */
return 0;
default:
--
2.54.0.545.g6539524ca2-goog
next prev parent reply other threads:[~2026-04-28 23:25 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-28 23:24 [PATCH RFC v5 00/53] guest_memfd: In-place conversion support Ackerley Tng via B4 Relay
2026-04-28 23:24 ` [PATCH RFC v5 01/53] KVM: guest_memfd: Introduce per-gmem attributes, use to guard user mappings Ackerley Tng via B4 Relay
2026-04-28 23:24 ` [PATCH RFC v5 02/53] KVM: Rename KVM_GENERIC_MEMORY_ATTRIBUTES to KVM_VM_MEMORY_ATTRIBUTES Ackerley Tng via B4 Relay
2026-04-28 23:24 ` [PATCH RFC v5 03/53] KVM: Enumerate support for PRIVATE memory iff kvm_arch_has_private_mem is defined Ackerley Tng via B4 Relay
2026-04-28 23:24 ` [PATCH RFC v5 04/53] KVM: Stub in ability to disable per-VM memory attribute tracking Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 05/53] KVM: guest_memfd: Wire up kvm_get_memory_attributes() to per-gmem attributes Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 06/53] KVM: x86/mmu: Bug the VM if gmem attributes are queried to determine max mapping level Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 07/53] KVM: guest_memfd: Update kvm_gmem_populate() to use gmem attributes Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 08/53] KVM: guest_memfd: Only prepare folios for private pages Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 09/53] KVM: Move kvm_supported_mem_attributes() to kvm_host.h Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 10/53] KVM: guest_memfd: Add basic support for KVM_SET_MEMORY_ATTRIBUTES2 Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 11/53] KVM: guest_memfd: Ensure pages are not in use before conversion Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 12/53] KVM: guest_memfd: Call arch invalidate hooks on conversion Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 13/53] KVM: guest_memfd: Return early if range already has requested attributes Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 14/53] KVM: guest_memfd: Advertise KVM_SET_MEMORY_ATTRIBUTES2 ioctl Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 15/53] KVM: guest_memfd: Handle lru_add fbatch refcounts during conversion safety check Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 16/53] KVM: guest_memfd: Use actual size for invalidation in kvm_gmem_release() Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 17/53] KVM: guest_memfd: Determine invalidation filter from memory attributes Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 18/53] KVM: Move KVM_VM_MEMORY_ATTRIBUTES config definition to x86 Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 19/53] KVM: Let userspace disable per-VM mem attributes, enable per-gmem attributes Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 20/53] KVM: guest_memfd: Enable INIT_SHARED on guest_memfd for x86 Coco VMs Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 21/53] KVM: guest_memfd: Introduce default handlers for content modes Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 22/53] KVM: guest_memfd: Apply content modes while setting memory attributes Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 23/53] KVM: x86: Support SW_PROTECTED_VM in applying content modes Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 24/53] KVM: SEV: Make 'uaddr' parameter optional for KVM_SEV_SNP_LAUNCH_UPDATE Ackerley Tng via B4 Relay
2026-04-28 23:40 ` Ackerley Tng
2026-04-28 23:25 ` [PATCH RFC v5 25/53] KVM: TDX: Make source page optional for KVM_TDX_INIT_MEM_REGION Ackerley Tng via B4 Relay
2026-04-28 23:25 ` Ackerley Tng via B4 Relay [this message]
2026-04-28 23:25 ` [PATCH RFC v5 27/53] KVM: x86: Bug CoCo VM on page fault before finalizing Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 28/53] KVM: Add CAP to enumerate supported SET_MEMORY_ATTRIBUTES2 flags Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 29/53] KVM: selftests: Create gmem fd before "regular" fd when adding memslot Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 30/53] KVM: selftests: Rename guest_memfd{,_offset} to gmem_{fd,offset} Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 31/53] KVM: selftests: Add support for mmap() on guest_memfd in core library Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 32/53] KVM: selftests: Add selftests global for guest memory attributes capability Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 33/53] KVM: selftests: Add helpers for calling ioctls on guest_memfd Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 34/53] KVM: selftests: Test basic single-page conversion flow Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 35/53] KVM: selftests: Test conversion flow when INIT_SHARED Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 36/53] KVM: selftests: Test conversion precision in guest_memfd Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 37/53] KVM: selftests: Test conversion before allocation Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 38/53] KVM: selftests: Convert with allocated folios in different layouts Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 39/53] KVM: selftests: Test that truncation does not change shared/private status Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 40/53] KVM: selftests: Test that shared/private status is consistent across processes Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 41/53] KVM: selftests: Test conversion with elevated page refcount Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 42/53] KVM: selftests: Test that conversion to private does not support ZERO Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 43/53] KVM: selftests: Support checking that data not equal expected Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 44/53] KVM: selftests: Test that not specifying a conversion flag scrambles memory contents Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 45/53] KVM: selftests: Reset shared memory after hole-punching Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 46/53] KVM: selftests: Provide function to look up guest_memfd details from gpa Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 47/53] KVM: selftests: Provide common function to set memory attributes Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 48/53] KVM: selftests: Check fd/flags provided to mmap() when setting up memslot Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 49/53] KVM: selftests: Make TEST_EXPECT_SIGBUS thread-safe Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 50/53] KVM: selftests: Update private_mem_conversions_test to mmap() guest_memfd Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 51/53] KVM: selftests: Add script to exercise private_mem_conversions_test Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 52/53] KVM: selftests: Update pre-fault test to work with per-guest_memfd attributes Ackerley Tng via B4 Relay
2026-04-28 23:25 ` [PATCH RFC v5 53/53] KVM: selftests: Update private memory exits test to work with per-gmem attributes Ackerley Tng via B4 Relay
2026-04-28 23:33 ` [POC PATCH 0/6] guest_memfd in-place conversion selftests for SNP Ackerley Tng
2026-04-28 23:33 ` [POC PATCH 1/6] KVM: selftests: Initialize guest_memfd with INIT_SHARED Ackerley Tng
2026-04-28 23:33 ` [POC PATCH 2/6] KVM: selftests: Use guest_memfd memory contents in-place for SNP launch update Ackerley Tng
2026-04-28 23:33 ` [POC PATCH 3/6] KVM: selftests: Make guest_code_xsave more friendly Ackerley Tng
2026-04-28 23:33 ` [POC PATCH 4/6] KVM: selftests: Allow specifying CoCo-privateness while mapping a page Ackerley Tng
2026-04-28 23:33 ` [POC PATCH 5/6] KVM: selftests: Test conversions for SNP Ackerley Tng
2026-04-28 23:33 ` [POC PATCH 6/6] KVM: selftests: Test content modes ZERO and PRESERVE " Ackerley Tng
2026-04-29 15:06 ` [PATCH RFC v5 00/53] guest_memfd: In-place conversion support Sean Christopherson
2026-04-29 23:51 ` Michael Roth
2026-04-30 23:51 ` Ackerley Tng
2026-05-01 22:21 ` Ackerley Tng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260428-gmem-inplace-conversion-v5-26-d8608ccfca22@google.com \
--to=devnull+ackerleytng.google.com@kernel.org \
--cc=ackerleytng@google.com \
--cc=aik@amd.com \
--cc=akpm@linux-foundation.org \
--cc=andrew.jones@linux.dev \
--cc=aneesh.kumar@kernel.org \
--cc=axelrasmussen@google.com \
--cc=baohua@kernel.org \
--cc=bhe@redhat.com \
--cc=binbin.wu@linux.intel.com \
--cc=bp@alien8.de \
--cc=brauner@kernel.org \
--cc=chao.p.peng@linux.intel.com \
--cc=chrisl@kernel.org \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=david@kernel.org \
--cc=forkloop@google.com \
--cc=hpa@zytor.com \
--cc=ira.weiny@intel.com \
--cc=jgg@ziepe.ca \
--cc=jmattson@google.com \
--cc=jthoughton@google.com \
--cc=kas@kernel.org \
--cc=kasong@tencent.com \
--cc=kvm@vger.kernel.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=michael.roth@amd.com \
--cc=mingo@redhat.com \
--cc=nphamcs@gmail.com \
--cc=oupton@kernel.org \
--cc=pankaj.gupta@amd.com \
--cc=pbonzini@redhat.com \
--cc=pratyush@kernel.org \
--cc=qi.zheng@linux.dev \
--cc=qperret@google.com \
--cc=rick.p.edgecombe@intel.com \
--cc=rientjes@google.com \
--cc=rostedt@goodmis.org \
--cc=seanjc@google.com \
--cc=shakeel.butt@linux.dev \
--cc=shikemeng@huaweicloud.com \
--cc=shivankg@amd.com \
--cc=shuah@kernel.org \
--cc=skhan@linuxfoundation.org \
--cc=steven.price@arm.com \
--cc=suzuki.poulose@arm.com \
--cc=tabba@google.com \
--cc=tglx@kernel.org \
--cc=vannapurve@google.com \
--cc=vbabka@kernel.org \
--cc=weixugc@google.com \
--cc=willy@infradead.org \
--cc=wyihan@google.com \
--cc=x86@kernel.org \
--cc=yan.y.zhao@intel.com \
--cc=youngjun.park@lge.com \
--cc=yuanchu@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox