Linux virtualization list
 help / color / mirror / Atom feed
From: Xiaoyao Li <xiaoyao.li@intel.com>
To: "Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"David Hildenbrand" <david@kernel.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Jason Wang" <jasowang@redhat.com>,
	"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
	"Eugenio Pérez" <eperezma@redhat.com>
Cc: virtualization@lists.linux.dev, linux-kernel@vger.kernel.org,
	Chenyi Qiang <chenyi.qiang@intel.com>
Subject: Re: [PATCH RFC] virtio-mem: support Confidential Computing (CoCo) environments
Date: Wed, 27 May 2026 18:22:02 +0800	[thread overview]
Message-ID: <523b346d-4f7d-4d89-9839-42a5c167fed3@intel.com> (raw)
In-Reply-To: <20260401-coco-v1-1-b9c3072e2d9c@redhat.com>

On 4/1/2026 7:12 PM, Marc-André Lureau wrote:
> In Confidential Computing (CoCo) environments such as Intel TDX or AMD
> SEV-SNP, hotplugged memory must be explicitly "accepted" (transitioned to
> a private/encrypted state) before it can be safely used by the guest.
> Conversely, before returning memory to the hypervisor during an unplug
> operation, it must be converted back to a shared/decrypted state.

It's not a must to convert it back to shared. The memory is going to be 
unplugged, the guest doesn't need to care the state of it unless there 
is restriction that private memory cannot be unplugged. But we don't 
have such restriction.

As I explained in the QEMU thread[1], the VMM needs to discard the 
memory (both shared and private) on unplug. If the VMM fails to do so, 
the memory is actually not unplugged and the guest is still able to 
access them.

If the VMM fails to discard/remove the private memory, either 
unintentionally or intentionally, it's the bug of the VMM. For TDX, this 
kind of VMM bug can lead to re-accept error. To make TDX guest more 
robust, we can let the guest release the memory itself on unplug, as 
suggested by Paolo[2] and Kiryl[3], so that it can survive even with 
buggy vmm. Converting the memory to shared is another approach for guest 
to proactively "release" the private memory. But the justification of it 
is not "guest must do so".

[1] 
https://lore.kernel.org/qemu-devel/7a9fe710-679e-4366-9eeb-3aba148773d7@intel.com/
[2] 
https://lore.kernel.org/lkml/CABgObfZ7_w8Q-dW=Sd4YA3P==BuN1edPv7Ty4EpPyU8ctW6RLg@mail.gmail.com/
[3] https://lore.kernel.org/lkml/acprNlPP7J_ttMrz@thinkstation/

> Attempting to handle memory acceptance automatically using generic
> architecture-level memory hotplug notifiers (e.g., MEM_GOING_ONLINE)
> is not viable for devices like virtio-mem:
> 
> 1. Granularity Mismatch: virtio-mem can dynamically hot(un)plug memory
>     at a subblock granularity (e.g., 2MB chunks within a 128MB memory
>     block). Generic memory notifiers operate on the entire memory block.
> 2. Lifecycle Control: Memory must be explicitly accepted *before* it is
>     handed to the core memory management subsystem (the buddy allocator),
>     and it must be decrypted *before* being handed back to the device.
> 3. State Tracking (Offline -> Re-online): If memory is offlined and
>     re-onlined without proper state transitions, TDX will panic on
>     attempting to accept an already-accepted page (TDX_EPT_ENTRY_STATE_INCORRECT).
> 
> To address this, this patch implements explicit CoCo memory conversions
> directly within the virtio-mem driver using set_memory_encrypted() and
> set_memory_decrypted():
> 
> - During hotplug, explicitly accepts only the physically plugged subblocks
>    right before fake-onlining them into the buddy allocator.
> - During unplug, memory is explicitly transitioned to the shared state
>    before being handed back to the host. If the unplug operation fails,
>    the driver attempts to re-accept (encrypt) the memory. If this
>    re-acceptance fails, the memory is intentionally leaked to prevent
>    confidentiality breaches or fatal hypervisor faults.
> 
> This was discovered while testing virtio-mem resize with TDX guests.
> The associated QEMU virtio-mem + TDX patch series is under review at:
> https://patchew.org/QEMU/20260226140001.3622334-1-marcandre.lureau@redhat.com/
> 
> Note that QEMU punches the guest_memfd on KVM_HC_MAP_GPA_RANGE, when the
> guest memory is decrypted. There is thus no need to discard the guest_memfd
> in the virtio-mem device.
> 
> This patch is a follow-up and supersedes "[PATCH 0/2] x86/tdx: Fix
> memory hotplug in TDX guests".
> 



      parent reply	other threads:[~2026-05-27 10:22 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-01 11:12 [PATCH RFC] virtio-mem: support Confidential Computing (CoCo) environments Marc-André Lureau
2026-04-01 19:29 ` David Hildenbrand (Arm)
2026-05-27 10:22 ` Xiaoyao Li [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=523b346d-4f7d-4d89-9839-42a5c167fed3@intel.com \
    --to=xiaoyao.li@intel.com \
    --cc=chenyi.qiang@intel.com \
    --cc=david@kernel.org \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcandre.lureau@redhat.com \
    --cc=mst@redhat.com \
    --cc=virtualization@lists.linux.dev \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox