* [PATCH 0/2] x86/tdx: Fix memory hotplug in TDX guests
@ 2026-03-24 15:21 Marc-André Lureau
2026-03-24 15:21 ` [PATCH 1/2] x86/tdx: Handle TDG.MEM.PAGE.ACCEPT success-with-warning returns Marc-André Lureau
2026-03-24 15:21 ` [PATCH 2/2] x86/tdx: Accept hotplugged memory before online Marc-André Lureau
0 siblings, 2 replies; 12+ messages in thread
From: Marc-André Lureau @ 2026-03-24 15:21 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
H. Peter Anvin, Kiryl Shutsemau, Rick Edgecombe, Chenyi Qiang
Cc: linux-kernel, linux-coco, kvm, Marc-André Lureau
In TDX guests, hotplugged memory (e.g., via virtio-mem) must be accepted
via TDG.MEM.PAGE.ACCEPT before use. The first access to an unaccepted
page triggers a fatal "SEPT entry in PENDING state" EPT violation and
KVM terminates the guest.
This was discovered while testing virtio-mem resize with TDX guests.
The associated QEMU virtio-mem + TDX patch series is under review at:
https://patchew.org/QEMU/20260226140001.3622334-1-marcandre.lureau@redhat.com/
The fix has two parts:
1. Handle TDG.MEM.PAGE.ACCEPT "success-with-warning" returns for pages
that are already in MAPPED state (e.g., after offline/re-online
cycles), instead of treating them as fatal errors.
2. Register a MEM_GOING_ONLINE memory hotplug notifier that calls
tdx_accept_memory() before pages are freed to the buddy allocator.
The TDCALL transparently triggers KVM-side page augmentation (AUG)
followed by acceptance, avoiding the fatal EPT violation path.
The solution was suggested by Claude Code (Anthropic) and has been
tested with virtio-mem hot-add on a TDX guest. I did my best to review
the produced code and comments. Apologies if the agent did hallucinate.
Let me know if I need to check or correct something.
Thanks,
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
Marc-André Lureau (2):
x86/tdx: Handle TDG.MEM.PAGE.ACCEPT success-with-warning returns
x86/tdx: Accept hotplugged memory before online
arch/x86/coco/tdx/tdx-shared.c | 2 +-
arch/x86/coco/tdx/tdx.c | 38 ++++++++++++++++++++++++++++++++++++++
2 files changed, 39 insertions(+), 1 deletion(-)
---
base-commit: c369299895a591d96745d6492d4888259b004a9e
change-id: 20260324-tdx-hotplug-fixes-644d009dad63
Best regards,
--
Marc-André Lureau <marcandre.lureau@redhat.com>
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH 1/2] x86/tdx: Handle TDG.MEM.PAGE.ACCEPT success-with-warning returns 2026-03-24 15:21 [PATCH 0/2] x86/tdx: Fix memory hotplug in TDX guests Marc-André Lureau @ 2026-03-24 15:21 ` Marc-André Lureau 2026-03-24 22:02 ` Edgecombe, Rick P 2026-03-24 15:21 ` [PATCH 2/2] x86/tdx: Accept hotplugged memory before online Marc-André Lureau 1 sibling, 1 reply; 12+ messages in thread From: Marc-André Lureau @ 2026-03-24 15:21 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Kiryl Shutsemau, Rick Edgecombe, Chenyi Qiang Cc: linux-kernel, linux-coco, kvm, Marc-André Lureau try_accept_one() treats any non-zero return from __tdcall() as a failure. However, per the TDX Module Base Spec (Table SEPT Walk Cases), TDG.MEM.PAGE.ACCEPT returns a non-zero status code with bit 63 clear when the target page is already in MAPPED state (i.e., already accepted). This is a "success-with-warning" -- the page is usable and no action is needed. Check only bit 63 (TDX_ERROR) to distinguish real errors from success-with-warning returns, rather than treating all non-zero values as failures. Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> --- arch/x86/coco/tdx/tdx-shared.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/coco/tdx/tdx-shared.c b/arch/x86/coco/tdx/tdx-shared.c index 1655aa56a0a51..24983601a2ded 100644 --- a/arch/x86/coco/tdx/tdx-shared.c +++ b/arch/x86/coco/tdx/tdx-shared.c @@ -35,7 +35,7 @@ static unsigned long try_accept_one(phys_addr_t start, unsigned long len, } args.rcx = start | page_size; - if (__tdcall(TDG_MEM_PAGE_ACCEPT, &args)) + if (__tdcall(TDG_MEM_PAGE_ACCEPT, &args) & TDX_ERROR) return 0; return accept_size; -- 2.53.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] x86/tdx: Handle TDG.MEM.PAGE.ACCEPT success-with-warning returns 2026-03-24 15:21 ` [PATCH 1/2] x86/tdx: Handle TDG.MEM.PAGE.ACCEPT success-with-warning returns Marc-André Lureau @ 2026-03-24 22:02 ` Edgecombe, Rick P 0 siblings, 0 replies; 12+ messages in thread From: Edgecombe, Rick P @ 2026-03-24 22:02 UTC (permalink / raw) To: bp@alien8.de, marcandre.lureau@redhat.com, kas@kernel.org, hpa@zytor.com, mingo@redhat.com, x86@kernel.org, tglx@kernel.org, dave.hansen@linux.intel.com, Qiang, Chenyi Cc: kvm@vger.kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org On Tue, 2026-03-24 at 19:21 +0400, Marc-André Lureau wrote: > try_accept_one() treats any non-zero return from __tdcall() as a > failure. However, per the TDX Module Base Spec (Table SEPT Walk Cases), > TDG.MEM.PAGE.ACCEPT returns a non-zero status code with bit 63 clear > when the target page is already in MAPPED state (i.e., already > accepted). This is a "success-with-warning" -- the page is usable and no > action is needed. > > Check only bit 63 (TDX_ERROR) to distinguish real errors from > success-with-warning returns, rather than treating all non-zero values > as failures. > > Assisted-by: Claude:claude-opus-4-6 > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Hmm. Accepting private memory is a security sensitive operation, so I think it is probably bad to silently hide the detection of re-accepting. For example, if the kernel accepts a page and sets some values in it, the VMM could reset the data to zero by re-adding the page and letting the second accept zero it. It allows the VMM to have some limited ability to mess with guest data. If we detect a re-accept we should probably warn on it actually. Not sure on if the specific case in this series is problematic, but this patch changes the behavior generally. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/2] x86/tdx: Accept hotplugged memory before online 2026-03-24 15:21 [PATCH 0/2] x86/tdx: Fix memory hotplug in TDX guests Marc-André Lureau 2026-03-24 15:21 ` [PATCH 1/2] x86/tdx: Handle TDG.MEM.PAGE.ACCEPT success-with-warning returns Marc-André Lureau @ 2026-03-24 15:21 ` Marc-André Lureau 2026-03-24 22:03 ` Edgecombe, Rick P 2026-03-27 8:28 ` Yan Zhao 1 sibling, 2 replies; 12+ messages in thread From: Marc-André Lureau @ 2026-03-24 15:21 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Kiryl Shutsemau, Rick Edgecombe, Chenyi Qiang Cc: linux-kernel, linux-coco, kvm, Marc-André Lureau In TDX guests, hotplugged memory (e.g., via virtio-mem) is never accepted before use. The first access triggers a fatal "SEPT entry in PENDING state" EPT violation and KVM terminates the guest. Fix this by registering a MEM_GOING_ONLINE memory hotplug notifier that calls tdx_accept_memory() for the range being onlined. The notifier returns NOTIFY_BAD on acceptance failure, preventing the memory from going online. Assisted-by: Claude:claude-opus-4-6 Reported-by: Chenyi Qiang <chenyi.qiang@intel.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> --- arch/x86/coco/tdx/tdx.c | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index 7b2833705d475..89f90bc303258 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -8,6 +8,7 @@ #include <linux/export.h> #include <linux/io.h> #include <linux/kexec.h> +#include <linux/memory.h> #include <asm/coco.h> #include <asm/tdx.h> #include <asm/vmx.h> @@ -1194,3 +1195,40 @@ void __init tdx_early_init(void) tdx_announce(); } + +#ifdef CONFIG_MEMORY_HOTPLUG +static int tdx_guest_memory_notifier(struct notifier_block *nb, + unsigned long action, void *v) +{ + struct memory_notify *mn = v; + phys_addr_t start, end; + + if (action != MEM_GOING_ONLINE) + return NOTIFY_OK; + + start = PFN_PHYS(mn->start_pfn); + end = start + PFN_PHYS(mn->nr_pages); + + if (!tdx_accept_memory(start, end)) { + pr_err("Failed to accept memory [0x%llx, 0x%llx)\n", + (unsigned long long)start, + (unsigned long long)end); + return NOTIFY_BAD; + } + + return NOTIFY_OK; +} + +static struct notifier_block tdx_guest_memory_nb = { + .notifier_call = tdx_guest_memory_notifier, +}; + +static int __init tdx_guest_memory_init(void) +{ + if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) + return 0; + + return register_memory_notifier(&tdx_guest_memory_nb); +} +core_initcall(tdx_guest_memory_init); +#endif -- 2.53.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] x86/tdx: Accept hotplugged memory before online 2026-03-24 15:21 ` [PATCH 2/2] x86/tdx: Accept hotplugged memory before online Marc-André Lureau @ 2026-03-24 22:03 ` Edgecombe, Rick P 2026-03-25 10:29 ` Marc-André Lureau 2026-03-27 8:28 ` Yan Zhao 1 sibling, 1 reply; 12+ messages in thread From: Edgecombe, Rick P @ 2026-03-24 22:03 UTC (permalink / raw) To: bp@alien8.de, marcandre.lureau@redhat.com, kas@kernel.org, hpa@zytor.com, mingo@redhat.com, x86@kernel.org, tglx@kernel.org, dave.hansen@linux.intel.com, Qiang, Chenyi Cc: kvm@vger.kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org On Tue, 2026-03-24 at 19:21 +0400, Marc-André Lureau wrote: > In TDX guests, hotplugged memory (e.g., via virtio-mem) is never > accepted before use. The first access triggers a fatal "SEPT entry in > PENDING state" EPT violation and KVM terminates the guest. > > Fix this by registering a MEM_GOING_ONLINE memory hotplug notifier that > calls tdx_accept_memory() for the range being onlined. > > The notifier returns NOTIFY_BAD on acceptance failure, preventing the > memory from going online. Does this depend on patch 1 somehow? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] x86/tdx: Accept hotplugged memory before online 2026-03-24 22:03 ` Edgecombe, Rick P @ 2026-03-25 10:29 ` Marc-André Lureau 2026-03-25 17:21 ` Edgecombe, Rick P 2026-03-27 3:05 ` Chenyi Qiang 0 siblings, 2 replies; 12+ messages in thread From: Marc-André Lureau @ 2026-03-25 10:29 UTC (permalink / raw) To: Edgecombe, Rick P Cc: bp@alien8.de, kas@kernel.org, hpa@zytor.com, mingo@redhat.com, x86@kernel.org, tglx@kernel.org, dave.hansen@linux.intel.com, Qiang, Chenyi, kvm@vger.kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Bonzini, Paolo Hi On Wed, Mar 25, 2026 at 2:04 AM Edgecombe, Rick P <rick.p.edgecombe@intel.com> wrote: > > On Tue, 2026-03-24 at 19:21 +0400, Marc-André Lureau wrote: > > In TDX guests, hotplugged memory (e.g., via virtio-mem) is never > > accepted before use. The first access triggers a fatal "SEPT entry in > > PENDING state" EPT violation and KVM terminates the guest. > > > > Fix this by registering a MEM_GOING_ONLINE memory hotplug notifier that > > calls tdx_accept_memory() for the range being onlined. > > > > The notifier returns NOTIFY_BAD on acceptance failure, preventing the > > memory from going online. > > Does this depend on patch 1 somehow? Yes, if I plug, unplug and plug again I get this without PATCH 1: [root@rhel10-server ~]# [ 5707.392231] virtio_mem virtio5: plugged size: 0x80000000 [ 5707.395583] virtio_mem virtio5: requested size: 0x0 [root@rhel10-server ~]# [ 5714.648501] virtio_mem virtio5: plugged size: 0x2e00000 [ 5714.651808] virtio_mem virtio5: requested size: 0x80000000 [ 5714.676296] tdx: Failed to accept memory [0x108000000, 0x110000000) [ 5714.683980] tdx: Failed to accept memory [0x110000000, 0x118000000) [ 5714.686997] tdx: Failed to accept memory [0x140000000, 0x148000000) [ 5714.689989] tdx: Failed to accept memory [0x128000000, 0x130000000) [ 5714.694981] tdx: Failed to accept memory [0x148000000, 0x150000000) [ 5714.704064] tdx: Failed to accept memory [0x138000000, 0x140000000) [ 5714.710144] tdx: Failed to accept memory [0x118000000, 0x120000000) [ 5714.722532] tdx: Failed to accept memory [0x130000000, 0x138000000) My understanding is that QEMU should eventually unplug the memory and PUNCH_HOLE then KVM should TDH.MEM.PAGE.REMOVE, but that doesn't seem to happen. Is this strictly required? According to the specification, it may not be. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] x86/tdx: Accept hotplugged memory before online 2026-03-25 10:29 ` Marc-André Lureau @ 2026-03-25 17:21 ` Edgecombe, Rick P 2026-03-26 18:25 ` Paolo Bonzini 2026-03-27 3:05 ` Chenyi Qiang 1 sibling, 1 reply; 12+ messages in thread From: Edgecombe, Rick P @ 2026-03-25 17:21 UTC (permalink / raw) To: marcandre.lureau@redhat.com Cc: pbonzini@redhat.com, bp@alien8.de, kas@kernel.org, Qiang, Chenyi, hpa@zytor.com, mingo@redhat.com, linux-kernel@vger.kernel.org, dave.hansen@linux.intel.com, tglx@kernel.org, kvm@vger.kernel.org, linux-coco@lists.linux.dev, x86@kernel.org On Wed, 2026-03-25 at 14:29 +0400, Marc-André Lureau wrote: > > Does this depend on patch 1 somehow? > > Yes, if I plug, unplug and plug again I get this without PATCH 1: > [root@rhel10-server ~]# [ 5707.392231] virtio_mem virtio5: plugged > size: 0x80000000 > [ 5707.395583] virtio_mem virtio5: requested size: 0x0 > > [root@rhel10-server ~]# [ 5714.648501] virtio_mem virtio5: plugged > size: 0x2e00000 > [ 5714.651808] virtio_mem virtio5: requested size: 0x80000000 > [ 5714.676296] tdx: Failed to accept memory [0x108000000, > 0x110000000) > [ 5714.683980] tdx: Failed to accept memory [0x110000000, > 0x118000000) > [ 5714.686997] tdx: Failed to accept memory [0x140000000, > 0x148000000) > [ 5714.689989] tdx: Failed to accept memory [0x128000000, > 0x130000000) > [ 5714.694981] tdx: Failed to accept memory [0x148000000, > 0x150000000) > [ 5714.704064] tdx: Failed to accept memory [0x138000000, > 0x140000000) > [ 5714.710144] tdx: Failed to accept memory [0x118000000, > 0x120000000) > [ 5714.722532] tdx: Failed to accept memory [0x130000000, > 0x138000000) > > My understanding is that QEMU should eventually unplug the memory and > PUNCH_HOLE then KVM should TDH.MEM.PAGE.REMOVE, but that doesn't seem > to happen. Is this strictly required? According to the specification, > it may not be. Ah, I see now! So the problem is not that the kernel is accidentally re-accepting the memory. It's that host userspace is not actually removing the memory during unplug. Hmm. Why not fix userspace then? If the memory is unplugged it should not be usable anymore by the guest. If it is still accessible then it seems kind of like a bug, no? And! This totally justifies the warning. If the error is ignored, the guest would think the memory is zeroed, but it could have old data in it. It's exactly the kind of tricks a VMM could play to attack the guest. Another option could be to perform a TDG.MEM.PAGE.RELEASE TDCALL from the guest when it unplugs the memory, to put it in an unaccepted state. This would be more robust to buggy VMM behavior. But working around buggy VM behavior would need a high bar. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] x86/tdx: Accept hotplugged memory before online 2026-03-25 17:21 ` Edgecombe, Rick P @ 2026-03-26 18:25 ` Paolo Bonzini 2026-03-26 20:40 ` Edgecombe, Rick P 0 siblings, 1 reply; 12+ messages in thread From: Paolo Bonzini @ 2026-03-26 18:25 UTC (permalink / raw) To: Edgecombe, Rick P Cc: Marc-André Lureau, Borislav Petkov, kas, Qiang, Chenyi, Anvin, H. Peter, Ingo Molnar, Kernel Mailing List, Linux, Dave Hansen, Thomas Gleixner, kvm, linux-coco, the arch/x86 maintainers Il mer 25 mar 2026, 18:21 Edgecombe, Rick P <rick.p.edgecombe@intel.com> ha scritto: > > Ah, I see now! So the problem is not that the kernel is accidentally > re-accepting the memory. It's that host userspace is not actually > removing the memory during unplug. Hmm. Why not fix userspace then? If > the memory is unplugged it should not be usable anymore by the guest. > If it is still accessible then it seems kind of like a bug, no? > > And! This totally justifies the warning. If the error is ignored, the > guest would think the memory is zeroed, but it could have old data in > it. It's exactly the kind of tricks a VMM could play to attack the > guest. > > Another option could be to perform a TDG.MEM.PAGE.RELEASE TDCALL from > the guest when it unplugs the memory, to put it in an unaccepted state. > This would be more robust to buggy VMM behavior. But working around > buggy VM behavior would need a high bar. Wouldn't it actually be a very low bar? Just from these two paragraphs of yours, it's clear that the line between buggy and malicious is fine, in fact I think userspace should not care at all about removing the memory. Only the guest cares about acceptance state. Doing a RELEASE TDCALL seems more robust and not hard. Paolo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] x86/tdx: Accept hotplugged memory before online 2026-03-26 18:25 ` Paolo Bonzini @ 2026-03-26 20:40 ` Edgecombe, Rick P 0 siblings, 0 replies; 12+ messages in thread From: Edgecombe, Rick P @ 2026-03-26 20:40 UTC (permalink / raw) To: pbonzini@redhat.com Cc: x86@kernel.org, dave.hansen@linux.intel.com, marcandre.lureau@redhat.com, kas@kernel.org, hpa@zytor.com, linux-kernel@vger.kernel.org, mingo@redhat.com, bp@alien8.de, Qiang, Chenyi, tglx@kernel.org, linux-coco@lists.linux.dev, kvm@vger.kernel.org Hi Paolo! On Thu, 2026-03-26 at 19:25 +0100, Paolo Bonzini wrote: > > Another option could be to perform a TDG.MEM.PAGE.RELEASE TDCALL from > > the guest when it unplugs the memory, to put it in an unaccepted state. > > This would be more robust to buggy VMM behavior. But working around > > buggy VM behavior would need a high bar. > > Wouldn't it actually be a very low bar? Just from these two paragraphs > of yours, it's clear that the line between buggy and malicious is > fine, in fact I think userspace should not care at all about removing > the memory. Only the guest cares about acceptance state. > > Doing a RELEASE TDCALL seems more robust and not hard. I mean I guess the contract is a bit fuzzy. The reason why I was thinking it was a host userspace bug is because the conventional bare metal behavior of unplugging memory should be that it is no longer accessible, right? If the guest could still use the unplugged memory, it could be surprising for userspace and the guest. Also, ideally I'd think the behavior wouldn't cover up guest bugs where it tried to keep using the memory. So forgetting about TDX, isn't it better behavior in general for unplugging memory, to actually pull it from the guest? Did I look at that wrong? As for the bar to change the guest, I was first imagining it would be the size of the accept memory plumbing. Which was not a small effort and has had a steady stream of bugs to squash where the accept was missed. But I didn't actually POC anything to check the scope so maybe that was a bit hasty. Should we do a POC? But considering the scope, I wonder if SNP has the same problem. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] x86/tdx: Accept hotplugged memory before online 2026-03-25 10:29 ` Marc-André Lureau 2026-03-25 17:21 ` Edgecombe, Rick P @ 2026-03-27 3:05 ` Chenyi Qiang 2026-03-27 8:49 ` David Hildenbrand (Arm) 1 sibling, 1 reply; 12+ messages in thread From: Chenyi Qiang @ 2026-03-27 3:05 UTC (permalink / raw) To: Marc-André Lureau, Edgecombe, Rick P Cc: bp@alien8.de, kas@kernel.org, hpa@zytor.com, mingo@redhat.com, x86@kernel.org, tglx@kernel.org, dave.hansen@linux.intel.com, kvm@vger.kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Bonzini, Paolo, David Hildenbrand (Arm) On 3/25/2026 6:29 PM, Marc-André Lureau wrote: > Hi > > On Wed, Mar 25, 2026 at 2:04 AM Edgecombe, Rick P > <rick.p.edgecombe@intel.com> wrote: >> >> On Tue, 2026-03-24 at 19:21 +0400, Marc-André Lureau wrote: >>> In TDX guests, hotplugged memory (e.g., via virtio-mem) is never >>> accepted before use. The first access triggers a fatal "SEPT entry in >>> PENDING state" EPT violation and KVM terminates the guest. >>> >>> Fix this by registering a MEM_GOING_ONLINE memory hotplug notifier that >>> calls tdx_accept_memory() for the range being onlined. >>> >>> The notifier returns NOTIFY_BAD on acceptance failure, preventing the >>> memory from going online. >> >> Does this depend on patch 1 somehow? > > Yes, if I plug, unplug and plug again I get this without PATCH 1: > [root@rhel10-server ~]# [ 5707.392231] virtio_mem virtio5: plugged > size: 0x80000000 > [ 5707.395583] virtio_mem virtio5: requested size: 0x0 > > [root@rhel10-server ~]# [ 5714.648501] virtio_mem virtio5: plugged > size: 0x2e00000 > [ 5714.651808] virtio_mem virtio5: requested size: 0x80000000 > [ 5714.676296] tdx: Failed to accept memory [0x108000000, 0x110000000) > [ 5714.683980] tdx: Failed to accept memory [0x110000000, 0x118000000) > [ 5714.686997] tdx: Failed to accept memory [0x140000000, 0x148000000) > [ 5714.689989] tdx: Failed to accept memory [0x128000000, 0x130000000) > [ 5714.694981] tdx: Failed to accept memory [0x148000000, 0x150000000) > [ 5714.704064] tdx: Failed to accept memory [0x138000000, 0x140000000) > [ 5714.710144] tdx: Failed to accept memory [0x118000000, 0x120000000) > [ 5714.722532] tdx: Failed to accept memory [0x130000000, 0x138000000) > > My understanding is that QEMU should eventually unplug the memory and > PUNCH_HOLE then KVM should TDH.MEM.PAGE.REMOVE, but that doesn't seem > to happen. I guess it doesn't happen because virtio-mem in QEMU only PUNCH_HOLE the shared memory by ram_block_discard_range() but it doesn't touch the private memory which should be discarded by ram_block_discard_guest_memfd_range(). Is this strictly required? According to the specification, > it may not be. > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] x86/tdx: Accept hotplugged memory before online 2026-03-27 3:05 ` Chenyi Qiang @ 2026-03-27 8:49 ` David Hildenbrand (Arm) 0 siblings, 0 replies; 12+ messages in thread From: David Hildenbrand (Arm) @ 2026-03-27 8:49 UTC (permalink / raw) To: Chenyi Qiang, Marc-André Lureau, Edgecombe, Rick P Cc: bp@alien8.de, kas@kernel.org, hpa@zytor.com, mingo@redhat.com, x86@kernel.org, tglx@kernel.org, dave.hansen@linux.intel.com, kvm@vger.kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Bonzini, Paolo On 3/27/26 04:05, Chenyi Qiang wrote: > > > On 3/25/2026 6:29 PM, Marc-André Lureau wrote: >> Hi >> >> On Wed, Mar 25, 2026 at 2:04 AM Edgecombe, Rick P >> <rick.p.edgecombe@intel.com> wrote: >>> >>> >>> Does this depend on patch 1 somehow? >> >> Yes, if I plug, unplug and plug again I get this without PATCH 1: >> [root@rhel10-server ~]# [ 5707.392231] virtio_mem virtio5: plugged >> size: 0x80000000 >> [ 5707.395583] virtio_mem virtio5: requested size: 0x0 >> >> [root@rhel10-server ~]# [ 5714.648501] virtio_mem virtio5: plugged >> size: 0x2e00000 >> [ 5714.651808] virtio_mem virtio5: requested size: 0x80000000 >> [ 5714.676296] tdx: Failed to accept memory [0x108000000, 0x110000000) >> [ 5714.683980] tdx: Failed to accept memory [0x110000000, 0x118000000) >> [ 5714.686997] tdx: Failed to accept memory [0x140000000, 0x148000000) >> [ 5714.689989] tdx: Failed to accept memory [0x128000000, 0x130000000) >> [ 5714.694981] tdx: Failed to accept memory [0x148000000, 0x150000000) >> [ 5714.704064] tdx: Failed to accept memory [0x138000000, 0x140000000) >> [ 5714.710144] tdx: Failed to accept memory [0x118000000, 0x120000000) >> [ 5714.722532] tdx: Failed to accept memory [0x130000000, 0x138000000) >> >> My understanding is that QEMU should eventually unplug the memory and >> PUNCH_HOLE then KVM should TDH.MEM.PAGE.REMOVE, but that doesn't seem >> to happen. > > I guess it doesn't happen because virtio-mem in QEMU only PUNCH_HOLE the > shared memory by ram_block_discard_range() but it doesn't touch the private > memory which should be discarded by ram_block_discard_guest_memfd_range(). > > Is this strictly required? According to the specification, So far nobody specified how virtio-mem should behave in a CoCo environment. I assume that we need enhancements on the driver and the device side. In Linux, we should not be accepting memory during memory onlining/offlining through notifiers, as we might only hot(un)plug parts of a memory block etc. We need some explicit calls into the core before we hand hotplugged memory to the core, and before we hand back unplugged memory to the device. In QEMU, I would similarly assume that we might have to perform some additional work when converting memory blocks. *maybe* that would just be done by the guest that converts memory from private to shared before unplug etc. -- Cheers, David ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] x86/tdx: Accept hotplugged memory before online 2026-03-24 15:21 ` [PATCH 2/2] x86/tdx: Accept hotplugged memory before online Marc-André Lureau 2026-03-24 22:03 ` Edgecombe, Rick P @ 2026-03-27 8:28 ` Yan Zhao 1 sibling, 0 replies; 12+ messages in thread From: Yan Zhao @ 2026-03-27 8:28 UTC (permalink / raw) To: Marc-André Lureau Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Kiryl Shutsemau, Rick Edgecombe, Chenyi Qiang, linux-kernel, linux-coco, kvm On Tue, Mar 24, 2026 at 07:21:48PM +0400, Marc-André Lureau wrote: > In TDX guests, hotplugged memory (e.g., via virtio-mem) is never > accepted before use. The first access triggers a fatal "SEPT entry in > PENDING state" EPT violation and KVM terminates the guest. > > Fix this by registering a MEM_GOING_ONLINE memory hotplug notifier that > calls tdx_accept_memory() for the range being onlined. > > The notifier returns NOTIFY_BAD on acceptance failure, preventing the > memory from going online. > > Assisted-by: Claude:claude-opus-4-6 > Reported-by: Chenyi Qiang <chenyi.qiang@intel.com> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> > --- > arch/x86/coco/tdx/tdx.c | 38 ++++++++++++++++++++++++++++++++++++++ > 1 file changed, 38 insertions(+) > > diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c > index 7b2833705d475..89f90bc303258 100644 > --- a/arch/x86/coco/tdx/tdx.c > +++ b/arch/x86/coco/tdx/tdx.c > @@ -8,6 +8,7 @@ > #include <linux/export.h> > #include <linux/io.h> > #include <linux/kexec.h> > +#include <linux/memory.h> > #include <asm/coco.h> > #include <asm/tdx.h> > #include <asm/vmx.h> > @@ -1194,3 +1195,40 @@ void __init tdx_early_init(void) > > tdx_announce(); > } > + > +#ifdef CONFIG_MEMORY_HOTPLUG > +static int tdx_guest_memory_notifier(struct notifier_block *nb, > + unsigned long action, void *v) > +{ > + struct memory_notify *mn = v; > + phys_addr_t start, end; > + > + if (action != MEM_GOING_ONLINE) > + return NOTIFY_OK; > + > + start = PFN_PHYS(mn->start_pfn); > + end = start + PFN_PHYS(mn->nr_pages); > + > + if (!tdx_accept_memory(start, end)) { > + pr_err("Failed to accept memory [0x%llx, 0x%llx)\n", > + (unsigned long long)start, > + (unsigned long long)end); > + return NOTIFY_BAD; > + } > + > + return NOTIFY_OK; > +} > + > +static struct notifier_block tdx_guest_memory_nb = { > + .notifier_call = tdx_guest_memory_notifier, > +}; > + > +static int __init tdx_guest_memory_init(void) > +{ > + if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) > + return 0; > + > + return register_memory_notifier(&tdx_guest_memory_nb); > +} If I read the code correctly, online_pages 1. memory_notify(MEM_GOING_ONLINE, &mem_arg); 2. online_pages_range(pfn, nr_pages); (*online_page_callback)(page, order); generic_online_page __free_pages_core(page, order, MEMINIT_HOTPLUG); In __free_pages_core(), there's accept_memory() already: if (page_contains_unaccepted(page, order)) { if (order == MAX_PAGE_ORDER && __free_unaccepted(page)) return; accept_memory(page_to_phys(page), PAGE_SIZE << order); } __free_unaccepted() also adds the pages to the unaccepted_pages list, so cond_accept_memory() will accept the memory later: So, is it because the virtio mem sets online_page_callback to virtio_mem_online_page_cb, which doesn't invoke __free_pages_core() properly? Or am I missing something that makes the memory notifier approach necessary? Thanks Yan > +core_initcall(tdx_guest_memory_init); > +#endif > > -- > 2.53.0 > > ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-03-27 9:08 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-24 15:21 [PATCH 0/2] x86/tdx: Fix memory hotplug in TDX guests Marc-André Lureau 2026-03-24 15:21 ` [PATCH 1/2] x86/tdx: Handle TDG.MEM.PAGE.ACCEPT success-with-warning returns Marc-André Lureau 2026-03-24 22:02 ` Edgecombe, Rick P 2026-03-24 15:21 ` [PATCH 2/2] x86/tdx: Accept hotplugged memory before online Marc-André Lureau 2026-03-24 22:03 ` Edgecombe, Rick P 2026-03-25 10:29 ` Marc-André Lureau 2026-03-25 17:21 ` Edgecombe, Rick P 2026-03-26 18:25 ` Paolo Bonzini 2026-03-26 20:40 ` Edgecombe, Rick P 2026-03-27 3:05 ` Chenyi Qiang 2026-03-27 8:49 ` David Hildenbrand (Arm) 2026-03-27 8:28 ` Yan Zhao
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox