public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* TDX/non-ACT: failed TDH.PHYMEM.PAGE.WBINVD after successful page remove can leave a page unreset
@ 2026-04-01 11:51 裴辰举 
  2026-04-01 13:59 ` Edgecombe, Rick P
  0 siblings, 1 reply; 3+ messages in thread
From: 裴辰举  @ 2026-04-01 11:51 UTC (permalink / raw)
  To: seanjc, pbonzini, kvm, linux-kernel, xiaoyao.li


  On non-ACT platforms, TDH.MEM.PAGE.REMOVE does not flush cachelines or initialize the removed page. KVM handles that by calling TDH.PHYMEM.PAGE.WBINVD 
after a private page is removed.
  The problem is the failure path after a successful remove:
       KVM drops a private page.
       TDH.MEM.PAGE.REMOVE succeeds, so the page is no longer assigned to the TD.
        KVM then calls TDH.PHYMEM.PAGE.WBINVD.
        If TDH.PHYMEM.PAGE.WBINVD fails, KVM marks the VM/TD dead and teardown follows.
  At that point, TDH.PHYMEM.PAGE.RECLAIM will not process the page that hit the WBINVD failure, because that page has already been removed from the TD. Normally TDH.PHYMEM.PAGE.RECLAIM clears/reinitializes TD pages during teardown, but this page is no longer in that set. This seems to create a state hole: the page has been 
removed from the TD, but it may never be fully reset/cleared for safe host reuse if the WBINVD step failed. Depending on later host-side handling, this can become 
either a leaked page or an unsafe page reuse issue.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: TDX/non-ACT: failed TDH.PHYMEM.PAGE.WBINVD after successful page remove can leave a page unreset
  2026-04-01 11:51 TDX/non-ACT: failed TDH.PHYMEM.PAGE.WBINVD after successful page remove can leave a page unreset 裴辰举 
@ 2026-04-01 13:59 ` Edgecombe, Rick P
  2026-04-01 16:59   ` Sean Christopherson
  0 siblings, 1 reply; 3+ messages in thread
From: Edgecombe, Rick P @ 2026-04-01 13:59 UTC (permalink / raw)
  To: Li, Xiaoyao, pcj3195161583@163.com, pbonzini@redhat.com,
	seanjc@google.com, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org

On Wed, 2026-04-01 at 19:51 +0800, 裴辰举 wrote:
> 
>   On non-ACT platforms, TDH.MEM.PAGE.REMOVE does not flush cachelines
> or initialize the removed page. KVM handles that by calling
> TDH.PHYMEM.PAGE.WBINVD 
> after a private page is removed.
>   The problem is the failure path after a successful remove:
>        KVM drops a private page.
>        TDH.MEM.PAGE.REMOVE succeeds, so the page is no longer
> assigned to the TD.
>         KVM then calls TDH.PHYMEM.PAGE.WBINVD.
>         If TDH.PHYMEM.PAGE.WBINVD fails, KVM marks the VM/TD dead and
> teardown follows.
>   At that point, TDH.PHYMEM.PAGE.RECLAIM will not process the page
> that hit the WBINVD failure, because that page has already been
> removed from the TD. Normally TDH.PHYMEM.PAGE.RECLAIM
> clears/reinitializes TD pages during teardown, but this page is no
> longer in that set. This seems to create a state hole: the page has
> been 
> removed from the TD, but it may never be fully reset/cleared for safe
> host reuse if the WBINVD step failed. Depending on later host-side
> handling, this can become 
> either a leaked page or an unsafe page reuse issue.

Not every SEAMCALL error is expected, based on the constraints in the
code. So the code deliberately does not handle all documented errors.
As in, the code is written in a way to guarantee some operations will
succeed. If the code sees any weird behavior it does a KVM_BUG_ON(), as
a best effort kind of thing. It is not intended to be part of a system
to cleanly handle all possible bugs.

Instead, if the kernel does allow a specific KVM_BUG_ON() scenario to
trigger, the kernel should be fixed. If the TDX module starts to return
an unexpected error, then the TDX module should be fixed.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: TDX/non-ACT: failed TDH.PHYMEM.PAGE.WBINVD after successful page remove can leave a page unreset
  2026-04-01 13:59 ` Edgecombe, Rick P
@ 2026-04-01 16:59   ` Sean Christopherson
  0 siblings, 0 replies; 3+ messages in thread
From: Sean Christopherson @ 2026-04-01 16:59 UTC (permalink / raw)
  To: Rick P Edgecombe
  Cc: Xiaoyao Li, pcj3195161583@163.com, pbonzini@redhat.com,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org

On Wed, Apr 01, 2026, Rick P Edgecombe wrote:
> On Wed, 2026-04-01 at 19:51 +0800, 裴辰举 wrote:
> > 
> >   On non-ACT platforms, TDH.MEM.PAGE.REMOVE does not flush cachelines
> > or initialize the removed page. KVM handles that by calling
> > TDH.PHYMEM.PAGE.WBINVD 
> > after a private page is removed.
> >   The problem is the failure path after a successful remove:
> >        KVM drops a private page.
> >        TDH.MEM.PAGE.REMOVE succeeds, so the page is no longer
> > assigned to the TD.
> >         KVM then calls TDH.PHYMEM.PAGE.WBINVD.
> >         If TDH.PHYMEM.PAGE.WBINVD fails, KVM marks the VM/TD dead and
> > teardown follows.
> >   At that point, TDH.PHYMEM.PAGE.RECLAIM will not process the page
> > that hit the WBINVD failure, because that page has already been
> > removed from the TD. Normally TDH.PHYMEM.PAGE.RECLAIM
> > clears/reinitializes TD pages during teardown, but this page is no
> > longer in that set. This seems to create a state hole: the page has
> > been 
> > removed from the TD, but it may never be fully reset/cleared for safe
> > host reuse if the WBINVD step failed. Depending on later host-side
> > handling, this can become 
> > either a leaked page or an unsafe page reuse issue.
> 
> Not every SEAMCALL error is expected, based on the constraints in the
> code. So the code deliberately does not handle all documented errors.
> As in, the code is written in a way to guarantee some operations will
> succeed. If the code sees any weird behavior it does a KVM_BUG_ON(), as
> a best effort kind of thing. It is not intended to be part of a system
> to cleanly handle all possible bugs.
> 
> Instead, if the kernel does allow a specific KVM_BUG_ON() scenario to
> trigger, the kernel should be fixed. If the TDX module starts to return
> an unexpected error, then the TDX module should be fixed.

+1, the right answer here is to not screw up in the first place.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-04-01 16:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-01 11:51 TDX/non-ACT: failed TDH.PHYMEM.PAGE.WBINVD after successful page remove can leave a page unreset 裴辰举 
2026-04-01 13:59 ` Edgecombe, Rick P
2026-04-01 16:59   ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox