All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: Rick P Edgecombe <rick.p.edgecombe@intel.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	 Yuan Yao <yuan.yao@intel.com>, Kai Huang <kai.huang@intel.com>,
	 "isaku.yamahata@gmail.com" <isaku.yamahata@gmail.com>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	 "dmatlack@google.com" <dmatlack@google.com>,
	"nik.borisov@suse.com" <nik.borisov@suse.com>
Subject: Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT
Date: Tue, 8 Oct 2024 07:51:13 -0700	[thread overview]
Message-ID: <ZwVG4bQ4g5Tm2jrt@google.com> (raw)
In-Reply-To: <ZvPrqMj1BWrkkwqN@yzhao56-desk.sh.intel.com>

On Wed, Sep 25, 2024, Yan Zhao wrote:
> On Sat, Sep 14, 2024 at 05:27:32PM +0800, Yan Zhao wrote:
> > On Fri, Sep 13, 2024 at 10:23:00AM -0700, Sean Christopherson wrote:
> > > On Fri, Sep 13, 2024, Yan Zhao wrote:
> > > > This is a lock status report of TDX module for current SEAMCALL retry issue
> > > > based on code in TDX module public repo https://github.com/intel/tdx-module.git
> > > > branch TDX_1.5.05.
> > > > 
> > > > TL;DR:
> > > > - tdh_mem_track() can contend with tdh_vp_enter().
> > > > - tdh_vp_enter() contends with tdh_mem*() when 0-stepping is suspected.
> > > 
> > > The zero-step logic seems to be the most problematic.  E.g. if KVM is trying to
> > > install a page on behalf of two vCPUs, and KVM resumes the guest if it encounters
> > > a FROZEN_SPTE when building the non-leaf SPTEs, then one of the vCPUs could
> > > trigger the zero-step mitigation if the vCPU that "wins" and gets delayed for
> > > whatever reason.
> > > 
> > > Since FROZEN_SPTE is essentially bit-spinlock with a reaaaaaly slow slow-path,
> > > what if instead of resuming the guest if a page fault hits FROZEN_SPTE, KVM retries
> > > the fault "locally", i.e. _without_ redoing tdh_vp_enter() to see if the vCPU still
> > > hits the fault?
> > > 
> > > For non-TDX, resuming the guest and letting the vCPU retry the instruction is
> > > desirable because in many cases, the winning task will install a valid mapping
> > > before KVM can re-run the vCPU, i.e. the fault will be fixed before the
> > > instruction is re-executed.  In the happy case, that provides optimal performance
> > > as KVM doesn't introduce any extra delay/latency.
> > > 
> > > But for TDX, the math is different as the cost of a re-hitting a fault is much,
> > > much higher, especially in light of the zero-step issues.
> > > 
> > > E.g. if the TDP MMU returns a unique error code for the frozen case, and
> > > kvm_mmu_page_fault() is modified to return the raw return code instead of '1',
> > > then the TDX EPT violation path can safely retry locally, similar to the do-while
> > > loop in kvm_tdp_map_page().
> > > 
> > > The only part I don't like about this idea is having two "retry" return values,
> > > which creates the potential for bugs due to checking one but not the other.
> > > 
> > > Hmm, that could be avoided by passing a bool pointer as an out-param to communicate
> > > to the TDX S-EPT fault handler that the SPTE is frozen.  I think I like that
> > > option better even though the out-param is a bit gross, because it makes it more
> > > obvious that the "frozen_spte" is a special case that doesn't need attention for
> > > most paths.
> > Good idea.
> > But could we extend it a bit more to allow TDX's EPT violation handler to also
> > retry directly when tdh_mem_sept_add()/tdh_mem_page_aug() returns BUSY?
> I'm asking this because merely avoiding invoking tdh_vp_enter() in vCPUs seeing
> FROZEN_SPTE might not be enough to prevent zero step mitigation.

The goal isn't to make it completely impossible for zero-step to fire, it's to
make it so that _if_ zero-step fires, KVM can report the error to userspace without
having to retry, because KVM _knows_ that advancing past the zero-step isn't
something KVM can solve.

 : I'm not worried about any performance hit with zero-step, I'm worried about KVM
 : not being able to differentiate between a KVM bug and guest interference.  The
 : goal with a local retry is to make it so that KVM _never_ triggers zero-step,
 : unless there is a bug somewhere.  At that point, if zero-step fires, KVM can
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 : report the error to userspace instead of trying to suppress guest activity, and
 : potentially from other KVM tasks too.

In other words, for the selftest you crafted, KVM reporting an error to userspace
due to zero-step would be working as intended.  

> E.g. in below selftest with a TD configured with pending_ve_disable=N,
> zero step mitigation can be triggered on a vCPU that is stuck in EPT violation
> vm exit for more than 6 times (due to that user space does not do memslot
> conversion correctly).
> 
> So, if vCPU A wins the chance to call tdh_mem_page_aug(), the SEAMCALL may
> contend with zero step mitigation code in tdh_vp_enter() in vCPU B stuck
> in EPT violation vm exits.

  reply	other threads:[~2024-10-08 14:51 UTC|newest]

Thread overview: 139+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-04  3:07 [PATCH 00/21] TDX MMU Part 2 Rick Edgecombe
2024-09-04  3:07 ` [PATCH 01/21] KVM: x86/mmu: Implement memslot deletion for TDX Rick Edgecombe
2024-09-09 13:44   ` Paolo Bonzini
2024-09-09 21:06     ` Edgecombe, Rick P
2024-09-04  3:07 ` [PATCH 02/21] KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU Rick Edgecombe
2024-09-09 13:51   ` Paolo Bonzini
2024-09-04  3:07 ` [PATCH 03/21] KVM: x86/mmu: Do not enable page track for TD guest Rick Edgecombe
2024-09-09 13:53   ` Paolo Bonzini
2024-09-09 21:07     ` Edgecombe, Rick P
2024-09-04  3:07 ` [PATCH 04/21] KVM: VMX: Split out guts of EPT violation to common/exposed function Rick Edgecombe
2024-09-09 13:57   ` Paolo Bonzini
2024-09-09 16:07   ` Sean Christopherson
2024-09-10  7:36     ` Paolo Bonzini
2024-09-04  3:07 ` [PATCH 05/21] KVM: VMX: Teach EPT violation helper about private mem Rick Edgecombe
2024-09-09 13:59   ` Paolo Bonzini
2024-09-11  8:52   ` Chao Gao
2024-09-11 16:29     ` Edgecombe, Rick P
2024-09-12  0:39   ` Huang, Kai
2024-09-12 13:58     ` Sean Christopherson
2024-09-12 14:43       ` Edgecombe, Rick P
2024-09-12 14:46         ` Paolo Bonzini
2024-09-12  1:19   ` Huang, Kai
2024-09-04  3:07 ` [PATCH 06/21] KVM: TDX: Add accessors VMX VMCS helpers Rick Edgecombe
2024-09-09 14:19   ` Paolo Bonzini
2024-09-09 21:29     ` Edgecombe, Rick P
2024-09-10 10:48       ` Paolo Bonzini
2024-09-04  3:07 ` [PATCH 07/21] KVM: TDX: Add load_mmu_pgd method for TDX Rick Edgecombe
2024-09-11  2:48   ` Chao Gao
2024-09-11  2:49     ` Edgecombe, Rick P
2024-09-04  3:07 ` [PATCH 08/21] KVM: TDX: Set gfn_direct_bits to shared bit Rick Edgecombe
2024-09-09 15:21   ` Paolo Bonzini
2024-09-04  3:07 ` [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT Rick Edgecombe
2024-09-06  1:41   ` Huang, Kai
2024-09-09 20:25     ` Edgecombe, Rick P
2024-09-09 15:25   ` Paolo Bonzini
2024-09-09 20:22     ` Edgecombe, Rick P
2024-09-09 21:11       ` Sean Christopherson
2024-09-09 21:23         ` Sean Christopherson
2024-09-09 22:34           ` Edgecombe, Rick P
2024-09-09 23:58             ` Sean Christopherson
2024-09-10  0:50               ` Edgecombe, Rick P
2024-09-10  1:46                 ` Sean Christopherson
2024-09-11  1:17               ` Huang, Kai
2024-09-11  2:48                 ` Edgecombe, Rick P
2024-09-11 22:55                   ` Huang, Kai
2024-09-10 13:15         ` Paolo Bonzini
2024-09-10 13:57           ` Sean Christopherson
2024-09-10 15:16             ` Paolo Bonzini
2024-09-10 15:57               ` Sean Christopherson
2024-09-10 16:28                 ` Edgecombe, Rick P
2024-09-10 17:42                   ` Sean Christopherson
2024-09-13  8:36                     ` Yan Zhao
2024-09-13 17:23                       ` Sean Christopherson
2024-09-13 19:19                         ` Edgecombe, Rick P
2024-09-13 22:18                           ` Sean Christopherson
2024-09-14  9:27                         ` Yan Zhao
2024-09-15  9:53                           ` Yan Zhao
2024-09-17  1:31                             ` Huang, Kai
2024-09-25 10:53                           ` Yan Zhao
2024-10-08 14:51                             ` Sean Christopherson [this message]
2024-10-10  5:23                               ` Yan Zhao
2024-10-10 17:33                                 ` Sean Christopherson
2024-10-10 21:53                                   ` Edgecombe, Rick P
2024-10-11  2:30                                     ` Yan Zhao
2024-10-14 10:54                                     ` Huang, Kai
2024-10-14 17:36                                       ` Edgecombe, Rick P
2024-10-14 23:03                                         ` Huang, Kai
2024-10-15  1:24                                           ` Edgecombe, Rick P
2024-10-11  2:06                                   ` Yan Zhao
2024-10-16 14:13                                   ` Yan Zhao
2024-09-17  2:11                         ` Huang, Kai
2024-09-13 19:19                       ` Edgecombe, Rick P
2024-09-14 10:00                         ` Yan Zhao
2024-09-04  3:07 ` [PATCH 10/21] KVM: TDX: Require TDP MMU and mmio caching for TDX Rick Edgecombe
2024-09-09 15:26   ` Paolo Bonzini
2024-09-12  0:15   ` Huang, Kai
2024-09-04  3:07 ` [PATCH 11/21] KVM: x86/mmu: Add setter for shadow_mmio_value Rick Edgecombe
2024-09-09 15:33   ` Paolo Bonzini
2024-09-04  3:07 ` [PATCH 12/21] KVM: TDX: Set per-VM shadow_mmio_value to 0 Rick Edgecombe
2024-09-09 15:33   ` Paolo Bonzini
2024-09-04  3:07 ` [PATCH 13/21] KVM: TDX: Handle TLB tracking for TDX Rick Edgecombe
2024-09-10  8:16   ` Paolo Bonzini
2024-09-10 23:49     ` Edgecombe, Rick P
2024-10-14  6:34     ` Yan Zhao
2024-09-11  6:25   ` Xu Yilun
2024-09-11 17:28     ` Edgecombe, Rick P
2024-09-12  4:54       ` Yan Zhao
2024-09-12 14:44         ` Edgecombe, Rick P
2024-09-12  7:47       ` Xu Yilun
2024-09-04  3:07 ` [PATCH 14/21] KVM: TDX: Implement hooks to propagate changes of TDP MMU mirror page table Rick Edgecombe
2024-09-06  2:10   ` Huang, Kai
2024-09-09 21:03     ` Edgecombe, Rick P
2024-09-10  1:52       ` Yan Zhao
2024-09-10  9:33       ` Paolo Bonzini
2024-09-10 23:58         ` Edgecombe, Rick P
2024-09-11  1:05           ` Yan Zhao
2024-10-30  3:03   ` Binbin Wu
2024-11-04  9:09     ` Yan Zhao
2024-09-04  3:07 ` [PATCH 15/21] KVM: TDX: Implement hook to get max mapping level of private pages Rick Edgecombe
2024-09-10 10:17   ` Paolo Bonzini
2024-09-04  3:07 ` [PATCH 16/21] KVM: TDX: Premap initial guest memory Rick Edgecombe
2024-09-10 10:24   ` Paolo Bonzini
2024-09-11  0:19     ` Edgecombe, Rick P
2024-09-13 13:33       ` Adrian Hunter
2024-09-13 19:49         ` Edgecombe, Rick P
2024-09-10 10:49   ` Paolo Bonzini
2024-09-11  0:30     ` Edgecombe, Rick P
2024-09-11 10:39       ` Paolo Bonzini
2024-09-11 16:36         ` Edgecombe, Rick P
2024-09-04  3:07 ` [PATCH 17/21] KVM: TDX: MTRR: implement get_mt_mask() for TDX Rick Edgecombe
2024-09-10 10:04   ` Paolo Bonzini
2024-09-10 14:05     ` Sean Christopherson
2024-09-04  3:07 ` [PATCH 18/21] KVM: x86/mmu: Export kvm_tdp_map_page() Rick Edgecombe
2024-09-10 10:02   ` Paolo Bonzini
2024-09-04  3:07 ` [PATCH 19/21] KVM: TDX: Add an ioctl to create initial guest memory Rick Edgecombe
2024-09-04  4:53   ` Yan Zhao
2024-09-04 14:01     ` Edgecombe, Rick P
2024-09-06 16:30       ` Edgecombe, Rick P
2024-09-09  1:29         ` Yan Zhao
2024-09-10 10:13         ` Paolo Bonzini
2024-09-11  0:11           ` Edgecombe, Rick P
2024-09-04 13:56   ` Edgecombe, Rick P
2024-09-10 10:16   ` Paolo Bonzini
2024-09-11  0:12     ` Edgecombe, Rick P
2024-09-04  3:07 ` [PATCH 20/21] KVM: TDX: Finalize VM initialization Rick Edgecombe
2024-09-04 15:37   ` Adrian Hunter
2024-09-04 16:09     ` Edgecombe, Rick P
2024-09-10 10:33     ` Paolo Bonzini
2024-09-10 11:15       ` Adrian Hunter
2024-09-10 11:28         ` Paolo Bonzini
2024-09-10 11:31         ` Adrian Hunter
2024-09-10 10:25   ` Paolo Bonzini
2024-09-10 11:54     ` Adrian Hunter
2024-09-04  3:07 ` [PATCH 21/21] KVM: TDX: Handle vCPU dissociation Rick Edgecombe
2024-09-09 15:41   ` Paolo Bonzini
2024-09-09 23:30     ` Edgecombe, Rick P
2024-09-10 10:45   ` Paolo Bonzini
2024-09-11  0:17     ` Edgecombe, Rick P
2024-11-04  9:45     ` Yan Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZwVG4bQ4g5Tm2jrt@google.com \
    --to=seanjc@google.com \
    --cc=dmatlack@google.com \
    --cc=isaku.yamahata@gmail.com \
    --cc=kai.huang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nik.borisov@suse.com \
    --cc=pbonzini@redhat.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yuan.yao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.