From: "Huang, Kai" <kai.huang@intel.com>
To: Sean Christopherson <seanjc@google.com>, Yan Zhao <yan.y.zhao@intel.com>
Cc: Rick P Edgecombe <rick.p.edgecombe@intel.com>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
Yuan Yao <yuan.yao@intel.com>,
"isaku.yamahata@gmail.com" <isaku.yamahata@gmail.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"dmatlack@google.com" <dmatlack@google.com>,
"nik.borisov@suse.com" <nik.borisov@suse.com>
Subject: Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT
Date: Tue, 17 Sep 2024 14:11:01 +1200 [thread overview]
Message-ID: <b50bfb56-c2f2-493e-bd87-1c5aaa8bfb59@intel.com> (raw)
In-Reply-To: <ZuR09EqzU1WbQYGd@google.com>
On 14/09/2024 5:23 am, Sean Christopherson wrote:
> On Fri, Sep 13, 2024, Yan Zhao wrote:
>> This is a lock status report of TDX module for current SEAMCALL retry issue
>> based on code in TDX module public repo https://github.com/intel/tdx-module.git
>> branch TDX_1.5.05.
>>
>> TL;DR:
>> - tdh_mem_track() can contend with tdh_vp_enter().
>> - tdh_vp_enter() contends with tdh_mem*() when 0-stepping is suspected.
>
> The zero-step logic seems to be the most problematic. E.g. if KVM is trying to
> install a page on behalf of two vCPUs, and KVM resumes the guest if it encounters
> a FROZEN_SPTE when building the non-leaf SPTEs, then one of the vCPUs could
> trigger the zero-step mitigation if the vCPU that "wins" and gets delayed for
> whatever reason.
>
> Since FROZEN_SPTE is essentially bit-spinlock with a reaaaaaly slow slow-path,
> what if instead of resuming the guest if a page fault hits FROZEN_SPTE, KVM retries
> the fault "locally", i.e. _without_ redoing tdh_vp_enter() to see if the vCPU still
> hits the fault?
>
> For non-TDX, resuming the guest and letting the vCPU retry the instruction is
> desirable because in many cases, the winning task will install a valid mapping
> before KVM can re-run the vCPU, i.e. the fault will be fixed before the
> instruction is re-executed. In the happy case, that provides optimal performance
> as KVM doesn't introduce any extra delay/latency.
>
> But for TDX, the math is different as the cost of a re-hitting a fault is much,
> much higher, especially in light of the zero-step issues.
>
> E.g. if the TDP MMU returns a unique error code for the frozen case, and
> kvm_mmu_page_fault() is modified to return the raw return code instead of '1',
> then the TDX EPT violation path can safely retry locally, similar to the do-while
> loop in kvm_tdp_map_page().
>
> The only part I don't like about this idea is having two "retry" return values,
> which creates the potential for bugs due to checking one but not the other.
>
> Hmm, that could be avoided by passing a bool pointer as an out-param to communicate
> to the TDX S-EPT fault handler that the SPTE is frozen. I think I like that
> option better even though the out-param is a bit gross, because it makes it more
> obvious that the "frozen_spte" is a special case that doesn't need attention for
> most paths.
>
[...]
>
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 5a475a6456d4..cbf9e46203f3 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -1174,6 +1174,8 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
>
> retry:
> rcu_read_unlock();
> + if (ret == RET_PF_RETRY && is_frozen_spte(iter.old_spte))
> + return RET_PF_RETRY_FOZEN;
Ack the whole "retry on frozen" approach, either with RET_PF_RETRY_FOZEN
or fault->frozen_spte.
One minor side effect:
For normal VMs, the fault handler can also see a frozen spte, e.g, when
kvm_tdp_mmu_map() checks the middle level SPTE:
/*
* If SPTE has been frozen by another thread, just give up and
* retry, avoiding unnecessary page table allocation and free.
*/
if (is_frozen_spte(iter.old_spte))
goto retry;
So for normal VM this RET_PF_RETRY_FOZEN will change "go back to guest
to retry" to "retry in KVM internally".
As you mentioned above for normal VMs we probably always want to "go
back to guest to retry" even for FROZEN SPTE, but I guess this is a
minor issue that we can even notice.
Or we can additionally add:
if (ret == RET_PF_RETRY && is_frozen_spte(iter.old_spte)
&& is_mirrored_sptep(iter.sptep))
return RET_PF_RETRY_FOZEN;
So it only applies to TDX.
next prev parent reply other threads:[~2024-09-17 2:11 UTC|newest]
Thread overview: 139+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-04 3:07 [PATCH 00/21] TDX MMU Part 2 Rick Edgecombe
2024-09-04 3:07 ` [PATCH 01/21] KVM: x86/mmu: Implement memslot deletion for TDX Rick Edgecombe
2024-09-09 13:44 ` Paolo Bonzini
2024-09-09 21:06 ` Edgecombe, Rick P
2024-09-04 3:07 ` [PATCH 02/21] KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU Rick Edgecombe
2024-09-09 13:51 ` Paolo Bonzini
2024-09-04 3:07 ` [PATCH 03/21] KVM: x86/mmu: Do not enable page track for TD guest Rick Edgecombe
2024-09-09 13:53 ` Paolo Bonzini
2024-09-09 21:07 ` Edgecombe, Rick P
2024-09-04 3:07 ` [PATCH 04/21] KVM: VMX: Split out guts of EPT violation to common/exposed function Rick Edgecombe
2024-09-09 13:57 ` Paolo Bonzini
2024-09-09 16:07 ` Sean Christopherson
2024-09-10 7:36 ` Paolo Bonzini
2024-09-04 3:07 ` [PATCH 05/21] KVM: VMX: Teach EPT violation helper about private mem Rick Edgecombe
2024-09-09 13:59 ` Paolo Bonzini
2024-09-11 8:52 ` Chao Gao
2024-09-11 16:29 ` Edgecombe, Rick P
2024-09-12 0:39 ` Huang, Kai
2024-09-12 13:58 ` Sean Christopherson
2024-09-12 14:43 ` Edgecombe, Rick P
2024-09-12 14:46 ` Paolo Bonzini
2024-09-12 1:19 ` Huang, Kai
2024-09-04 3:07 ` [PATCH 06/21] KVM: TDX: Add accessors VMX VMCS helpers Rick Edgecombe
2024-09-09 14:19 ` Paolo Bonzini
2024-09-09 21:29 ` Edgecombe, Rick P
2024-09-10 10:48 ` Paolo Bonzini
2024-09-04 3:07 ` [PATCH 07/21] KVM: TDX: Add load_mmu_pgd method for TDX Rick Edgecombe
2024-09-11 2:48 ` Chao Gao
2024-09-11 2:49 ` Edgecombe, Rick P
2024-09-04 3:07 ` [PATCH 08/21] KVM: TDX: Set gfn_direct_bits to shared bit Rick Edgecombe
2024-09-09 15:21 ` Paolo Bonzini
2024-09-04 3:07 ` [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT Rick Edgecombe
2024-09-06 1:41 ` Huang, Kai
2024-09-09 20:25 ` Edgecombe, Rick P
2024-09-09 15:25 ` Paolo Bonzini
2024-09-09 20:22 ` Edgecombe, Rick P
2024-09-09 21:11 ` Sean Christopherson
2024-09-09 21:23 ` Sean Christopherson
2024-09-09 22:34 ` Edgecombe, Rick P
2024-09-09 23:58 ` Sean Christopherson
2024-09-10 0:50 ` Edgecombe, Rick P
2024-09-10 1:46 ` Sean Christopherson
2024-09-11 1:17 ` Huang, Kai
2024-09-11 2:48 ` Edgecombe, Rick P
2024-09-11 22:55 ` Huang, Kai
2024-09-10 13:15 ` Paolo Bonzini
2024-09-10 13:57 ` Sean Christopherson
2024-09-10 15:16 ` Paolo Bonzini
2024-09-10 15:57 ` Sean Christopherson
2024-09-10 16:28 ` Edgecombe, Rick P
2024-09-10 17:42 ` Sean Christopherson
2024-09-13 8:36 ` Yan Zhao
2024-09-13 17:23 ` Sean Christopherson
2024-09-13 19:19 ` Edgecombe, Rick P
2024-09-13 22:18 ` Sean Christopherson
2024-09-14 9:27 ` Yan Zhao
2024-09-15 9:53 ` Yan Zhao
2024-09-17 1:31 ` Huang, Kai
2024-09-25 10:53 ` Yan Zhao
2024-10-08 14:51 ` Sean Christopherson
2024-10-10 5:23 ` Yan Zhao
2024-10-10 17:33 ` Sean Christopherson
2024-10-10 21:53 ` Edgecombe, Rick P
2024-10-11 2:30 ` Yan Zhao
2024-10-14 10:54 ` Huang, Kai
2024-10-14 17:36 ` Edgecombe, Rick P
2024-10-14 23:03 ` Huang, Kai
2024-10-15 1:24 ` Edgecombe, Rick P
2024-10-11 2:06 ` Yan Zhao
2024-10-16 14:13 ` Yan Zhao
2024-09-17 2:11 ` Huang, Kai [this message]
2024-09-13 19:19 ` Edgecombe, Rick P
2024-09-14 10:00 ` Yan Zhao
2024-09-04 3:07 ` [PATCH 10/21] KVM: TDX: Require TDP MMU and mmio caching for TDX Rick Edgecombe
2024-09-09 15:26 ` Paolo Bonzini
2024-09-12 0:15 ` Huang, Kai
2024-09-04 3:07 ` [PATCH 11/21] KVM: x86/mmu: Add setter for shadow_mmio_value Rick Edgecombe
2024-09-09 15:33 ` Paolo Bonzini
2024-09-04 3:07 ` [PATCH 12/21] KVM: TDX: Set per-VM shadow_mmio_value to 0 Rick Edgecombe
2024-09-09 15:33 ` Paolo Bonzini
2024-09-04 3:07 ` [PATCH 13/21] KVM: TDX: Handle TLB tracking for TDX Rick Edgecombe
2024-09-10 8:16 ` Paolo Bonzini
2024-09-10 23:49 ` Edgecombe, Rick P
2024-10-14 6:34 ` Yan Zhao
2024-09-11 6:25 ` Xu Yilun
2024-09-11 17:28 ` Edgecombe, Rick P
2024-09-12 4:54 ` Yan Zhao
2024-09-12 14:44 ` Edgecombe, Rick P
2024-09-12 7:47 ` Xu Yilun
2024-09-04 3:07 ` [PATCH 14/21] KVM: TDX: Implement hooks to propagate changes of TDP MMU mirror page table Rick Edgecombe
2024-09-06 2:10 ` Huang, Kai
2024-09-09 21:03 ` Edgecombe, Rick P
2024-09-10 1:52 ` Yan Zhao
2024-09-10 9:33 ` Paolo Bonzini
2024-09-10 23:58 ` Edgecombe, Rick P
2024-09-11 1:05 ` Yan Zhao
2024-10-30 3:03 ` Binbin Wu
2024-11-04 9:09 ` Yan Zhao
2024-09-04 3:07 ` [PATCH 15/21] KVM: TDX: Implement hook to get max mapping level of private pages Rick Edgecombe
2024-09-10 10:17 ` Paolo Bonzini
2024-09-04 3:07 ` [PATCH 16/21] KVM: TDX: Premap initial guest memory Rick Edgecombe
2024-09-10 10:24 ` Paolo Bonzini
2024-09-11 0:19 ` Edgecombe, Rick P
2024-09-13 13:33 ` Adrian Hunter
2024-09-13 19:49 ` Edgecombe, Rick P
2024-09-10 10:49 ` Paolo Bonzini
2024-09-11 0:30 ` Edgecombe, Rick P
2024-09-11 10:39 ` Paolo Bonzini
2024-09-11 16:36 ` Edgecombe, Rick P
2024-09-04 3:07 ` [PATCH 17/21] KVM: TDX: MTRR: implement get_mt_mask() for TDX Rick Edgecombe
2024-09-10 10:04 ` Paolo Bonzini
2024-09-10 14:05 ` Sean Christopherson
2024-09-04 3:07 ` [PATCH 18/21] KVM: x86/mmu: Export kvm_tdp_map_page() Rick Edgecombe
2024-09-10 10:02 ` Paolo Bonzini
2024-09-04 3:07 ` [PATCH 19/21] KVM: TDX: Add an ioctl to create initial guest memory Rick Edgecombe
2024-09-04 4:53 ` Yan Zhao
2024-09-04 14:01 ` Edgecombe, Rick P
2024-09-06 16:30 ` Edgecombe, Rick P
2024-09-09 1:29 ` Yan Zhao
2024-09-10 10:13 ` Paolo Bonzini
2024-09-11 0:11 ` Edgecombe, Rick P
2024-09-04 13:56 ` Edgecombe, Rick P
2024-09-10 10:16 ` Paolo Bonzini
2024-09-11 0:12 ` Edgecombe, Rick P
2024-09-04 3:07 ` [PATCH 20/21] KVM: TDX: Finalize VM initialization Rick Edgecombe
2024-09-04 15:37 ` Adrian Hunter
2024-09-04 16:09 ` Edgecombe, Rick P
2024-09-10 10:33 ` Paolo Bonzini
2024-09-10 11:15 ` Adrian Hunter
2024-09-10 11:28 ` Paolo Bonzini
2024-09-10 11:31 ` Adrian Hunter
2024-09-10 10:25 ` Paolo Bonzini
2024-09-10 11:54 ` Adrian Hunter
2024-09-04 3:07 ` [PATCH 21/21] KVM: TDX: Handle vCPU dissociation Rick Edgecombe
2024-09-09 15:41 ` Paolo Bonzini
2024-09-09 23:30 ` Edgecombe, Rick P
2024-09-10 10:45 ` Paolo Bonzini
2024-09-11 0:17 ` Edgecombe, Rick P
2024-11-04 9:45 ` Yan Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b50bfb56-c2f2-493e-bd87-1c5aaa8bfb59@intel.com \
--to=kai.huang@intel.com \
--cc=dmatlack@google.com \
--cc=isaku.yamahata@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nik.borisov@suse.com \
--cc=pbonzini@redhat.com \
--cc=rick.p.edgecombe@intel.com \
--cc=seanjc@google.com \
--cc=yan.y.zhao@intel.com \
--cc=yuan.yao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox