public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: "Zhao, Yan Y" <yan.y.zhao@intel.com>,
	"sean.j.christopherson@intel.com"
	<sean.j.christopherson@intel.com>,
	"Huang, Kai" <kai.huang@intel.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"Yamahata, Isaku" <isaku.yamahata@intel.com>
Subject: Re: [PATCH 09/13] x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking
Date: Fri, 3 Jan 2025 01:28:16 +0000	[thread overview]
Message-ID: <fb17ae7ec64fa1eb884ab87b1729cdeb79c32871.camel@intel.com> (raw)
In-Reply-To: <20250101074959.412696-10-pbonzini@redhat.com>

On Wed, 2025-01-01 at 02:49 -0500, Paolo Bonzini wrote:
> From: Isaku Yamahata <isaku.yamahata@intel.com>
> 
> TDX module defines a TLB tracking protocol to make sure that no logical
> processor holds any stale Secure EPT (S-EPT or SEPT) TLB translations for a
> given TD private GPA range. After a successful TDH.MEM.RANGE.BLOCK,
> TDH.MEM.TRACK, and kicking off all vCPUs, TDX module ensures that the
> subsequent TDH.VP.ENTER on each vCPU will flush all stale TLB entries for
> the specified GPA ranges in TDH.MEM.RANGE.BLOCK. Wrap the
> TDH.MEM.RANGE.BLOCK with tdh_mem_range_block() and TDH.MEM.TRACK with
> tdh_mem_track() to enable the kernel to assist the TDX module in TLB
> tracking management.
> 
> The caller of tdh_mem_range_block() needs to specify "GPA" and "level" to
> request the TDX module to block the subsequent creation of TLB translation
> for a GPA range. This GPA range can correspond to a SEPT page or a TD
> private page at any level.
> 
> Contentions and errors are possible with the SEAMCALL TDH.MEM.RANGE.BLOCK.
> Therefore, the caller of tdh_mem_range_block() needs to check the function
> return value and retrieve extended error info from the function output
> params.
> 
> Upon TDH.MEM.RANGE.BLOCK success, no new TLB entries will be created for
> the specified private GPA range, though the existing TLB translations may
> still persist.
> 
> Call tdh_mem_track() after tdh_mem_range_block(). No extra info is required
> except the TDR HPA to denote the TD. TDH.MEM.TRACK will advance the TD's
> epoch counter to ensure TDX module will flush TLBs in all vCPUs once the
> vCPUs re-enter the TD. TDH.MEM.TRACK will fail to advance TD's epoch
> counter if there are vCPUs still running in non-root mode at the previous
> TD epoch counter. Therefore, send IPIs to kick off vCPUs after
> tdh_mem_track() to avoid the failure by forcing all vCPUs to re-enter the
> TD.

This patch doesn't implement the functionality described in the above paragraph,
the caller does it. It also doesn't explain why it leaves it all for the callers
instead of implementing the described flow on the x86 side. I think Dave will
want to understand this, so how about this instead:

So to ensure private GPA translations are flushed, callers must first call
tdh_mem_range_block(), then tdh_mem_track(), and lastly send IPIs to kick all
the vCPUs and trigger a TLB flush forcing them to re-enter. Don't export a
single operation and instead export functions that just expose the block and
track operations. Do this for a couple reasons:
1. The vCPU kick should use KVM's functionality for doing this, which can better
target sending IPIs to only the minimum required pCPUs.
2. tdh_mem_track() doesn't need to be executed if a vCPU has not entered a TD,
which is information only KVM knows.
3. Leaving the operations separate will allow for batching many
tdh_mem_range_block() calls before a tdh_mem_track(). While this batching will
not be done initially by KVM, it demonstrates that keeping mem block and track
as separate operations is a generally good design.


The above also loses the bit about ensuring all vCPUs are kicked to avoid epoch
related errors. I didn't think it was needed to justify the wrappers.

> 
> Contentions are also possible in TDH.MEM.TRACK. For example, TDH.MEM.TRACK
> may contend with TDH.VP.ENTER when advancing the TD epoch counter.
> tdh_mem_track() does not provide the retries for the caller. Callers can
> choose to avoid contentions or retry on their own.
> 
> [Kai: Switched from generic seamcall export]
> [Yan: Re-wrote the changelog]
> Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> Message-ID: <20241112073648.22143-1-yan.y.zhao@intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/include/asm/tdx.h  |  2 ++
>  arch/x86/virt/vmx/tdx/tdx.c | 27 +++++++++++++++++++++++++++
>  arch/x86/virt/vmx/tdx/tdx.h |  2 ++
>  3 files changed, 31 insertions(+)
> 
> diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
> index 32f3981d56c5..f0b7b7b7d506 100644
> --- a/arch/x86/include/asm/tdx.h
> +++ b/arch/x86/include/asm/tdx.h
> @@ -142,6 +142,7 @@ u64 tdh_mem_page_add(struct tdx_td *td, u64 gpa, u64 hpa, u64 source, u64 *rcx,
>  u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, u64 level, u64 hpa, u64 *rcx, u64 *rdx);
>  u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page);
>  u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, u64 hpa, u64 *rcx, u64 *rdx);
> +u64 tdh_mem_range_block(struct tdx_td *td, u64 gpa, u64 level, u64 *rcx, u64 *rdx);
>  u64 tdh_mng_key_config(struct tdx_td *td);
>  u64 tdh_mng_create(struct tdx_td *td, u64 hkid);
>  u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
> @@ -155,6 +156,7 @@ u64 tdh_vp_rd(struct tdx_vp *vp, u64 field, u64 *data);
>  u64 tdh_vp_wr(struct tdx_vp *vp, u64 field, u64 data, u64 mask);
>  u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid);
>  u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size);
> +u64 tdh_mem_track(struct tdx_td *tdr);
>  u64 tdh_phymem_cache_wb(bool resume);
>  u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td);
>  #else
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index f39197d4eafc..c7e6f30d0a14 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -1561,6 +1561,23 @@ u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, u64 hpa, u64 *rcx, u64 *rdx)
>  }
>  EXPORT_SYMBOL_GPL(tdh_mem_page_aug);
>  
> +u64 tdh_mem_range_block(struct tdx_td *td, u64 gpa, u64 level, u64 *rcx, u64 *rdx)
> +{
> +	struct tdx_module_args args = {
> +		.rcx = gpa | level,
> +		.rdx = tdx_tdr_pa(td),
> +	};
> +	u64 ret;
> +
> +	ret = seamcall_ret(TDH_MEM_RANGE_BLOCK, &args);
> +
> +	*rcx = args.rcx;
> +	*rdx = args.rdx;

Similar to the others, these could be called extended_err1, extended_err2.

> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(tdh_mem_range_block);
> +
>  u64 tdh_mng_key_config(struct tdx_td *td)
>  {
>  	struct tdx_module_args args = {
> @@ -1734,6 +1751,16 @@ u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64
>  }
>  EXPORT_SYMBOL_GPL(tdh_phymem_page_reclaim);
>  
> +u64 tdh_mem_track(struct tdx_td *td)
> +{
> +	struct tdx_module_args args = {
> +		.rcx = tdx_tdr_pa(td),
> +	};
> +
> +	return seamcall(TDH_MEM_TRACK, &args);
> +}
> +EXPORT_SYMBOL_GPL(tdh_mem_track);
> +
>  u64 tdh_phymem_cache_wb(bool resume)
>  {
>  	struct tdx_module_args args = {
> diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
> index 80e6ef006085..4b0ad536afd9 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.h
> +++ b/arch/x86/virt/vmx/tdx/tdx.h
> @@ -20,6 +20,7 @@
>  #define TDH_MEM_SEPT_ADD		3
>  #define TDH_VP_ADDCX			4
>  #define TDH_MEM_PAGE_AUG		6
> +#define TDH_MEM_RANGE_BLOCK		7
>  #define TDH_MNG_KEY_CONFIG		8
>  #define TDH_MNG_CREATE			9
>  #define TDH_MNG_RD			11
> @@ -35,6 +36,7 @@
>  #define TDH_SYS_KEY_CONFIG		31
>  #define TDH_SYS_INIT			33
>  #define TDH_SYS_RD			34
> +#define TDH_MEM_TRACK			38
>  #define TDH_SYS_LP_INIT			35
>  #define TDH_SYS_TDMR_INIT		36
>  #define TDH_PHYMEM_CACHE_WB		40


  reply	other threads:[~2025-01-03  1:28 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-01  7:49 [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
2025-01-01  7:49 ` [PATCH 01/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management Paolo Bonzini
2025-01-02 19:44   ` Edgecombe, Rick P
2025-01-01  7:49 ` [PATCH 02/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX TD creation Paolo Bonzini
2025-01-03 17:51   ` Edgecombe, Rick P
2025-01-01  7:49 ` [PATCH 03/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX vCPU creation Paolo Bonzini
2025-01-01  7:49 ` [PATCH 04/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX page cache management Paolo Bonzini
2025-01-01  7:49 ` [PATCH 05/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX VM/vCPU field access Paolo Bonzini
2025-01-01  7:49 ` [PATCH 06/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX flush operations Paolo Bonzini
2025-01-01  7:49 ` [PATCH 07/13] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages Paolo Bonzini
2025-01-02 21:59   ` Edgecombe, Rick P
2025-01-07  5:27     ` Yan Zhao
2025-01-07 19:48     ` Dave Hansen
2025-01-08  1:12       ` Yan Zhao
2025-01-08  1:20         ` Dave Hansen
2025-01-08  1:43           ` Edgecombe, Rick P
2025-01-08  2:14             ` Yan Zhao
2025-01-01  7:49 ` [PATCH 08/13] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages Paolo Bonzini
2025-01-02 23:32   ` Edgecombe, Rick P
2025-01-06  5:50     ` Yan Zhao
2025-01-06 19:21       ` Edgecombe, Rick P
2025-01-07  6:37         ` Yan Zhao
2025-01-14 23:32       ` Paolo Bonzini
2025-01-15  0:49         ` Edgecombe, Rick P
2025-01-15  2:02           ` Edgecombe, Rick P
2025-01-15  5:49         ` Yan Zhao
2025-01-01  7:49 ` [PATCH 09/13] x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking Paolo Bonzini
2025-01-03  1:28   ` Edgecombe, Rick P [this message]
2025-01-07  6:40     ` Yan Zhao
2025-01-01  7:49 ` [PATCH 10/13] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page Paolo Bonzini
2025-01-03  1:36   ` Edgecombe, Rick P
2025-01-07  6:43     ` Yan Zhao
2025-01-07  6:52       ` Yan Zhao
2025-01-07 22:13       ` Dave Hansen
2025-01-08  0:41         ` Yan Zhao
2025-01-08 16:31           ` Dave Hansen
2025-01-09  2:19             ` Yan Zhao
2025-01-01  7:49 ` [PATCH 11/13] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents Paolo Bonzini
2025-01-03 18:02   ` Edgecombe, Rick P
2025-01-14 22:03     ` Paolo Bonzini
2025-01-14 22:10       ` Dave Hansen
2025-01-15  1:24       ` Yan Zhao
2025-01-07  7:01   ` Yan Zhao
2025-01-07 22:05     ` Dave Hansen
2025-01-01  7:49 ` [PATCH 12/13] x86/virt/tdx: Read essential global metadata for KVM Paolo Bonzini
2025-01-03 18:26   ` Edgecombe, Rick P
2025-01-01  7:49 ` [PATCH 13/13] x86/virt/tdx: Add tdx_guest_keyid_alloc/free() to alloc and free TDX guest KeyID Paolo Bonzini
2025-01-02 19:43 ` [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Edgecombe, Rick P

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fb17ae7ec64fa1eb884ab87b1729cdeb79c32871.camel@intel.com \
    --to=rick.p.edgecombe@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=isaku.yamahata@intel.com \
    --cc=kai.huang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox