* [PATCH V4 0/2] x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present
@ 2025-07-23 12:05 Adrian Hunter
2025-07-23 12:05 ` [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page() Adrian Hunter
2025-07-23 12:05 ` [PATCH V4 2/2] x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present Adrian Hunter
0 siblings, 2 replies; 15+ messages in thread
From: Adrian Hunter @ 2025-07-23 12:05 UTC (permalink / raw)
To: Dave Hansen, pbonzini, seanjc, vannapurve
Cc: Tony Luck, Borislav Petkov, Thomas Gleixner, Ingo Molnar, x86,
H Peter Anvin, linux-kernel, kvm, rick.p.edgecombe, kas,
kai.huang, reinette.chatre, xiaoyao.li, tony.lindgren, binbin.wu,
isaku.yamahata, yan.y.zhao, chao.gao
Hi
Here are 2 small self-explanatory patches related to clearing TDX private
pages.
Patch 1 is a minor tidy-up.
In patch 2, by skipping the clearing step, shutdown time can improve by
up to 40%.
Changes in V4:
x86/tdx: Eliminate duplicate code in tdx_clear_page()
Add and use tdx_quirk_reset_page() for KVM (Sean)
x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present
Add TDX Module Base spec. version (Rick)
Add Rick's Rev'd-by
Changes in V3:
x86/tdx: Eliminate duplicate code in tdx_clear_page()
Explain "quirk" rename in commit message (Rick)
Explain mb() change in commit message (Rick)
Add Rev'd-by, Ack'd-by tags
x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present
Remove "flush cache" comments (Rick)
Update function comment to better relate to "quirk" naming (Rick)
Add "via MOVDIR64B" to comment (Xiaoyao)
Add Rev'd-by, Ack'd-by tags
Changes in V2 (as requested by Dave):
x86/tdx: Eliminate duplicate code in tdx_clear_page()
Rename reset_tdx_pages() to tdx_quirk_reset_paddr()
Call tdx_quirk_reset_paddr() directly
x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present
Improve the comment
Adrian Hunter (2):
x86/tdx: Eliminate duplicate code in tdx_clear_page()
x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present
arch/x86/include/asm/tdx.h | 2 ++
arch/x86/kvm/vmx/tdx.c | 25 +++----------------------
arch/x86/virt/vmx/tdx/tdx.c | 20 +++++++++++++++-----
3 files changed, 20 insertions(+), 27 deletions(-)
Regards
Adrian
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page()
2025-07-23 12:05 [PATCH V4 0/2] x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present Adrian Hunter
@ 2025-07-23 12:05 ` Adrian Hunter
2025-07-23 14:06 ` Edgecombe, Rick P
2025-07-23 15:57 ` Sean Christopherson
2025-07-23 12:05 ` [PATCH V4 2/2] x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present Adrian Hunter
1 sibling, 2 replies; 15+ messages in thread
From: Adrian Hunter @ 2025-07-23 12:05 UTC (permalink / raw)
To: Dave Hansen, pbonzini, seanjc, vannapurve
Cc: Tony Luck, Borislav Petkov, Thomas Gleixner, Ingo Molnar, x86,
H Peter Anvin, linux-kernel, kvm, rick.p.edgecombe, kas,
kai.huang, reinette.chatre, xiaoyao.li, tony.lindgren, binbin.wu,
isaku.yamahata, yan.y.zhao, chao.gao
tdx_clear_page() and reset_tdx_pages() duplicate the TDX page clearing
logic. Rename reset_tdx_pages() to tdx_quirk_reset_paddr() and create
tdx_quirk_reset_page() to call tdx_quirk_reset_paddr() and be used in
place of tdx_clear_page().
The new name reflects that, in fact, the clearing is necessary only for
hardware with a certain quirk. That is dealt with in a subsequent patch
but doing the rename here avoids additional churn.
Note reset_tdx_pages() is slightly different from tdx_clear_page() because,
more appropriately, it uses mb() in place of __mb(). Except when extra
debugging is enabled (kcsan at present), mb() just calls __mb().
Reviewed-by: Kirill A. Shutemov <kas@kernel.org>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
Changes in V4:
Add and use tdx_quirk_reset_page() for KVM (Sean)
Changes in V3:
Explain "quirk" rename in commit message (Rick)
Explain mb() change in commit message (Rick)
Add Rev'd-by, Ack'd-by tags
Changes in V2:
Rename reset_tdx_pages() to tdx_quirk_reset_paddr()
Call tdx_quirk_reset_paddr() directly
arch/x86/include/asm/tdx.h | 2 ++
arch/x86/kvm/vmx/tdx.c | 25 +++----------------------
arch/x86/virt/vmx/tdx/tdx.c | 10 ++++++++--
3 files changed, 13 insertions(+), 24 deletions(-)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 7ddef3a69866..57b46f05ff97 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -131,6 +131,8 @@ int tdx_guest_keyid_alloc(void);
u32 tdx_get_nr_guest_keyids(void);
void tdx_guest_keyid_free(unsigned int keyid);
+void tdx_quirk_reset_page(struct page *page);
+
struct tdx_td {
/* TD root structure: */
struct page *tdr_page;
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index 573d6f7d1694..ebb36229c7c8 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -283,25 +283,6 @@ static inline void tdx_disassociate_vp(struct kvm_vcpu *vcpu)
vcpu->cpu = -1;
}
-static void tdx_clear_page(struct page *page)
-{
- const void *zero_page = (const void *) page_to_virt(ZERO_PAGE(0));
- void *dest = page_to_virt(page);
- unsigned long i;
-
- /*
- * The page could have been poisoned. MOVDIR64B also clears
- * the poison bit so the kernel can safely use the page again.
- */
- for (i = 0; i < PAGE_SIZE; i += 64)
- movdir64b(dest + i, zero_page);
- /*
- * MOVDIR64B store uses WC buffer. Prevent following memory reads
- * from seeing potentially poisoned cache.
- */
- __mb();
-}
-
static void tdx_no_vcpus_enter_start(struct kvm *kvm)
{
struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
@@ -347,7 +328,7 @@ static int tdx_reclaim_page(struct page *page)
r = __tdx_reclaim_page(page);
if (!r)
- tdx_clear_page(page);
+ tdx_quirk_reset_page(page);
return r;
}
@@ -596,7 +577,7 @@ static void tdx_reclaim_td_control_pages(struct kvm *kvm)
pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err);
return;
}
- tdx_clear_page(kvm_tdx->td.tdr_page);
+ tdx_quirk_reset_page(kvm_tdx->td.tdr_page);
__free_page(kvm_tdx->td.tdr_page);
kvm_tdx->td.tdr_page = NULL;
@@ -1717,7 +1698,7 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn,
pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err);
return -EIO;
}
- tdx_clear_page(page);
+ tdx_quirk_reset_page(page);
tdx_unpin(kvm, page);
return 0;
}
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index c7a9a087ccaf..fc8d8e444f15 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -637,7 +637,7 @@ static int tdmrs_set_up_pamt_all(struct tdmr_info_list *tdmr_list,
* clear these pages. Note this function doesn't flush cache of
* these TDX private pages. The caller should make sure of that.
*/
-static void reset_tdx_pages(unsigned long base, unsigned long size)
+static void tdx_quirk_reset_paddr(unsigned long base, unsigned long size)
{
const void *zero_page = (const void *)page_address(ZERO_PAGE(0));
unsigned long phys, end;
@@ -654,9 +654,15 @@ static void reset_tdx_pages(unsigned long base, unsigned long size)
mb();
}
+void tdx_quirk_reset_page(struct page *page)
+{
+ tdx_quirk_reset_paddr(page_to_phys(page), PAGE_SIZE);
+}
+EXPORT_SYMBOL_GPL(tdx_quirk_reset_page);
+
static void tdmr_reset_pamt(struct tdmr_info *tdmr)
{
- tdmr_do_pamt_func(tdmr, reset_tdx_pages);
+ tdmr_do_pamt_func(tdmr, tdx_quirk_reset_paddr);
}
static void tdmrs_reset_pamt_all(struct tdmr_info_list *tdmr_list)
--
2.48.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH V4 2/2] x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present
2025-07-23 12:05 [PATCH V4 0/2] x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present Adrian Hunter
2025-07-23 12:05 ` [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page() Adrian Hunter
@ 2025-07-23 12:05 ` Adrian Hunter
2025-07-23 12:31 ` Xiaoyao Li
1 sibling, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2025-07-23 12:05 UTC (permalink / raw)
To: Dave Hansen, pbonzini, seanjc, vannapurve
Cc: Tony Luck, Borislav Petkov, Thomas Gleixner, Ingo Molnar, x86,
H Peter Anvin, linux-kernel, kvm, rick.p.edgecombe, kas,
kai.huang, reinette.chatre, xiaoyao.li, tony.lindgren, binbin.wu,
isaku.yamahata, yan.y.zhao, chao.gao
Avoid clearing reclaimed TDX private pages unless the platform is affected
by the X86_BUG_TDX_PW_MCE erratum. This significantly reduces VM shutdown
time on unaffected systems.
Background
KVM currently clears reclaimed TDX private pages using MOVDIR64B, which:
- Clears the TD Owner bit (which identifies TDX private memory) and
integrity metadata without triggering integrity violations.
- Clears poison from cache lines without consuming it, avoiding MCEs on
access (refer TDX Module Base spec. 1348549-006US section 6.5.
Handling Machine Check Events during Guest TD Operation).
The TDX module also uses MOVDIR64B to initialize private pages before use.
If cache flushing is needed, it sets TDX_FEATURES.CLFLUSH_BEFORE_ALLOC.
However, KVM currently flushes unconditionally, refer commit 94c477a751c7b
("x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages")
In contrast, when private pages are reclaimed, the TDX Module handles
flushing via the TDH.PHYMEM.CACHE.WB SEAMCALL.
Problem
Clearing all private pages during VM shutdown is costly. For guests
with a large amount of memory it can take minutes.
Solution
TDX Module Base Architecture spec. documents that private pages reclaimed
from a TD should be initialized using MOVDIR64B, in order to avoid
integrity violation or TD bit mismatch detection when later being read
using a shared HKID, refer April 2025 spec. "Page Initialization" in
section "8.6.2. Platforms not Using ACT: Required Cache Flush and
Initialization by the Host VMM"
That is an overstatement and will be clarified in coming versions of the
spec. In fact, as outlined in "Table 16.2: Non-ACT Platforms Checks on
Memory" and "Table 16.3: Non-ACT Platforms Checks on Memory Reads in Li
Mode" in the same spec, there is no issue accessing such reclaimed pages
using a shared key that does not have integrity enabled. Linux always uses
KeyID 0 which never has integrity enabled. KeyID 0 is also the TME KeyID
which disallows integrity, refer "TME Policy/Encryption Algorithm" bit
description in "Intel Architecture Memory Encryption Technologies" spec
version 1.6 April 2025. So there is no need to clear pages to avoid
integrity violations.
There remains a risk of poison consumption. However, in the context of
TDX, it is expected that there would be a machine check associated with the
original poisoning. On some platforms that results in a panic. However
platforms may support "SEAM_NR" Machine Check capability, in which case
Linux machine check handler marks the page as poisoned, which prevents it
from being allocated anymore, refer commit 7911f145de5fe ("x86/mce:
Implement recovery for errors in TDX/SEAM non-root mode")
Improvement
By skipping the clearing step on unaffected platforms, shutdown time
can improve by up to 40%.
On platforms with the X86_BUG_TDX_PW_MCE erratum (SPR and EMR), continue
clearing because these platforms may trigger poison on partial writes to
previously-private pages, even with KeyID 0, refer commit 1e536e1068970
("x86/cpu: Detect TDX partial write machine check erratum")
Reviewed-by: Kirill A. Shutemov <kas@kernel.org>
Acked-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
Changes in V4:
Add TDX Module Base spec. version (Rick)
Add Rick's Rev'd-by
Changes in V3:
Remove "flush cache" comments (Rick)
Update function comment to better relate to "quirk" naming (Rick)
Add "via MOVDIR64B" to comment (Xiaoyao)
Add Rev'd-by, Ack'd-by tags
Changes in V2:
Improve the comment
arch/x86/virt/vmx/tdx/tdx.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index fc8d8e444f15..ef22fc2b9af0 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -633,15 +633,19 @@ static int tdmrs_set_up_pamt_all(struct tdmr_info_list *tdmr_list,
}
/*
- * Convert TDX private pages back to normal by using MOVDIR64B to
- * clear these pages. Note this function doesn't flush cache of
- * these TDX private pages. The caller should make sure of that.
+ * Convert TDX private pages back to normal by using MOVDIR64B to clear these
+ * pages. Typically, any write to the page will convert it from TDX private back
+ * to normal kernel memory. Systems with the X86_BUG_TDX_PW_MCE erratum need to
+ * do the conversion explicitly via MOVDIR64B.
*/
static void tdx_quirk_reset_paddr(unsigned long base, unsigned long size)
{
const void *zero_page = (const void *)page_address(ZERO_PAGE(0));
unsigned long phys, end;
+ if (!boot_cpu_has_bug(X86_BUG_TDX_PW_MCE))
+ return;
+
end = base + size;
for (phys = base; phys < end; phys += 64)
movdir64b(__va(phys), zero_page);
--
2.48.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH V4 2/2] x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present
2025-07-23 12:05 ` [PATCH V4 2/2] x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present Adrian Hunter
@ 2025-07-23 12:31 ` Xiaoyao Li
0 siblings, 0 replies; 15+ messages in thread
From: Xiaoyao Li @ 2025-07-23 12:31 UTC (permalink / raw)
To: Adrian Hunter, Dave Hansen, pbonzini, seanjc, vannapurve
Cc: Tony Luck, Borislav Petkov, Thomas Gleixner, Ingo Molnar, x86,
H Peter Anvin, linux-kernel, kvm, rick.p.edgecombe, kas,
kai.huang, reinette.chatre, tony.lindgren, binbin.wu,
isaku.yamahata, yan.y.zhao, chao.gao
On 7/23/2025 8:05 PM, Adrian Hunter wrote:
> Avoid clearing reclaimed TDX private pages unless the platform is affected
> by the X86_BUG_TDX_PW_MCE erratum. This significantly reduces VM shutdown
> time on unaffected systems.
>
> Background
>
> KVM currently clears reclaimed TDX private pages using MOVDIR64B, which:
>
> - Clears the TD Owner bit (which identifies TDX private memory) and
> integrity metadata without triggering integrity violations.
> - Clears poison from cache lines without consuming it, avoiding MCEs on
> access (refer TDX Module Base spec. 1348549-006US section 6.5.
> Handling Machine Check Events during Guest TD Operation).
>
> The TDX module also uses MOVDIR64B to initialize private pages before use.
> If cache flushing is needed, it sets TDX_FEATURES.CLFLUSH_BEFORE_ALLOC.
> However, KVM currently flushes unconditionally, refer commit 94c477a751c7b
> ("x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages")
>
> In contrast, when private pages are reclaimed, the TDX Module handles
> flushing via the TDH.PHYMEM.CACHE.WB SEAMCALL.
>
> Problem
>
> Clearing all private pages during VM shutdown is costly. For guests
> with a large amount of memory it can take minutes.
>
> Solution
>
> TDX Module Base Architecture spec. documents that private pages reclaimed
> from a TD should be initialized using MOVDIR64B, in order to avoid
> integrity violation or TD bit mismatch detection when later being read
> using a shared HKID, refer April 2025 spec. "Page Initialization" in
> section "8.6.2. Platforms not Using ACT: Required Cache Flush and
> Initialization by the Host VMM"
>
> That is an overstatement and will be clarified in coming versions of the
> spec. In fact, as outlined in "Table 16.2: Non-ACT Platforms Checks on
> Memory" and "Table 16.3: Non-ACT Platforms Checks on Memory Reads in Li
> Mode" in the same spec, there is no issue accessing such reclaimed pages
> using a shared key that does not have integrity enabled. Linux always uses
> KeyID 0 which never has integrity enabled. KeyID 0 is also the TME KeyID
> which disallows integrity, refer "TME Policy/Encryption Algorithm" bit
> description in "Intel Architecture Memory Encryption Technologies" spec
> version 1.6 April 2025. So there is no need to clear pages to avoid
> integrity violations.
>
> There remains a risk of poison consumption. However, in the context of
> TDX, it is expected that there would be a machine check associated with the
> original poisoning. On some platforms that results in a panic. However
> platforms may support "SEAM_NR" Machine Check capability, in which case
> Linux machine check handler marks the page as poisoned, which prevents it
> from being allocated anymore, refer commit 7911f145de5fe ("x86/mce:
> Implement recovery for errors in TDX/SEAM non-root mode")
>
> Improvement
>
> By skipping the clearing step on unaffected platforms, shutdown time
> can improve by up to 40%.
>
> On platforms with the X86_BUG_TDX_PW_MCE erratum (SPR and EMR), continue
> clearing because these platforms may trigger poison on partial writes to
> previously-private pages, even with KeyID 0, refer commit 1e536e1068970
> ("x86/cpu: Detect TDX partial write machine check erratum")
>
> Reviewed-by: Kirill A. Shutemov <kas@kernel.org>
> Acked-by: Kai Huang <kai.huang@intel.com>
> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>
>
> Changes in V4:
>
> Add TDX Module Base spec. version (Rick)
> Add Rick's Rev'd-by
>
> Changes in V3:
>
> Remove "flush cache" comments (Rick)
> Update function comment to better relate to "quirk" naming (Rick)
> Add "via MOVDIR64B" to comment (Xiaoyao)
> Add Rev'd-by, Ack'd-by tags
>
> Changes in V2:
>
> Improve the comment
>
>
> arch/x86/virt/vmx/tdx/tdx.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index fc8d8e444f15..ef22fc2b9af0 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -633,15 +633,19 @@ static int tdmrs_set_up_pamt_all(struct tdmr_info_list *tdmr_list,
> }
>
> /*
> - * Convert TDX private pages back to normal by using MOVDIR64B to
> - * clear these pages. Note this function doesn't flush cache of
> - * these TDX private pages. The caller should make sure of that.
> + * Convert TDX private pages back to normal by using MOVDIR64B to clear these
> + * pages. Typically, any write to the page will convert it from TDX private back
> + * to normal kernel memory. Systems with the X86_BUG_TDX_PW_MCE erratum need to
> + * do the conversion explicitly via MOVDIR64B.
> */
> static void tdx_quirk_reset_paddr(unsigned long base, unsigned long size)
> {
> const void *zero_page = (const void *)page_address(ZERO_PAGE(0));
> unsigned long phys, end;
>
> + if (!boot_cpu_has_bug(X86_BUG_TDX_PW_MCE))
> + return;
> +
> end = base + size;
> for (phys = base; phys < end; phys += 64)
> movdir64b(__va(phys), zero_page);
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page()
2025-07-23 12:05 ` [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page() Adrian Hunter
@ 2025-07-23 14:06 ` Edgecombe, Rick P
2025-07-23 14:37 ` Adrian Hunter
2025-07-23 15:57 ` Sean Christopherson
1 sibling, 1 reply; 15+ messages in thread
From: Edgecombe, Rick P @ 2025-07-23 14:06 UTC (permalink / raw)
To: Annapurve, Vishal, pbonzini@redhat.com, Hunter, Adrian,
seanjc@google.com, dave.hansen@linux.intel.com
Cc: kvm@vger.kernel.org, Li, Xiaoyao, Huang, Kai, Zhao, Yan Y,
Luck, Tony, kas@kernel.org, mingo@redhat.com, Chatre, Reinette,
tony.lindgren@linux.intel.com, tglx@linutronix.de,
Yamahata, Isaku, linux-kernel@vger.kernel.org,
binbin.wu@linux.intel.com, hpa@zytor.com, bp@alien8.de, Gao, Chao,
x86@kernel.org
On Wed, 2025-07-23 at 15:05 +0300, Adrian Hunter wrote:
>
> +void tdx_quirk_reset_page(struct page *page)
> +{
> + tdx_quirk_reset_paddr(page_to_phys(page), PAGE_SIZE);
> +}
> +EXPORT_SYMBOL_GPL(tdx_quirk_reset_page);
> +
> static void tdmr_reset_pamt(struct tdmr_info *tdmr)
> {
> - tdmr_do_pamt_func(tdmr, reset_tdx_pages);
> + tdmr_do_pamt_func(tdmr, tdx_quirk_reset_paddr);
> }
>
Up the call chain there is:
/*
* According to the TDX hardware spec, if the platform
* doesn't have the "partial write machine check"
* erratum, any kernel read/write will never cause #MC
* in kernel space, thus it's OK to not convert PAMTs
* back to normal. But do the conversion anyway here
* as suggested by the TDX spec.
*/
tdmrs_reset_pamt_all(&tdx_tdmr_list);
So the comment says it's going to clear it even if partial write machine check
is not present. Then the call chain goes through a bunch of functions not named
"quirk", then finally calls "tdx_quirk_reset_paddr" which actually skips the
page clearing.
I think you need to either fix the comment and rename the whole stack to
"tdx_quirk_...", or make tdx_quirk_reset_page() be the one that has the errata
check, and the error path above call the PA version reset_tdx_pages() without
the errata check.
The latter seems better to me for the sake of less churn.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page()
2025-07-23 14:06 ` Edgecombe, Rick P
@ 2025-07-23 14:37 ` Adrian Hunter
2025-07-23 14:44 ` Edgecombe, Rick P
0 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2025-07-23 14:37 UTC (permalink / raw)
To: Edgecombe, Rick P, Annapurve, Vishal, pbonzini@redhat.com,
seanjc@google.com, dave.hansen@linux.intel.com
Cc: kvm@vger.kernel.org, Li, Xiaoyao, Huang, Kai, Zhao, Yan Y,
Luck, Tony, kas@kernel.org, mingo@redhat.com, Chatre, Reinette,
tony.lindgren@linux.intel.com, tglx@linutronix.de,
Yamahata, Isaku, linux-kernel@vger.kernel.org,
binbin.wu@linux.intel.com, hpa@zytor.com, bp@alien8.de, Gao, Chao,
x86@kernel.org
On 23/07/2025 17:06, Edgecombe, Rick P wrote:
> On Wed, 2025-07-23 at 15:05 +0300, Adrian Hunter wrote:
>>
>> +void tdx_quirk_reset_page(struct page *page)
>> +{
>> + tdx_quirk_reset_paddr(page_to_phys(page), PAGE_SIZE);
>> +}
>> +EXPORT_SYMBOL_GPL(tdx_quirk_reset_page);
>> +
>> static void tdmr_reset_pamt(struct tdmr_info *tdmr)
>> {
>> - tdmr_do_pamt_func(tdmr, reset_tdx_pages);
>> + tdmr_do_pamt_func(tdmr, tdx_quirk_reset_paddr);
>> }
>>
>
> Up the call chain there is:
> /*
> * According to the TDX hardware spec, if the platform
> * doesn't have the "partial write machine check"
> * erratum, any kernel read/write will never cause #MC
> * in kernel space, thus it's OK to not convert PAMTs
> * back to normal. But do the conversion anyway here
> * as suggested by the TDX spec.
> */
> tdmrs_reset_pamt_all(&tdx_tdmr_list);
>
>
> So the comment says it's going to clear it even if partial write machine check
> is not present. Then the call chain goes through a bunch of functions not named
> "quirk", then finally calls "tdx_quirk_reset_paddr" which actually skips the
> page clearing.
>
> I think you need to either fix the comment and rename the whole stack to
> "tdx_quirk_...", or make tdx_quirk_reset_page() be the one that has the errata
> check, and the error path above call the PA version reset_tdx_pages() without
> the errata check.
>
> The latter seems better to me for the sake of less churn.
Why make tdx_quirk_reset_page() and tdx_quirk_reset_paddr() follow
different rules.
How about this:
From: Adrian Hunter <adrian.hunter@intel.com>
Subject: [PATCH] x86/tdx: Tidy reset_pamt functions
Rename reset_pamt functions to contain "quirk" to reflect the new
functionality, and remove the now misleading comment.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
arch/x86/virt/vmx/tdx/tdx.c | 16 ++++------------
1 file changed, 4 insertions(+), 12 deletions(-)
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index ef22fc2b9af0..823850399bb7 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -664,17 +664,17 @@ void tdx_quirk_reset_page(struct page *page)
}
EXPORT_SYMBOL_GPL(tdx_quirk_reset_page);
-static void tdmr_reset_pamt(struct tdmr_info *tdmr)
+static void tdmr_quirk_reset_pamt(struct tdmr_info *tdmr)
{
tdmr_do_pamt_func(tdmr, tdx_quirk_reset_paddr);
}
-static void tdmrs_reset_pamt_all(struct tdmr_info_list *tdmr_list)
+static void tdmrs_quirk_reset_pamt_all(struct tdmr_info_list *tdmr_list)
{
int i;
for (i = 0; i < tdmr_list->nr_consumed_tdmrs; i++)
- tdmr_reset_pamt(tdmr_entry(tdmr_list, i));
+ tdmr_quirk_reset_pamt(tdmr_entry(tdmr_list, i));
}
static unsigned long tdmrs_count_pamt_kb(struct tdmr_info_list *tdmr_list)
@@ -1146,15 +1146,7 @@ static int init_tdx_module(void)
* to the kernel.
*/
wbinvd_on_all_cpus();
- /*
- * According to the TDX hardware spec, if the platform
- * doesn't have the "partial write machine check"
- * erratum, any kernel read/write will never cause #MC
- * in kernel space, thus it's OK to not convert PAMTs
- * back to normal. But do the conversion anyway here
- * as suggested by the TDX spec.
- */
- tdmrs_reset_pamt_all(&tdx_tdmr_list);
+ tdmrs_quirk_reset_pamt_all(&tdx_tdmr_list);
err_free_pamts:
tdmrs_free_pamt_all(&tdx_tdmr_list);
err_free_tdmrs:
--
2.48.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page()
2025-07-23 14:37 ` Adrian Hunter
@ 2025-07-23 14:44 ` Edgecombe, Rick P
2025-07-23 15:30 ` Adrian Hunter
0 siblings, 1 reply; 15+ messages in thread
From: Edgecombe, Rick P @ 2025-07-23 14:44 UTC (permalink / raw)
To: pbonzini@redhat.com, Hunter, Adrian, Annapurve, Vishal,
seanjc@google.com, dave.hansen@linux.intel.com
Cc: kvm@vger.kernel.org, Li, Xiaoyao, Huang, Kai, Zhao, Yan Y,
Luck, Tony, linux-kernel@vger.kernel.org,
tony.lindgren@linux.intel.com, Chatre, Reinette, kas@kernel.org,
tglx@linutronix.de, Yamahata, Isaku, binbin.wu@linux.intel.com,
hpa@zytor.com, mingo@redhat.com, bp@alien8.de, Gao, Chao,
x86@kernel.org
On Wed, 2025-07-23 at 17:37 +0300, Adrian Hunter wrote:
> > The latter seems better to me for the sake of less churn.
>
> Why make tdx_quirk_reset_page() and tdx_quirk_reset_paddr() follow
> different rules.
>
> How about this:
>
> From: Adrian Hunter <adrian.hunter@intel.com>
> Subject: [PATCH] x86/tdx: Tidy reset_pamt functions
>
> Rename reset_pamt functions to contain "quirk" to reflect the new
> functionality, and remove the now misleading comment.
This looks like the "former" option. Churn is not too bad, and it has the
benefit of clear code vs long comment. I'm ok either way. But it needs to go
cleanup first in the patch order.
The log should explain why it's ok to change now, with respect to the reasoning
in the comment that is being removed.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page()
2025-07-23 14:44 ` Edgecombe, Rick P
@ 2025-07-23 15:30 ` Adrian Hunter
2025-07-23 15:33 ` Edgecombe, Rick P
2025-07-23 23:01 ` Huang, Kai
0 siblings, 2 replies; 15+ messages in thread
From: Adrian Hunter @ 2025-07-23 15:30 UTC (permalink / raw)
To: Edgecombe, Rick P, pbonzini@redhat.com, Annapurve, Vishal,
seanjc@google.com, dave.hansen@linux.intel.com
Cc: kvm@vger.kernel.org, Li, Xiaoyao, Huang, Kai, Zhao, Yan Y,
Luck, Tony, linux-kernel@vger.kernel.org,
tony.lindgren@linux.intel.com, Chatre, Reinette, kas@kernel.org,
tglx@linutronix.de, Yamahata, Isaku, binbin.wu@linux.intel.com,
hpa@zytor.com, mingo@redhat.com, bp@alien8.de, Gao, Chao,
x86@kernel.org
On 23/07/2025 17:44, Edgecombe, Rick P wrote:
> On Wed, 2025-07-23 at 17:37 +0300, Adrian Hunter wrote:
>>> The latter seems better to me for the sake of less churn.
>>
>> Why make tdx_quirk_reset_page() and tdx_quirk_reset_paddr() follow
>> different rules.
>>
>> How about this:
>>
>> From: Adrian Hunter <adrian.hunter@intel.com>
>> Subject: [PATCH] x86/tdx: Tidy reset_pamt functions
>>
>> Rename reset_pamt functions to contain "quirk" to reflect the new
>> functionality, and remove the now misleading comment.
>
> This looks like the "former" option. Churn is not too bad, and it has the
> benefit of clear code vs long comment. I'm ok either way. But it needs to go
> cleanup first in the patch order.
>
> The log should explain why it's ok to change now, with respect to the reasoning
> in the comment that is being removed.
It makes more sense afterwards because then it can refer to the
functional change:
From: Adrian Hunter <adrian.hunter@intel.com>
Subject: [PATCH] x86/tdx: Tidy reset_pamt functions
tdx_quirk_reset_paddr() has been made to reflect that, in fact, the
clearing is necessary only for hardware with a certain quirk. Refer
patch "x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE
is present" for details.
Rename reset_pamt functions to contain "quirk" to reflect the new
functionality, and remove the now misleading comment.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
arch/x86/virt/vmx/tdx/tdx.c | 16 ++++------------
1 file changed, 4 insertions(+), 12 deletions(-)
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index ef22fc2b9af0..823850399bb7 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -664,17 +664,17 @@ void tdx_quirk_reset_page(struct page *page)
}
EXPORT_SYMBOL_GPL(tdx_quirk_reset_page);
-static void tdmr_reset_pamt(struct tdmr_info *tdmr)
+static void tdmr_quirk_reset_pamt(struct tdmr_info *tdmr)
{
tdmr_do_pamt_func(tdmr, tdx_quirk_reset_paddr);
}
-static void tdmrs_reset_pamt_all(struct tdmr_info_list *tdmr_list)
+static void tdmrs_quirk_reset_pamt_all(struct tdmr_info_list *tdmr_list)
{
int i;
for (i = 0; i < tdmr_list->nr_consumed_tdmrs; i++)
- tdmr_reset_pamt(tdmr_entry(tdmr_list, i));
+ tdmr_quirk_reset_pamt(tdmr_entry(tdmr_list, i));
}
static unsigned long tdmrs_count_pamt_kb(struct tdmr_info_list *tdmr_list)
@@ -1146,15 +1146,7 @@ static int init_tdx_module(void)
* to the kernel.
*/
wbinvd_on_all_cpus();
- /*
- * According to the TDX hardware spec, if the platform
- * doesn't have the "partial write machine check"
- * erratum, any kernel read/write will never cause #MC
- * in kernel space, thus it's OK to not convert PAMTs
- * back to normal. But do the conversion anyway here
- * as suggested by the TDX spec.
- */
- tdmrs_reset_pamt_all(&tdx_tdmr_list);
+ tdmrs_quirk_reset_pamt_all(&tdx_tdmr_list);
err_free_pamts:
tdmrs_free_pamt_all(&tdx_tdmr_list);
err_free_tdmrs:
--
2.48.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page()
2025-07-23 15:30 ` Adrian Hunter
@ 2025-07-23 15:33 ` Edgecombe, Rick P
2025-07-23 15:41 ` Adrian Hunter
2025-07-23 23:01 ` Huang, Kai
1 sibling, 1 reply; 15+ messages in thread
From: Edgecombe, Rick P @ 2025-07-23 15:33 UTC (permalink / raw)
To: pbonzini@redhat.com, Hunter, Adrian, Annapurve, Vishal,
seanjc@google.com, dave.hansen@linux.intel.com
Cc: kvm@vger.kernel.org, Li, Xiaoyao, Huang, Kai, Zhao, Yan Y,
Luck, Tony, tony.lindgren@linux.intel.com, mingo@redhat.com,
binbin.wu@linux.intel.com, kas@kernel.org, tglx@linutronix.de,
Yamahata, Isaku, linux-kernel@vger.kernel.org, Chatre, Reinette,
hpa@zytor.com, bp@alien8.de, Gao, Chao, x86@kernel.org
On Wed, 2025-07-23 at 18:30 +0300, Adrian Hunter wrote:
> > The log should explain why it's ok to change now, with respect to the
> > reasoning
> > in the comment that is being removed.
>
> It makes more sense afterwards because then it can refer to the
> functional change:
Cleanups first is the norm. This doesn't seem like a special situation. Did you
try to re-arrange it?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page()
2025-07-23 15:33 ` Edgecombe, Rick P
@ 2025-07-23 15:41 ` Adrian Hunter
2025-07-23 16:03 ` Edgecombe, Rick P
0 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2025-07-23 15:41 UTC (permalink / raw)
To: Edgecombe, Rick P, pbonzini@redhat.com, Annapurve, Vishal,
seanjc@google.com, dave.hansen@linux.intel.com
Cc: kvm@vger.kernel.org, Li, Xiaoyao, Huang, Kai, Zhao, Yan Y,
Luck, Tony, tony.lindgren@linux.intel.com, mingo@redhat.com,
binbin.wu@linux.intel.com, kas@kernel.org, tglx@linutronix.de,
Yamahata, Isaku, linux-kernel@vger.kernel.org, Chatre, Reinette,
hpa@zytor.com, bp@alien8.de, Gao, Chao, x86@kernel.org
On 23/07/2025 18:33, Edgecombe, Rick P wrote:
> On Wed, 2025-07-23 at 18:30 +0300, Adrian Hunter wrote:
>>> The log should explain why it's ok to change now, with respect to the
>>> reasoning
>>> in the comment that is being removed.
>>
>> It makes more sense afterwards because then it can refer to the
>> functional change:
>
> Cleanups first is the norm. This doesn't seem like a special situation. Did you
> try to re-arrange it?
Patch 1 only introduced "quirk" terminology to save touching the
same lines of code in patch 2 and distracting from its main purpose,
but the quirk functionality is not added until patch 2, so the
tidy-up only really makes sense afterwards.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page()
2025-07-23 12:05 ` [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page() Adrian Hunter
2025-07-23 14:06 ` Edgecombe, Rick P
@ 2025-07-23 15:57 ` Sean Christopherson
1 sibling, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2025-07-23 15:57 UTC (permalink / raw)
To: Adrian Hunter
Cc: Dave Hansen, pbonzini, vannapurve, Tony Luck, Borislav Petkov,
Thomas Gleixner, Ingo Molnar, x86, H Peter Anvin, linux-kernel,
kvm, rick.p.edgecombe, kas, kai.huang, reinette.chatre,
xiaoyao.li, tony.lindgren, binbin.wu, isaku.yamahata, yan.y.zhao,
chao.gao
On Wed, Jul 23, 2025, Adrian Hunter wrote:
> tdx_clear_page() and reset_tdx_pages() duplicate the TDX page clearing
> logic. Rename reset_tdx_pages() to tdx_quirk_reset_paddr() and create
> tdx_quirk_reset_page() to call tdx_quirk_reset_paddr() and be used in
> place of tdx_clear_page().
>
> The new name reflects that, in fact, the clearing is necessary only for
> hardware with a certain quirk. That is dealt with in a subsequent patch
> but doing the rename here avoids additional churn.
>
> Note reset_tdx_pages() is slightly different from tdx_clear_page() because,
> more appropriately, it uses mb() in place of __mb(). Except when extra
> debugging is enabled (kcsan at present), mb() just calls __mb().
>
> Reviewed-by: Kirill A. Shutemov <kas@kernel.org>
> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Acked-by: Kai Huang <kai.huang@intel.com>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
...
> diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
> index 7ddef3a69866..57b46f05ff97 100644
> --- a/arch/x86/include/asm/tdx.h
> +++ b/arch/x86/include/asm/tdx.h
> @@ -131,6 +131,8 @@ int tdx_guest_keyid_alloc(void);
> u32 tdx_get_nr_guest_keyids(void);
> void tdx_guest_keyid_free(unsigned int keyid);
>
> +void tdx_quirk_reset_page(struct page *page);
Might make sense to have this be a static inline so as to avoid two exports if
KVM ever needs/wants the inner helper, but either way is a-ok by me.
Acked-by: Sean Christopherson <seanjc@google.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page()
2025-07-23 15:41 ` Adrian Hunter
@ 2025-07-23 16:03 ` Edgecombe, Rick P
0 siblings, 0 replies; 15+ messages in thread
From: Edgecombe, Rick P @ 2025-07-23 16:03 UTC (permalink / raw)
To: pbonzini@redhat.com, Hunter, Adrian, Annapurve, Vishal,
seanjc@google.com, dave.hansen@linux.intel.com
Cc: kvm@vger.kernel.org, Li, Xiaoyao, Huang, Kai, Zhao, Yan Y,
Luck, Tony, tony.lindgren@linux.intel.com, Chatre, Reinette,
binbin.wu@linux.intel.com, linux-kernel@vger.kernel.org,
tglx@linutronix.de, Yamahata, Isaku, kas@kernel.org,
mingo@redhat.com, hpa@zytor.com, bp@alien8.de, Gao, Chao,
x86@kernel.org
On Wed, 2025-07-23 at 18:41 +0300, Adrian Hunter wrote:
> On 23/07/2025 18:33, Edgecombe, Rick P wrote:
> > On Wed, 2025-07-23 at 18:30 +0300, Adrian Hunter wrote:
> > > > The log should explain why it's ok to change now, with respect to the
> > > > reasoning
> > > > in the comment that is being removed.
> > >
> > > It makes more sense afterwards because then it can refer to the
> > > functional change:
> >
> > Cleanups first is the norm. This doesn't seem like a special situation. Did you
> > try to re-arrange it?
>
> Patch 1 only introduced "quirk" terminology to save touching the
> same lines of code in patch 2 and distracting from its main purpose,
> but the quirk functionality is not added until patch 2, so the
> tidy-up only really makes sense afterwards.
No. It could be easily done upfront. Just rename everything and remove the
comment if you want to go with the rename option. Justification: Make code
readable instead of having comments to explain confusing code. Then put a little
bit saying that future changes will make it optional so it's nice to have the
name.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page()
2025-07-23 15:30 ` Adrian Hunter
2025-07-23 15:33 ` Edgecombe, Rick P
@ 2025-07-23 23:01 ` Huang, Kai
2025-07-23 23:26 ` Edgecombe, Rick P
1 sibling, 1 reply; 15+ messages in thread
From: Huang, Kai @ 2025-07-23 23:01 UTC (permalink / raw)
To: pbonzini@redhat.com, Hunter, Adrian, Annapurve, Vishal,
Edgecombe, Rick P, dave.hansen@linux.intel.com, seanjc@google.com
Cc: kvm@vger.kernel.org, Li, Xiaoyao, Luck, Tony, Zhao, Yan Y,
kas@kernel.org, Chatre, Reinette, binbin.wu@linux.intel.com,
linux-kernel@vger.kernel.org, mingo@redhat.com, Yamahata, Isaku,
tony.lindgren@linux.intel.com, tglx@linutronix.de, hpa@zytor.com,
Gao, Chao, bp@alien8.de, x86@kernel.org
On Wed, 2025-07-23 at 18:30 +0300, Hunter, Adrian wrote:
> On 23/07/2025 17:44, Edgecombe, Rick P wrote:
> > On Wed, 2025-07-23 at 17:37 +0300, Adrian Hunter wrote:
> > > > The latter seems better to me for the sake of less churn.
> > >
> > > Why make tdx_quirk_reset_page() and tdx_quirk_reset_paddr() follow
> > > different rules.
> > >
> > > How about this:
> > >
> > > From: Adrian Hunter <adrian.hunter@intel.com>
> > > Subject: [PATCH] x86/tdx: Tidy reset_pamt functions
> > >
> > > Rename reset_pamt functions to contain "quirk" to reflect the new
> > > functionality, and remove the now misleading comment.
> >
> > This looks like the "former" option. Churn is not too bad, and it has the
> > benefit of clear code vs long comment. I'm ok either way. But it needs to go
> > cleanup first in the patch order.
> >
> > The log should explain why it's ok to change now, with respect to the reasoning
> > in the comment that is being removed.
>
> It makes more sense afterwards because then it can refer to the
> functional change:
>
> From: Adrian Hunter <adrian.hunter@intel.com>
> Subject: [PATCH] x86/tdx: Tidy reset_pamt functions
>
> tdx_quirk_reset_paddr() has been made to reflect that, in fact, the
> clearing is necessary only for hardware with a certain quirk. Refer
> patch "x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE
> is present" for details.
>
> Rename reset_pamt functions to contain "quirk" to reflect the new
> functionality, and remove the now misleading comment.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
> arch/x86/virt/vmx/tdx/tdx.c | 16 ++++------------
> 1 file changed, 4 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index ef22fc2b9af0..823850399bb7 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -664,17 +664,17 @@ void tdx_quirk_reset_page(struct page *page)
> }
> EXPORT_SYMBOL_GPL(tdx_quirk_reset_page);
>
> -static void tdmr_reset_pamt(struct tdmr_info *tdmr)
> +static void tdmr_quirk_reset_pamt(struct tdmr_info *tdmr)
> {
> tdmr_do_pamt_func(tdmr, tdx_quirk_reset_paddr);
> }
>
> -static void tdmrs_reset_pamt_all(struct tdmr_info_list *tdmr_list)
> +static void tdmrs_quirk_reset_pamt_all(struct tdmr_info_list *tdmr_list)
> {
> int i;
>
> for (i = 0; i < tdmr_list->nr_consumed_tdmrs; i++)
> - tdmr_reset_pamt(tdmr_entry(tdmr_list, i));
> + tdmr_quirk_reset_pamt(tdmr_entry(tdmr_list, i));
> }
>
> static unsigned long tdmrs_count_pamt_kb(struct tdmr_info_list *tdmr_list)
> @@ -1146,15 +1146,7 @@ static int init_tdx_module(void)
> * to the kernel.
> */
> wbinvd_on_all_cpus();
> - /*
> - * According to the TDX hardware spec, if the platform
> - * doesn't have the "partial write machine check"
> - * erratum, any kernel read/write will never cause #MC
> - * in kernel space, thus it's OK to not convert PAMTs
> - * back to normal. But do the conversion anyway here
> - * as suggested by the TDX spec.
> - */
> - tdmrs_reset_pamt_all(&tdx_tdmr_list);
> + tdmrs_quirk_reset_pamt_all(&tdx_tdmr_list);
> err_free_pamts:
> tdmrs_free_pamt_all(&tdx_tdmr_list);
> err_free_tdmrs:
> --
> 2.48.1
Such renaming goes a little bit far IMHO. I respect the value of having
"quirk" in the name, but it also seems quite reasonable to me to hide such
"quirk" at the last level but just having "reset TDX pages" concept in the
higher levels.
E.g.,:
static void tdx_quirk_reset_paddr(unsigned long base, unsigned long size)
{
/* doing MOVDIR64B ... */
}
static void tdx_reset_paddr(unsigned long base, unsigned long size)
{
if (!boot_cpu_has_bug(X86_BUG_TDX_PW_MCE))
return;
tdx_quirk_reset_paddr(base, size);
}
void tdx_reset_page(struct page *page)
{
tdx_reset_paddr(page_to_phys(page), PAGE_SIZE);
}
EXPORT_SYMBOL_GPL(tdx_reset_page);
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page()
2025-07-23 23:01 ` Huang, Kai
@ 2025-07-23 23:26 ` Edgecombe, Rick P
2025-07-23 23:56 ` Huang, Kai
0 siblings, 1 reply; 15+ messages in thread
From: Edgecombe, Rick P @ 2025-07-23 23:26 UTC (permalink / raw)
To: pbonzini@redhat.com, Hunter, Adrian, Annapurve, Vishal,
Huang, Kai, dave.hansen@linux.intel.com, seanjc@google.com
Cc: kvm@vger.kernel.org, Li, Xiaoyao, Zhao, Yan Y, Luck, Tony,
kas@kernel.org, binbin.wu@linux.intel.com, Chatre, Reinette,
tony.lindgren@linux.intel.com, mingo@redhat.com, Yamahata, Isaku,
linux-kernel@vger.kernel.org, tglx@linutronix.de, hpa@zytor.com,
Gao, Chao, bp@alien8.de, x86@kernel.org
On Wed, 2025-07-23 at 23:01 +0000, Huang, Kai wrote:
> Such renaming goes a little bit far IMHO.
>
I agree it's not quite necessary churn.
> I respect the value of having
> "quirk" in the name, but it also seems quite reasonable to me to hide such
> "quirk" at the last level but just having "reset TDX pages" concept in the
> higher levels.
Assuming all the comments get corrected, this still leaves "reset" as an
operation that sometimes eagerly resets the page, or sometimes leaves it to be
lazily done later by a random access. Maybe instead of reset which is an action
that sometimes is skipped, something that says what state we want the page to be
at the end - ready to use.
tdx_make_page_ready()
tdx_make_page_usable()
...or something in that direction.
But this is still churn. Kai, what do you think about the other option of just
putting the X86_BUG_TDX_PW_MCE in tdx_reset_page() and letting the
initialization error path (tdmrs_reset_pamt_all()) keep always zeroing the
pages. So:
static void tdx_reset_paddr(unsigned long base, unsigned long size)
{
/* doing MOVDIR64B ... */
}
static void tdmr_reset_pamt(struct tdmr_info *tdmr)
{
tdmr_do_pamt_func(tdmr, tdx_reset_paddr);
}
void tdx_quirk_reset_page(struct page *page)
{
if (!boot_cpu_has_bug(X86_BUG_TDX_PW_MCE))
return;
tdx_reset_paddr(page_to_phys(page), PAGE_SIZE);
}
EXPORT_SYMBOL_GPL(tdx_reset_page);
>
> E.g.,:
>
> static void tdx_quirk_reset_paddr(unsigned long base, unsigned long size)
> {
> /* doing MOVDIR64B ... */
> }
>
> static void tdx_reset_paddr(unsigned long base, unsigned long size)
> {
> if (!boot_cpu_has_bug(X86_BUG_TDX_PW_MCE))
> return;
>
> tdx_quirk_reset_paddr(base, size);
> }
>
> void tdx_reset_page(struct page *page)
> {
> tdx_reset_paddr(page_to_phys(page), PAGE_SIZE);
> }
> EXPORT_SYMBOL_GPL(tdx_reset_page);
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page()
2025-07-23 23:26 ` Edgecombe, Rick P
@ 2025-07-23 23:56 ` Huang, Kai
0 siblings, 0 replies; 15+ messages in thread
From: Huang, Kai @ 2025-07-23 23:56 UTC (permalink / raw)
To: pbonzini@redhat.com, Hunter, Adrian, Annapurve, Vishal,
Edgecombe, Rick P, dave.hansen@linux.intel.com, seanjc@google.com
Cc: kvm@vger.kernel.org, Li, Xiaoyao, Luck, Tony, Zhao, Yan Y,
linux-kernel@vger.kernel.org, binbin.wu@linux.intel.com,
Chatre, Reinette, kas@kernel.org, tglx@linutronix.de,
Yamahata, Isaku, tony.lindgren@linux.intel.com, mingo@redhat.com,
hpa@zytor.com, bp@alien8.de, Gao, Chao, x86@kernel.org
On Wed, 2025-07-23 at 23:26 +0000, Edgecombe, Rick P wrote:
> On Wed, 2025-07-23 at 23:01 +0000, Huang, Kai wrote:
> > Such renaming goes a little bit far IMHO.
> >
>
> I agree it's not quite necessary churn.
>
> > I respect the value of having
> > "quirk" in the name, but it also seems quite reasonable to me to hide such
> > "quirk" at the last level but just having "reset TDX pages" concept in the
> > higher levels.
>
> Assuming all the comments get corrected, this still leaves "reset" as an
> operation that sometimes eagerly resets the page, or sometimes leaves it to be
> lazily done later by a random access.
>
Thanks for the point.
Yeah I agree it's better to convey such information in the function name.
> Maybe instead of reset which is an action
> that sometimes is skipped, something that says what state we want the page to be
> at the end - ready to use.
>
> tdx_make_page_ready()
> tdx_make_page_usable()
> ...or something in that direction.
>
> But this is still churn. Kai, what do you think about the other option of just
> putting the X86_BUG_TDX_PW_MCE in tdx_reset_page() and letting the
> initialization error path (tdmrs_reset_pamt_all()) keep always zeroing the
> pages. So:
>
> static void tdx_reset_paddr(unsigned long base, unsigned long size)
> {
> /* doing MOVDIR64B ... */
> }
>
> static void tdmr_reset_pamt(struct tdmr_info *tdmr)
> {
> tdmr_do_pamt_func(tdmr, tdx_reset_paddr);
> }
>
> void tdx_quirk_reset_page(struct page *page)
> {
> if (!boot_cpu_has_bug(X86_BUG_TDX_PW_MCE))
> return;
>
> tdx_reset_paddr(page_to_phys(page), PAGE_SIZE);
> }
> EXPORT_SYMBOL_GPL(tdx_reset_page);
I don't think it's good idea to treat PAMT and other types of TDX memory
differently. I would rather go with the renaming as shown in Adrian's
patch.
So no objection from me. :-)
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-07-23 23:56 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-23 12:05 [PATCH V4 0/2] x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present Adrian Hunter
2025-07-23 12:05 ` [PATCH V4 1/2] x86/tdx: Eliminate duplicate code in tdx_clear_page() Adrian Hunter
2025-07-23 14:06 ` Edgecombe, Rick P
2025-07-23 14:37 ` Adrian Hunter
2025-07-23 14:44 ` Edgecombe, Rick P
2025-07-23 15:30 ` Adrian Hunter
2025-07-23 15:33 ` Edgecombe, Rick P
2025-07-23 15:41 ` Adrian Hunter
2025-07-23 16:03 ` Edgecombe, Rick P
2025-07-23 23:01 ` Huang, Kai
2025-07-23 23:26 ` Edgecombe, Rick P
2025-07-23 23:56 ` Huang, Kai
2025-07-23 15:57 ` Sean Christopherson
2025-07-23 12:05 ` [PATCH V4 2/2] x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present Adrian Hunter
2025-07-23 12:31 ` Xiaoyao Li
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).