* [PATCH RESEND V2 0/2] Fixes for recovery for machine check in TDX/SEAM non-root mode
@ 2025-08-19 16:24 Adrian Hunter
2025-08-19 16:24 ` [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors " Adrian Hunter
2025-08-19 16:24 ` [PATCH RESEND V2 2/2] x86/mce: Remove MCI_ADDR_PHYSADDR Adrian Hunter
0 siblings, 2 replies; 19+ messages in thread
From: Adrian Hunter @ 2025-08-19 16:24 UTC (permalink / raw)
To: Dave Hansen, Tony Luck, pbonzini, seanjc
Cc: vannapurve, Borislav Petkov, Thomas Gleixner, Ingo Molnar, x86,
H Peter Anvin, linux-edac, linux-kernel, kvm, rick.p.edgecombe,
kai.huang, reinette.chatre, xiaoyao.li, tony.lindgren, binbin.wu,
ira.weiny, isaku.yamahata, Fan Du, Yazen Ghannam, yan.y.zhao,
chao.gao
Hi
Here is V2 of a small fix related to recovery for machine check in TDX/SEAM
non-root mode, and a small tidy-up.
Changes in V2:
x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM
non-root mode
Mask address when it is read
Amend struct mce addr description
KVM: TDX: Do not clear poisoned pages
Patch dropped
x86/mce: Remove MCI_ADDR_PHYSADDR
New patch
The issue was noticed as part of work to determine the conditions under
which TDX private memory needs to be cleared after being reclaimed.
For guests with a large amount of memory, clearing all private pages during
VM shutdown can take minutes, so we are looking at when that can be
skipped. A future patch will deal with that.
One thing that was investigated was the effect of deliberately corrupting a
TDX guest private page by writing to it on the host, and then reading it
on the guest, which results in a machine check as expected, but revealed
the issue addressed in patch 1.
There are 2 outstanding issues:
1. It is assumed that once the TDX VM is shutdown that the memory is
returned to the allocator. That is true at present, but may not be in the
future. Consider, for example, patch set "New KVM ioctl to link a gmem
inode to a new gmem file" :
https://lore.kernel.org/r/cover.1747368092.git.afranji@google.com/
2. Currently, KVM TDX does not cater for the TDX VM to enter a FATAL error
state, where the only operation permitted is to tear down the VM. KVM just
carries on, hitting various errors, but in particular, memory reclaim fails
because it is not following the teardown procedure, and all guest private
memory is leaked.
Adrian Hunter (2):
x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
x86/mce: Remove MCI_ADDR_PHYSADDR
arch/x86/include/asm/mce.h | 3 ---
arch/x86/include/uapi/asm/mce.h | 2 +-
arch/x86/kernel/cpu/mce/core.c | 9 ++++++---
drivers/cxl/core/mce.c | 2 +-
drivers/edac/skx_common.c | 2 +-
5 files changed, 9 insertions(+), 9 deletions(-)
Regards
Adrian
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-19 16:24 [PATCH RESEND V2 0/2] Fixes for recovery for machine check in TDX/SEAM non-root mode Adrian Hunter
@ 2025-08-19 16:24 ` Adrian Hunter
2025-08-19 17:28 ` Yazen Ghannam
2025-08-19 21:32 ` Borislav Petkov
2025-08-19 16:24 ` [PATCH RESEND V2 2/2] x86/mce: Remove MCI_ADDR_PHYSADDR Adrian Hunter
1 sibling, 2 replies; 19+ messages in thread
From: Adrian Hunter @ 2025-08-19 16:24 UTC (permalink / raw)
To: Dave Hansen, Tony Luck, pbonzini, seanjc
Cc: vannapurve, Borislav Petkov, Thomas Gleixner, Ingo Molnar, x86,
H Peter Anvin, linux-edac, linux-kernel, kvm, rick.p.edgecombe,
kai.huang, reinette.chatre, xiaoyao.li, tony.lindgren, binbin.wu,
ira.weiny, isaku.yamahata, Fan Du, Yazen Ghannam, yan.y.zhao,
chao.gao
Commit 8a01ec97dc066 ("x86/mce: Mask out non-address bits from machine
check bank") introduced a new #define MCI_ADDR_PHYSADDR for the mask of
valid physical address bits within the machine check bank address register.
This is particularly needed in the case of errors in TDX/SEAM non-root mode
because the reported address contains the TDX KeyID. Refer to TDX and
TME-MK documentation for more information about KeyIDs.
Commit 7911f145de5fe ("x86/mce: Implement recovery for errors in TDX/SEAM
non-root mode") uses the address to mark the affected page as poisoned, but
omits to use the aforementioned mask.
Investigation of user space expectations has concluded it would be more
correct for the address to contain only address bits in the first place.
Refer https://lore.kernel.org/r/807ff02d-7af0-419d-8d14-a4d6c5d5420d@intel.com
Mask the address when it is read from the machine check bank address
register. Do not use MCI_ADDR_PHYSADDR because that will be removed in a
later patch.
It is assumed __log_error() in arch/x86/kernel/cpu/mce/amd.c does not need
similar treatment.
Amend struct mce addr member description slightly to reflect that it is
not, and never has been, an exact copy of the bank's MCi_ADDR MSR.
Fixes: 8a01ec97dc066 ("x86/mce: Mask out non-address bits from machine check bank")
Fixes: 7911f145de5fe ("x86/mce: Implement recovery for errors in TDX/SEAM non-root mode")
Link: https://lore.kernel.org/r/807ff02d-7af0-419d-8d14-a4d6c5d5420d@intel.com
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
Changes in V2:
Mask address when it is read
Amend struct mce addr description
arch/x86/include/uapi/asm/mce.h | 2 +-
arch/x86/kernel/cpu/mce/core.c | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index cb6b48a7c22b..abf6ee43f5f8 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -14,7 +14,7 @@
struct mce {
__u64 status; /* Bank's MCi_STATUS MSR */
__u64 misc; /* Bank's MCi_MISC MSR */
- __u64 addr; /* Bank's MCi_ADDR MSR */
+ __u64 addr; /* Address from bank's MCi_ADDR MSR */
__u64 mcgstatus; /* Machine Check Global Status MSR */
__u64 ip; /* Instruction Pointer when the error happened */
__u64 tsc; /* CPU time stamp counter */
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 4da4eab56c81..deb47463a75d 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -699,6 +699,9 @@ static noinstr void mce_read_aux(struct mce_hw_err *err, int i)
}
smca_extract_err_addr(m);
+
+ /* Mask out non-address bits, such as TDX KeyID */
+ m->addr &= GENMASK_ULL(boot_cpu_data.x86_phys_bits - 1, 0);
}
if (mce_flags.smca) {
--
2.48.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH RESEND V2 2/2] x86/mce: Remove MCI_ADDR_PHYSADDR
2025-08-19 16:24 [PATCH RESEND V2 0/2] Fixes for recovery for machine check in TDX/SEAM non-root mode Adrian Hunter
2025-08-19 16:24 ` [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors " Adrian Hunter
@ 2025-08-19 16:24 ` Adrian Hunter
1 sibling, 0 replies; 19+ messages in thread
From: Adrian Hunter @ 2025-08-19 16:24 UTC (permalink / raw)
To: Dave Hansen, Tony Luck, pbonzini, seanjc
Cc: vannapurve, Borislav Petkov, Thomas Gleixner, Ingo Molnar, x86,
H Peter Anvin, linux-edac, linux-kernel, kvm, rick.p.edgecombe,
kai.huang, reinette.chatre, xiaoyao.li, tony.lindgren, binbin.wu,
ira.weiny, isaku.yamahata, Fan Du, Yazen Ghannam, yan.y.zhao,
chao.gao
Now that the address is masked when it is read from the machine check bank
address register (refer patch "x86/mce: Fix missing address mask in
recovery for errors in TDX/SEAM non-root mode"), the MCI_ADDR_PHYSADDR
macro is no longer needed. Remove it.
Note MCE address information also enters the kernel from APEI via the
Common Platform Error Record (CPER) Memory Error Section "Physical Address"
field (struct cper_sec_mem_err physical_addr), refer the UEFI
specification. It is assumed that field contains only the physical
address.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
Changes in V2:
New patch
arch/x86/include/asm/mce.h | 3 ---
arch/x86/kernel/cpu/mce/core.c | 6 +++---
drivers/cxl/core/mce.c | 2 +-
drivers/edac/skx_common.c | 2 +-
4 files changed, 5 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 6c77c03139f7..0cf8017dcae9 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -91,9 +91,6 @@
#define MCI_MISC_ADDR_MEM 3 /* memory address */
#define MCI_MISC_ADDR_GENERIC 7 /* generic */
-/* MCi_ADDR register defines */
-#define MCI_ADDR_PHYSADDR GENMASK_ULL(boot_cpu_data.x86_phys_bits - 1, 0)
-
/* CTL2 register defines */
#define MCI_CTL2_CMCI_EN BIT_ULL(30)
#define MCI_CTL2_CMCI_THRESHOLD_MASK 0x7fffULL
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index deb47463a75d..80e06d6728a7 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -642,7 +642,7 @@ static int uc_decode_notifier(struct notifier_block *nb, unsigned long val,
mce->severity != MCE_DEFERRED_SEVERITY)
return NOTIFY_DONE;
- pfn = (mce->addr & MCI_ADDR_PHYSADDR) >> PAGE_SHIFT;
+ pfn = mce->addr >> PAGE_SHIFT;
if (!memory_failure(pfn, 0)) {
set_mce_nospec(pfn);
mce->kflags |= MCE_HANDLED_UC;
@@ -1415,7 +1415,7 @@ static void kill_me_maybe(struct callback_head *cb)
if (!p->mce_ripv)
flags |= MF_MUST_KILL;
- pfn = (p->mce_addr & MCI_ADDR_PHYSADDR) >> PAGE_SHIFT;
+ pfn = p->mce_addr >> PAGE_SHIFT;
ret = memory_failure(pfn, flags);
if (!ret) {
set_mce_nospec(pfn);
@@ -1444,7 +1444,7 @@ static void kill_me_never(struct callback_head *cb)
p->mce_count = 0;
pr_err("Kernel accessed poison in user space at %llx\n", p->mce_addr);
- pfn = (p->mce_addr & MCI_ADDR_PHYSADDR) >> PAGE_SHIFT;
+ pfn = p->mce_addr >> PAGE_SHIFT;
if (!memory_failure(pfn, 0))
set_mce_nospec(pfn);
}
diff --git a/drivers/cxl/core/mce.c b/drivers/cxl/core/mce.c
index ff8d078c6ca1..4ba8b7ae3de7 100644
--- a/drivers/cxl/core/mce.c
+++ b/drivers/cxl/core/mce.c
@@ -24,7 +24,7 @@ static int cxl_handle_mce(struct notifier_block *nb, unsigned long val,
if (!endpoint)
return NOTIFY_DONE;
- spa = mce->addr & MCI_ADDR_PHYSADDR;
+ spa = mce->addr;
pfn = spa >> PAGE_SHIFT;
if (!pfn_valid(pfn))
diff --git a/drivers/edac/skx_common.c b/drivers/edac/skx_common.c
index 39c733dbc5b9..2de675958560 100644
--- a/drivers/edac/skx_common.c
+++ b/drivers/edac/skx_common.c
@@ -732,7 +732,7 @@ int skx_mce_check_error(struct notifier_block *nb, unsigned long val,
memset(&res, 0, sizeof(res));
res.mce = mce;
- res.addr = mce->addr & MCI_ADDR_PHYSADDR;
+ res.addr = mce->addr;
if (!pfn_to_online_page(res.addr >> PAGE_SHIFT) && !arch_is_platform_page(res.addr)) {
pr_err("Invalid address 0x%llx in IA32_MC%d_ADDR\n", mce->addr, mce->bank);
return NOTIFY_DONE;
--
2.48.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-19 16:24 ` [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors " Adrian Hunter
@ 2025-08-19 17:28 ` Yazen Ghannam
2025-08-19 17:51 ` Luck, Tony
2025-08-19 21:32 ` Borislav Petkov
1 sibling, 1 reply; 19+ messages in thread
From: Yazen Ghannam @ 2025-08-19 17:28 UTC (permalink / raw)
To: Adrian Hunter
Cc: Dave Hansen, Tony Luck, pbonzini, seanjc, vannapurve,
Borislav Petkov, Thomas Gleixner, Ingo Molnar, x86, H Peter Anvin,
linux-edac, linux-kernel, kvm, rick.p.edgecombe, kai.huang,
reinette.chatre, xiaoyao.li, tony.lindgren, binbin.wu, ira.weiny,
isaku.yamahata, Fan Du, yan.y.zhao, chao.gao
On Tue, Aug 19, 2025 at 07:24:34PM +0300, Adrian Hunter wrote:
> Commit 8a01ec97dc066 ("x86/mce: Mask out non-address bits from machine
> check bank") introduced a new #define MCI_ADDR_PHYSADDR for the mask of
> valid physical address bits within the machine check bank address register.
>
> This is particularly needed in the case of errors in TDX/SEAM non-root mode
> because the reported address contains the TDX KeyID. Refer to TDX and
> TME-MK documentation for more information about KeyIDs.
>
> Commit 7911f145de5fe ("x86/mce: Implement recovery for errors in TDX/SEAM
> non-root mode") uses the address to mark the affected page as poisoned, but
> omits to use the aforementioned mask.
>
> Investigation of user space expectations has concluded it would be more
> correct for the address to contain only address bits in the first place.
> Refer https://lore.kernel.org/r/807ff02d-7af0-419d-8d14-a4d6c5d5420d@intel.com
>
> Mask the address when it is read from the machine check bank address
> register. Do not use MCI_ADDR_PHYSADDR because that will be removed in a
> later patch.
>
> It is assumed __log_error() in arch/x86/kernel/cpu/mce/amd.c does not need
> similar treatment.
>
> Amend struct mce addr member description slightly to reflect that it is
> not, and never has been, an exact copy of the bank's MCi_ADDR MSR.
>
I think it would be more accurate to say that the MCi_ADDR MSR is not,
and never has been, guaranteed to be a system physical address.
We could introduce a new field that represents the system physical
address, if one exists for the error type. This way we can operate on a
value without assumption or additional checks. And we can keep the raw
MCi_ADDR MSR value in case it is of value to debug folks or hardware
designers. In my experience, they seem to appreciate having the full,
unfiltered data. We don't give them that today, but we can work towards
that goal.
I have some old work in this area:
https://github.com/AMDESE/linux/commit/76732c67cbf96c14f55ed1061804db9ff1505ea3
This isn't a quick fix, so maybe we can come back to it if folks are
happy with your current solution.
But I do think there's value in sharing the data as given to us by
hardware. And providing new interfaces to users if we need to modify
something for them to take action.
Thanks,
Yazen
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-19 17:28 ` Yazen Ghannam
@ 2025-08-19 17:51 ` Luck, Tony
2025-08-19 17:58 ` Adrian Hunter
0 siblings, 1 reply; 19+ messages in thread
From: Luck, Tony @ 2025-08-19 17:51 UTC (permalink / raw)
To: Yazen Ghannam
Cc: Adrian Hunter, Dave Hansen, pbonzini, seanjc, vannapurve,
Borislav Petkov, Thomas Gleixner, Ingo Molnar, x86, H Peter Anvin,
linux-edac, linux-kernel, kvm, rick.p.edgecombe, kai.huang,
reinette.chatre, xiaoyao.li, tony.lindgren, binbin.wu, ira.weiny,
isaku.yamahata, Fan Du, yan.y.zhao, chao.gao
On Tue, Aug 19, 2025 at 01:28:46PM -0400, Yazen Ghannam wrote:
> On Tue, Aug 19, 2025 at 07:24:34PM +0300, Adrian Hunter wrote:
> > Commit 8a01ec97dc066 ("x86/mce: Mask out non-address bits from machine
> > check bank") introduced a new #define MCI_ADDR_PHYSADDR for the mask of
> > valid physical address bits within the machine check bank address register.
> >
> > This is particularly needed in the case of errors in TDX/SEAM non-root mode
> > because the reported address contains the TDX KeyID. Refer to TDX and
> > TME-MK documentation for more information about KeyIDs.
> >
> > Commit 7911f145de5fe ("x86/mce: Implement recovery for errors in TDX/SEAM
> > non-root mode") uses the address to mark the affected page as poisoned, but
> > omits to use the aforementioned mask.
> >
> > Investigation of user space expectations has concluded it would be more
> > correct for the address to contain only address bits in the first place.
> > Refer https://lore.kernel.org/r/807ff02d-7af0-419d-8d14-a4d6c5d5420d@intel.com
> >
> > Mask the address when it is read from the machine check bank address
> > register. Do not use MCI_ADDR_PHYSADDR because that will be removed in a
> > later patch.
> >
> > It is assumed __log_error() in arch/x86/kernel/cpu/mce/amd.c does not need
> > similar treatment.
> >
> > Amend struct mce addr member description slightly to reflect that it is
> > not, and never has been, an exact copy of the bank's MCi_ADDR MSR.
> >
>
> I think it would be more accurate to say that the MCi_ADDR MSR is not,
> and never has been, guaranteed to be a system physical address.
>
> We could introduce a new field that represents the system physical
> address, if one exists for the error type. This way we can operate on a
> value without assumption or additional checks. And we can keep the raw
> MCi_ADDR MSR value in case it is of value to debug folks or hardware
> designers. In my experience, they seem to appreciate having the full,
> unfiltered data. We don't give them that today, but we can work towards
> that goal.
Having and exact copy of MCi_ADDR might be useful. I recall some angst
about this code masking off low order bits:
m->addr = mce_rdmsrq(mca_msr_reg(i, MCA_ADDR));
/*
* Mask the reported address by the reported granularity.
*/
if (mca_cfg.ser && (m->status & MCI_STATUS_MISCV)) {
u8 shift = MCI_MISC_ADDR_LSB(m->misc);
m->addr >>= shift;
m->addr <<= shift;
}
this proposal masks some high order bits too.
I second Yazen's suggestion of a new field. One for the raw value,
another for the massaged phsical address derived from the MSR.
-Tony
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-19 17:51 ` Luck, Tony
@ 2025-08-19 17:58 ` Adrian Hunter
2025-08-19 18:03 ` Luck, Tony
0 siblings, 1 reply; 19+ messages in thread
From: Adrian Hunter @ 2025-08-19 17:58 UTC (permalink / raw)
To: Luck, Tony, Yazen Ghannam
Cc: Dave Hansen, pbonzini, seanjc, vannapurve, Borislav Petkov,
Thomas Gleixner, Ingo Molnar, x86, H Peter Anvin, linux-edac,
linux-kernel, kvm, rick.p.edgecombe, kai.huang, reinette.chatre,
xiaoyao.li, tony.lindgren, binbin.wu, ira.weiny, isaku.yamahata,
Fan Du, yan.y.zhao, chao.gao
On 19/08/2025 20:51, Luck, Tony wrote:
> On Tue, Aug 19, 2025 at 01:28:46PM -0400, Yazen Ghannam wrote:
>> On Tue, Aug 19, 2025 at 07:24:34PM +0300, Adrian Hunter wrote:
>>> Commit 8a01ec97dc066 ("x86/mce: Mask out non-address bits from machine
>>> check bank") introduced a new #define MCI_ADDR_PHYSADDR for the mask of
>>> valid physical address bits within the machine check bank address register.
>>>
>>> This is particularly needed in the case of errors in TDX/SEAM non-root mode
>>> because the reported address contains the TDX KeyID. Refer to TDX and
>>> TME-MK documentation for more information about KeyIDs.
>>>
>>> Commit 7911f145de5fe ("x86/mce: Implement recovery for errors in TDX/SEAM
>>> non-root mode") uses the address to mark the affected page as poisoned, but
>>> omits to use the aforementioned mask.
>>>
>>> Investigation of user space expectations has concluded it would be more
>>> correct for the address to contain only address bits in the first place.
>>> Refer https://lore.kernel.org/r/807ff02d-7af0-419d-8d14-a4d6c5d5420d@intel.com
>>>
>>> Mask the address when it is read from the machine check bank address
>>> register. Do not use MCI_ADDR_PHYSADDR because that will be removed in a
>>> later patch.
>>>
>>> It is assumed __log_error() in arch/x86/kernel/cpu/mce/amd.c does not need
>>> similar treatment.
>>>
>>> Amend struct mce addr member description slightly to reflect that it is
>>> not, and never has been, an exact copy of the bank's MCi_ADDR MSR.
>>>
>>
>> I think it would be more accurate to say that the MCi_ADDR MSR is not,
>> and never has been, guaranteed to be a system physical address.
>>
>> We could introduce a new field that represents the system physical
>> address, if one exists for the error type. This way we can operate on a
>> value without assumption or additional checks. And we can keep the raw
>> MCi_ADDR MSR value in case it is of value to debug folks or hardware
>> designers. In my experience, they seem to appreciate having the full,
>> unfiltered data. We don't give them that today, but we can work towards
>> that goal.
>
> Having and exact copy of MCi_ADDR might be useful. I recall some angst
> about this code masking off low order bits:
>
> m->addr = mce_rdmsrq(mca_msr_reg(i, MCA_ADDR));
>
> /*
> * Mask the reported address by the reported granularity.
> */
> if (mca_cfg.ser && (m->status & MCI_STATUS_MISCV)) {
> u8 shift = MCI_MISC_ADDR_LSB(m->misc);
> m->addr >>= shift;
> m->addr <<= shift;
> }
>
> this proposal masks some high order bits too.
>
> I second Yazen's suggestion of a new field. One for the raw value,
> another for the massaged phsical address derived from the MSR.
For struct mce? Maybe that should be 2 new fields:
__u64 addr; /* Deprecated */
...
__u64 mci_addr; /* Bank's MCi_ADDR MSR */
__u64 phys_addr; /* Physical address */
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-19 17:58 ` Adrian Hunter
@ 2025-08-19 18:03 ` Luck, Tony
2025-08-20 15:59 ` Adrian Hunter
0 siblings, 1 reply; 19+ messages in thread
From: Luck, Tony @ 2025-08-19 18:03 UTC (permalink / raw)
To: Hunter, Adrian, Yazen Ghannam
Cc: Dave Hansen, pbonzini@redhat.com, seanjc@google.com,
Annapurve, Vishal, Borislav Petkov, Thomas Gleixner, Ingo Molnar,
x86@kernel.org, H Peter Anvin, linux-edac@vger.kernel.org,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Edgecombe, Rick P, Huang, Kai, Chatre, Reinette, Li, Xiaoyao,
tony.lindgren@linux.intel.com, binbin.wu@linux.intel.com,
Weiny, Ira, Yamahata, Isaku, Du, Fan, Zhao, Yan Y, Gao, Chao
> For struct mce? Maybe that should be 2 new fields:
>
> __u64 addr; /* Deprecated */
> ...
> __u64 mci_addr; /* Bank's MCi_ADDR MSR */
> __u64 phys_addr; /* Physical address */
Would "addr" keep the current (low bits masked, high bits preserved) value?
-Tony
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-19 16:24 ` [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors " Adrian Hunter
2025-08-19 17:28 ` Yazen Ghannam
@ 2025-08-19 21:32 ` Borislav Petkov
2025-08-21 7:24 ` Adrian Hunter
2025-08-27 8:22 ` Adrian Hunter
1 sibling, 2 replies; 19+ messages in thread
From: Borislav Petkov @ 2025-08-19 21:32 UTC (permalink / raw)
To: Adrian Hunter
Cc: Dave Hansen, Tony Luck, pbonzini, seanjc, vannapurve,
Thomas Gleixner, Ingo Molnar, x86, H Peter Anvin, linux-edac,
linux-kernel, kvm, rick.p.edgecombe, kai.huang, reinette.chatre,
xiaoyao.li, tony.lindgren, binbin.wu, ira.weiny, isaku.yamahata,
Fan Du, Yazen Ghannam, yan.y.zhao, chao.gao
On Tue, Aug 19, 2025 at 07:24:34PM +0300, Adrian Hunter wrote:
> Commit 8a01ec97dc066 ("x86/mce: Mask out non-address bits from machine
> check bank") introduced a new #define MCI_ADDR_PHYSADDR for the mask of
> valid physical address bits within the machine check bank address register.
>
> This is particularly needed in the case of errors in TDX/SEAM non-root mode
> because the reported address contains the TDX KeyID. Refer to TDX and
> TME-MK documentation for more information about KeyIDs.
>
> Commit 7911f145de5fe ("x86/mce: Implement recovery for errors in TDX/SEAM
> non-root mode") uses the address to mark the affected page as poisoned, but
> omits to use the aforementioned mask.
>
> Investigation of user space expectations has concluded it would be more
> correct for the address to contain only address bits in the first place.
> Refer https://lore.kernel.org/r/807ff02d-7af0-419d-8d14-a4d6c5d5420d@intel.com
>
> Mask the address when it is read from the machine check bank address
> register. Do not use MCI_ADDR_PHYSADDR because that will be removed in a
> later patch.
Why is this patch talking about TDX-something but doing "global" changes to
mce.addr?
Why don't you simply do a TDX-specific masking out when you're running on
in TDX env and leave the rest as is?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-19 18:03 ` Luck, Tony
@ 2025-08-20 15:59 ` Adrian Hunter
2025-08-20 16:12 ` Luck, Tony
0 siblings, 1 reply; 19+ messages in thread
From: Adrian Hunter @ 2025-08-20 15:59 UTC (permalink / raw)
To: Luck, Tony, Yazen Ghannam
Cc: Dave Hansen, pbonzini@redhat.com, seanjc@google.com,
Annapurve, Vishal, Borislav Petkov, Thomas Gleixner, Ingo Molnar,
x86@kernel.org, H Peter Anvin, linux-edac@vger.kernel.org,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Edgecombe, Rick P, Huang, Kai, Chatre, Reinette, Li, Xiaoyao,
tony.lindgren@linux.intel.com, binbin.wu@linux.intel.com,
Weiny, Ira, Yamahata, Isaku, Du, Fan, Zhao, Yan Y, Gao, Chao
On 19/08/2025 21:03, Luck, Tony wrote:
>> For struct mce? Maybe that should be 2 new fields:
>>
>> __u64 addr; /* Deprecated */
>> ...
>> __u64 mci_addr; /* Bank's MCi_ADDR MSR */
>> __u64 phys_addr; /* Physical address */
>
> Would "addr" keep the current (low bits masked, high bits preserved) value?
Yeah, it wouldn't make much sense if phys_addr was the same as addr anyway.
Not really thinking
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-20 15:59 ` Adrian Hunter
@ 2025-08-20 16:12 ` Luck, Tony
2025-08-20 17:56 ` Yazen Ghannam
0 siblings, 1 reply; 19+ messages in thread
From: Luck, Tony @ 2025-08-20 16:12 UTC (permalink / raw)
To: Hunter, Adrian, Yazen Ghannam
Cc: Dave Hansen, pbonzini@redhat.com, seanjc@google.com,
Annapurve, Vishal, Borislav Petkov, Thomas Gleixner, Ingo Molnar,
x86@kernel.org, H Peter Anvin, linux-edac@vger.kernel.org,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Edgecombe, Rick P, Huang, Kai, Chatre, Reinette, Li, Xiaoyao,
tony.lindgren@linux.intel.com, binbin.wu@linux.intel.com,
Weiny, Ira, Yamahata, Isaku, Du, Fan, Zhao, Yan Y, Gao, Chao
> >> For struct mce? Maybe that should be 2 new fields:
> >>
> >> __u64 addr; /* Deprecated */
> >> ...
> >> __u64 mci_addr; /* Bank's MCi_ADDR MSR */
> >> __u64 phys_addr; /* Physical address */
> >
> > Would "addr" keep the current (low bits masked, high bits preserved) value?
>
> Yeah, it wouldn't make much sense if phys_addr was the same as addr anyway.
> Not really thinking
The other option (but a bad one) would be:
__u64 deprecated; /* was "addr" */
...
__u64 mci_addr; /* Bank's MCi_ADDR MSR */
__u64 phys_addr; /* Physical address */
which would be good to force cleanup in the kernel, but bad for preserving
ABI (since "struct mce" is visible to user space via /dev/mcelog).
-Tony
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-20 16:12 ` Luck, Tony
@ 2025-08-20 17:56 ` Yazen Ghannam
2025-08-21 6:49 ` Adrian Hunter
0 siblings, 1 reply; 19+ messages in thread
From: Yazen Ghannam @ 2025-08-20 17:56 UTC (permalink / raw)
To: Luck, Tony
Cc: Hunter, Adrian, Dave Hansen, pbonzini@redhat.com,
seanjc@google.com, Annapurve, Vishal, Borislav Petkov,
Thomas Gleixner, Ingo Molnar, x86@kernel.org, H Peter Anvin,
linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, Edgecombe, Rick P, Huang, Kai,
Chatre, Reinette, Li, Xiaoyao, tony.lindgren@linux.intel.com,
binbin.wu@linux.intel.com, Weiny, Ira, Yamahata, Isaku, Du, Fan,
Zhao, Yan Y, Gao, Chao
On Wed, Aug 20, 2025 at 04:12:28PM +0000, Luck, Tony wrote:
> > >> For struct mce? Maybe that should be 2 new fields:
> > >>
> > >> __u64 addr; /* Deprecated */
> > >> ...
> > >> __u64 mci_addr; /* Bank's MCi_ADDR MSR */
> > >> __u64 phys_addr; /* Physical address */
> > >
> > > Would "addr" keep the current (low bits masked, high bits preserved) value?
> >
> > Yeah, it wouldn't make much sense if phys_addr was the same as addr anyway.
> > Not really thinking
>
> The other option (but a bad one) would be:
>
> __u64 deprecated; /* was "addr" */
> ...
> __u64 mci_addr; /* Bank's MCi_ADDR MSR */
> __u64 phys_addr; /* Physical address */
>
> which would be good to force cleanup in the kernel, but bad for preserving
> ABI (since "struct mce" is visible to user space via /dev/mcelog).
>
/dev/mcelog has been deprecated for a while.
Is the mcelog app still in active development? Could it be updated to
use trace events for MCE info?
You could also just fix up the address value in the mcelog notifier's
copy. I believe it has its own cache separate from the MCE genpool.
Thanks,
Yazen
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-20 17:56 ` Yazen Ghannam
@ 2025-08-21 6:49 ` Adrian Hunter
0 siblings, 0 replies; 19+ messages in thread
From: Adrian Hunter @ 2025-08-21 6:49 UTC (permalink / raw)
To: Yazen Ghannam, Luck, Tony
Cc: Dave Hansen, pbonzini@redhat.com, seanjc@google.com,
Annapurve, Vishal, Borislav Petkov, Thomas Gleixner, Ingo Molnar,
x86@kernel.org, H Peter Anvin, linux-edac@vger.kernel.org,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Edgecombe, Rick P, Huang, Kai, Chatre, Reinette, Li, Xiaoyao,
tony.lindgren@linux.intel.com, binbin.wu@linux.intel.com,
Weiny, Ira, Yamahata, Isaku, Du, Fan, Zhao, Yan Y, Gao, Chao
On 20/08/2025 20:56, Yazen Ghannam wrote:
> On Wed, Aug 20, 2025 at 04:12:28PM +0000, Luck, Tony wrote:
>>>>> For struct mce? Maybe that should be 2 new fields:
>>>>>
>>>>> __u64 addr; /* Deprecated */
>>>>> ...
>>>>> __u64 mci_addr; /* Bank's MCi_ADDR MSR */
>>>>> __u64 phys_addr; /* Physical address */
>>>>
>>>> Would "addr" keep the current (low bits masked, high bits preserved) value?
>>>
>>> Yeah, it wouldn't make much sense if phys_addr was the same as addr anyway.
>>> Not really thinking
>>
>> The other option (but a bad one) would be:
>>
>> __u64 deprecated; /* was "addr" */
>> ...
>> __u64 mci_addr; /* Bank's MCi_ADDR MSR */
>> __u64 phys_addr; /* Physical address */
>>
>> which would be good to force cleanup in the kernel, but bad for preserving
>> ABI (since "struct mce" is visible to user space via /dev/mcelog).
>>
>
> /dev/mcelog has been deprecated for a while.
There is also mce_record tracepoint
>
> Is the mcelog app still in active development? Could it be updated to
> use trace events for MCE info?
>
> You could also just fix up the address value in the mcelog notifier's
> copy. I believe it has its own cache separate from the MCE genpool.
Is there an advantage to fixing up later rather than when addr is
initially assigned?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-19 21:32 ` Borislav Petkov
@ 2025-08-21 7:24 ` Adrian Hunter
2025-08-21 13:25 ` Borislav Petkov
2025-08-27 8:22 ` Adrian Hunter
1 sibling, 1 reply; 19+ messages in thread
From: Adrian Hunter @ 2025-08-21 7:24 UTC (permalink / raw)
To: Borislav Petkov
Cc: Dave Hansen, Tony Luck, pbonzini, seanjc, vannapurve,
Thomas Gleixner, Ingo Molnar, x86, H Peter Anvin, linux-edac,
linux-kernel, kvm, rick.p.edgecombe, kai.huang, reinette.chatre,
xiaoyao.li, tony.lindgren, binbin.wu, ira.weiny, isaku.yamahata,
Fan Du, Yazen Ghannam, yan.y.zhao, chao.gao
On 20/08/2025 00:32, Borislav Petkov wrote:
> On Tue, Aug 19, 2025 at 07:24:34PM +0300, Adrian Hunter wrote:
>> Commit 8a01ec97dc066 ("x86/mce: Mask out non-address bits from machine
>> check bank") introduced a new #define MCI_ADDR_PHYSADDR for the mask of
>> valid physical address bits within the machine check bank address register.
>>
>> This is particularly needed in the case of errors in TDX/SEAM non-root mode
>> because the reported address contains the TDX KeyID. Refer to TDX and
>> TME-MK documentation for more information about KeyIDs.
>>
>> Commit 7911f145de5fe ("x86/mce: Implement recovery for errors in TDX/SEAM
>> non-root mode") uses the address to mark the affected page as poisoned, but
>> omits to use the aforementioned mask.
>>
>> Investigation of user space expectations has concluded it would be more
>> correct for the address to contain only address bits in the first place.
>> Refer https://lore.kernel.org/r/807ff02d-7af0-419d-8d14-a4d6c5d5420d@intel.com
>>
>> Mask the address when it is read from the machine check bank address
>> register. Do not use MCI_ADDR_PHYSADDR because that will be removed in a
>> later patch.
>
> Why is this patch talking about TDX-something but doing "global" changes to
> mce.addr?
It falls a bit into the category of: easier to maintain a
global way of doing things than have lots of special-cases.
>
> Why don't you simply do a TDX-specific masking out when you're running on
> in TDX env and leave the rest as is?
>
It was kinda like that in V1:
https://lore.kernel.org/r/20250618120806.113884-2-adrian.hunter@intel.com/
where the code change was dealing with SEAM_NR in the block starting:
} else if (m->mcgstatus & MCG_STATUS_SEAM_NR) {
Then Dave asked about changing addr itself:
https://lore.kernel.org/all/487c5e63-07d3-41ad-bfc0-bda14b3c435e@intel.com/
https://lore.kernel.org/all/79eca29a-8ba4-4ad9-b2e0-54d8e668f731@intel.com/
And it seems like user space does expect addr to be a physical address:
https://lore.kernel.org/r/807ff02d-7af0-419d-8d14-a4d6c5d5420d@intel.com
Something like below would work, but doesn't answer Dave's question
of why not do it in mce_read_aux()
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 4da4eab56c81..53c7ea3d0464 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1655,28 +1655,30 @@ noinstr void do_machine_check(struct pt_regs *regs)
} else if (m->mcgstatus & MCG_STATUS_SEAM_NR) {
/*
* Saved RIP on stack makes it look like the machine check
* was taken in the kernel on the instruction following
* the entry to SEAM mode. But MCG_STATUS_SEAM_NR indicates
* that the machine check was taken inside SEAM non-root
* mode. CPU core has already marked that guest as dead.
* It is OK for the kernel to resume execution at the
* apparent point of the machine check as the fault did
* not occur there. Mark the page as poisoned so it won't
* be added to free list when the guest is terminated.
*/
if (mce_usable_address(m)) {
- struct page *p = pfn_to_online_page(m->addr >> PAGE_SHIFT);
+ struct page *p;
+ m->addr &= MCI_ADDR_PHYSADDR;
+ p = pfn_to_online_page(m->addr >> PAGE_SHIFT);
if (p)
SetPageHWPoison(p);
}
} else {
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-21 7:24 ` Adrian Hunter
@ 2025-08-21 13:25 ` Borislav Petkov
2025-08-22 7:57 ` Adrian Hunter
0 siblings, 1 reply; 19+ messages in thread
From: Borislav Petkov @ 2025-08-21 13:25 UTC (permalink / raw)
To: Adrian Hunter
Cc: Dave Hansen, Tony Luck, pbonzini, seanjc, vannapurve,
Thomas Gleixner, Ingo Molnar, x86, H Peter Anvin, linux-edac,
linux-kernel, kvm, rick.p.edgecombe, kai.huang, reinette.chatre,
xiaoyao.li, tony.lindgren, binbin.wu, ira.weiny, isaku.yamahata,
Fan Du, Yazen Ghannam, yan.y.zhao, chao.gao
On Thu, Aug 21, 2025 at 10:24:22AM +0300, Adrian Hunter wrote:
> Something like below would work, but doesn't answer Dave's question
> of why not do it in mce_read_aux()
So, let me see what I understand from all this bla: you want to zap the KeyID
from mci_addr because it is completely useless there. So zap it.
You can't make any other changes to mci_addr because that goes to luserspace.
So far so good.
Now, all that other bla leads me to believe that there might be some need to
dump the raw mci_addr value after all.
If so, your patch is not needed.
Which makes me think, all yall folks need to make up your mind here.
And you need to get rid of all that extraneous information in your commit
message:
"Investigation of user space expectations has concluded it..."
No investigation needed - this is exported to userspace so you can't touch it.
The one and only question you need to answer is, do you really need KeyID in
it or not. And whatever you do, once you do it, we're stuck with it because it
goes out to userspace.
Especially if you want this backported to stable.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-21 13:25 ` Borislav Petkov
@ 2025-08-22 7:57 ` Adrian Hunter
2025-08-22 13:54 ` Borislav Petkov
0 siblings, 1 reply; 19+ messages in thread
From: Adrian Hunter @ 2025-08-22 7:57 UTC (permalink / raw)
To: Borislav Petkov
Cc: Dave Hansen, Tony Luck, pbonzini, seanjc, vannapurve,
Thomas Gleixner, Ingo Molnar, x86, H Peter Anvin, linux-edac,
linux-kernel, kvm, rick.p.edgecombe, kai.huang, reinette.chatre,
xiaoyao.li, tony.lindgren, binbin.wu, ira.weiny, isaku.yamahata,
Fan Du, Yazen Ghannam, yan.y.zhao, chao.gao
On 21/08/2025 16:25, Borislav Petkov wrote:
> On Thu, Aug 21, 2025 at 10:24:22AM +0300, Adrian Hunter wrote:
>> Something like below would work, but doesn't answer Dave's question
>> of why not do it in mce_read_aux()
Thanks for looking at this.
> So, let me see what I understand from all this bla: you want to zap the KeyID
> from mci_addr because it is completely useless there. So zap it.
Not exactly. I just want to fix the bug whereby the mce handler fails
to mark the affected page as poisoned because it does not remove the KeyID
from the address before looking-up the page.
> You can't make any other changes to mci_addr because that goes to luserspace.
>
> So far so good.
>
> Now, all that other bla leads me to believe that there might be some need to
> dump the raw mci_addr value after all.
>
> If so, your patch is not needed.
>
> Which makes me think, all yall folks need to make up your mind here.
>
> And you need to get rid of all that extraneous information in your commit
> message:
>
> "Investigation of user space expectations has concluded it..."
>
> No investigation needed - this is exported to userspace so you can't touch it.
It is a bit of a grey area. No one expects to find non-address bits in
struct mce addr, and the kernel already strips low-order bits, and in the
case of SMCA, high-order bits too, refer smca_extract_err_addr().
> The one and only question you need to answer is, do you really need KeyID in
> it or not. And whatever you do, once you do it, we're stuck with it because it
> goes out to userspace.
No, the KeyID is almost useless. It is just a dynamically allocated
number. It might indicate which TDX VM encountered the error, except the
MCE is fatal to the VM, so a fatal error is reported to the VMM anyway.
However, it is allowed to extend struct mce, so adding KeyID or
raw MCI ADDR later is quite possible.
> Especially if you want this backported to stable.
The bug exists in stable. Ideally it would get fixed there too somehow.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-22 7:57 ` Adrian Hunter
@ 2025-08-22 13:54 ` Borislav Petkov
2025-08-22 14:54 ` Adrian Hunter
0 siblings, 1 reply; 19+ messages in thread
From: Borislav Petkov @ 2025-08-22 13:54 UTC (permalink / raw)
To: Adrian Hunter
Cc: Dave Hansen, Tony Luck, pbonzini, seanjc, vannapurve,
Thomas Gleixner, Ingo Molnar, x86, H Peter Anvin, linux-edac,
linux-kernel, kvm, rick.p.edgecombe, kai.huang, reinette.chatre,
xiaoyao.li, tony.lindgren, binbin.wu, ira.weiny, isaku.yamahata,
Fan Du, Yazen Ghannam, yan.y.zhao, chao.gao
On Fri, Aug 22, 2025 at 10:57:36AM +0300, Adrian Hunter wrote:
> Not exactly. I just want to fix the bug whereby the mce handler fails
> to mark the affected page as poisoned because it does not remove the KeyID
> from the address before looking-up the page.
Lemme ask this differently then: are you ever going to need KeyID in mci_addr?
> No one expects to find non-address bits in struct mce addr,
You're preaching to the choir - I don't know whose idea it was to shove
a key ID in an address value... it sure sounds silly.
> However, it is allowed to extend struct mce, so adding KeyID or raw MCI ADDR
> later is quite possible.
Why would you want to do that? Do you have a use case?
If not, you can drop that whole angle about adding KeyID later. If yes, let's
hear it.
Just this hypothetically, maybe we'd need it, maybe not, it might be a good
idea ... bla is muddying the water unnecessarily. So let's focus pls and
address *only* the issue(s) at hand.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-22 13:54 ` Borislav Petkov
@ 2025-08-22 14:54 ` Adrian Hunter
0 siblings, 0 replies; 19+ messages in thread
From: Adrian Hunter @ 2025-08-22 14:54 UTC (permalink / raw)
To: Borislav Petkov
Cc: Dave Hansen, Tony Luck, pbonzini, seanjc, vannapurve,
Thomas Gleixner, Ingo Molnar, x86, H Peter Anvin, linux-edac,
linux-kernel, kvm, rick.p.edgecombe, kai.huang, reinette.chatre,
xiaoyao.li, tony.lindgren, binbin.wu, ira.weiny, isaku.yamahata,
Fan Du, Yazen Ghannam, yan.y.zhao, chao.gao
On 22/08/2025 16:54, Borislav Petkov wrote:
> On Fri, Aug 22, 2025 at 10:57:36AM +0300, Adrian Hunter wrote:
>> Not exactly. I just want to fix the bug whereby the mce handler fails
>> to mark the affected page as poisoned because it does not remove the KeyID
>> from the address before looking-up the page.
>
> Lemme ask this differently then: are you ever going to need KeyID in mci_addr?
No
>
>> No one expects to find non-address bits in struct mce addr,
>
> You're preaching to the choir - I don't know whose idea it was to shove
> a key ID in an address value... it sure sounds silly.
>
>> However, it is allowed to extend struct mce, so adding KeyID or raw MCI ADDR
>> later is quite possible.
>
> Why would you want to do that? Do you have a use case?
>
> If not, you can drop that whole angle about adding KeyID later
Droppin' it.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-19 21:32 ` Borislav Petkov
2025-08-21 7:24 ` Adrian Hunter
@ 2025-08-27 8:22 ` Adrian Hunter
2025-08-27 8:29 ` Borislav Petkov
1 sibling, 1 reply; 19+ messages in thread
From: Adrian Hunter @ 2025-08-27 8:22 UTC (permalink / raw)
To: Borislav Petkov, Dave Hansen, Tony Luck
Cc: pbonzini, seanjc, vannapurve, Thomas Gleixner, Ingo Molnar, x86,
H Peter Anvin, linux-edac, linux-kernel, kvm, rick.p.edgecombe,
kai.huang, reinette.chatre, xiaoyao.li, tony.lindgren, binbin.wu,
ira.weiny, isaku.yamahata, Fan Du, Yazen Ghannam, yan.y.zhao,
chao.gao
On 20/08/2025 00:32, Borislav Petkov wrote:
> On Tue, Aug 19, 2025 at 07:24:34PM +0300, Adrian Hunter wrote:
>> Commit 8a01ec97dc066 ("x86/mce: Mask out non-address bits from machine
>> check bank") introduced a new #define MCI_ADDR_PHYSADDR for the mask of
>> valid physical address bits within the machine check bank address register.
>>
>> This is particularly needed in the case of errors in TDX/SEAM non-root mode
>> because the reported address contains the TDX KeyID. Refer to TDX and
>> TME-MK documentation for more information about KeyIDs.
>>
>> Commit 7911f145de5fe ("x86/mce: Implement recovery for errors in TDX/SEAM
>> non-root mode") uses the address to mark the affected page as poisoned, but
>> omits to use the aforementioned mask.
>>
>> Investigation of user space expectations has concluded it would be more
>> correct for the address to contain only address bits in the first place.
>> Refer https://lore.kernel.org/r/807ff02d-7af0-419d-8d14-a4d6c5d5420d@intel.com
>>
>> Mask the address when it is read from the machine check bank address
>> register. Do not use MCI_ADDR_PHYSADDR because that will be removed in a
>> later patch.
>
> Why is this patch talking about TDX-something but doing "global" changes to
> mce.addr?
>
> Why don't you simply do a TDX-specific masking out when you're running on
> in TDX env and leave the rest as is?
>
How about this?
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 4da4eab56c81..3963d4cd8fc1 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -699,6 +699,8 @@ static noinstr void mce_read_aux(struct mce_hw_err *err, int i)
}
smca_extract_err_addr(m);
+
+ tdx_extract_err_addr(m);
}
if (mce_flags.smca) {
diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h
index b5ba598e54cb..fcf0b84a7c98 100644
--- a/arch/x86/kernel/cpu/mce/internal.h
+++ b/arch/x86/kernel/cpu/mce/internal.h
@@ -298,6 +298,16 @@ static inline bool amd_mce_usable_address(struct mce *m) { return false; }
static inline void smca_extract_err_addr(struct mce *m) { }
#endif
+#ifdef CONFIG_X86_MCE_INTEL
+static __always_inline void tdx_extract_err_addr(struct mce *m)
+{
+ if (boot_cpu_has(X86_FEATURE_TDX_HOST_PLATFORM))
+ m->addr &= GENMASK_ULL(boot_cpu_data.x86_phys_bits - 1, 0);
+}
+#else
+static inline void tdx_extract_err_addr(struct mce *m) { }
+#endif
+
#ifdef CONFIG_X86_ANCIENT_MCE
void intel_p5_mcheck_init(struct cpuinfo_x86 *c);
void winchip_mcheck_init(struct cpuinfo_x86 *c);
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
2025-08-27 8:22 ` Adrian Hunter
@ 2025-08-27 8:29 ` Borislav Petkov
0 siblings, 0 replies; 19+ messages in thread
From: Borislav Petkov @ 2025-08-27 8:29 UTC (permalink / raw)
To: Adrian Hunter
Cc: Dave Hansen, Tony Luck, pbonzini, seanjc, vannapurve,
Thomas Gleixner, Ingo Molnar, x86, H Peter Anvin, linux-edac,
linux-kernel, kvm, rick.p.edgecombe, kai.huang, reinette.chatre,
xiaoyao.li, tony.lindgren, binbin.wu, ira.weiny, isaku.yamahata,
Fan Du, Yazen Ghannam, yan.y.zhao, chao.gao
On Wed, Aug 27, 2025 at 11:22:07AM +0300, Adrian Hunter wrote:
> +#ifdef CONFIG_X86_MCE_INTEL
> +static __always_inline void tdx_extract_err_addr(struct mce *m)
> +{
> + if (boot_cpu_has(X86_FEATURE_TDX_HOST_PLATFORM))
> + m->addr &= GENMASK_ULL(boot_cpu_data.x86_phys_bits - 1, 0);
Right, you can stick that thing straight into mce_read_aux() since it is
simple enough and drop the ifdeffery and use cpu_feature_enabled():
mce_read_aux:
...
/* Remove TDX KeyID from the address */
if (cpu_feature_enabled(X86_FEATURE_TDX_HOST_PLATFORM))
m->addr &= GENMASK_ULL(boot_cpu_data.x86_phys_bits - 1, 0);
Something like that...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2025-08-27 8:30 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-19 16:24 [PATCH RESEND V2 0/2] Fixes for recovery for machine check in TDX/SEAM non-root mode Adrian Hunter
2025-08-19 16:24 ` [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors " Adrian Hunter
2025-08-19 17:28 ` Yazen Ghannam
2025-08-19 17:51 ` Luck, Tony
2025-08-19 17:58 ` Adrian Hunter
2025-08-19 18:03 ` Luck, Tony
2025-08-20 15:59 ` Adrian Hunter
2025-08-20 16:12 ` Luck, Tony
2025-08-20 17:56 ` Yazen Ghannam
2025-08-21 6:49 ` Adrian Hunter
2025-08-19 21:32 ` Borislav Petkov
2025-08-21 7:24 ` Adrian Hunter
2025-08-21 13:25 ` Borislav Petkov
2025-08-22 7:57 ` Adrian Hunter
2025-08-22 13:54 ` Borislav Petkov
2025-08-22 14:54 ` Adrian Hunter
2025-08-27 8:22 ` Adrian Hunter
2025-08-27 8:29 ` Borislav Petkov
2025-08-19 16:24 ` [PATCH RESEND V2 2/2] x86/mce: Remove MCI_ADDR_PHYSADDR Adrian Hunter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).