* [PATCH v5 0/9] x86: Add support for NMI-source reporting with FRED
@ 2025-05-07 1:21 Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 1/9] x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM Sohil Mehta
` (8 more replies)
0 siblings, 9 replies; 24+ messages in thread
From: Sohil Mehta @ 2025-05-07 1:21 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Sohil Mehta, Brian Gerst, Andrew Cooper, Kirill A . Shutemov,
Jacob Pan, Andi Kleen, Kai Huang, Nikolay Borisov,
linux-perf-users, linux-edac, kvm, linux-pm, linux-trace-kernel
Introduction
============
NMI-source reporting with FRED [1] provides a new mechanism for
identifying the source of NMIs. As part of the FRED event delivery
framework, a 16-bit vector bitmap is provided that identifies one or
more sources that caused the NMI.
Using the source bitmap, the kernel can precisely run the relevant NMI
handlers instead of polling the entire NMI handler list. Additionally,
the source information would be invaluable for debugging misbehaving
handlers and unknown NMIs.
Changes since the last version
==============================
v4: https://lore.kernel.org/lkml/20240709143906.1040477-1-jacob.jun.pan@linux.intel.com/
Apart from the change of personnel, the patches include the following major
changes:
* Reorder the patches to have the infrastructure changes precede the
feature addition. (Sean)
* Use a simplified encoding mechanism for NMI-source vectors. (Sean)
* Get rid of the alternate NMI vector priority scheme. (below)
* Simplify NMI handling logic with source bitmap. (below)
Existing NMI handling code already has a priority mechanism for the NMI
handlers, with CPU-specific (NMI_LOCAL) handlers executed first followed
by platform NMI handlers and unknown NMI (NMI_UNKNOWN) handlers being
last. Within each of these NMI types, the handlers registered with
NMI_FLAG_FIRST are given priority.
It is essential that new NMI-source handling follows the same scheme to
maintain consistent behavior with and without NMI-source. If there is a
need for a more granular priority scheme, it should be introduced at the
generic NMI handler level instead of assigning priorities to NMI-source
vectors.
This design choice leads to a simplification in the NMI handling logic
as well. It is now possible to get rid of the complexity introduced by a
new handler lookup table as well as the partial bitmap handling logic.
The updated code (patch 5) is significantly less intrusive and easier to
maintain.
Day in the life of an NMI-source vector
=======================================
A brief overview of how NMI-source vectors are used:
// Allocate a static source vector at compile time
#define NMIS_VECTOR_TEST 1
// Register an NMI handler with the vector
register_nmi_handler(NMI_LOCAL, test_handler, 0, "nmi_test", NMIS_VECTOR_TEST);
// Generate an NMI with the source vector using NMI encoded delivery
__apic_send_IPI_mask(cpumask, APIC_DM_NMI | NMIS_VECTOR_TEST);
// Handle an NMI with or without the source information (oversimplified)
source_bitmap = fred_event_data(regs);
if (!source_bitmap || (source_bitmap & BIT(NMIS_VECTOR_TEST)))
test_handler();
// Unregister handler along with the vector
unregister_nmi_handler(NMI_LOCAL, "nmi_test");
Patch structure
===============
The patches are based on tip:x86/nmi because they depend on the NMI
cleanup series merged earlier [2].
Patch 1-2: Prepare FRED/KVM and enumerate NMI-source reporting
Patch 3-5: Register and handle NMI-source vectors
Patch 6-8: APIC changes to generate NMIs with vectors
Patch 9: Improve trace and debug with NMI-source information
Many thanks to Sean Christopherson, Xin Li, H. Peter Anvin, Andi Kleen,
Tony Luck, Kan Liang, Jacob Pan Jun, Zeng Guang and others for their
contributions, reviews and feedback.
Future work / Opens
===================
I am considering a few additional changes that would be valuable for
enhancing NMI handling support. Any feedback, preferences or suggestions
on the following would be helpful.
Assigning more NMI-source vectors
---------------------------------
The current patches assign NMI vectors to a limited number of sources.
The microcode rendezvous and crash reboot code use NMI but do not go
through the typical register_nmi_handler() path. Their handling is
special-cased in exc_nmi(). To isolate blame and improve debugging, it
would be useful to assign vectors to them, even if the vectors are
ignored during handling.
Other NMI sources, such as GHES and Platform NMIs, can also be assigned
vectors to speed up their NMI handling and improve isolation. However,
this would require a software/hardware agreement on vector reservation
and usage. Such an endeavor is likely not worth the effort.
Explicitly enabling NMIs
------------------------
HPA brought up the idea [3] of explicitly enabling NMIs only when the
kernel is ready to take them. With FRED, if we enter the kernel with
NMIs disabled, they could remain disabled until returning back to
userspace.
DebugFS support
---------------
Currently, the kernel has counters for unknown NMIs, swallowed NMIs and
other NMI handling data. However, there is no easy way to access that.
To identify issues that happen over a longer timeframe, it might be
useful to add DebugFS support for NMI statistics.
KVM support
-----------
The NMI-source feature can be useful for perf users and other NMI use
cases in guest VMs. Exposing NMI-source to guests once FRED support is
in place should be relatively easier. The prototype code for this is
under evaluation.
Links
=====
[1]: Chapter 9, https://www.intel.com/content/www/us/en/content-details/819481/flexible-return-and-event-delivery-fred-specification.html
[2]: https://lore.kernel.org/lkml/20250327234629.3953536-1-sohil.mehta@intel.com/
[3]: https://lore.kernel.org/lkml/F5D36889-A868-46D2-A678-8EE26E28556D@zytor.com/
Jacob Pan (1):
perf/x86: Enable NMI-source reporting for perfmon
Sohil Mehta (7):
x86/cpufeatures: Add the CPUID feature bit for NMI-source reporting
x86/nmi: Extend the registration interface to include the NMI-source
vector
x86/nmi: Assign and register NMI-source vectors
x86/nmi: Add support to handle NMIs with source information
x86/nmi: Prepare for the new NMI-source vector encoding
x86/nmi: Enable NMI-source for IPIs delivered as NMIs
x86/nmi: Include NMI-source information in tracepoint and debug prints
Zeng Guang (1):
x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM
arch/x86/entry/entry_64_fred.S | 2 +-
arch/x86/events/amd/ibs.c | 2 +-
arch/x86/events/core.c | 6 ++--
arch/x86/events/intel/core.c | 6 ++--
arch/x86/include/asm/apic.h | 38 ++++++++++++++++++++++
arch/x86/include/asm/apicdef.h | 2 +-
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/fred.h | 9 +++---
arch/x86/include/asm/nmi.h | 37 ++++++++++++++++++++-
arch/x86/kernel/apic/hw_nmi.c | 5 ++-
arch/x86/kernel/apic/ipi.c | 4 +--
arch/x86/kernel/apic/local.h | 24 +++++++-------
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
arch/x86/kernel/cpu/mce/inject.c | 4 +--
arch/x86/kernel/cpu/mshyperv.c | 3 +-
arch/x86/kernel/kgdb.c | 8 ++---
arch/x86/kernel/kvm.c | 9 +-----
arch/x86/kernel/nmi.c | 50 ++++++++++++++++++++++++++++-
arch/x86/kernel/nmi_selftest.c | 9 +++---
arch/x86/kernel/smp.c | 6 ++--
arch/x86/kvm/vmx/vmx.c | 5 +--
arch/x86/platform/uv/uv_nmi.c | 4 +--
drivers/acpi/apei/ghes.c | 2 +-
drivers/char/ipmi/ipmi_watchdog.c | 3 +-
drivers/edac/igen6_edac.c | 3 +-
drivers/thermal/intel/therm_throt.c | 2 +-
drivers/watchdog/hpwdt.c | 6 ++--
include/trace/events/nmi.h | 13 +++++---
28 files changed, 190 insertions(+), 74 deletions(-)
base-commit: f2e01dcf6df2d12e86c363ea9c37d53994d89dd6
--
2.43.0
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v5 1/9] x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM
2025-05-07 1:21 [PATCH v5 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
@ 2025-05-07 1:21 ` Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 2/9] x86/cpufeatures: Add the CPUID feature bit for NMI-source reporting Sohil Mehta
` (7 subsequent siblings)
8 siblings, 0 replies; 24+ messages in thread
From: Sohil Mehta @ 2025-05-07 1:21 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Sohil Mehta, Brian Gerst, Andrew Cooper, Kirill A . Shutemov,
Jacob Pan, Andi Kleen, Kai Huang, Nikolay Borisov,
linux-perf-users, linux-edac, kvm, linux-pm, linux-trace-kernel
From: Zeng Guang <guang.zeng@intel.com>
Extend the FRED entry point from KVM to take an extra argument to allow
KVM to invoke the FRED event dispatch framework with event data.
The first use of this extended API is to pass the NMI-source bitmap for
NMI-induced VM exits. Read the VMCS exit qualification field to get the
NMI-source information and store it as event data precisely in the
format expected by the FRED event framework.
Read the VMCS exit qualification unconditionally since almost all
upcoming CPUs are expected to enable FRED and NMI-source together. In
the rare case that NMI-source isn't enabled, the extra VMREAD would be
harmless since the exit qualification is expected to be zero.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Zeng Guang <guang.zeng@intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v5: Read the VMCS exit qualification unconditionally. (Sean)
Combine related patches into one.
---
arch/x86/entry/entry_64_fred.S | 2 +-
arch/x86/include/asm/fred.h | 9 +++++----
arch/x86/kvm/vmx/vmx.c | 5 +++--
3 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/arch/x86/entry/entry_64_fred.S b/arch/x86/entry/entry_64_fred.S
index 29c5c32c16c3..a61256be9703 100644
--- a/arch/x86/entry/entry_64_fred.S
+++ b/arch/x86/entry/entry_64_fred.S
@@ -93,7 +93,7 @@ SYM_FUNC_START(asm_fred_entry_from_kvm)
* +--------+-----------------+
*/
push $0 /* Reserved, must be 0 */
- push $0 /* Event data, 0 for IRQ/NMI */
+ push %rsi /* Event data for IRQ/NMI */
push %rdi /* fred_ss handed in by the caller */
push %rbp
pushf
diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h
index 2a29e5216881..a4de57e578c4 100644
--- a/arch/x86/include/asm/fred.h
+++ b/arch/x86/include/asm/fred.h
@@ -64,14 +64,15 @@ static __always_inline unsigned long fred_event_data(struct pt_regs *regs)
void asm_fred_entrypoint_user(void);
void asm_fred_entrypoint_kernel(void);
-void asm_fred_entry_from_kvm(struct fred_ss);
+void asm_fred_entry_from_kvm(struct fred_ss ss, unsigned long edata);
__visible void fred_entry_from_user(struct pt_regs *regs);
__visible void fred_entry_from_kernel(struct pt_regs *regs);
__visible void __fred_entry_from_kvm(struct pt_regs *regs);
/* Can be called from noinstr code, thus __always_inline */
-static __always_inline void fred_entry_from_kvm(unsigned int type, unsigned int vector)
+static __always_inline void fred_entry_from_kvm(unsigned int type, unsigned int vector,
+ unsigned long edata)
{
struct fred_ss ss = {
.ss =__KERNEL_DS,
@@ -81,7 +82,7 @@ static __always_inline void fred_entry_from_kvm(unsigned int type, unsigned int
.lm = 1,
};
- asm_fred_entry_from_kvm(ss);
+ asm_fred_entry_from_kvm(ss, edata);
}
void cpu_init_fred_exceptions(void);
@@ -109,7 +110,7 @@ static __always_inline unsigned long fred_event_data(struct pt_regs *regs) { ret
static inline void cpu_init_fred_exceptions(void) { }
static inline void cpu_init_fred_rsps(void) { }
static inline void fred_complete_exception_setup(void) { }
-static inline void fred_entry_from_kvm(unsigned int type, unsigned int vector) { }
+static inline void fred_entry_from_kvm(unsigned int type, unsigned int vector, unsigned long edata) { }
static inline void fred_sync_rsp0(unsigned long rsp0) { }
static inline void fred_update_rsp0(void) { }
#endif /* CONFIG_X86_FRED */
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 5c5766467a61..1d43d4a2f6b6 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7079,7 +7079,7 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu,
kvm_before_interrupt(vcpu, KVM_HANDLING_IRQ);
if (cpu_feature_enabled(X86_FEATURE_FRED))
- fred_entry_from_kvm(EVENT_TYPE_EXTINT, vector);
+ fred_entry_from_kvm(EVENT_TYPE_EXTINT, vector, 0);
else
vmx_do_interrupt_irqoff(gate_offset((gate_desc *)host_idt_base + vector));
kvm_after_interrupt(vcpu);
@@ -7393,7 +7393,8 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
is_nmi(vmx_get_intr_info(vcpu))) {
kvm_before_interrupt(vcpu, KVM_HANDLING_NMI);
if (cpu_feature_enabled(X86_FEATURE_FRED))
- fred_entry_from_kvm(EVENT_TYPE_NMI, NMI_VECTOR);
+ fred_entry_from_kvm(EVENT_TYPE_NMI, NMI_VECTOR,
+ vmx_get_exit_qual(vcpu));
else
vmx_do_nmi_irqoff();
kvm_after_interrupt(vcpu);
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 2/9] x86/cpufeatures: Add the CPUID feature bit for NMI-source reporting
2025-05-07 1:21 [PATCH v5 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 1/9] x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM Sohil Mehta
@ 2025-05-07 1:21 ` Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 3/9] x86/nmi: Extend the registration interface to include the NMI-source vector Sohil Mehta
` (6 subsequent siblings)
8 siblings, 0 replies; 24+ messages in thread
From: Sohil Mehta @ 2025-05-07 1:21 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Sohil Mehta, Brian Gerst, Andrew Cooper, Kirill A . Shutemov,
Jacob Pan, Andi Kleen, Kai Huang, Nikolay Borisov,
linux-perf-users, linux-edac, kvm, linux-pm, linux-trace-kernel
NMI-source reporting is introduced to report the sources of NMIs with
FRED event delivery based on vectors in NMI interrupt messages or the
local APIC. This enables the kernel to avoid the latency incurred by
going over the entire NMI handler list and reduces ambiguity about the
source of an NMI.
Enumerate NMI-source reporting in cpufeatures. Also, since NMI-source
reporting uses the FRED event dispatch framework, make it dependent on
FRED in the CPUID dependency table. This ensures that NMI-source
reporting gets disabled when FRED is disabled.
NMI-source reporting is intended as a kernel feature and does not need
userspace enumeration or configuration. There is no need to expose it to
userspace through /proc/cpuinfo.
Originally-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v5: Add NMI-source to the CPUID dependency table.
Do not expose NMI-source feature through /proc/cpuinfo.
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 6c2c152d8a67..2ced1bc64548 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -321,6 +321,7 @@
#define X86_FEATURE_FRED (12*32+17) /* "fred" Flexible Return and Event Delivery */
#define X86_FEATURE_LKGS (12*32+18) /* Load "kernel" (userspace) GS */
#define X86_FEATURE_WRMSRNS (12*32+19) /* Non-serializing WRMSR */
+#define X86_FEATURE_NMI_SOURCE (12*32+20) /* NMI-Source reporting with FRED */
#define X86_FEATURE_AMX_FP16 (12*32+21) /* AMX fp16 Support */
#define X86_FEATURE_AVX_IFMA (12*32+23) /* Support for VPMADD52[H,L]UQ */
#define X86_FEATURE_LAM (12*32+26) /* "lam" Linear Address Masking */
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index a2fbea0be535..ed0fd35c9290 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -84,6 +84,7 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_AMX_TILE, X86_FEATURE_XFD },
{ X86_FEATURE_SHSTK, X86_FEATURE_XSAVES },
{ X86_FEATURE_FRED, X86_FEATURE_LKGS },
+ { X86_FEATURE_NMI_SOURCE, X86_FEATURE_FRED },
{}
};
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 3/9] x86/nmi: Extend the registration interface to include the NMI-source vector
2025-05-07 1:21 [PATCH v5 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 1/9] x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 2/9] x86/cpufeatures: Add the CPUID feature bit for NMI-source reporting Sohil Mehta
@ 2025-05-07 1:21 ` Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 4/9] x86/nmi: Assign and register NMI-source vectors Sohil Mehta
` (5 subsequent siblings)
8 siblings, 0 replies; 24+ messages in thread
From: Sohil Mehta @ 2025-05-07 1:21 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Sohil Mehta, Brian Gerst, Andrew Cooper, Kirill A . Shutemov,
Jacob Pan, Andi Kleen, Kai Huang, Nikolay Borisov,
linux-perf-users, linux-edac, kvm, linux-pm, linux-trace-kernel
To prepare for NMI-source reporting, add a source vector argument to the
NMI handler registration interface. Later, this will be used to
register NMI handlers with a unique source vector that can be used to
identify the originator of the NMI.
For now, just extend the interface and pass zero as the source vector
for all handlers. No functional change intended.
Originally-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v5: Split the patch into two parts. This one only extends the interface.
---
arch/x86/events/amd/ibs.c | 2 +-
arch/x86/events/core.c | 2 +-
arch/x86/include/asm/nmi.h | 5 ++++-
arch/x86/kernel/apic/hw_nmi.c | 3 +--
arch/x86/kernel/cpu/mce/inject.c | 2 +-
arch/x86/kernel/cpu/mshyperv.c | 3 +--
arch/x86/kernel/kgdb.c | 6 ++----
arch/x86/kernel/nmi_selftest.c | 7 +++----
arch/x86/kernel/smp.c | 4 ++--
arch/x86/platform/uv/uv_nmi.c | 4 ++--
drivers/acpi/apei/ghes.c | 2 +-
drivers/char/ipmi/ipmi_watchdog.c | 3 +--
drivers/edac/igen6_edac.c | 3 +--
drivers/watchdog/hpwdt.c | 6 +++---
14 files changed, 24 insertions(+), 28 deletions(-)
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index 0252b7ea8bca..45dece8bee84 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -1486,7 +1486,7 @@ static __init int perf_event_ibs_init(void)
if (ret)
goto err_op;
- ret = register_nmi_handler(NMI_LOCAL, perf_ibs_nmi_handler, 0, "perf_ibs");
+ ret = register_nmi_handler(NMI_LOCAL, perf_ibs_nmi_handler, 0, "perf_ibs", 0);
if (ret)
goto err_nmi;
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 6866cc5acb0b..b84b8be1f075 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2115,7 +2115,7 @@ static int __init init_hw_perf_events(void)
x86_pmu.config_mask = X86_RAW_EVENT_MASK;
perf_events_lapic_init();
- register_nmi_handler(NMI_LOCAL, perf_event_nmi_handler, 0, "PMI");
+ register_nmi_handler(NMI_LOCAL, perf_event_nmi_handler, 0, "PMI", 0);
unconstrained = (struct event_constraint)
__EVENT_CONSTRAINT(0, x86_pmu.cntr_mask64,
diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
index 79d88d12c8fb..f0a577bf7bba 100644
--- a/arch/x86/include/asm/nmi.h
+++ b/arch/x86/include/asm/nmi.h
@@ -54,6 +54,7 @@ struct nmiaction {
u64 max_duration;
unsigned long flags;
const char *name;
+ u8 source_vector;
};
/**
@@ -62,6 +63,7 @@ struct nmiaction {
* @fn: The NMI handler
* @fg: Flags associated with the NMI handler
* @n: Name of the NMI handler
+ * @src: NMI-source based vector for the NMI handler
* @init: Optional __init* attributes for struct nmiaction
*
* Adds the provided handler to the list of handlers for the specified
@@ -75,13 +77,14 @@ struct nmiaction {
*
* Return: 0 on success, or an error code on failure.
*/
-#define register_nmi_handler(t, fn, fg, n, init...) \
+#define register_nmi_handler(t, fn, fg, n, src, init...) \
({ \
static struct nmiaction init fn##_na = { \
.list = LIST_HEAD_INIT(fn##_na.list), \
.handler = (fn), \
.name = (n), \
.flags = (fg), \
+ .source_vector = (src), \
}; \
__register_nmi_handler((t), &fn##_na); \
})
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 45af535c44a0..612b77660d05 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -53,8 +53,7 @@ NOKPROBE_SYMBOL(nmi_cpu_backtrace_handler);
static int __init register_nmi_cpu_backtrace_handler(void)
{
- register_nmi_handler(NMI_LOCAL, nmi_cpu_backtrace_handler,
- 0, "arch_bt");
+ register_nmi_handler(NMI_LOCAL, nmi_cpu_backtrace_handler, 0, "arch_bt", 0);
return 0;
}
early_initcall(register_nmi_cpu_backtrace_handler);
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index 06e3cf7229ce..17804ba0b02f 100644
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -774,7 +774,7 @@ static int __init inject_init(void)
debugfs_init();
- register_nmi_handler(NMI_LOCAL, mce_raise_notify, 0, "mce_notify");
+ register_nmi_handler(NMI_LOCAL, mce_raise_notify, 0, "mce_notify", 0);
mce_register_injector_chain(&inject_nb);
setup_inj_struct(&i_mce);
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 3e2533954675..d643d6fb3cfa 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -548,8 +548,7 @@ static void __init ms_hyperv_init_platform(void)
lapic_timer_period);
}
- register_nmi_handler(NMI_UNKNOWN, hv_nmi_unknown, NMI_FLAG_FIRST,
- "hv_nmi_unknown");
+ register_nmi_handler(NMI_UNKNOWN, hv_nmi_unknown, NMI_FLAG_FIRST, "hv_nmi_unknown", 0);
#endif
#ifdef CONFIG_X86_IO_APIC
diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
index 9c9faa1634fb..ab2d1b79b79e 100644
--- a/arch/x86/kernel/kgdb.c
+++ b/arch/x86/kernel/kgdb.c
@@ -602,13 +602,11 @@ int kgdb_arch_init(void)
if (retval)
goto out;
- retval = register_nmi_handler(NMI_LOCAL, kgdb_nmi_handler,
- 0, "kgdb");
+ retval = register_nmi_handler(NMI_LOCAL, kgdb_nmi_handler, 0, "kgdb", 0);
if (retval)
goto out1;
- retval = register_nmi_handler(NMI_UNKNOWN, kgdb_nmi_handler,
- 0, "kgdb");
+ retval = register_nmi_handler(NMI_UNKNOWN, kgdb_nmi_handler, 0, "kgdb", 0);
if (retval)
goto out2;
diff --git a/arch/x86/kernel/nmi_selftest.c b/arch/x86/kernel/nmi_selftest.c
index a010e9d062bf..b203e4371816 100644
--- a/arch/x86/kernel/nmi_selftest.c
+++ b/arch/x86/kernel/nmi_selftest.c
@@ -40,8 +40,7 @@ static int __init nmi_unk_cb(unsigned int val, struct pt_regs *regs)
static void __init init_nmi_testsuite(void)
{
/* trap all the unknown NMIs we may generate */
- register_nmi_handler(NMI_UNKNOWN, nmi_unk_cb, 0, "nmi_selftest_unk",
- __initdata);
+ register_nmi_handler(NMI_UNKNOWN, nmi_unk_cb, 0, "nmi_selftest_unk", 0, __initdata);
}
static void __init cleanup_nmi_testsuite(void)
@@ -63,8 +62,8 @@ static void __init test_nmi_ipi(struct cpumask *mask)
{
unsigned long timeout;
- if (register_nmi_handler(NMI_LOCAL, test_nmi_ipi_callback,
- NMI_FLAG_FIRST, "nmi_selftest", __initdata)) {
+ if (register_nmi_handler(NMI_LOCAL, test_nmi_ipi_callback, NMI_FLAG_FIRST,
+ "nmi_selftest", 0, __initdata)) {
nmi_fail = FAILURE;
return;
}
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 18266cc3d98c..b80812aa06c3 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -142,8 +142,8 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_reboot)
static int register_stop_handler(void)
{
- return register_nmi_handler(NMI_LOCAL, smp_stop_nmi_callback,
- NMI_FLAG_FIRST, "smp_stop");
+ return register_nmi_handler(NMI_LOCAL, smp_stop_nmi_callback, NMI_FLAG_FIRST, "smp_stop",
+ 0);
}
static void native_stop_other_cpus(int wait)
diff --git a/arch/x86/platform/uv/uv_nmi.c b/arch/x86/platform/uv/uv_nmi.c
index 5c50e550ab63..473c34eb264c 100644
--- a/arch/x86/platform/uv/uv_nmi.c
+++ b/arch/x86/platform/uv/uv_nmi.c
@@ -1029,10 +1029,10 @@ static int uv_handle_nmi_ping(unsigned int reason, struct pt_regs *regs)
static void uv_register_nmi_notifier(void)
{
- if (register_nmi_handler(NMI_UNKNOWN, uv_handle_nmi, 0, "uv"))
+ if (register_nmi_handler(NMI_UNKNOWN, uv_handle_nmi, 0, "uv", 0))
pr_warn("UV: NMI handler failed to register\n");
- if (register_nmi_handler(NMI_LOCAL, uv_handle_nmi_ping, 0, "uvping"))
+ if (register_nmi_handler(NMI_LOCAL, uv_handle_nmi_ping, 0, "uvping", 0))
pr_warn("UV: PING NMI handler failed to register\n");
}
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index b72772494655..95bd3a64608f 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -1318,7 +1318,7 @@ static void ghes_nmi_add(struct ghes *ghes)
{
mutex_lock(&ghes_list_mutex);
if (list_empty(&ghes_nmi))
- register_nmi_handler(NMI_LOCAL, ghes_notify_nmi, 0, "ghes");
+ register_nmi_handler(NMI_LOCAL, ghes_notify_nmi, 0, "ghes", 0);
list_add_rcu(&ghes->list, &ghes_nmi);
mutex_unlock(&ghes_list_mutex);
}
diff --git a/drivers/char/ipmi/ipmi_watchdog.c b/drivers/char/ipmi/ipmi_watchdog.c
index f1875b2bebbc..5db402c4b9e7 100644
--- a/drivers/char/ipmi/ipmi_watchdog.c
+++ b/drivers/char/ipmi/ipmi_watchdog.c
@@ -1267,8 +1267,7 @@ static void check_parms(void)
}
}
if (do_nmi && !nmi_handler_registered) {
- rv = register_nmi_handler(NMI_UNKNOWN, ipmi_nmi, 0,
- "ipmi");
+ rv = register_nmi_handler(NMI_UNKNOWN, ipmi_nmi, 0, "ipmi", 0);
if (rv) {
pr_warn("Can't register nmi handler\n");
return;
diff --git a/drivers/edac/igen6_edac.c b/drivers/edac/igen6_edac.c
index 5807517ee32d..3a6e7334e94c 100644
--- a/drivers/edac/igen6_edac.c
+++ b/drivers/edac/igen6_edac.c
@@ -1363,8 +1363,7 @@ static int register_err_handler(void)
return 0;
}
- rc = register_nmi_handler(NMI_SERR, ecclog_nmi_handler,
- 0, IGEN6_NMI_NAME);
+ rc = register_nmi_handler(NMI_SERR, ecclog_nmi_handler, 0, IGEN6_NMI_NAME, 0);
if (rc) {
igen6_printk(KERN_ERR, "Failed to register NMI handler\n");
return rc;
diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
index ae30e394d176..5246706afcf6 100644
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -242,13 +242,13 @@ static int hpwdt_init_nmi_decoding(struct pci_dev *dev)
/*
* Only one function can register for NMI_UNKNOWN
*/
- retval = register_nmi_handler(NMI_UNKNOWN, hpwdt_pretimeout, 0, "hpwdt");
+ retval = register_nmi_handler(NMI_UNKNOWN, hpwdt_pretimeout, 0, "hpwdt", 0);
if (retval)
goto error;
- retval = register_nmi_handler(NMI_SERR, hpwdt_pretimeout, 0, "hpwdt");
+ retval = register_nmi_handler(NMI_SERR, hpwdt_pretimeout, 0, "hpwdt", 0);
if (retval)
goto error1;
- retval = register_nmi_handler(NMI_IO_CHECK, hpwdt_pretimeout, 0, "hpwdt");
+ retval = register_nmi_handler(NMI_IO_CHECK, hpwdt_pretimeout, 0, "hpwdt", 0);
if (retval)
goto error2;
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 4/9] x86/nmi: Assign and register NMI-source vectors
2025-05-07 1:21 [PATCH v5 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
` (2 preceding siblings ...)
2025-05-07 1:21 ` [PATCH v5 3/9] x86/nmi: Extend the registration interface to include the NMI-source vector Sohil Mehta
@ 2025-05-07 1:21 ` Sohil Mehta
2025-05-07 8:22 ` Peter Zijlstra
2025-05-07 1:21 ` [PATCH v5 5/9] x86/nmi: Add support to handle NMIs with source information Sohil Mehta
` (4 subsequent siblings)
8 siblings, 1 reply; 24+ messages in thread
From: Sohil Mehta @ 2025-05-07 1:21 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Sohil Mehta, Brian Gerst, Andrew Cooper, Kirill A . Shutemov,
Jacob Pan, Andi Kleen, Kai Huang, Nikolay Borisov,
linux-perf-users, linux-edac, kvm, linux-pm, linux-trace-kernel
Prior to NMI-source support, the vector information was ignored by the
hardware while delivering NMIs. With NMI-source, the architecture
currently supports a 16-bit source bitmap to identify the source of the
NMI. Upon receiving an NMI, this bitmap is delivered as part of the FRED
event delivery mechanism to the kernel.
Assign a vector space of 0-15 that is specific to NMI-source and
independent of the IDT vector space of 0-255. Being a bitmap, the
NMI-source vectors do not have any inherent priority associated with
them. The order of executing the NMI handlers is up to the kernel.
Existing NMI handling already has a priority mechanism for the NMI
handlers, with CPU-specific (NMI_LOCAL) handlers executed first,
followed by platform NMI handlers and unknown NMI (NMI_UNKNOWN) handlers
being last. Within each of these NMI types, the handlers registered with
NMI_FLAG_FIRST are given priority.
NMI-source follows the same priority scheme to avoid unnecessary
complexity. Therefore, the NMI-source vectors are assigned arbitrarily,
except for vectors 0 and 2.
Vector 0 is set by the hardware whenever a source vector was not used
while generating an NMI or the originator could not be reliably
identified. Do not assign it to any handler.
Vector 2 is reserved for external NMIs corresponding to Local APIC -
LINT1. Some third-party chipsets may send NMI messages with a hardcoded
vector of 2, which would result in vector 2 being set in the NMI-source
bitmap. To avoid confusion, do not assign vector 2 to any handler.
NMI-source vectors are only assigned for NMI_LOCAL type handlers.
Platform NMI handlers have a single handler registered per type. They
don't need additional source information to differentiate among them.
Use the assigned vectors to register the respective NMI handlers. A
couple of NMI handlers, such as the microcode rendezvous and the crash
reboot, do not use the typical NMI registration interface. Leave them
as-is for now.
Originally-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v5: Move the vector defines to nmi.h.
Combine vector allocation and registration into one patch.
Simplify NMI vector names.
Describe usage of vector 2 for external NMIs.
Get rid of vector priorities.
---
arch/x86/events/core.c | 2 +-
arch/x86/include/asm/nmi.h | 32 ++++++++++++++++++++++++++++++++
arch/x86/kernel/apic/hw_nmi.c | 2 +-
arch/x86/kernel/cpu/mce/inject.c | 2 +-
arch/x86/kernel/kgdb.c | 2 +-
arch/x86/kernel/nmi.c | 5 +++++
arch/x86/kernel/nmi_selftest.c | 2 +-
arch/x86/kernel/smp.c | 2 +-
8 files changed, 43 insertions(+), 6 deletions(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index b84b8be1f075..031e908f0d61 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2115,7 +2115,7 @@ static int __init init_hw_perf_events(void)
x86_pmu.config_mask = X86_RAW_EVENT_MASK;
perf_events_lapic_init();
- register_nmi_handler(NMI_LOCAL, perf_event_nmi_handler, 0, "PMI", 0);
+ register_nmi_handler(NMI_LOCAL, perf_event_nmi_handler, 0, "PMI", NMIS_VECTOR_PMI);
unconstrained = (struct event_constraint)
__EVENT_CONSTRAINT(0, x86_pmu.cntr_mask64,
diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
index f0a577bf7bba..b9beb216f2d0 100644
--- a/arch/x86/include/asm/nmi.h
+++ b/arch/x86/include/asm/nmi.h
@@ -57,6 +57,38 @@ struct nmiaction {
u8 source_vector;
};
+/**
+ * NMI-source vectors are used to identify the origin of an NMI and to
+ * route the NMI directly to the appropriate handler.
+ *
+ * On CPUs that support NMI-source reporting with FRED, receiving an NMI
+ * with a valid vector sets the corresponding bit in the NMI-source
+ * bitmap. The bitmap is delivered as FRED event data on the stack.
+ * Multiple NMIs are coalesced in the NMI-source bitmap until the next
+ * NMI delivery.
+ *
+ * If an NMI is received without a vector or beyond the defined range,
+ * the CPU sets bit 0 of the NMI-source bitmap.
+ *
+ * Vector 2 is reserved for external NMIs related to the Local APIC -
+ * LINT1. Some third-party chipsets may send NMI messages with a
+ * hardcoded vector of 2, which would result in bit 2 being set in the
+ * NMI-source bitmap.
+ *
+ * The vectors are in no particular priority order. Add new vector
+ * assignments sequentially in the list below.
+ */
+#define NMIS_VECTOR_NONE 0 /* Reserved - Set for all unidentified sources */
+#define NMIS_VECTOR_TEST 1 /* NMI selftest */
+#define NMIS_VECTOR_EXTERNAL 2 /* Reserved - Match External NMI vector 2 */
+#define NMIS_VECTOR_SMP_STOP 3 /* Panic stop CPU */
+#define NMIS_VECTOR_BT 4 /* CPU backtrace */
+#define NMIS_VECTOR_KGDB 5 /* Kernel debugger */
+#define NMIS_VECTOR_MCE 6 /* MCE injection */
+#define NMIS_VECTOR_PMI 7 /* PerfMon counters */
+
+#define NMIS_VECTORS_MAX 16 /* Maximum number of NMI-source vectors */
+
/**
* register_nmi_handler - Register a handler for a specific NMI type
* @t: NMI type (e.g. NMI_LOCAL)
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 612b77660d05..4e04f13d2de9 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -53,7 +53,7 @@ NOKPROBE_SYMBOL(nmi_cpu_backtrace_handler);
static int __init register_nmi_cpu_backtrace_handler(void)
{
- register_nmi_handler(NMI_LOCAL, nmi_cpu_backtrace_handler, 0, "arch_bt", 0);
+ register_nmi_handler(NMI_LOCAL, nmi_cpu_backtrace_handler, 0, "arch_bt", NMIS_VECTOR_BT);
return 0;
}
early_initcall(register_nmi_cpu_backtrace_handler);
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index 17804ba0b02f..a3c753dfce91 100644
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -774,7 +774,7 @@ static int __init inject_init(void)
debugfs_init();
- register_nmi_handler(NMI_LOCAL, mce_raise_notify, 0, "mce_notify", 0);
+ register_nmi_handler(NMI_LOCAL, mce_raise_notify, 0, "mce_notify", NMIS_VECTOR_MCE);
mce_register_injector_chain(&inject_nb);
setup_inj_struct(&i_mce);
diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
index ab2d1b79b79e..9ca4b141da0c 100644
--- a/arch/x86/kernel/kgdb.c
+++ b/arch/x86/kernel/kgdb.c
@@ -602,7 +602,7 @@ int kgdb_arch_init(void)
if (retval)
goto out;
- retval = register_nmi_handler(NMI_LOCAL, kgdb_nmi_handler, 0, "kgdb", 0);
+ retval = register_nmi_handler(NMI_LOCAL, kgdb_nmi_handler, 0, "kgdb", NMIS_VECTOR_KGDB);
if (retval)
goto out1;
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index be93ec7255bf..a1d672dcb6f0 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -184,6 +184,11 @@ int __register_nmi_handler(unsigned int type, struct nmiaction *action)
raw_spin_lock_irqsave(&desc->lock, flags);
+ WARN_ON_ONCE(action->source_vector >= NMIS_VECTORS_MAX);
+
+ if (!cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) || type != NMI_LOCAL)
+ action->source_vector = 0;
+
/*
* Indicate if there are multiple registrations on the
* internal NMI handler call chains (SERR and IO_CHECK).
diff --git a/arch/x86/kernel/nmi_selftest.c b/arch/x86/kernel/nmi_selftest.c
index b203e4371816..5196023b31dc 100644
--- a/arch/x86/kernel/nmi_selftest.c
+++ b/arch/x86/kernel/nmi_selftest.c
@@ -63,7 +63,7 @@ static void __init test_nmi_ipi(struct cpumask *mask)
unsigned long timeout;
if (register_nmi_handler(NMI_LOCAL, test_nmi_ipi_callback, NMI_FLAG_FIRST,
- "nmi_selftest", 0, __initdata)) {
+ "nmi_selftest", NMIS_VECTOR_TEST, __initdata)) {
nmi_fail = FAILURE;
return;
}
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index b80812aa06c3..5be1c0bdf901 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -143,7 +143,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_reboot)
static int register_stop_handler(void)
{
return register_nmi_handler(NMI_LOCAL, smp_stop_nmi_callback, NMI_FLAG_FIRST, "smp_stop",
- 0);
+ NMIS_VECTOR_SMP_STOP);
}
static void native_stop_other_cpus(int wait)
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 5/9] x86/nmi: Add support to handle NMIs with source information
2025-05-07 1:21 [PATCH v5 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
` (3 preceding siblings ...)
2025-05-07 1:21 ` [PATCH v5 4/9] x86/nmi: Assign and register NMI-source vectors Sohil Mehta
@ 2025-05-07 1:21 ` Sohil Mehta
2025-05-07 9:14 ` Peter Zijlstra
2025-05-07 1:21 ` [PATCH v5 6/9] x86/nmi: Prepare for the new NMI-source vector encoding Sohil Mehta
` (3 subsequent siblings)
8 siblings, 1 reply; 24+ messages in thread
From: Sohil Mehta @ 2025-05-07 1:21 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Sohil Mehta, Brian Gerst, Andrew Cooper, Kirill A . Shutemov,
Jacob Pan, Andi Kleen, Kai Huang, Nikolay Borisov,
linux-perf-users, linux-edac, kvm, linux-pm, linux-trace-kernel
The NMI-source bitmap is delivered as FRED event data to the kernel.
When available, use NMI-source based filtering to determine the exact
handlers to run.
Activate NMI-source based filtering only for Local NMIs. While handling
platform NMI types (such as SERR and IOCHK), do not use the source
bitmap. They have only one handler registered per type, so there is no
need to disambiguate between multiple handlers.
Some third-party chipsets may send NMI messages with a hardcoded vector
of 2, which would result in bit 2 being set in the NMI-source bitmap.
Skip the local NMI handlers in this situation.
Bit 0 of the source bitmap is set by the hardware whenever a source
vector was not used while generating an NMI, or the originator could not
be reliably identified. Poll all the registered handlers in that case.
When multiple handlers need to be executed, adhere to the existing
priority scheme and execute the handlers registered with NMI_FLAG_FIRST
before others.
The logic for handling legacy NMIs is unaffected since the source bitmap
would always be zero.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v5: Significantly simplify NMI-source handling logic.
Get rid of a separate lookup table for NMI-source vectors.
Adhere to existing priority scheme for handling NMIs.
---
arch/x86/kernel/nmi.c | 40 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 40 insertions(+)
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index a1d672dcb6f0..183e3e717326 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -127,9 +127,25 @@ static void nmi_check_duration(struct nmiaction *action, u64 duration)
action->handler, duration, decimal_msecs);
}
+/*
+ * Match the NMI-source vector saved during registration with the source
+ * bitmap.
+ *
+ * Always return true when bit 0 of the source bitmap is set.
+ */
+static inline bool match_nmi_source(unsigned long source_bitmap, struct nmiaction *action)
+{
+ unsigned long match_vector;
+
+ match_vector = BIT(NMIS_VECTOR_NONE) | BIT(action->source_vector);
+
+ return (source_bitmap & match_vector);
+}
+
static int nmi_handle(unsigned int type, struct pt_regs *regs)
{
struct nmi_desc *desc = nmi_to_desc(type);
+ unsigned long source_bitmap = 0;
nmi_handler_t ehandler;
struct nmiaction *a;
int handled=0;
@@ -148,16 +164,40 @@ static int nmi_handle(unsigned int type, struct pt_regs *regs)
rcu_read_lock();
+ /*
+ * Activate NMI source-based filtering only for Local NMIs.
+ *
+ * Platform NMI types (such as SERR and IOCHK) have only one
+ * handler registered per type, so there is no need to
+ * disambiguate between multiple handlers.
+ *
+ * Also, if a platform source ends up setting bit 2 in the
+ * source bitmap, the local NMI handlers would be skipped since
+ * none of them use this reserved vector.
+ *
+ * For Unknown NMIs, avoid using the source bitmap to ensure all
+ * potential handlers have a chance to claim responsibility.
+ */
+ if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL)
+ source_bitmap = fred_event_data(regs);
+
/*
* NMIs are edge-triggered, which means if you have enough
* of them concurrently, you can lose some because only one
* can be latched at any given time. Walk the whole list
* to handle those situations.
+ *
+ * However, NMI-source reporting does not have this limitation.
+ * When NMI-source information is available, only run the
+ * handlers that match the reported vectors.
*/
list_for_each_entry_rcu(a, &desc->head, list) {
int thishandled;
u64 delta;
+ if (source_bitmap && !match_nmi_source(source_bitmap, a))
+ continue;
+
delta = sched_clock();
thishandled = a->handler(type, regs);
handled += thishandled;
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 6/9] x86/nmi: Prepare for the new NMI-source vector encoding
2025-05-07 1:21 [PATCH v5 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
` (4 preceding siblings ...)
2025-05-07 1:21 ` [PATCH v5 5/9] x86/nmi: Add support to handle NMIs with source information Sohil Mehta
@ 2025-05-07 1:21 ` Sohil Mehta
2025-05-07 9:17 ` Peter Zijlstra
2025-05-07 1:21 ` [PATCH v5 7/9] x86/nmi: Enable NMI-source for IPIs delivered as NMIs Sohil Mehta
` (2 subsequent siblings)
8 siblings, 1 reply; 24+ messages in thread
From: Sohil Mehta @ 2025-05-07 1:21 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Sohil Mehta, Brian Gerst, Andrew Cooper, Kirill A . Shutemov,
Jacob Pan, Andi Kleen, Kai Huang, Nikolay Borisov,
linux-perf-users, linux-edac, kvm, linux-pm, linux-trace-kernel
When using the send_IPI_* APIC calls, callers typically use NMI vector
0x2 to trigger NMIs. The APIC APIs convert the NMI vector to the NMI
delivery mode, which is eventually used to program the APIC.
Before FRED, the hardware would ignore the vector used with NMI delivery
mode. However, with NMI-source reporting, the vector information is
relayed to the destination CPU, which sets the corresponding bit in the
NMI-source bitmap. Unfortunately, the kernel now needs to maintain a new
set of NMI vectors and differentiate them from the IDT vectors.
Instead of creating a parallel set of send_NMI_* APIs to handle
NMI-source vectors, enhance the existing send_IPI_* APIs with a new
encoding scheme to handle the NMI delivery mode along with the
NMI-source vector.
NMI-source vectors would be encoded as:
APIC_DM_NMI (0x400) | NMI_SOURCE_VECTOR (0x1-0xF)
Also, introduce a helper to prepare the ICR value with the encoded
delivery mode and vector. Update the guest paravirtual APIC code to use
the new helper as well.
While at it, rename APIC_DM_FIXED_MASK to the more appropriate
APIC_DM_MASK.
Suggested-by: Sean Christopherson <seanjc@google.com>
Co-developed-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v5: Use a simiplified encoding scheme for NMI-source vectors.
---
arch/x86/include/asm/apic.h | 30 +++++++++++++++++++++++++++++
arch/x86/include/asm/apicdef.h | 2 +-
arch/x86/kernel/apic/ipi.c | 4 ++--
arch/x86/kernel/apic/local.h | 24 ++++++++++++-----------
arch/x86/kernel/kvm.c | 9 +--------
drivers/thermal/intel/therm_throt.c | 2 +-
6 files changed, 48 insertions(+), 23 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index c903d358405d..11a3f88518fa 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -470,6 +470,36 @@ static __always_inline bool apic_id_valid(u32 apic_id)
return apic_id <= apic->max_apic_id;
}
+/*
+ * Prepare the delivery mode and vector for the ICR.
+ *
+ * NMI-source vectors have the NMI delivery mode encoded within them to
+ * differentiate them from the IDT vectors. IDT vector 0x2 (NMI_VECTOR)
+ * is treated as an NMI request but without any NMI-source information.
+ */
+static inline u16 __prepare_ICR_DM_vector(u16 dm_vector)
+{
+ u16 vector = dm_vector & APIC_VECTOR_MASK;
+ u16 dm = dm_vector & APIC_DM_MASK;
+
+ if (dm == APIC_DM_NMI) {
+ /*
+ * Pre-FRED, the actual vector is ignored for NMIs, but
+ * zero it if NMI-source reporting is not supported to
+ * avoid breakage on misbehaving hardware or hypervisors.
+ */
+ if (!cpu_feature_enabled(X86_FEATURE_NMI_SOURCE))
+ vector = 0;
+
+ return dm | vector;
+ }
+
+ if (vector == NMI_VECTOR)
+ return APIC_DM_NMI;
+ else
+ return APIC_DM_FIXED | vector;
+}
+
#else /* CONFIG_X86_LOCAL_APIC */
static inline u32 apic_read(u32 reg) { return 0; }
diff --git a/arch/x86/include/asm/apicdef.h b/arch/x86/include/asm/apicdef.h
index 094106b6a538..3fb8fa73f6aa 100644
--- a/arch/x86/include/asm/apicdef.h
+++ b/arch/x86/include/asm/apicdef.h
@@ -87,8 +87,8 @@
#define APIC_ICR_BUSY 0x01000
#define APIC_DEST_LOGICAL 0x00800
#define APIC_DEST_PHYSICAL 0x00000
+#define APIC_DM_MASK 0x00700
#define APIC_DM_FIXED 0x00000
-#define APIC_DM_FIXED_MASK 0x00700
#define APIC_DM_LOWEST 0x00100
#define APIC_DM_SMI 0x00200
#define APIC_DM_REMRD 0x00300
diff --git a/arch/x86/kernel/apic/ipi.c b/arch/x86/kernel/apic/ipi.c
index 98a57cb4aa86..4e8bc42f3bd5 100644
--- a/arch/x86/kernel/apic/ipi.c
+++ b/arch/x86/kernel/apic/ipi.c
@@ -158,7 +158,7 @@ static void __default_send_IPI_shortcut(unsigned int shortcut, int vector)
* issues where otherwise the system hangs when the panic CPU tries
* to stop the others before launching the kdump kernel.
*/
- if (unlikely(vector == NMI_VECTOR))
+ if (unlikely(is_nmi_vector(vector)))
apic_mem_wait_icr_idle_timeout();
else
apic_mem_wait_icr_idle();
@@ -175,7 +175,7 @@ void __default_send_IPI_dest_field(unsigned int dest_mask, int vector,
unsigned int dest_mode)
{
/* See comment in __default_send_IPI_shortcut() */
- if (unlikely(vector == NMI_VECTOR))
+ if (unlikely(is_nmi_vector(vector)))
apic_mem_wait_icr_idle_timeout();
else
apic_mem_wait_icr_idle();
diff --git a/arch/x86/kernel/apic/local.h b/arch/x86/kernel/apic/local.h
index bdcf609eb283..9a54c589a4bf 100644
--- a/arch/x86/kernel/apic/local.h
+++ b/arch/x86/kernel/apic/local.h
@@ -24,22 +24,24 @@ extern u32 x2apic_max_apicid;
/* IPI */
+u16 __prepare_ICR_DM_vector(u16 vector);
+
DECLARE_STATIC_KEY_FALSE(apic_use_ipi_shorthand);
+/* NMI-source vectors have the delivery mode encoded within them */
+static inline bool is_nmi_vector(u16 vector)
+{
+ if ((vector & APIC_DM_MASK) == APIC_DM_NMI)
+ return true;
+ if ((vector & APIC_VECTOR_MASK) == NMI_VECTOR)
+ return true;
+ return false;
+}
+
static inline unsigned int __prepare_ICR(unsigned int shortcut, int vector,
unsigned int dest)
{
- unsigned int icr = shortcut | dest;
-
- switch (vector) {
- default:
- icr |= APIC_DM_FIXED | vector;
- break;
- case NMI_VECTOR:
- icr |= APIC_DM_NMI;
- break;
- }
- return icr;
+ return shortcut | dest | __prepare_ICR_DM_vector(vector);
}
void default_init_apic_ldr(void);
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 3be9b3342c67..aa45fe4fecd0 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -517,14 +517,7 @@ static void __send_ipi_mask(const struct cpumask *mask, int vector)
local_irq_save(flags);
- switch (vector) {
- default:
- icr = APIC_DM_FIXED | vector;
- break;
- case NMI_VECTOR:
- icr = APIC_DM_NMI;
- break;
- }
+ icr = __prepare_ICR_DM_vector(vector);
for_each_cpu(cpu, mask) {
apic_id = per_cpu(x86_cpu_to_apicid, cpu);
diff --git a/drivers/thermal/intel/therm_throt.c b/drivers/thermal/intel/therm_throt.c
index e69868e868eb..83dd53cb4fc7 100644
--- a/drivers/thermal/intel/therm_throt.c
+++ b/drivers/thermal/intel/therm_throt.c
@@ -740,7 +740,7 @@ void intel_init_thermal(struct cpuinfo_x86 *c)
* BIOS has programmed on AP based on BSP's info we saved since BIOS
* is always setting the same value for all threads/cores.
*/
- if ((h & APIC_DM_FIXED_MASK) != APIC_DM_FIXED)
+ if ((h & APIC_DM_MASK) != APIC_DM_FIXED)
apic_write(APIC_LVTTHMR, lvtthmr_init);
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 7/9] x86/nmi: Enable NMI-source for IPIs delivered as NMIs
2025-05-07 1:21 [PATCH v5 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
` (5 preceding siblings ...)
2025-05-07 1:21 ` [PATCH v5 6/9] x86/nmi: Prepare for the new NMI-source vector encoding Sohil Mehta
@ 2025-05-07 1:21 ` Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 8/9] perf/x86: Enable NMI-source reporting for perfmon Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 9/9] x86/nmi: Include NMI-source information in tracepoint and debug prints Sohil Mehta
8 siblings, 0 replies; 24+ messages in thread
From: Sohil Mehta @ 2025-05-07 1:21 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Sohil Mehta, Brian Gerst, Andrew Cooper, Kirill A . Shutemov,
Jacob Pan, Andi Kleen, Kai Huang, Nikolay Borisov,
linux-perf-users, linux-edac, kvm, linux-pm, linux-trace-kernel
With the IPI handling APIs ready to support the new NMI encoding, encode
the NMI delivery mode directly with the NMI-source vectors to trigger
NMIs.
Move most of the existing NMI-based IPIs to use the new NMI-source
vectors, except for the microcode rendezvous NMI and the crash reboot
NMI. NMI handling for them is special-cased in exc_nmi() and does not
need NMI-source reporting.
However, in the future, it might be useful to assign a source vector to
all NMI sources to improve isolation and debuggability.
Originally-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Co-developed-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v5: Encode APIC_DM_NMI directly with the NMI-source vector.
---
arch/x86/include/asm/apic.h | 7 +++++++
arch/x86/kernel/apic/hw_nmi.c | 2 +-
arch/x86/kernel/cpu/mce/inject.c | 2 +-
arch/x86/kernel/kgdb.c | 2 +-
arch/x86/kernel/nmi_selftest.c | 2 +-
arch/x86/kernel/smp.c | 2 +-
6 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 11a3f88518fa..9bade39b5feb 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -23,6 +23,13 @@
#define APIC_EXTNMI_ALL 1
#define APIC_EXTNMI_NONE 2
+/* Trigger NMIs with source information */
+#define TEST_NMI (APIC_DM_NMI | NMIS_VECTOR_TEST)
+#define SMP_STOP_NMI (APIC_DM_NMI | NMIS_VECTOR_SMP_STOP)
+#define BT_NMI (APIC_DM_NMI | NMIS_VECTOR_BT)
+#define KGDB_NMI (APIC_DM_NMI | NMIS_VECTOR_KGDB)
+#define MCE_NMI (APIC_DM_NMI | NMIS_VECTOR_MCE)
+
/*
* Debugging macros
*/
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 4e04f13d2de9..586f4b25feae 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -33,7 +33,7 @@ u64 hw_nmi_get_sample_period(int watchdog_thresh)
#ifdef arch_trigger_cpumask_backtrace
static void nmi_raise_cpu_backtrace(cpumask_t *mask)
{
- __apic_send_IPI_mask(mask, NMI_VECTOR);
+ __apic_send_IPI_mask(mask, BT_NMI);
}
void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu)
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index a3c753dfce91..6328a607ffc4 100644
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -269,7 +269,7 @@ static void __maybe_unused raise_mce(struct mce *m)
mce_irq_ipi, NULL, 0);
preempt_enable();
} else if (m->inject_flags & MCJ_NMI_BROADCAST)
- __apic_send_IPI_mask(mce_inject_cpumask, NMI_VECTOR);
+ __apic_send_IPI_mask(mce_inject_cpumask, MCE_NMI);
}
start = jiffies;
while (!cpumask_empty(mce_inject_cpumask)) {
diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
index 9ca4b141da0c..3dedc5f57541 100644
--- a/arch/x86/kernel/kgdb.c
+++ b/arch/x86/kernel/kgdb.c
@@ -416,7 +416,7 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
*/
void kgdb_roundup_cpus(void)
{
- apic_send_IPI_allbutself(NMI_VECTOR);
+ apic_send_IPI_allbutself(KGDB_NMI);
}
#endif
diff --git a/arch/x86/kernel/nmi_selftest.c b/arch/x86/kernel/nmi_selftest.c
index 5196023b31dc..c5c91f520c69 100644
--- a/arch/x86/kernel/nmi_selftest.c
+++ b/arch/x86/kernel/nmi_selftest.c
@@ -71,7 +71,7 @@ static void __init test_nmi_ipi(struct cpumask *mask)
/* sync above data before sending NMI */
wmb();
- __apic_send_IPI_mask(mask, NMI_VECTOR);
+ __apic_send_IPI_mask(mask, TEST_NMI);
/* Don't wait longer than a second */
timeout = USEC_PER_SEC;
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 5be1c0bdf901..614acec5655f 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -217,7 +217,7 @@ static void native_stop_other_cpus(int wait)
pr_emerg("Shutting down cpus with NMI\n");
for_each_cpu(cpu, &cpus_stop_mask)
- __apic_send_IPI(cpu, NMI_VECTOR);
+ __apic_send_IPI(cpu, SMP_STOP_NMI);
}
/*
* Don't wait longer than 10 ms if the caller didn't
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 8/9] perf/x86: Enable NMI-source reporting for perfmon
2025-05-07 1:21 [PATCH v5 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
` (6 preceding siblings ...)
2025-05-07 1:21 ` [PATCH v5 7/9] x86/nmi: Enable NMI-source for IPIs delivered as NMIs Sohil Mehta
@ 2025-05-07 1:21 ` Sohil Mehta
2025-05-08 11:20 ` Sandipan Das
2025-05-07 1:21 ` [PATCH v5 9/9] x86/nmi: Include NMI-source information in tracepoint and debug prints Sohil Mehta
8 siblings, 1 reply; 24+ messages in thread
From: Sohil Mehta @ 2025-05-07 1:21 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Sohil Mehta, Brian Gerst, Andrew Cooper, Kirill A . Shutemov,
Jacob Pan, Andi Kleen, Kai Huang, Nikolay Borisov,
linux-perf-users, linux-edac, kvm, linux-pm, linux-trace-kernel
From: Jacob Pan <jacob.jun.pan@linux.intel.com>
Program the designated PMI NMI-source vector into the local vector table
for the PMU. An NMI for the PMU would directly invoke the PMI handler
without polling other NMI handlers, resulting in reduced PMI delivery
latency.
Co-developed-by: Zeng Guang <guang.zeng@intel.com>
Signed-off-by: Zeng Guang <guang.zeng@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
---
v5: No significant change.
---
arch/x86/events/core.c | 4 ++--
arch/x86/events/intel/core.c | 6 +++---
arch/x86/include/asm/apic.h | 1 +
3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 031e908f0d61..42b270526631 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1695,7 +1695,7 @@ int x86_pmu_handle_irq(struct pt_regs *regs)
* This generic handler doesn't seem to have any issues where the
* unmasking occurs so it was left at the top.
*/
- apic_write(APIC_LVTPC, APIC_DM_NMI);
+ apic_write(APIC_LVTPC, PERF_NMI);
for_each_set_bit(idx, x86_pmu.cntr_mask, X86_PMC_IDX_MAX) {
if (!test_bit(idx, cpuc->active_mask))
@@ -1737,7 +1737,7 @@ void perf_events_lapic_init(void)
/*
* Always use NMI for PMU
*/
- apic_write(APIC_LVTPC, APIC_DM_NMI);
+ apic_write(APIC_LVTPC, PERF_NMI);
}
static int
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 09d2d66c9f21..87c624686c58 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3202,7 +3202,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
* NMI handler.
*/
if (!late_ack && !mid_ack)
- apic_write(APIC_LVTPC, APIC_DM_NMI);
+ apic_write(APIC_LVTPC, PERF_NMI);
intel_bts_disable_local();
cpuc->enabled = 0;
__intel_pmu_disable_all(true);
@@ -3239,7 +3239,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
done:
if (mid_ack)
- apic_write(APIC_LVTPC, APIC_DM_NMI);
+ apic_write(APIC_LVTPC, PERF_NMI);
/* Only restore PMU state when it's active. See x86_pmu_disable(). */
cpuc->enabled = pmu_enabled;
if (pmu_enabled)
@@ -3252,7 +3252,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
* Haswell CPUs.
*/
if (late_ack)
- apic_write(APIC_LVTPC, APIC_DM_NMI);
+ apic_write(APIC_LVTPC, PERF_NMI);
return handled;
}
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 9bade39b5feb..b2f864e77d84 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -29,6 +29,7 @@
#define BT_NMI (APIC_DM_NMI | NMIS_VECTOR_BT)
#define KGDB_NMI (APIC_DM_NMI | NMIS_VECTOR_KGDB)
#define MCE_NMI (APIC_DM_NMI | NMIS_VECTOR_MCE)
+#define PERF_NMI (APIC_DM_NMI | NMIS_VECTOR_PMI)
/*
* Debugging macros
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 9/9] x86/nmi: Include NMI-source information in tracepoint and debug prints
2025-05-07 1:21 [PATCH v5 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
` (7 preceding siblings ...)
2025-05-07 1:21 ` [PATCH v5 8/9] perf/x86: Enable NMI-source reporting for perfmon Sohil Mehta
@ 2025-05-07 1:21 ` Sohil Mehta
2025-05-07 21:48 ` Steven Rostedt
8 siblings, 1 reply; 24+ messages in thread
From: Sohil Mehta @ 2025-05-07 1:21 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Sohil Mehta, Brian Gerst, Andrew Cooper, Kirill A . Shutemov,
Jacob Pan, Andi Kleen, Kai Huang, Nikolay Borisov,
linux-perf-users, linux-edac, kvm, linux-pm, linux-trace-kernel
The NMI-source bitmap is the most critical information provided by the
NMI-source reporting feature. It can help identify issues when multiple
NMIs occur simultaneously or if certain NMI handlers consistently
misbehave. It is also very useful in debugging unknown NMIs since it can
pinpoint the exact source that caused the NMI.
Add the source bitmap to the nmi_handler() tracepoint. Also, print the
bitmap along with the "Unknown NMI" kernel log message.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v5: New patch
---
arch/x86/kernel/nmi.c | 5 ++++-
include/trace/events/nmi.h | 13 ++++++++-----
2 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index 183e3e717326..b9ece0b63ca7 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -202,7 +202,7 @@ static int nmi_handle(unsigned int type, struct pt_regs *regs)
thishandled = a->handler(type, regs);
handled += thishandled;
delta = sched_clock() - delta;
- trace_nmi_handler(a->handler, (int)delta, thishandled);
+ trace_nmi_handler(a->handler, (int)delta, thishandled, source_bitmap);
nmi_check_duration(a, delta);
}
@@ -387,6 +387,9 @@ unknown_nmi_error(unsigned char reason, struct pt_regs *regs)
pr_emerg_ratelimited("Uhhuh. NMI received for unknown reason %02x on CPU %d.\n",
reason, smp_processor_id());
+ if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE))
+ pr_emerg_ratelimited("NMI-source bitmap is 0x%lx\n", fred_event_data(regs));
+
if (unknown_nmi_panic || panic_on_unrecovered_nmi)
nmi_panic(regs, "NMI: Not continuing");
diff --git a/include/trace/events/nmi.h b/include/trace/events/nmi.h
index 18e0411398ba..6e4a1ff70a44 100644
--- a/include/trace/events/nmi.h
+++ b/include/trace/events/nmi.h
@@ -10,29 +10,32 @@
TRACE_EVENT(nmi_handler,
- TP_PROTO(void *handler, s64 delta_ns, int handled),
+ TP_PROTO(void *handler, s64 delta_ns, int handled, unsigned long source_bitmap),
- TP_ARGS(handler, delta_ns, handled),
+ TP_ARGS(handler, delta_ns, handled, source_bitmap),
TP_STRUCT__entry(
__field( void *, handler )
__field( s64, delta_ns)
__field( int, handled )
+ __field( unsigned long, source_bitmap)
),
TP_fast_assign(
__entry->handler = handler;
__entry->delta_ns = delta_ns;
__entry->handled = handled;
+ __entry->source_bitmap = source_bitmap;
),
- TP_printk("%ps() delta_ns: %lld handled: %d",
+ TP_printk("%ps() delta_ns: %lld handled: %d source_bitmap: 0x%lx",
__entry->handler,
__entry->delta_ns,
- __entry->handled)
+ __entry->handled,
+ __entry->source_bitmap)
);
#endif /* _TRACE_NMI_H */
-/* This part ust be outside protection */
+/* This part must be outside protection */
#include <trace/define_trace.h>
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v5 4/9] x86/nmi: Assign and register NMI-source vectors
2025-05-07 1:21 ` [PATCH v5 4/9] x86/nmi: Assign and register NMI-source vectors Sohil Mehta
@ 2025-05-07 8:22 ` Peter Zijlstra
2025-05-07 20:43 ` Sohil Mehta
0 siblings, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2025-05-07 8:22 UTC (permalink / raw)
To: Sohil Mehta
Cc: x86, linux-kernel, Xin Li, H . Peter Anvin, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Brian Gerst, Andrew Cooper, Kirill A . Shutemov, Jacob Pan,
Andi Kleen, Kai Huang, Nikolay Borisov, linux-perf-users,
linux-edac, kvm, linux-pm, linux-trace-kernel
On Tue, May 06, 2025 at 06:21:40PM -0700, Sohil Mehta wrote:
> diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
> index f0a577bf7bba..b9beb216f2d0 100644
> --- a/arch/x86/include/asm/nmi.h
> +++ b/arch/x86/include/asm/nmi.h
> @@ -57,6 +57,38 @@ struct nmiaction {
> u8 source_vector;
> };
>
> +/**
> + * NMI-source vectors are used to identify the origin of an NMI and to
> + * route the NMI directly to the appropriate handler.
> + *
> + * On CPUs that support NMI-source reporting with FRED, receiving an NMI
> + * with a valid vector sets the corresponding bit in the NMI-source
> + * bitmap. The bitmap is delivered as FRED event data on the stack.
> + * Multiple NMIs are coalesced in the NMI-source bitmap until the next
> + * NMI delivery.
> + *
> + * If an NMI is received without a vector or beyond the defined range,
> + * the CPU sets bit 0 of the NMI-source bitmap.
> + *
> + * Vector 2 is reserved for external NMIs related to the Local APIC -
> + * LINT1. Some third-party chipsets may send NMI messages with a
> + * hardcoded vector of 2, which would result in bit 2 being set in the
> + * NMI-source bitmap.
> + *
> + * The vectors are in no particular priority order. Add new vector
> + * assignments sequentially in the list below.
> + */
> +#define NMIS_VECTOR_NONE 0 /* Reserved - Set for all unidentified sources */
> +#define NMIS_VECTOR_TEST 1 /* NMI selftest */
> +#define NMIS_VECTOR_EXTERNAL 2 /* Reserved - Match External NMI vector 2 */
> +#define NMIS_VECTOR_SMP_STOP 3 /* Panic stop CPU */
> +#define NMIS_VECTOR_BT 4 /* CPU backtrace */
> +#define NMIS_VECTOR_KGDB 5 /* Kernel debugger */
> +#define NMIS_VECTOR_MCE 6 /* MCE injection */
> +#define NMIS_VECTOR_PMI 7 /* PerfMon counters */
> +
> +#define NMIS_VECTORS_MAX 16 /* Maximum number of NMI-source vectors */
Are these really independent NMI vectors, or simply NMI source reporting bits?
Because if they are not NMI vectors, naming them such is confusing.
Specifically, is there a latch per source?
> diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
> index be93ec7255bf..a1d672dcb6f0 100644
> --- a/arch/x86/kernel/nmi.c
> +++ b/arch/x86/kernel/nmi.c
> @@ -184,6 +184,11 @@ int __register_nmi_handler(unsigned int type, struct nmiaction *action)
>
> raw_spin_lock_irqsave(&desc->lock, flags);
>
> + WARN_ON_ONCE(action->source_vector >= NMIS_VECTORS_MAX);
> +
> + if (!cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) || type != NMI_LOCAL)
> + action->source_vector = 0;
How about:
WARN_ON_ONCE(type != NMI_LOCAL && action->source_vector);
instead?
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/9] x86/nmi: Add support to handle NMIs with source information
2025-05-07 1:21 ` [PATCH v5 5/9] x86/nmi: Add support to handle NMIs with source information Sohil Mehta
@ 2025-05-07 9:14 ` Peter Zijlstra
2025-05-07 21:48 ` Sohil Mehta
0 siblings, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2025-05-07 9:14 UTC (permalink / raw)
To: Sohil Mehta
Cc: x86, linux-kernel, Xin Li, H . Peter Anvin, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Brian Gerst, Andrew Cooper, Kirill A . Shutemov, Jacob Pan,
Andi Kleen, Kai Huang, Nikolay Borisov, linux-perf-users,
linux-edac, kvm, linux-pm, linux-trace-kernel
On Tue, May 06, 2025 at 06:21:41PM -0700, Sohil Mehta wrote:
> The NMI-source bitmap is delivered as FRED event data to the kernel.
> When available, use NMI-source based filtering to determine the exact
> handlers to run.
>
> Activate NMI-source based filtering only for Local NMIs. While handling
> platform NMI types (such as SERR and IOCHK), do not use the source
> bitmap. They have only one handler registered per type, so there is no
> need to disambiguate between multiple handlers.
>
> Some third-party chipsets may send NMI messages with a hardcoded vector
> of 2, which would result in bit 2 being set in the NMI-source bitmap.
> Skip the local NMI handlers in this situation.
>
> Bit 0 of the source bitmap is set by the hardware whenever a source
> vector was not used while generating an NMI, or the originator could not
> be reliably identified. Poll all the registered handlers in that case.
>
> When multiple handlers need to be executed, adhere to the existing
> priority scheme and execute the handlers registered with NMI_FLAG_FIRST
> before others.
>
> The logic for handling legacy NMIs is unaffected since the source bitmap
> would always be zero.
>
> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
> ---
> v5: Significantly simplify NMI-source handling logic.
> Get rid of a separate lookup table for NMI-source vectors.
> Adhere to existing priority scheme for handling NMIs.
> ---
> arch/x86/kernel/nmi.c | 40 ++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 40 insertions(+)
>
> diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
> index a1d672dcb6f0..183e3e717326 100644
> --- a/arch/x86/kernel/nmi.c
> +++ b/arch/x86/kernel/nmi.c
> static int nmi_handle(unsigned int type, struct pt_regs *regs)
> {
> struct nmi_desc *desc = nmi_to_desc(type);
> + unsigned long source_bitmap = 0;
unsigned long source = ~0UL;
> nmi_handler_t ehandler;
> struct nmiaction *a;
> int handled=0;
> @@ -148,16 +164,40 @@ static int nmi_handle(unsigned int type, struct pt_regs *regs)
>
> rcu_read_lock();
>
> + /*
> + * Activate NMI source-based filtering only for Local NMIs.
> + *
> + * Platform NMI types (such as SERR and IOCHK) have only one
> + * handler registered per type, so there is no need to
> + * disambiguate between multiple handlers.
> + *
> + * Also, if a platform source ends up setting bit 2 in the
> + * source bitmap, the local NMI handlers would be skipped since
> + * none of them use this reserved vector.
> + *
> + * For Unknown NMIs, avoid using the source bitmap to ensure all
> + * potential handlers have a chance to claim responsibility.
> + */
> + if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL)
> + source_bitmap = fred_event_data(regs);
if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) {
source = fred_event_data(regs);
if (source & BIT(0))
source = ~0UL;
}
> /*
> * NMIs are edge-triggered, which means if you have enough
> * of them concurrently, you can lose some because only one
> * can be latched at any given time. Walk the whole list
> * to handle those situations.
> + *
> + * However, NMI-source reporting does not have this limitation.
> + * When NMI-source information is available, only run the
> + * handlers that match the reported vectors.
> */
> list_for_each_entry_rcu(a, &desc->head, list) {
> int thishandled;
> u64 delta;
>
> + if (source_bitmap && !match_nmi_source(source_bitmap, a))
> + continue;
if (!(souce & BIT(a->source_vector)))
continue;
> delta = sched_clock();
> thishandled = a->handler(type, regs);
> handled += thishandled;
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 6/9] x86/nmi: Prepare for the new NMI-source vector encoding
2025-05-07 1:21 ` [PATCH v5 6/9] x86/nmi: Prepare for the new NMI-source vector encoding Sohil Mehta
@ 2025-05-07 9:17 ` Peter Zijlstra
2025-05-07 22:10 ` Sohil Mehta
0 siblings, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2025-05-07 9:17 UTC (permalink / raw)
To: Sohil Mehta
Cc: x86, linux-kernel, Xin Li, H . Peter Anvin, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Brian Gerst, Andrew Cooper, Kirill A . Shutemov, Jacob Pan,
Andi Kleen, Kai Huang, Nikolay Borisov, linux-perf-users,
linux-edac, kvm, linux-pm, linux-trace-kernel
On Tue, May 06, 2025 at 06:21:42PM -0700, Sohil Mehta wrote:
> +/*
> + * Prepare the delivery mode and vector for the ICR.
> + *
> + * NMI-source vectors have the NMI delivery mode encoded within them to
> + * differentiate them from the IDT vectors. IDT vector 0x2 (NMI_VECTOR)
> + * is treated as an NMI request but without any NMI-source information.
> + */
> +static inline u16 __prepare_ICR_DM_vector(u16 dm_vector)
> +{
> + u16 vector = dm_vector & APIC_VECTOR_MASK;
> + u16 dm = dm_vector & APIC_DM_MASK;
> +
> + if (dm == APIC_DM_NMI) {
> + /*
> + * Pre-FRED, the actual vector is ignored for NMIs, but
> + * zero it if NMI-source reporting is not supported to
> + * avoid breakage on misbehaving hardware or hypervisors.
> + */
> + if (!cpu_feature_enabled(X86_FEATURE_NMI_SOURCE))
> + vector = 0;
> +
> + return dm | vector;
> + }
> +
> + if (vector == NMI_VECTOR)
> + return APIC_DM_NMI;
> + else
Please drop that else, that's pointless.
> + return APIC_DM_FIXED | vector;
> +}
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 4/9] x86/nmi: Assign and register NMI-source vectors
2025-05-07 8:22 ` Peter Zijlstra
@ 2025-05-07 20:43 ` Sohil Mehta
0 siblings, 0 replies; 24+ messages in thread
From: Sohil Mehta @ 2025-05-07 20:43 UTC (permalink / raw)
To: Peter Zijlstra
Cc: x86, linux-kernel, Xin Li, H . Peter Anvin, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Brian Gerst, Andrew Cooper, Kirill A . Shutemov, Jacob Pan,
Andi Kleen, Kai Huang, Nikolay Borisov, linux-perf-users,
linux-edac, kvm, linux-pm, linux-trace-kernel
On 5/7/2025 1:22 AM, Peter Zijlstra wrote:
> On Tue, May 06, 2025 at 06:21:40PM -0700, Sohil Mehta wrote:
>> + */
>> +#define NMIS_VECTOR_NONE 0 /* Reserved - Set for all unidentified sources */
>> +#define NMIS_VECTOR_TEST 1 /* NMI selftest */
>> +#define NMIS_VECTOR_EXTERNAL 2 /* Reserved - Match External NMI vector 2 */
>> +#define NMIS_VECTOR_SMP_STOP 3 /* Panic stop CPU */
>> +#define NMIS_VECTOR_BT 4 /* CPU backtrace */
>> +#define NMIS_VECTOR_KGDB 5 /* Kernel debugger */
>> +#define NMIS_VECTOR_MCE 6 /* MCE injection */
>> +#define NMIS_VECTOR_PMI 7 /* PerfMon counters */
>> +
>> +#define NMIS_VECTORS_MAX 16 /* Maximum number of NMI-source vectors */
>
> Are these really independent NMI vectors, or simply NMI source reporting bits?
>
> Because if they are not NMI vectors, naming them such is confusing.
>
> Specifically, is there a latch per source?
>
Yes, they are truly vectors, confirmed with HPA that there is one latch
per source. Also, while generating the NMIs these values are used in the
APIC code to program the ICR vector field as shown.
ICR[7:0] — Vector -> NMIS_VECTOR_BT (4)
ICR[10:8] — Delivery Mode -> APIC_DM_NMI (100)
>> diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
>> index be93ec7255bf..a1d672dcb6f0 100644
>> --- a/arch/x86/kernel/nmi.c
>> +++ b/arch/x86/kernel/nmi.c
>> @@ -184,6 +184,11 @@ int __register_nmi_handler(unsigned int type, struct nmiaction *action)
>>
>> raw_spin_lock_irqsave(&desc->lock, flags);
>>
>> + WARN_ON_ONCE(action->source_vector >= NMIS_VECTORS_MAX);
>> +
>> + if (!cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) || type != NMI_LOCAL)
>> + action->source_vector = 0;
>
> How about:
>
> WARN_ON_ONCE(type != NMI_LOCAL && action->source_vector);
>
This should work fine as well.
I don't see any harm in storing the source_vector unconditionally even
if X86_FEATURE_NMI_SOURCE is disabled.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 9/9] x86/nmi: Include NMI-source information in tracepoint and debug prints
2025-05-07 1:21 ` [PATCH v5 9/9] x86/nmi: Include NMI-source information in tracepoint and debug prints Sohil Mehta
@ 2025-05-07 21:48 ` Steven Rostedt
2025-05-08 0:02 ` Sohil Mehta
0 siblings, 1 reply; 24+ messages in thread
From: Steven Rostedt @ 2025-05-07 21:48 UTC (permalink / raw)
To: Sohil Mehta
Cc: x86, linux-kernel, Xin Li, H . Peter Anvin, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Sean Christopherson, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Masami Hiramatsu, Mathieu Desnoyers, Brian Gerst,
Andrew Cooper, Kirill A . Shutemov, Jacob Pan, Andi Kleen,
Kai Huang, Nikolay Borisov, linux-perf-users, linux-edac, kvm,
linux-pm, linux-trace-kernel
On Tue, 6 May 2025 18:21:45 -0700
Sohil Mehta <sohil.mehta@intel.com> wrote:
> diff --git a/include/trace/events/nmi.h b/include/trace/events/nmi.h
> index 18e0411398ba..6e4a1ff70a44 100644
> --- a/include/trace/events/nmi.h
> +++ b/include/trace/events/nmi.h
> @@ -10,29 +10,32 @@
>
> TRACE_EVENT(nmi_handler,
>
> - TP_PROTO(void *handler, s64 delta_ns, int handled),
> + TP_PROTO(void *handler, s64 delta_ns, int handled, unsigned long source_bitmap),
Even though x86 is currently the only architecture using the nmi
tracepoint, this "source_bitmap" makes it become very x86 specific.
This file should be moved into arch/x86/include/asm/trace/
And that would require adding to the Makefile:
CFLAGS_nmi.o := -I $(src)/../include/asm/trace
>
> - TP_ARGS(handler, delta_ns, handled),
> + TP_ARGS(handler, delta_ns, handled, source_bitmap),
>
> TP_STRUCT__entry(
> __field( void *, handler )
> __field( s64, delta_ns)
> __field( int, handled )
> + __field( unsigned long, source_bitmap)
> ),
>
> TP_fast_assign(
> __entry->handler = handler;
> __entry->delta_ns = delta_ns;
> __entry->handled = handled;
> + __entry->source_bitmap = source_bitmap;
> ),
>
> - TP_printk("%ps() delta_ns: %lld handled: %d",
> + TP_printk("%ps() delta_ns: %lld handled: %d source_bitmap: 0x%lx",
> __entry->handler,
> __entry->delta_ns,
> - __entry->handled)
> + __entry->handled,
> + __entry->source_bitmap)
> );
>
> #endif /* _TRACE_NMI_H */
>
> -/* This part ust be outside protection */
> +/* This part must be outside protection */
And this would need to have:
#undef TRACE_INCLUDE_PATH
#undef TRACE_INCLUDE_FILE
#define TRACE_INCLUDE_PATH .
#define TRACE_INCLUDE_FILE nmi
-- Steve
> #include <trace/define_trace.h>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/9] x86/nmi: Add support to handle NMIs with source information
2025-05-07 9:14 ` Peter Zijlstra
@ 2025-05-07 21:48 ` Sohil Mehta
2025-05-08 12:15 ` Peter Zijlstra
0 siblings, 1 reply; 24+ messages in thread
From: Sohil Mehta @ 2025-05-07 21:48 UTC (permalink / raw)
To: Peter Zijlstra
Cc: x86, linux-kernel, Xin Li, H . Peter Anvin, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Brian Gerst, Andrew Cooper, Kirill A . Shutemov, Jacob Pan,
Andi Kleen, Kai Huang, Nikolay Borisov, linux-perf-users,
linux-edac, kvm, linux-pm, linux-trace-kernel
On 5/7/2025 2:14 AM, Peter Zijlstra wrote:
> On Tue, May 06, 2025 at 06:21:41PM -0700, Sohil Mehta wrote:
>>
>> diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
>> index a1d672dcb6f0..183e3e717326 100644
>> --- a/arch/x86/kernel/nmi.c
>> +++ b/arch/x86/kernel/nmi.c
>
>> static int nmi_handle(unsigned int type, struct pt_regs *regs)
>> {
>> struct nmi_desc *desc = nmi_to_desc(type);
>> + unsigned long source_bitmap = 0;
>
> unsigned long source = ~0UL;
>
Thanks! This makes the logic even simpler by getting rid of
match_nmi_source(). A minor change described further down.
Also, do you prefer "source" over "source_bitmap"? I had it as such to
avoid confusion between source_vector and source_bitmap.
>> nmi_handler_t ehandler;
>> struct nmiaction *a;
>> int handled=0;
>> @@ -148,16 +164,40 @@ static int nmi_handle(unsigned int type, struct pt_regs *regs)
>>
>> rcu_read_lock();
>>
>> + /*
>> + * Activate NMI source-based filtering only for Local NMIs.
>> + *
>> + * Platform NMI types (such as SERR and IOCHK) have only one
>> + * handler registered per type, so there is no need to
>> + * disambiguate between multiple handlers.
>> + *
>> + * Also, if a platform source ends up setting bit 2 in the
>> + * source bitmap, the local NMI handlers would be skipped since
>> + * none of them use this reserved vector.
>> + *
>> + * For Unknown NMIs, avoid using the source bitmap to ensure all
>> + * potential handlers have a chance to claim responsibility.
>> + */
>> + if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL)
>> + source_bitmap = fred_event_data(regs);
>
> if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) {
> source = fred_event_data(regs);
> if (source & BIT(0))
> source = ~0UL;
> }
>
Looks good, except when fred_event_data() returns 0. I don't expect it
to happen in practice. But, maybe with new hardware and eventually
different hypervisors being involved, it is a possibility.
We can either call it a bug that an NMI happened without source
information. Or be extra nice and do this:
if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) {
source = fred_event_data(regs);
if (!source || (source & BIT(0)))
source = ~0UL;
}
>> /*
>> * NMIs are edge-triggered, which means if you have enough
>> * of them concurrently, you can lose some because only one
>> * can be latched at any given time. Walk the whole list
>> * to handle those situations.
>> + *
>> + * However, NMI-source reporting does not have this limitation.
>> + * When NMI-source information is available, only run the
>> + * handlers that match the reported vectors.
>> */
>> list_for_each_entry_rcu(a, &desc->head, list) {
>> int thishandled;
>> u64 delta;
>>
>> + if (source_bitmap && !match_nmi_source(source_bitmap, a))
>> + continue;
>
> if (!(souce & BIT(a->source_vector)))
> continue;
>
>> delta = sched_clock();
>> thishandled = a->handler(type, regs);
>> handled += thishandled;
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 6/9] x86/nmi: Prepare for the new NMI-source vector encoding
2025-05-07 9:17 ` Peter Zijlstra
@ 2025-05-07 22:10 ` Sohil Mehta
0 siblings, 0 replies; 24+ messages in thread
From: Sohil Mehta @ 2025-05-07 22:10 UTC (permalink / raw)
To: Peter Zijlstra
Cc: x86, linux-kernel, Xin Li, H . Peter Anvin, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Brian Gerst, Andrew Cooper, Kirill A . Shutemov, Jacob Pan,
Andi Kleen, Kai Huang, Nikolay Borisov, linux-perf-users,
linux-edac, kvm, linux-pm, linux-trace-kernel
On 5/7/2025 2:17 AM, Peter Zijlstra wrote:
> On Tue, May 06, 2025 at 06:21:42PM -0700, Sohil Mehta wrote:
>> + if (vector == NMI_VECTOR)
>> + return APIC_DM_NMI;
>> + else
>
> Please drop that else, that's pointless.
>
Will do.
>> + return APIC_DM_FIXED | vector;
>> +}
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 9/9] x86/nmi: Include NMI-source information in tracepoint and debug prints
2025-05-07 21:48 ` Steven Rostedt
@ 2025-05-08 0:02 ` Sohil Mehta
0 siblings, 0 replies; 24+ messages in thread
From: Sohil Mehta @ 2025-05-08 0:02 UTC (permalink / raw)
To: Steven Rostedt
Cc: x86, linux-kernel, Xin Li, H . Peter Anvin, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Sean Christopherson, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Masami Hiramatsu, Mathieu Desnoyers, Brian Gerst,
Andrew Cooper, Kirill A . Shutemov, Jacob Pan, Andi Kleen,
Kai Huang, Nikolay Borisov, linux-perf-users, linux-edac, kvm,
linux-pm, linux-trace-kernel
On 5/7/2025 2:48 PM, Steven Rostedt wrote:
> On Tue, 6 May 2025 18:21:45 -0700
> Sohil Mehta <sohil.mehta@intel.com> wrote:
>
>> diff --git a/include/trace/events/nmi.h b/include/trace/events/nmi.h
>> index 18e0411398ba..6e4a1ff70a44 100644
>> --- a/include/trace/events/nmi.h
>> +++ b/include/trace/events/nmi.h
>> @@ -10,29 +10,32 @@
>>
>> TRACE_EVENT(nmi_handler,
>>
>> - TP_PROTO(void *handler, s64 delta_ns, int handled),
>> + TP_PROTO(void *handler, s64 delta_ns, int handled, unsigned long source_bitmap),
>
> Even though x86 is currently the only architecture using the nmi
> tracepoint, this "source_bitmap" makes it become very x86 specific.
>
Sure. Will move it into x86.
> This file should be moved into arch/x86/include/asm/trace/
>
> And that would require adding to the Makefile:
>
> CFLAGS_nmi.o := -I $(src)/../include/asm/trace
>
Thank you for the detailed instructions. It makes it a lot easier.
> diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
> index 183e3e717326..b9ece0b63ca7 100644
> --- a/arch/x86/kernel/nmi.c
> +++ b/arch/x86/kernel/nmi.c
> @@ -202,7 +202,7 @@ static int nmi_handle(unsigned int type, struct pt_regs *regs)
> thishandled = a->handler(type, regs);
> handled += thishandled;
> delta = sched_clock() - delta;
> - trace_nmi_handler(a->handler, (int)delta, thishandled);
> + trace_nmi_handler(a->handler, (int)delta, thishandled, source_bitmap);
>
> nmi_check_duration(a, delta);
> }
Also, I just realized that the source_bitmap information might be
incorrect for !NMI_LOCAL. With the new changes suggested by PeterZ in
patch 5, the source_bitmap would be significantly different from the
original one received from fred_event_data().
I need to figure out a solution to print it accurately in all situations.
Sohil
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 8/9] perf/x86: Enable NMI-source reporting for perfmon
2025-05-07 1:21 ` [PATCH v5 8/9] perf/x86: Enable NMI-source reporting for perfmon Sohil Mehta
@ 2025-05-08 11:20 ` Sandipan Das
2025-05-09 0:46 ` Sohil Mehta
0 siblings, 1 reply; 24+ messages in thread
From: Sandipan Das @ 2025-05-08 11:20 UTC (permalink / raw)
To: Sohil Mehta, x86, linux-kernel
Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Brian Gerst, Andrew Cooper, Kirill A . Shutemov, Jacob Pan,
Andi Kleen, Kai Huang, Nikolay Borisov, linux-perf-users,
linux-edac, kvm, linux-pm, linux-trace-kernel
On 5/7/2025 6:51 AM, Sohil Mehta wrote:
> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
>
> Program the designated PMI NMI-source vector into the local vector table
> for the PMU. An NMI for the PMU would directly invoke the PMI handler
> without polling other NMI handlers, resulting in reduced PMI delivery
> latency.
>
> Co-developed-by: Zeng Guang <guang.zeng@intel.com>
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
> ---
> v5: No significant change.
> ---
> arch/x86/events/core.c | 4 ++--
> arch/x86/events/intel/core.c | 6 +++---
> arch/x86/include/asm/apic.h | 1 +
> 3 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index 031e908f0d61..42b270526631 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -1695,7 +1695,7 @@ int x86_pmu_handle_irq(struct pt_regs *regs)
> * This generic handler doesn't seem to have any issues where the
> * unmasking occurs so it was left at the top.
> */
> - apic_write(APIC_LVTPC, APIC_DM_NMI);
> + apic_write(APIC_LVTPC, PERF_NMI);
>
> for_each_set_bit(idx, x86_pmu.cntr_mask, X86_PMC_IDX_MAX) {
> if (!test_bit(idx, cpuc->active_mask))
> @@ -1737,7 +1737,7 @@ void perf_events_lapic_init(void)
> /*
> * Always use NMI for PMU
> */
> - apic_write(APIC_LVTPC, APIC_DM_NMI);
> + apic_write(APIC_LVTPC, PERF_NMI);
> }
>
> static int
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 09d2d66c9f21..87c624686c58 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -3202,7 +3202,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
> * NMI handler.
> */
> if (!late_ack && !mid_ack)
> - apic_write(APIC_LVTPC, APIC_DM_NMI);
> + apic_write(APIC_LVTPC, PERF_NMI);
> intel_bts_disable_local();
> cpuc->enabled = 0;
> __intel_pmu_disable_all(true);
> @@ -3239,7 +3239,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
>
> done:
> if (mid_ack)
> - apic_write(APIC_LVTPC, APIC_DM_NMI);
> + apic_write(APIC_LVTPC, PERF_NMI);
> /* Only restore PMU state when it's active. See x86_pmu_disable(). */
> cpuc->enabled = pmu_enabled;
> if (pmu_enabled)
> @@ -3252,7 +3252,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
> * Haswell CPUs.
> */
> if (late_ack)
> - apic_write(APIC_LVTPC, APIC_DM_NMI);
> + apic_write(APIC_LVTPC, PERF_NMI);
> return handled;
> }
>
> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
> index 9bade39b5feb..b2f864e77d84 100644
> --- a/arch/x86/include/asm/apic.h
> +++ b/arch/x86/include/asm/apic.h
> @@ -29,6 +29,7 @@
> #define BT_NMI (APIC_DM_NMI | NMIS_VECTOR_BT)
> #define KGDB_NMI (APIC_DM_NMI | NMIS_VECTOR_KGDB)
> #define MCE_NMI (APIC_DM_NMI | NMIS_VECTOR_MCE)
> +#define PERF_NMI (APIC_DM_NMI | NMIS_VECTOR_PMI)
>
> /*
> * Debugging macros
For AMD processors that do not support NMI source reporting but use
x86_pmu_handle_irq() and perf_events_lapic_init()
Tested-by: Sandipan Das <sandipan.das@amd.com>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/9] x86/nmi: Add support to handle NMIs with source information
2025-05-07 21:48 ` Sohil Mehta
@ 2025-05-08 12:15 ` Peter Zijlstra
2025-05-08 20:23 ` H. Peter Anvin
0 siblings, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2025-05-08 12:15 UTC (permalink / raw)
To: Sohil Mehta
Cc: x86, linux-kernel, Xin Li, H . Peter Anvin, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Brian Gerst, Andrew Cooper, Kirill A . Shutemov, Jacob Pan,
Andi Kleen, Kai Huang, Nikolay Borisov, linux-perf-users,
linux-edac, kvm, linux-pm, linux-trace-kernel
On Wed, May 07, 2025 at 02:48:34PM -0700, Sohil Mehta wrote:
> On 5/7/2025 2:14 AM, Peter Zijlstra wrote:
> > On Tue, May 06, 2025 at 06:21:41PM -0700, Sohil Mehta wrote:
> >>
> >> diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
> >> index a1d672dcb6f0..183e3e717326 100644
> >> --- a/arch/x86/kernel/nmi.c
> >> +++ b/arch/x86/kernel/nmi.c
> >
> >> static int nmi_handle(unsigned int type, struct pt_regs *regs)
> >> {
> >> struct nmi_desc *desc = nmi_to_desc(type);
> >> + unsigned long source_bitmap = 0;
> >
> > unsigned long source = ~0UL;
> >
>
> Thanks! This makes the logic even simpler by getting rid of
> match_nmi_source(). A minor change described further down.
>
> Also, do you prefer "source" over "source_bitmap"? I had it as such to
> avoid confusion between source_vector and source_bitmap.
Yeah, I was lazy typing. Perhaps just call it bitmap then?
> >> nmi_handler_t ehandler;
> >> struct nmiaction *a;
> >> int handled=0;
> >> @@ -148,16 +164,40 @@ static int nmi_handle(unsigned int type, struct pt_regs *regs)
> >>
> >> rcu_read_lock();
> >>
> >> + /*
> >> + * Activate NMI source-based filtering only for Local NMIs.
> >> + *
> >> + * Platform NMI types (such as SERR and IOCHK) have only one
> >> + * handler registered per type, so there is no need to
> >> + * disambiguate between multiple handlers.
> >> + *
> >> + * Also, if a platform source ends up setting bit 2 in the
> >> + * source bitmap, the local NMI handlers would be skipped since
> >> + * none of them use this reserved vector.
> >> + *
> >> + * For Unknown NMIs, avoid using the source bitmap to ensure all
> >> + * potential handlers have a chance to claim responsibility.
> >> + */
> >> + if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL)
> >> + source_bitmap = fred_event_data(regs);
> >
> > if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) {
> > source = fred_event_data(regs);
> > if (source & BIT(0))
> > source = ~0UL;
> > }
> >
>
> Looks good, except when fred_event_data() returns 0. I don't expect it
> to happen in practice. But, maybe with new hardware and eventually
> different hypervisors being involved, it is a possibility.
>
> We can either call it a bug that an NMI happened without source
> information. Or be extra nice and do this:
>
> if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) {
> source = fred_event_data(regs);
> if (!source || (source & BIT(0)))
> source = ~0UL;
> }
Perhaps also WARN about the !source case?
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/9] x86/nmi: Add support to handle NMIs with source information
2025-05-08 12:15 ` Peter Zijlstra
@ 2025-05-08 20:23 ` H. Peter Anvin
2025-05-08 20:49 ` Peter Zijlstra
0 siblings, 1 reply; 24+ messages in thread
From: H. Peter Anvin @ 2025-05-08 20:23 UTC (permalink / raw)
To: Peter Zijlstra, Sohil Mehta
Cc: x86, linux-kernel, Xin Li, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, Sean Christopherson,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Tony Luck, Paolo Bonzini, Vitaly Kuznetsov,
Rafael J . Wysocki, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, Brian Gerst,
Andrew Cooper, Kirill A . Shutemov, Jacob Pan, Andi Kleen,
Kai Huang, Nikolay Borisov, linux-perf-users, linux-edac, kvm,
linux-pm, linux-trace-kernel
On May 8, 2025 5:15:44 AM PDT, Peter Zijlstra <peterz@infradead.org> wrote:
>On Wed, May 07, 2025 at 02:48:34PM -0700, Sohil Mehta wrote:
>> On 5/7/2025 2:14 AM, Peter Zijlstra wrote:
>> > On Tue, May 06, 2025 at 06:21:41PM -0700, Sohil Mehta wrote:
>> >>
>> >> diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
>> >> index a1d672dcb6f0..183e3e717326 100644
>> >> --- a/arch/x86/kernel/nmi.c
>> >> +++ b/arch/x86/kernel/nmi.c
>> >
>> >> static int nmi_handle(unsigned int type, struct pt_regs *regs)
>> >> {
>> >> struct nmi_desc *desc = nmi_to_desc(type);
>> >> + unsigned long source_bitmap = 0;
>> >
>> > unsigned long source = ~0UL;
>> >
>>
>> Thanks! This makes the logic even simpler by getting rid of
>> match_nmi_source(). A minor change described further down.
>>
>> Also, do you prefer "source" over "source_bitmap"? I had it as such to
>> avoid confusion between source_vector and source_bitmap.
>
>Yeah, I was lazy typing. Perhaps just call it bitmap then?
>
>> >> nmi_handler_t ehandler;
>> >> struct nmiaction *a;
>> >> int handled=0;
>> >> @@ -148,16 +164,40 @@ static int nmi_handle(unsigned int type, struct pt_regs *regs)
>> >>
>> >> rcu_read_lock();
>> >>
>> >> + /*
>> >> + * Activate NMI source-based filtering only for Local NMIs.
>> >> + *
>> >> + * Platform NMI types (such as SERR and IOCHK) have only one
>> >> + * handler registered per type, so there is no need to
>> >> + * disambiguate between multiple handlers.
>> >> + *
>> >> + * Also, if a platform source ends up setting bit 2 in the
>> >> + * source bitmap, the local NMI handlers would be skipped since
>> >> + * none of them use this reserved vector.
>> >> + *
>> >> + * For Unknown NMIs, avoid using the source bitmap to ensure all
>> >> + * potential handlers have a chance to claim responsibility.
>> >> + */
>> >> + if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL)
>> >> + source_bitmap = fred_event_data(regs);
>> >
>> > if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) {
>> > source = fred_event_data(regs);
>> > if (source & BIT(0))
>> > source = ~0UL;
>> > }
>> >
>>
>> Looks good, except when fred_event_data() returns 0. I don't expect it
>> to happen in practice. But, maybe with new hardware and eventually
>> different hypervisors being involved, it is a possibility.
>>
>> We can either call it a bug that an NMI happened without source
>> information. Or be extra nice and do this:
>>
>> if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) {
>> source = fred_event_data(regs);
>> if (!source || (source & BIT(0)))
>> source = ~0UL;
>> }
>
>Perhaps also WARN about the !source case?
A 0 should be interpreted such that NMI source is not available, e.g. due to a broken hypervisor or similar.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/9] x86/nmi: Add support to handle NMIs with source information
2025-05-08 20:23 ` H. Peter Anvin
@ 2025-05-08 20:49 ` Peter Zijlstra
2025-05-09 0:45 ` Sohil Mehta
0 siblings, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2025-05-08 20:49 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Sohil Mehta, x86, linux-kernel, Xin Li, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Brian Gerst, Andrew Cooper, Kirill A . Shutemov, Jacob Pan,
Andi Kleen, Kai Huang, Nikolay Borisov, linux-perf-users,
linux-edac, kvm, linux-pm, linux-trace-kernel
On Thu, May 08, 2025 at 01:23:04PM -0700, H. Peter Anvin wrote:
> On May 8, 2025 5:15:44 AM PDT, Peter Zijlstra <peterz@infradead.org> wrote:
> >> Looks good, except when fred_event_data() returns 0. I don't expect it
> >> to happen in practice. But, maybe with new hardware and eventually
> >> different hypervisors being involved, it is a possibility.
> >>
> >> We can either call it a bug that an NMI happened without source
> >> information. Or be extra nice and do this:
> >>
> >> if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) {
> >> source = fred_event_data(regs);
> >> if (!source || (source & BIT(0)))
> >> source = ~0UL;
> >> }
> >
> >Perhaps also WARN about the !source case?
>
> A 0 should be interpreted such that NMI source is not available, e.g.
> due to a broken hypervisor or similar.
I'm reading that as an agreement for WARN-ing on 0. We should definitely
WARN on broken hypervisors and similar.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/9] x86/nmi: Add support to handle NMIs with source information
2025-05-08 20:49 ` Peter Zijlstra
@ 2025-05-09 0:45 ` Sohil Mehta
0 siblings, 0 replies; 24+ messages in thread
From: Sohil Mehta @ 2025-05-09 0:45 UTC (permalink / raw)
To: Peter Zijlstra, H. Peter Anvin
Cc: x86, linux-kernel, Xin Li, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, Sean Christopherson,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Tony Luck, Paolo Bonzini, Vitaly Kuznetsov,
Rafael J . Wysocki, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, Brian Gerst,
Andrew Cooper, Kirill A . Shutemov, Jacob Pan, Andi Kleen,
Kai Huang, Nikolay Borisov, linux-perf-users, linux-edac, kvm,
linux-pm, linux-trace-kernel
On 5/8/2025 1:49 PM, Peter Zijlstra wrote:
> On Thu, May 08, 2025 at 01:23:04PM -0700, H. Peter Anvin wrote:
>> On May 8, 2025 5:15:44 AM PDT, Peter Zijlstra <peterz@infradead.org> wrote:
>
>>>> Looks good, except when fred_event_data() returns 0. I don't expect it
>>>> to happen in practice. But, maybe with new hardware and eventually
>>>> different hypervisors being involved, it is a possibility.
>>>>
>>>> We can either call it a bug that an NMI happened without source
>>>> information. Or be extra nice and do this:
>>>>
>>>> if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) {
>>>> source = fred_event_data(regs);
>>>> if (!source || (source & BIT(0)))
>>>> source = ~0UL;
>>>> }
>>>
>>> Perhaps also WARN about the !source case?
>>
>> A 0 should be interpreted such that NMI source is not available, e.g.
>> due to a broken hypervisor or similar.
>
> I'm reading that as an agreement for WARN-ing on 0. We should definitely
> WARN on broken hypervisors and similar.
Yup, will do.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 8/9] perf/x86: Enable NMI-source reporting for perfmon
2025-05-08 11:20 ` Sandipan Das
@ 2025-05-09 0:46 ` Sohil Mehta
0 siblings, 0 replies; 24+ messages in thread
From: Sohil Mehta @ 2025-05-09 0:46 UTC (permalink / raw)
To: Sandipan Das, x86, linux-kernel
Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
Sean Christopherson, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Kan Liang, Tony Luck, Paolo Bonzini,
Vitaly Kuznetsov, Rafael J . Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Brian Gerst, Andrew Cooper, Kirill A . Shutemov, Jacob Pan,
Andi Kleen, Kai Huang, Nikolay Borisov, linux-perf-users,
linux-edac, kvm, linux-pm, linux-trace-kernel
On 5/8/2025 4:20 AM, Sandipan Das wrote:
> For AMD processors that do not support NMI source reporting but use
> x86_pmu_handle_irq() and perf_events_lapic_init()
>
> Tested-by: Sandipan Das <sandipan.das@amd.com>
Thank you for testing this.
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2025-05-09 0:47 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-07 1:21 [PATCH v5 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 1/9] x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 2/9] x86/cpufeatures: Add the CPUID feature bit for NMI-source reporting Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 3/9] x86/nmi: Extend the registration interface to include the NMI-source vector Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 4/9] x86/nmi: Assign and register NMI-source vectors Sohil Mehta
2025-05-07 8:22 ` Peter Zijlstra
2025-05-07 20:43 ` Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 5/9] x86/nmi: Add support to handle NMIs with source information Sohil Mehta
2025-05-07 9:14 ` Peter Zijlstra
2025-05-07 21:48 ` Sohil Mehta
2025-05-08 12:15 ` Peter Zijlstra
2025-05-08 20:23 ` H. Peter Anvin
2025-05-08 20:49 ` Peter Zijlstra
2025-05-09 0:45 ` Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 6/9] x86/nmi: Prepare for the new NMI-source vector encoding Sohil Mehta
2025-05-07 9:17 ` Peter Zijlstra
2025-05-07 22:10 ` Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 7/9] x86/nmi: Enable NMI-source for IPIs delivered as NMIs Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 8/9] perf/x86: Enable NMI-source reporting for perfmon Sohil Mehta
2025-05-08 11:20 ` Sandipan Das
2025-05-09 0:46 ` Sohil Mehta
2025-05-07 1:21 ` [PATCH v5 9/9] x86/nmi: Include NMI-source information in tracepoint and debug prints Sohil Mehta
2025-05-07 21:48 ` Steven Rostedt
2025-05-08 0:02 ` Sohil Mehta
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).