linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/9] x86: Add support for NMI-source reporting with FRED
@ 2025-05-13 20:37 Sohil Mehta
  2025-05-13 20:37 ` [PATCH v6 1/9] x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM Sohil Mehta
                   ` (8 more replies)
  0 siblings, 9 replies; 37+ messages in thread
From: Sohil Mehta @ 2025-05-13 20:37 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Sohil Mehta, Andrew Cooper,
	Kirill A . Shutemov, Jacob Pan, Andi Kleen, Kai Huang,
	Sandipan Das, linux-perf-users, linux-edac, kvm, linux-pm,
	linux-trace-kernel

Introduction
============
NMI-source reporting with FRED [1] provides a new mechanism for
identifying the source of NMIs. As part of the FRED event delivery
framework, a 16-bit vector bitmap is provided that identifies one or
more sources that caused the NMI.

Using the source bitmap, the kernel can precisely run the relevant NMI
handlers instead of polling the entire NMI handler list. Additionally,
the source information would be invaluable for debugging misbehaving
handlers and unknown NMIs.

Changes since v5
================
This series mainly implements PeterZ's suggestions in patches 4,5,6:

 * Simplify NMI handling by always setting and using the source bitmap.
 * Add runtime warnings for unexpected values.
 * Fix a compile issue in apic.h with a specific config.
 * Drop the tracepoint changes for now (include it with the DebugFS series).
 * Pick-up Sandipan's tested-by for the perf patch.

The previous posting included a major simplification compared to the
series posted last year[2]. Refer the v5 cover letter for details.
v5: https://lore.kernel.org/lkml/20250507012145.2998143-1-sohil.mehta@intel.com/

Overview of NMI-source usage
============================
Code snippets:

// Allocate a static source vector at compile time
#define NMIS_VECTOR_TEST	1

// Register an NMI handler with the vector
register_nmi_handler(NMI_LOCAL, test_handler, 0, "nmi_test", NMIS_VECTOR_TEST);

// Generate an NMI with the source vector using NMI encoded delivery
__apic_send_IPI_mask(cpumask, APIC_DM_NMI | NMIS_VECTOR_TEST);

// Handle an NMI with or without the source information (oversimplified)
source_bitmap = fred_event_data(regs);
if (!source_bitmap || (source_bitmap & BIT(NMIS_VECTOR_TEST)))
        test_handler();

// Unregister handler along with the vector
unregister_nmi_handler(NMI_LOCAL, "nmi_test");

Patch structure
===============
The patches are based on tip:x86/nmi because they depend on the NMI
cleanup series merged earlier [3].

Patch 1-2: Prepare FRED/KVM and enumerate NMI-source reporting
Patch 3-5: Register and handle NMI-source vectors 
Patch 6-8: APIC changes to generate NMIs with vectors
Patch   9: Improve debug print with NMI-source information

Many thanks to Sean Christopherson, Xin Li, H. Peter Anvin, Andi Kleen,
Tony Luck, Kan Liang, Jacob Pan Jun, Zeng Guang, Peter Zijlstra,
Sandipan Das, Steven Rostedt and others for their contributions, reviews
and feedback.

Future work / Opens
===================
I am considering a few additional changes that would be valuable for
enhancing NMI handling support. Any feedback, preferences or suggestions
on the following would be helpful.

Assigning more NMI-source vectors
---------------------------------
The current patches assign NMI vectors to a limited number of sources.
The microcode rendezvous and crash reboot code use NMI but do not go
through the typical register_nmi_handler() path. Their handling is
special-cased in exc_nmi(). To isolate blame and improve debugging, it
would be useful to assign vectors to them, even if the vectors are
ignored during handling.

Other NMI sources, such as GHES and Platform NMIs, can also be assigned
vectors to speed up their NMI handling and improve isolation. However,
this would require a software/hardware agreement on vector reservation
and usage. Such an endeavor is likely not worth the effort.

Explicitly enabling NMIs
------------------------
HPA brought up the idea [4] of explicitly enabling NMIs only when the
kernel is ready to take them. With FRED, if we enter the kernel with
NMIs disabled, they could remain disabled until returning back to
userspace. I am evaluating the request and related code changes.

Debug support (Tracing and DebugFS)
-----------------------
NMI-source information can help identify issues when multiple NMIs occur
simultaneously or if certain NMI handlers consistently misbehave. Based
on feedback from Steven Rostedt[5], the plan is to move the trace event
to arch/x86 and then add source_bitmap to the nmi_handler() trace event.

Currently, the kernel has counters for unknown NMIs, swallowed NMIs and
other NMI handling data. However, there is no easy way to access that.
To identify issues that happen over a longer timeframe, it might be
useful to add DebugFS support for NMI statistics.

KVM support
-----------
The NMI-source feature can be useful for perf users and other NMI use
cases in guest VMs. Exposing NMI-source to guests once FRED support is
in place should be relatively easier. The prototype code for this is
under evaluation.

Links
=====
[1]: Chapter 9, https://www.intel.com/content/www/us/en/content-details/819481/flexible-return-and-event-delivery-fred-specification.html
[2]: https://lore.kernel.org/lkml/20240709143906.1040477-1-jacob.jun.pan@linux.intel.com/
[3]: https://lore.kernel.org/lkml/20250327234629.3953536-1-sohil.mehta@intel.com/
[4]: https://lore.kernel.org/lkml/F5D36889-A868-46D2-A678-8EE26E28556D@zytor.com/
[5]: https://lore.kernel.org/lkml/20250507174809.10cfc5ac@gandalf.local.home/

Jacob Pan (1):
  perf/x86: Enable NMI-source reporting for perfmon

Sohil Mehta (7):
  x86/cpufeatures: Add the CPUID feature bit for NMI-source reporting
  x86/nmi: Extend the registration interface to include the NMI-source
    vector
  x86/nmi: Assign and register NMI-source vectors
  x86/nmi: Add support to handle NMIs with source information
  x86/nmi: Prepare for the new NMI-source vector encoding
  x86/nmi: Enable NMI-source for IPIs delivered as NMIs
  x86/nmi: Print source information with the unknown NMI console message

Zeng Guang (1):
  x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM

 arch/x86/entry/entry_64_fred.S      |  2 +-
 arch/x86/events/amd/ibs.c           |  2 +-
 arch/x86/events/core.c              |  6 ++---
 arch/x86/events/intel/core.c        |  6 ++---
 arch/x86/include/asm/apic.h         | 39 +++++++++++++++++++++++++++++
 arch/x86/include/asm/apicdef.h      |  2 +-
 arch/x86/include/asm/cpufeatures.h  |  1 +
 arch/x86/include/asm/fred.h         |  9 ++++---
 arch/x86/include/asm/nmi.h          | 37 ++++++++++++++++++++++++++-
 arch/x86/kernel/apic/hw_nmi.c       |  5 ++--
 arch/x86/kernel/apic/ipi.c          |  4 +--
 arch/x86/kernel/apic/local.h        | 24 ++++++++++--------
 arch/x86/kernel/cpu/cpuid-deps.c    |  1 +
 arch/x86/kernel/cpu/mce/inject.c    |  4 +--
 arch/x86/kernel/cpu/mshyperv.c      |  3 +--
 arch/x86/kernel/kgdb.c              |  8 +++---
 arch/x86/kernel/kvm.c               |  9 +------
 arch/x86/kernel/nmi.c               | 37 +++++++++++++++++++++++++++
 arch/x86/kernel/nmi_selftest.c      |  9 +++----
 arch/x86/kernel/smp.c               |  6 ++---
 arch/x86/kvm/vmx/vmx.c              |  5 ++--
 arch/x86/platform/uv/uv_nmi.c       |  4 +--
 drivers/acpi/apei/ghes.c            |  2 +-
 drivers/char/ipmi/ipmi_watchdog.c   |  3 +--
 drivers/edac/igen6_edac.c           |  3 +--
 drivers/thermal/intel/therm_throt.c |  2 +-
 drivers/watchdog/hpwdt.c            |  6 ++---
 27 files changed, 171 insertions(+), 68 deletions(-)


base-commit: f2e01dcf6df2d12e86c363ea9c37d53994d89dd6
-- 
2.43.0


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v6 1/9] x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM
  2025-05-13 20:37 [PATCH v6 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
@ 2025-05-13 20:37 ` Sohil Mehta
  2025-05-14 14:15   ` Sean Christopherson
  2025-05-13 20:37 ` [PATCH v6 2/9] x86/cpufeatures: Add the CPUID feature bit for NMI-source reporting Sohil Mehta
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 37+ messages in thread
From: Sohil Mehta @ 2025-05-13 20:37 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Sohil Mehta, Andrew Cooper,
	Kirill A . Shutemov, Jacob Pan, Andi Kleen, Kai Huang,
	Sandipan Das, linux-perf-users, linux-edac, kvm, linux-pm,
	linux-trace-kernel

From: Zeng Guang <guang.zeng@intel.com>

Extend the FRED entry point from KVM to take an extra argument to allow
KVM to invoke the FRED event dispatch framework with event data.

The first use of this extended API is to pass the NMI-source bitmap for
NMI-induced VM exits. Read the VMCS exit qualification field to get the
NMI-source information and store it as event data precisely in the
format expected by the FRED event framework.

Read the VMCS exit qualification unconditionally since almost all
upcoming CPUs are expected to enable FRED and NMI-source together. In
the rare case that NMI-source isn't enabled, the extra VMREAD would be
harmless since the exit qualification is expected to be zero.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Zeng Guang <guang.zeng@intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v6: No change

v5: Read the VMCS exit qualification unconditionally. (Sean)
    Combine related patches into one.
---
 arch/x86/entry/entry_64_fred.S | 2 +-
 arch/x86/include/asm/fred.h    | 9 +++++----
 arch/x86/kvm/vmx/vmx.c         | 5 +++--
 3 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/entry/entry_64_fred.S b/arch/x86/entry/entry_64_fred.S
index 29c5c32c16c3..a61256be9703 100644
--- a/arch/x86/entry/entry_64_fred.S
+++ b/arch/x86/entry/entry_64_fred.S
@@ -93,7 +93,7 @@ SYM_FUNC_START(asm_fred_entry_from_kvm)
 	 * +--------+-----------------+
 	 */
 	push $0				/* Reserved, must be 0 */
-	push $0				/* Event data, 0 for IRQ/NMI */
+	push %rsi			/* Event data for IRQ/NMI */
 	push %rdi			/* fred_ss handed in by the caller */
 	push %rbp
 	pushf
diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h
index 2a29e5216881..a4de57e578c4 100644
--- a/arch/x86/include/asm/fred.h
+++ b/arch/x86/include/asm/fred.h
@@ -64,14 +64,15 @@ static __always_inline unsigned long fred_event_data(struct pt_regs *regs)
 
 void asm_fred_entrypoint_user(void);
 void asm_fred_entrypoint_kernel(void);
-void asm_fred_entry_from_kvm(struct fred_ss);
+void asm_fred_entry_from_kvm(struct fred_ss ss, unsigned long edata);
 
 __visible void fred_entry_from_user(struct pt_regs *regs);
 __visible void fred_entry_from_kernel(struct pt_regs *regs);
 __visible void __fred_entry_from_kvm(struct pt_regs *regs);
 
 /* Can be called from noinstr code, thus __always_inline */
-static __always_inline void fred_entry_from_kvm(unsigned int type, unsigned int vector)
+static __always_inline void fred_entry_from_kvm(unsigned int type, unsigned int vector,
+						unsigned long edata)
 {
 	struct fred_ss ss = {
 		.ss     =__KERNEL_DS,
@@ -81,7 +82,7 @@ static __always_inline void fred_entry_from_kvm(unsigned int type, unsigned int
 		.lm     = 1,
 	};
 
-	asm_fred_entry_from_kvm(ss);
+	asm_fred_entry_from_kvm(ss, edata);
 }
 
 void cpu_init_fred_exceptions(void);
@@ -109,7 +110,7 @@ static __always_inline unsigned long fred_event_data(struct pt_regs *regs) { ret
 static inline void cpu_init_fred_exceptions(void) { }
 static inline void cpu_init_fred_rsps(void) { }
 static inline void fred_complete_exception_setup(void) { }
-static inline void fred_entry_from_kvm(unsigned int type, unsigned int vector) { }
+static inline void fred_entry_from_kvm(unsigned int type, unsigned int vector, unsigned long edata) { }
 static inline void fred_sync_rsp0(unsigned long rsp0) { }
 static inline void fred_update_rsp0(void) { }
 #endif /* CONFIG_X86_FRED */
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 5c5766467a61..1d43d4a2f6b6 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7079,7 +7079,7 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu,
 
 	kvm_before_interrupt(vcpu, KVM_HANDLING_IRQ);
 	if (cpu_feature_enabled(X86_FEATURE_FRED))
-		fred_entry_from_kvm(EVENT_TYPE_EXTINT, vector);
+		fred_entry_from_kvm(EVENT_TYPE_EXTINT, vector, 0);
 	else
 		vmx_do_interrupt_irqoff(gate_offset((gate_desc *)host_idt_base + vector));
 	kvm_after_interrupt(vcpu);
@@ -7393,7 +7393,8 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
 	    is_nmi(vmx_get_intr_info(vcpu))) {
 		kvm_before_interrupt(vcpu, KVM_HANDLING_NMI);
 		if (cpu_feature_enabled(X86_FEATURE_FRED))
-			fred_entry_from_kvm(EVENT_TYPE_NMI, NMI_VECTOR);
+			fred_entry_from_kvm(EVENT_TYPE_NMI, NMI_VECTOR,
+					    vmx_get_exit_qual(vcpu));
 		else
 			vmx_do_nmi_irqoff();
 		kvm_after_interrupt(vcpu);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v6 2/9] x86/cpufeatures: Add the CPUID feature bit for NMI-source reporting
  2025-05-13 20:37 [PATCH v6 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
  2025-05-13 20:37 ` [PATCH v6 1/9] x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM Sohil Mehta
@ 2025-05-13 20:37 ` Sohil Mehta
  2025-05-13 20:37 ` [PATCH v6 3/9] x86/nmi: Extend the registration interface to include the NMI-source vector Sohil Mehta
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 37+ messages in thread
From: Sohil Mehta @ 2025-05-13 20:37 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Sohil Mehta, Andrew Cooper,
	Kirill A . Shutemov, Jacob Pan, Andi Kleen, Kai Huang,
	Sandipan Das, linux-perf-users, linux-edac, kvm, linux-pm,
	linux-trace-kernel

NMI-source reporting is introduced to report the sources of NMIs with
FRED event delivery based on vectors in NMI interrupt messages or the
local APIC. This enables the kernel to avoid the latency incurred by
going over the entire NMI handler list and reduces ambiguity about the
source of an NMI.

Enumerate NMI-source reporting in cpufeatures. Also, since NMI-source
reporting uses the FRED event dispatch framework, make it dependent on
FRED in the CPUID dependency table. This ensures that NMI-source
reporting gets disabled when FRED is disabled.

NMI-source reporting is intended as a kernel feature and does not need
userspace enumeration or configuration. There is no need to expose it to
userspace through /proc/cpuinfo.

Originally-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v6: No change.

v5: Add NMI-source to the CPUID dependency table.
    Do not expose NMI-source feature through /proc/cpuinfo.
---
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/kernel/cpu/cpuid-deps.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 6c2c152d8a67..2ced1bc64548 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -321,6 +321,7 @@
 #define X86_FEATURE_FRED		(12*32+17) /* "fred" Flexible Return and Event Delivery */
 #define X86_FEATURE_LKGS		(12*32+18) /* Load "kernel" (userspace) GS */
 #define X86_FEATURE_WRMSRNS		(12*32+19) /* Non-serializing WRMSR */
+#define X86_FEATURE_NMI_SOURCE		(12*32+20) /* NMI-Source reporting with FRED */
 #define X86_FEATURE_AMX_FP16		(12*32+21) /* AMX fp16 Support */
 #define X86_FEATURE_AVX_IFMA            (12*32+23) /* Support for VPMADD52[H,L]UQ */
 #define X86_FEATURE_LAM			(12*32+26) /* "lam" Linear Address Masking */
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index a2fbea0be535..ed0fd35c9290 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -84,6 +84,7 @@ static const struct cpuid_dep cpuid_deps[] = {
 	{ X86_FEATURE_AMX_TILE,			X86_FEATURE_XFD       },
 	{ X86_FEATURE_SHSTK,			X86_FEATURE_XSAVES    },
 	{ X86_FEATURE_FRED,			X86_FEATURE_LKGS      },
+	{ X86_FEATURE_NMI_SOURCE,		X86_FEATURE_FRED      },
 	{}
 };
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v6 3/9] x86/nmi: Extend the registration interface to include the NMI-source vector
  2025-05-13 20:37 [PATCH v6 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
  2025-05-13 20:37 ` [PATCH v6 1/9] x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM Sohil Mehta
  2025-05-13 20:37 ` [PATCH v6 2/9] x86/cpufeatures: Add the CPUID feature bit for NMI-source reporting Sohil Mehta
@ 2025-05-13 20:37 ` Sohil Mehta
  2025-06-03  7:23   ` Xin Li
  2025-06-03 17:07   ` Dave Hansen
  2025-05-13 20:37 ` [PATCH v6 4/9] x86/nmi: Assign and register NMI-source vectors Sohil Mehta
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 37+ messages in thread
From: Sohil Mehta @ 2025-05-13 20:37 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Sohil Mehta, Andrew Cooper,
	Kirill A . Shutemov, Jacob Pan, Andi Kleen, Kai Huang,
	Sandipan Das, linux-perf-users, linux-edac, kvm, linux-pm,
	linux-trace-kernel

To prepare for NMI-source reporting, add a source vector argument to the
NMI handler registration interface. Later, this will be used to
register NMI handlers with a unique source vector that can be used to
identify the originator of the NMI.

For now, just extend the interface and pass zero as the source vector
for all handlers. No functional change intended.

Originally-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v6: No change.

v5: Split the patch into two parts. This one only extends the interface.
---
 arch/x86/events/amd/ibs.c         | 2 +-
 arch/x86/events/core.c            | 2 +-
 arch/x86/include/asm/nmi.h        | 5 ++++-
 arch/x86/kernel/apic/hw_nmi.c     | 3 +--
 arch/x86/kernel/cpu/mce/inject.c  | 2 +-
 arch/x86/kernel/cpu/mshyperv.c    | 3 +--
 arch/x86/kernel/kgdb.c            | 6 ++----
 arch/x86/kernel/nmi_selftest.c    | 7 +++----
 arch/x86/kernel/smp.c             | 4 ++--
 arch/x86/platform/uv/uv_nmi.c     | 4 ++--
 drivers/acpi/apei/ghes.c          | 2 +-
 drivers/char/ipmi/ipmi_watchdog.c | 3 +--
 drivers/edac/igen6_edac.c         | 3 +--
 drivers/watchdog/hpwdt.c          | 6 +++---
 14 files changed, 24 insertions(+), 28 deletions(-)

diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index 0252b7ea8bca..45dece8bee84 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -1486,7 +1486,7 @@ static __init int perf_event_ibs_init(void)
 	if (ret)
 		goto err_op;
 
-	ret = register_nmi_handler(NMI_LOCAL, perf_ibs_nmi_handler, 0, "perf_ibs");
+	ret = register_nmi_handler(NMI_LOCAL, perf_ibs_nmi_handler, 0, "perf_ibs", 0);
 	if (ret)
 		goto err_nmi;
 
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 6866cc5acb0b..b84b8be1f075 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2115,7 +2115,7 @@ static int __init init_hw_perf_events(void)
 		x86_pmu.config_mask = X86_RAW_EVENT_MASK;
 
 	perf_events_lapic_init();
-	register_nmi_handler(NMI_LOCAL, perf_event_nmi_handler, 0, "PMI");
+	register_nmi_handler(NMI_LOCAL, perf_event_nmi_handler, 0, "PMI", 0);
 
 	unconstrained = (struct event_constraint)
 		__EVENT_CONSTRAINT(0, x86_pmu.cntr_mask64,
diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
index 79d88d12c8fb..f0a577bf7bba 100644
--- a/arch/x86/include/asm/nmi.h
+++ b/arch/x86/include/asm/nmi.h
@@ -54,6 +54,7 @@ struct nmiaction {
 	u64			max_duration;
 	unsigned long		flags;
 	const char		*name;
+	u8			source_vector;
 };
 
 /**
@@ -62,6 +63,7 @@ struct nmiaction {
  * @fn:   The NMI handler
  * @fg:   Flags associated with the NMI handler
  * @n:    Name of the NMI handler
+ * @src:  NMI-source based vector for the NMI handler
  * @init: Optional __init* attributes for struct nmiaction
  *
  * Adds the provided handler to the list of handlers for the specified
@@ -75,13 +77,14 @@ struct nmiaction {
  *
  * Return: 0 on success, or an error code on failure.
  */
-#define register_nmi_handler(t, fn, fg, n, init...)	\
+#define register_nmi_handler(t, fn, fg, n, src, init...)	\
 ({							\
 	static struct nmiaction init fn##_na = {	\
 		.list = LIST_HEAD_INIT(fn##_na.list),	\
 		.handler = (fn),			\
 		.name = (n),				\
 		.flags = (fg),				\
+		.source_vector = (src),			\
 	};						\
 	__register_nmi_handler((t), &fn##_na);		\
 })
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 45af535c44a0..612b77660d05 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -53,8 +53,7 @@ NOKPROBE_SYMBOL(nmi_cpu_backtrace_handler);
 
 static int __init register_nmi_cpu_backtrace_handler(void)
 {
-	register_nmi_handler(NMI_LOCAL, nmi_cpu_backtrace_handler,
-				0, "arch_bt");
+	register_nmi_handler(NMI_LOCAL, nmi_cpu_backtrace_handler, 0, "arch_bt", 0);
 	return 0;
 }
 early_initcall(register_nmi_cpu_backtrace_handler);
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index 06e3cf7229ce..17804ba0b02f 100644
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -774,7 +774,7 @@ static int __init inject_init(void)
 
 	debugfs_init();
 
-	register_nmi_handler(NMI_LOCAL, mce_raise_notify, 0, "mce_notify");
+	register_nmi_handler(NMI_LOCAL, mce_raise_notify, 0, "mce_notify", 0);
 	mce_register_injector_chain(&inject_nb);
 
 	setup_inj_struct(&i_mce);
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 3e2533954675..d643d6fb3cfa 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -548,8 +548,7 @@ static void __init ms_hyperv_init_platform(void)
 			lapic_timer_period);
 	}
 
-	register_nmi_handler(NMI_UNKNOWN, hv_nmi_unknown, NMI_FLAG_FIRST,
-			     "hv_nmi_unknown");
+	register_nmi_handler(NMI_UNKNOWN, hv_nmi_unknown, NMI_FLAG_FIRST, "hv_nmi_unknown", 0);
 #endif
 
 #ifdef CONFIG_X86_IO_APIC
diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
index 9c9faa1634fb..ab2d1b79b79e 100644
--- a/arch/x86/kernel/kgdb.c
+++ b/arch/x86/kernel/kgdb.c
@@ -602,13 +602,11 @@ int kgdb_arch_init(void)
 	if (retval)
 		goto out;
 
-	retval = register_nmi_handler(NMI_LOCAL, kgdb_nmi_handler,
-					0, "kgdb");
+	retval = register_nmi_handler(NMI_LOCAL, kgdb_nmi_handler, 0, "kgdb", 0);
 	if (retval)
 		goto out1;
 
-	retval = register_nmi_handler(NMI_UNKNOWN, kgdb_nmi_handler,
-					0, "kgdb");
+	retval = register_nmi_handler(NMI_UNKNOWN, kgdb_nmi_handler, 0, "kgdb", 0);
 
 	if (retval)
 		goto out2;
diff --git a/arch/x86/kernel/nmi_selftest.c b/arch/x86/kernel/nmi_selftest.c
index a010e9d062bf..b203e4371816 100644
--- a/arch/x86/kernel/nmi_selftest.c
+++ b/arch/x86/kernel/nmi_selftest.c
@@ -40,8 +40,7 @@ static int __init nmi_unk_cb(unsigned int val, struct pt_regs *regs)
 static void __init init_nmi_testsuite(void)
 {
 	/* trap all the unknown NMIs we may generate */
-	register_nmi_handler(NMI_UNKNOWN, nmi_unk_cb, 0, "nmi_selftest_unk",
-			__initdata);
+	register_nmi_handler(NMI_UNKNOWN, nmi_unk_cb, 0, "nmi_selftest_unk", 0, __initdata);
 }
 
 static void __init cleanup_nmi_testsuite(void)
@@ -63,8 +62,8 @@ static void __init test_nmi_ipi(struct cpumask *mask)
 {
 	unsigned long timeout;
 
-	if (register_nmi_handler(NMI_LOCAL, test_nmi_ipi_callback,
-				 NMI_FLAG_FIRST, "nmi_selftest", __initdata)) {
+	if (register_nmi_handler(NMI_LOCAL, test_nmi_ipi_callback, NMI_FLAG_FIRST,
+				 "nmi_selftest", 0, __initdata)) {
 		nmi_fail = FAILURE;
 		return;
 	}
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 18266cc3d98c..b80812aa06c3 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -142,8 +142,8 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_reboot)
 
 static int register_stop_handler(void)
 {
-	return register_nmi_handler(NMI_LOCAL, smp_stop_nmi_callback,
-				    NMI_FLAG_FIRST, "smp_stop");
+	return register_nmi_handler(NMI_LOCAL, smp_stop_nmi_callback, NMI_FLAG_FIRST, "smp_stop",
+				    0);
 }
 
 static void native_stop_other_cpus(int wait)
diff --git a/arch/x86/platform/uv/uv_nmi.c b/arch/x86/platform/uv/uv_nmi.c
index 5c50e550ab63..473c34eb264c 100644
--- a/arch/x86/platform/uv/uv_nmi.c
+++ b/arch/x86/platform/uv/uv_nmi.c
@@ -1029,10 +1029,10 @@ static int uv_handle_nmi_ping(unsigned int reason, struct pt_regs *regs)
 
 static void uv_register_nmi_notifier(void)
 {
-	if (register_nmi_handler(NMI_UNKNOWN, uv_handle_nmi, 0, "uv"))
+	if (register_nmi_handler(NMI_UNKNOWN, uv_handle_nmi, 0, "uv", 0))
 		pr_warn("UV: NMI handler failed to register\n");
 
-	if (register_nmi_handler(NMI_LOCAL, uv_handle_nmi_ping, 0, "uvping"))
+	if (register_nmi_handler(NMI_LOCAL, uv_handle_nmi_ping, 0, "uvping", 0))
 		pr_warn("UV: PING NMI handler failed to register\n");
 }
 
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index b72772494655..95bd3a64608f 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -1318,7 +1318,7 @@ static void ghes_nmi_add(struct ghes *ghes)
 {
 	mutex_lock(&ghes_list_mutex);
 	if (list_empty(&ghes_nmi))
-		register_nmi_handler(NMI_LOCAL, ghes_notify_nmi, 0, "ghes");
+		register_nmi_handler(NMI_LOCAL, ghes_notify_nmi, 0, "ghes", 0);
 	list_add_rcu(&ghes->list, &ghes_nmi);
 	mutex_unlock(&ghes_list_mutex);
 }
diff --git a/drivers/char/ipmi/ipmi_watchdog.c b/drivers/char/ipmi/ipmi_watchdog.c
index f1875b2bebbc..5db402c4b9e7 100644
--- a/drivers/char/ipmi/ipmi_watchdog.c
+++ b/drivers/char/ipmi/ipmi_watchdog.c
@@ -1267,8 +1267,7 @@ static void check_parms(void)
 		}
 	}
 	if (do_nmi && !nmi_handler_registered) {
-		rv = register_nmi_handler(NMI_UNKNOWN, ipmi_nmi, 0,
-						"ipmi");
+		rv = register_nmi_handler(NMI_UNKNOWN, ipmi_nmi, 0, "ipmi", 0);
 		if (rv) {
 			pr_warn("Can't register nmi handler\n");
 			return;
diff --git a/drivers/edac/igen6_edac.c b/drivers/edac/igen6_edac.c
index 5807517ee32d..3a6e7334e94c 100644
--- a/drivers/edac/igen6_edac.c
+++ b/drivers/edac/igen6_edac.c
@@ -1363,8 +1363,7 @@ static int register_err_handler(void)
 		return 0;
 	}
 
-	rc = register_nmi_handler(NMI_SERR, ecclog_nmi_handler,
-				  0, IGEN6_NMI_NAME);
+	rc = register_nmi_handler(NMI_SERR, ecclog_nmi_handler, 0, IGEN6_NMI_NAME, 0);
 	if (rc) {
 		igen6_printk(KERN_ERR, "Failed to register NMI handler\n");
 		return rc;
diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
index ae30e394d176..5246706afcf6 100644
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -242,13 +242,13 @@ static int hpwdt_init_nmi_decoding(struct pci_dev *dev)
 	/*
 	 * Only one function can register for NMI_UNKNOWN
 	 */
-	retval = register_nmi_handler(NMI_UNKNOWN, hpwdt_pretimeout, 0, "hpwdt");
+	retval = register_nmi_handler(NMI_UNKNOWN, hpwdt_pretimeout, 0, "hpwdt", 0);
 	if (retval)
 		goto error;
-	retval = register_nmi_handler(NMI_SERR, hpwdt_pretimeout, 0, "hpwdt");
+	retval = register_nmi_handler(NMI_SERR, hpwdt_pretimeout, 0, "hpwdt", 0);
 	if (retval)
 		goto error1;
-	retval = register_nmi_handler(NMI_IO_CHECK, hpwdt_pretimeout, 0, "hpwdt");
+	retval = register_nmi_handler(NMI_IO_CHECK, hpwdt_pretimeout, 0, "hpwdt", 0);
 	if (retval)
 		goto error2;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v6 4/9] x86/nmi: Assign and register NMI-source vectors
  2025-05-13 20:37 [PATCH v6 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
                   ` (2 preceding siblings ...)
  2025-05-13 20:37 ` [PATCH v6 3/9] x86/nmi: Extend the registration interface to include the NMI-source vector Sohil Mehta
@ 2025-05-13 20:37 ` Sohil Mehta
  2025-06-03 16:34   ` Xin Li
  2025-06-03 17:23   ` Dave Hansen
  2025-05-13 20:37 ` [PATCH v6 5/9] x86/nmi: Add support to handle NMIs with source information Sohil Mehta
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 37+ messages in thread
From: Sohil Mehta @ 2025-05-13 20:37 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Sohil Mehta, Andrew Cooper,
	Kirill A . Shutemov, Jacob Pan, Andi Kleen, Kai Huang,
	Sandipan Das, linux-perf-users, linux-edac, kvm, linux-pm,
	linux-trace-kernel

Prior to NMI-source support, the vector information was ignored by the
hardware while delivering NMIs. With NMI-source, the architecture
currently supports a 16-bit source bitmap to identify the source of the
NMI. Upon receiving an NMI, this bitmap is delivered as part of the FRED
event delivery mechanism to the kernel.

Assign a vector space of 0-15 that is specific to NMI-source and
independent of the IDT vector space of 0-255. Being a bitmap, the
NMI-source vectors do not have any inherent priority associated with
them. The order of executing the NMI handlers is up to the kernel.

Existing NMI handling already has a priority mechanism for the NMI
handlers, with CPU-specific (NMI_LOCAL) handlers executed first,
followed by platform NMI handlers and unknown NMI (NMI_UNKNOWN) handlers
being last. Within each of these NMI types, the handlers registered with
NMI_FLAG_FIRST are given priority.

NMI-source follows the same priority scheme to avoid unnecessary
complexity. Therefore, the NMI-source vectors are assigned arbitrarily,
except for vectors 0 and 2.

Vector 0 is set by the hardware whenever a source vector was not used
while generating an NMI or the originator could not be reliably
identified. Do not assign it to any handler.

Vector 2 is reserved for external NMIs corresponding to Local APIC -
LINT1. Some third-party chipsets may send NMI messages with a hardcoded
vector of 2, which would result in vector 2 being set in the NMI-source
bitmap. To avoid confusion, do not assign vector 2 to any handler.

NMI-source vectors are only assigned for NMI_LOCAL type handlers.
Platform NMI handlers have a single handler registered per type. They
don't need additional source information to differentiate among them.

Use the assigned vectors to register the respective NMI handlers. Warn
if the vector values are unexpected.

A couple of NMI handlers, such as the microcode rendezvous and the crash
reboot, do not use the typical NMI registration interface. Leave them
as-is for now.

Originally-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v6: Store source vector unconditionally.
    Add a warning for unexpected source vector values.

v5: Move the vector defines to nmi.h.
    Combine vector allocation and registration into one patch.
    Simplify NMI vector names.
    Describe usage of vector 2 for external NMIs.
    Get rid of vector priorities.
---
 arch/x86/events/core.c           |  2 +-
 arch/x86/include/asm/nmi.h       | 32 ++++++++++++++++++++++++++++++++
 arch/x86/kernel/apic/hw_nmi.c    |  2 +-
 arch/x86/kernel/cpu/mce/inject.c |  2 +-
 arch/x86/kernel/kgdb.c           |  2 +-
 arch/x86/kernel/nmi.c            |  4 ++++
 arch/x86/kernel/nmi_selftest.c   |  2 +-
 arch/x86/kernel/smp.c            |  2 +-
 8 files changed, 42 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index b84b8be1f075..031e908f0d61 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2115,7 +2115,7 @@ static int __init init_hw_perf_events(void)
 		x86_pmu.config_mask = X86_RAW_EVENT_MASK;
 
 	perf_events_lapic_init();
-	register_nmi_handler(NMI_LOCAL, perf_event_nmi_handler, 0, "PMI", 0);
+	register_nmi_handler(NMI_LOCAL, perf_event_nmi_handler, 0, "PMI", NMIS_VECTOR_PMI);
 
 	unconstrained = (struct event_constraint)
 		__EVENT_CONSTRAINT(0, x86_pmu.cntr_mask64,
diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
index f0a577bf7bba..b9beb216f2d0 100644
--- a/arch/x86/include/asm/nmi.h
+++ b/arch/x86/include/asm/nmi.h
@@ -57,6 +57,38 @@ struct nmiaction {
 	u8			source_vector;
 };
 
+/**
+ * NMI-source vectors are used to identify the origin of an NMI and to
+ * route the NMI directly to the appropriate handler.
+ *
+ * On CPUs that support NMI-source reporting with FRED, receiving an NMI
+ * with a valid vector sets the corresponding bit in the NMI-source
+ * bitmap. The bitmap is delivered as FRED event data on the stack.
+ * Multiple NMIs are coalesced in the NMI-source bitmap until the next
+ * NMI delivery.
+ *
+ * If an NMI is received without a vector or beyond the defined range,
+ * the CPU sets bit 0 of the NMI-source bitmap.
+ *
+ * Vector 2 is reserved for external NMIs related to the Local APIC -
+ * LINT1. Some third-party chipsets may send NMI messages with a
+ * hardcoded vector of 2, which would result in bit 2 being set in the
+ * NMI-source bitmap.
+ *
+ * The vectors are in no particular priority order. Add new vector
+ * assignments sequentially in the list below.
+ */
+#define NMIS_VECTOR_NONE	0	/* Reserved - Set for all unidentified sources */
+#define NMIS_VECTOR_TEST	1	/* NMI selftest */
+#define NMIS_VECTOR_EXTERNAL	2	/* Reserved - Match External NMI vector 2 */
+#define NMIS_VECTOR_SMP_STOP	3	/* Panic stop CPU */
+#define NMIS_VECTOR_BT		4	/* CPU backtrace */
+#define NMIS_VECTOR_KGDB	5	/* Kernel debugger */
+#define NMIS_VECTOR_MCE		6	/* MCE injection */
+#define NMIS_VECTOR_PMI		7	/* PerfMon counters */
+
+#define NMIS_VECTORS_MAX	16	/* Maximum number of NMI-source vectors */
+
 /**
  * register_nmi_handler - Register a handler for a specific NMI type
  * @t:    NMI type (e.g. NMI_LOCAL)
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 612b77660d05..4e04f13d2de9 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -53,7 +53,7 @@ NOKPROBE_SYMBOL(nmi_cpu_backtrace_handler);
 
 static int __init register_nmi_cpu_backtrace_handler(void)
 {
-	register_nmi_handler(NMI_LOCAL, nmi_cpu_backtrace_handler, 0, "arch_bt", 0);
+	register_nmi_handler(NMI_LOCAL, nmi_cpu_backtrace_handler, 0, "arch_bt", NMIS_VECTOR_BT);
 	return 0;
 }
 early_initcall(register_nmi_cpu_backtrace_handler);
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index 17804ba0b02f..a3c753dfce91 100644
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -774,7 +774,7 @@ static int __init inject_init(void)
 
 	debugfs_init();
 
-	register_nmi_handler(NMI_LOCAL, mce_raise_notify, 0, "mce_notify", 0);
+	register_nmi_handler(NMI_LOCAL, mce_raise_notify, 0, "mce_notify", NMIS_VECTOR_MCE);
 	mce_register_injector_chain(&inject_nb);
 
 	setup_inj_struct(&i_mce);
diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
index ab2d1b79b79e..9ca4b141da0c 100644
--- a/arch/x86/kernel/kgdb.c
+++ b/arch/x86/kernel/kgdb.c
@@ -602,7 +602,7 @@ int kgdb_arch_init(void)
 	if (retval)
 		goto out;
 
-	retval = register_nmi_handler(NMI_LOCAL, kgdb_nmi_handler, 0, "kgdb", 0);
+	retval = register_nmi_handler(NMI_LOCAL, kgdb_nmi_handler, 0, "kgdb", NMIS_VECTOR_KGDB);
 	if (retval)
 		goto out1;
 
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index be93ec7255bf..1a24e8df1bdf 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -182,6 +182,10 @@ int __register_nmi_handler(unsigned int type, struct nmiaction *action)
 	if (WARN_ON_ONCE(!action->handler || !list_empty(&action->list)))
 		return -EINVAL;
 
+	/* NMI-source reporting should only be used for NMI_LOCAL */
+	WARN_ON_ONCE(type != NMI_LOCAL && action->source_vector);
+	WARN_ON_ONCE(action->source_vector >= NMIS_VECTORS_MAX);
+
 	raw_spin_lock_irqsave(&desc->lock, flags);
 
 	/*
diff --git a/arch/x86/kernel/nmi_selftest.c b/arch/x86/kernel/nmi_selftest.c
index b203e4371816..5196023b31dc 100644
--- a/arch/x86/kernel/nmi_selftest.c
+++ b/arch/x86/kernel/nmi_selftest.c
@@ -63,7 +63,7 @@ static void __init test_nmi_ipi(struct cpumask *mask)
 	unsigned long timeout;
 
 	if (register_nmi_handler(NMI_LOCAL, test_nmi_ipi_callback, NMI_FLAG_FIRST,
-				 "nmi_selftest", 0, __initdata)) {
+				 "nmi_selftest", NMIS_VECTOR_TEST, __initdata)) {
 		nmi_fail = FAILURE;
 		return;
 	}
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index b80812aa06c3..5be1c0bdf901 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -143,7 +143,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_reboot)
 static int register_stop_handler(void)
 {
 	return register_nmi_handler(NMI_LOCAL, smp_stop_nmi_callback, NMI_FLAG_FIRST, "smp_stop",
-				    0);
+				    NMIS_VECTOR_SMP_STOP);
 }
 
 static void native_stop_other_cpus(int wait)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v6 5/9] x86/nmi: Add support to handle NMIs with source information
  2025-05-13 20:37 [PATCH v6 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
                   ` (3 preceding siblings ...)
  2025-05-13 20:37 ` [PATCH v6 4/9] x86/nmi: Assign and register NMI-source vectors Sohil Mehta
@ 2025-05-13 20:37 ` Sohil Mehta
  2025-06-03 16:53   ` Xin Li
  2025-05-13 20:38 ` [PATCH v6 6/9] x86/nmi: Prepare for the new NMI-source vector encoding Sohil Mehta
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 37+ messages in thread
From: Sohil Mehta @ 2025-05-13 20:37 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Sohil Mehta, Andrew Cooper,
	Kirill A . Shutemov, Jacob Pan, Andi Kleen, Kai Huang,
	Sandipan Das, linux-perf-users, linux-edac, kvm, linux-pm,
	linux-trace-kernel

The NMI-source bitmap is delivered as FRED event data to the kernel.
When available, use NMI-source based filtering to determine the exact
handlers to run.

Activate NMI-source based filtering only for Local NMIs. While handling
platform NMI types (such as SERR and IOCHK), do not use the source
bitmap. They have only one handler registered per type, so there is no
need to disambiguate between multiple handlers.

Some third-party chipsets may send NMI messages with a hardcoded vector
of 2, which would result in bit 2 being set in the NMI-source bitmap.
Skip the local NMI handlers in this situation.

Bit 0 of the source bitmap is set by the hardware whenever a source
vector was not used while generating an NMI, or the originator could not
be reliably identified. Poll all the registered handlers in that case.

When multiple handlers need to be executed, adhere to the existing
priority scheme and execute the handlers registered with NMI_FLAG_FIRST
before others.

The logic for handling legacy NMIs is unaffected since the source bitmap
would always have all bits set.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v6: Get rid of a separate NMI source matching function
    Set source_bitmap to ULONG_MAX to match all sources by default

v5: Significantly simplify NMI-source handling logic.
    Get rid of a separate lookup table for NMI-source vectors.
    Adhere to existing priority scheme for handling NMIs.
---
 arch/x86/kernel/nmi.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index 1a24e8df1bdf..55ecbe2ab5e4 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -130,6 +130,7 @@ static void nmi_check_duration(struct nmiaction *action, u64 duration)
 static int nmi_handle(unsigned int type, struct pt_regs *regs)
 {
 	struct nmi_desc *desc = nmi_to_desc(type);
+	unsigned long source_bitmap = ULONG_MAX;
 	nmi_handler_t ehandler;
 	struct nmiaction *a;
 	int handled=0;
@@ -148,16 +149,45 @@ static int nmi_handle(unsigned int type, struct pt_regs *regs)
 
 	rcu_read_lock();
 
+	/*
+	 * Activate NMI source-based filtering only for Local NMIs.
+	 *
+	 * Platform NMI types (such as SERR and IOCHK) have only one
+	 * handler registered per type, so there is no need to
+	 * disambiguate between multiple handlers.
+	 *
+	 * Also, if a platform source ends up setting bit 2 in the
+	 * source bitmap, the local NMI handlers would be skipped since
+	 * none of them use this reserved vector.
+	 *
+	 * For Unknown NMIs, avoid using the source bitmap to ensure all
+	 * potential handlers have a chance to claim responsibility.
+	 */
+	if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) {
+		source_bitmap = fred_event_data(regs);
+
+		/* Reset the bitmap if a valid source could not be identified */
+		if (WARN_ON_ONCE(!source_bitmap) || (source_bitmap & BIT(NMIS_VECTOR_NONE)))
+			source_bitmap = ULONG_MAX;
+	}
+
 	/*
 	 * NMIs are edge-triggered, which means if you have enough
 	 * of them concurrently, you can lose some because only one
 	 * can be latched at any given time.  Walk the whole list
 	 * to handle those situations.
+	 *
+	 * However, NMI-source reporting does not have this limitation.
+	 * When NMI sources have been identified, only run the handlers
+	 * that match the reported vectors.
 	 */
 	list_for_each_entry_rcu(a, &desc->head, list) {
 		int thishandled;
 		u64 delta;
 
+		if (!(source_bitmap & BIT(a->source_vector)))
+			continue;
+
 		delta = sched_clock();
 		thishandled = a->handler(type, regs);
 		handled += thishandled;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v6 6/9] x86/nmi: Prepare for the new NMI-source vector encoding
  2025-05-13 20:37 [PATCH v6 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
                   ` (4 preceding siblings ...)
  2025-05-13 20:37 ` [PATCH v6 5/9] x86/nmi: Add support to handle NMIs with source information Sohil Mehta
@ 2025-05-13 20:38 ` Sohil Mehta
  2025-05-13 20:38 ` [PATCH v6 7/9] x86/nmi: Enable NMI-source for IPIs delivered as NMIs Sohil Mehta
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 37+ messages in thread
From: Sohil Mehta @ 2025-05-13 20:38 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Sohil Mehta, Andrew Cooper,
	Kirill A . Shutemov, Jacob Pan, Andi Kleen, Kai Huang,
	Sandipan Das, linux-perf-users, linux-edac, kvm, linux-pm,
	linux-trace-kernel

When using the send_IPI_* APIC calls, callers typically use NMI vector
0x2 to trigger NMIs. The APIC APIs convert the NMI vector to the NMI
delivery mode, which is eventually used to program the APIC.

Before FRED, the hardware would ignore the vector used with NMI delivery
mode. However, with NMI-source reporting, the vector information is
relayed to the destination CPU, which sets the corresponding bit in the
NMI-source bitmap. Unfortunately, the kernel now needs to maintain a new
set of NMI vectors and differentiate them from the IDT vectors.

Instead of creating a parallel set of send_NMI_* APIs to handle
NMI-source vectors, enhance the existing send_IPI_* APIs with a new
encoding scheme to handle the NMI delivery mode along with the
NMI-source vector.

NMI-source vectors would be encoded as:
    APIC_DM_NMI (0x400) | NMI_SOURCE_VECTOR (0x1-0xF)

Also, introduce a helper to prepare the ICR value with the encoded
delivery mode and vector. Update the guest paravirtual APIC code to use
the new helper as well.

While at it, rename APIC_DM_FIXED_MASK to the more appropriate
APIC_DM_MASK.

Suggested-by: Sean Christopherson <seanjc@google.com>
Co-developed-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v6: Remove a redundant else statement.

v5: Use a simiplified encoding scheme for NMI-source vectors.
---
 arch/x86/include/asm/apic.h         | 30 +++++++++++++++++++++++++++++
 arch/x86/include/asm/apicdef.h      |  2 +-
 arch/x86/kernel/apic/ipi.c          |  4 ++--
 arch/x86/kernel/apic/local.h        | 24 ++++++++++++-----------
 arch/x86/kernel/kvm.c               |  9 +--------
 drivers/thermal/intel/therm_throt.c |  2 +-
 6 files changed, 48 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index c903d358405d..9c3d5932d591 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -470,6 +470,36 @@ static __always_inline bool apic_id_valid(u32 apic_id)
 	return apic_id <= apic->max_apic_id;
 }
 
+/*
+ * Prepare the delivery mode and vector for the ICR.
+ *
+ * NMI-source vectors have the NMI delivery mode encoded within them to
+ * differentiate them from the IDT vectors. IDT vector 0x2 (NMI_VECTOR)
+ * is treated as an NMI request but without any NMI-source information.
+ */
+static inline u16 __prepare_ICR_DM_vector(u16 dm_vector)
+{
+	u16 vector = dm_vector & APIC_VECTOR_MASK;
+	u16 dm = dm_vector & APIC_DM_MASK;
+
+	if (dm == APIC_DM_NMI) {
+		/*
+		 * Pre-FRED, the actual vector is ignored for NMIs, but
+		 * zero it if NMI-source reporting is not supported to
+		 * avoid breakage on misbehaving hardware or hypervisors.
+		 */
+		if (!cpu_feature_enabled(X86_FEATURE_NMI_SOURCE))
+			vector = 0;
+
+		return dm | vector;
+	}
+
+	if (vector == NMI_VECTOR)
+		return APIC_DM_NMI;
+
+	return APIC_DM_FIXED | vector;
+}
+
 #else /* CONFIG_X86_LOCAL_APIC */
 
 static inline u32 apic_read(u32 reg) { return 0; }
diff --git a/arch/x86/include/asm/apicdef.h b/arch/x86/include/asm/apicdef.h
index 094106b6a538..3fb8fa73f6aa 100644
--- a/arch/x86/include/asm/apicdef.h
+++ b/arch/x86/include/asm/apicdef.h
@@ -87,8 +87,8 @@
 #define		APIC_ICR_BUSY		0x01000
 #define		APIC_DEST_LOGICAL	0x00800
 #define		APIC_DEST_PHYSICAL	0x00000
+#define		APIC_DM_MASK		0x00700
 #define		APIC_DM_FIXED		0x00000
-#define		APIC_DM_FIXED_MASK	0x00700
 #define		APIC_DM_LOWEST		0x00100
 #define		APIC_DM_SMI		0x00200
 #define		APIC_DM_REMRD		0x00300
diff --git a/arch/x86/kernel/apic/ipi.c b/arch/x86/kernel/apic/ipi.c
index 98a57cb4aa86..4e8bc42f3bd5 100644
--- a/arch/x86/kernel/apic/ipi.c
+++ b/arch/x86/kernel/apic/ipi.c
@@ -158,7 +158,7 @@ static void __default_send_IPI_shortcut(unsigned int shortcut, int vector)
 	 * issues where otherwise the system hangs when the panic CPU tries
 	 * to stop the others before launching the kdump kernel.
 	 */
-	if (unlikely(vector == NMI_VECTOR))
+	if (unlikely(is_nmi_vector(vector)))
 		apic_mem_wait_icr_idle_timeout();
 	else
 		apic_mem_wait_icr_idle();
@@ -175,7 +175,7 @@ void __default_send_IPI_dest_field(unsigned int dest_mask, int vector,
 				   unsigned int dest_mode)
 {
 	/* See comment in __default_send_IPI_shortcut() */
-	if (unlikely(vector == NMI_VECTOR))
+	if (unlikely(is_nmi_vector(vector)))
 		apic_mem_wait_icr_idle_timeout();
 	else
 		apic_mem_wait_icr_idle();
diff --git a/arch/x86/kernel/apic/local.h b/arch/x86/kernel/apic/local.h
index bdcf609eb283..9a54c589a4bf 100644
--- a/arch/x86/kernel/apic/local.h
+++ b/arch/x86/kernel/apic/local.h
@@ -24,22 +24,24 @@ extern u32 x2apic_max_apicid;
 
 /* IPI */
 
+u16 __prepare_ICR_DM_vector(u16 vector);
+
 DECLARE_STATIC_KEY_FALSE(apic_use_ipi_shorthand);
 
+/* NMI-source vectors have the delivery mode encoded within them */
+static inline bool is_nmi_vector(u16 vector)
+{
+	if ((vector & APIC_DM_MASK) == APIC_DM_NMI)
+		return true;
+	if ((vector & APIC_VECTOR_MASK) == NMI_VECTOR)
+		return true;
+	return false;
+}
+
 static inline unsigned int __prepare_ICR(unsigned int shortcut, int vector,
 					 unsigned int dest)
 {
-	unsigned int icr = shortcut | dest;
-
-	switch (vector) {
-	default:
-		icr |= APIC_DM_FIXED | vector;
-		break;
-	case NMI_VECTOR:
-		icr |= APIC_DM_NMI;
-		break;
-	}
-	return icr;
+	return shortcut | dest | __prepare_ICR_DM_vector(vector);
 }
 
 void default_init_apic_ldr(void);
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 3be9b3342c67..aa45fe4fecd0 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -517,14 +517,7 @@ static void __send_ipi_mask(const struct cpumask *mask, int vector)
 
 	local_irq_save(flags);
 
-	switch (vector) {
-	default:
-		icr = APIC_DM_FIXED | vector;
-		break;
-	case NMI_VECTOR:
-		icr = APIC_DM_NMI;
-		break;
-	}
+	icr = __prepare_ICR_DM_vector(vector);
 
 	for_each_cpu(cpu, mask) {
 		apic_id = per_cpu(x86_cpu_to_apicid, cpu);
diff --git a/drivers/thermal/intel/therm_throt.c b/drivers/thermal/intel/therm_throt.c
index e69868e868eb..83dd53cb4fc7 100644
--- a/drivers/thermal/intel/therm_throt.c
+++ b/drivers/thermal/intel/therm_throt.c
@@ -740,7 +740,7 @@ void intel_init_thermal(struct cpuinfo_x86 *c)
 	 * BIOS has programmed on AP based on BSP's info we saved since BIOS
 	 * is always setting the same value for all threads/cores.
 	 */
-	if ((h & APIC_DM_FIXED_MASK) != APIC_DM_FIXED)
+	if ((h & APIC_DM_MASK) != APIC_DM_FIXED)
 		apic_write(APIC_LVTTHMR, lvtthmr_init);
 
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v6 7/9] x86/nmi: Enable NMI-source for IPIs delivered as NMIs
  2025-05-13 20:37 [PATCH v6 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
                   ` (5 preceding siblings ...)
  2025-05-13 20:38 ` [PATCH v6 6/9] x86/nmi: Prepare for the new NMI-source vector encoding Sohil Mehta
@ 2025-05-13 20:38 ` Sohil Mehta
  2025-05-13 20:38 ` [PATCH v6 8/9] perf/x86: Enable NMI-source reporting for perfmon Sohil Mehta
  2025-05-13 20:38 ` [PATCH v6 9/9] x86/nmi: Print source information with the unknown NMI console message Sohil Mehta
  8 siblings, 0 replies; 37+ messages in thread
From: Sohil Mehta @ 2025-05-13 20:38 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Sohil Mehta, Andrew Cooper,
	Kirill A . Shutemov, Jacob Pan, Andi Kleen, Kai Huang,
	Sandipan Das, linux-perf-users, linux-edac, kvm, linux-pm,
	linux-trace-kernel

With the IPI handling APIs ready to support the new NMI encoding, encode
the NMI delivery mode directly with the NMI-source vectors to trigger
NMIs.

Move most of the existing NMI-based IPIs to use the new NMI-source
vectors, except for the microcode rendezvous NMI and the crash reboot
NMI. NMI handling for them is special-cased in exc_nmi() and does not
need NMI-source reporting.

However, in the future, it might be useful to assign a source vector to
all NMI sources to improve isolation and debuggability.

Originally-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Co-developed-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v6: Include asm/nmi.h to avoid compile errors. (LKP)

v5: Encode APIC_DM_NMI directly with the NMI-source vector.
---
 arch/x86/include/asm/apic.h      | 8 ++++++++
 arch/x86/kernel/apic/hw_nmi.c    | 2 +-
 arch/x86/kernel/cpu/mce/inject.c | 2 +-
 arch/x86/kernel/kgdb.c           | 2 +-
 arch/x86/kernel/nmi_selftest.c   | 2 +-
 arch/x86/kernel/smp.c            | 2 +-
 6 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 9c3d5932d591..99033bfb26ea 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -14,6 +14,7 @@
 #include <asm/msr.h>
 #include <asm/hardirq.h>
 #include <asm/io.h>
+#include <asm/nmi.h>
 #include <asm/posted_intr.h>
 
 #define ARCH_APICTIMER_STOPS_ON_C3	1
@@ -23,6 +24,13 @@
 #define APIC_EXTNMI_ALL		1
 #define APIC_EXTNMI_NONE	2
 
+/* Trigger NMIs with source information */
+#define TEST_NMI		(APIC_DM_NMI | NMIS_VECTOR_TEST)
+#define SMP_STOP_NMI		(APIC_DM_NMI | NMIS_VECTOR_SMP_STOP)
+#define BT_NMI			(APIC_DM_NMI | NMIS_VECTOR_BT)
+#define KGDB_NMI		(APIC_DM_NMI | NMIS_VECTOR_KGDB)
+#define MCE_NMI			(APIC_DM_NMI | NMIS_VECTOR_MCE)
+
 /*
  * Debugging macros
  */
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 4e04f13d2de9..586f4b25feae 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -33,7 +33,7 @@ u64 hw_nmi_get_sample_period(int watchdog_thresh)
 #ifdef arch_trigger_cpumask_backtrace
 static void nmi_raise_cpu_backtrace(cpumask_t *mask)
 {
-	__apic_send_IPI_mask(mask, NMI_VECTOR);
+	__apic_send_IPI_mask(mask, BT_NMI);
 }
 
 void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu)
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index a3c753dfce91..6328a607ffc4 100644
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -269,7 +269,7 @@ static void __maybe_unused raise_mce(struct mce *m)
 					mce_irq_ipi, NULL, 0);
 				preempt_enable();
 			} else if (m->inject_flags & MCJ_NMI_BROADCAST)
-				__apic_send_IPI_mask(mce_inject_cpumask, NMI_VECTOR);
+				__apic_send_IPI_mask(mce_inject_cpumask, MCE_NMI);
 		}
 		start = jiffies;
 		while (!cpumask_empty(mce_inject_cpumask)) {
diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
index 9ca4b141da0c..3dedc5f57541 100644
--- a/arch/x86/kernel/kgdb.c
+++ b/arch/x86/kernel/kgdb.c
@@ -416,7 +416,7 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
  */
 void kgdb_roundup_cpus(void)
 {
-	apic_send_IPI_allbutself(NMI_VECTOR);
+	apic_send_IPI_allbutself(KGDB_NMI);
 }
 #endif
 
diff --git a/arch/x86/kernel/nmi_selftest.c b/arch/x86/kernel/nmi_selftest.c
index 5196023b31dc..c5c91f520c69 100644
--- a/arch/x86/kernel/nmi_selftest.c
+++ b/arch/x86/kernel/nmi_selftest.c
@@ -71,7 +71,7 @@ static void __init test_nmi_ipi(struct cpumask *mask)
 	/* sync above data before sending NMI */
 	wmb();
 
-	__apic_send_IPI_mask(mask, NMI_VECTOR);
+	__apic_send_IPI_mask(mask, TEST_NMI);
 
 	/* Don't wait longer than a second */
 	timeout = USEC_PER_SEC;
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 5be1c0bdf901..614acec5655f 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -217,7 +217,7 @@ static void native_stop_other_cpus(int wait)
 			pr_emerg("Shutting down cpus with NMI\n");
 
 			for_each_cpu(cpu, &cpus_stop_mask)
-				__apic_send_IPI(cpu, NMI_VECTOR);
+				__apic_send_IPI(cpu, SMP_STOP_NMI);
 		}
 		/*
 		 * Don't wait longer than 10 ms if the caller didn't
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v6 8/9] perf/x86: Enable NMI-source reporting for perfmon
  2025-05-13 20:37 [PATCH v6 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
                   ` (6 preceding siblings ...)
  2025-05-13 20:38 ` [PATCH v6 7/9] x86/nmi: Enable NMI-source for IPIs delivered as NMIs Sohil Mehta
@ 2025-05-13 20:38 ` Sohil Mehta
  2025-06-03 16:57   ` Xin Li
  2025-05-13 20:38 ` [PATCH v6 9/9] x86/nmi: Print source information with the unknown NMI console message Sohil Mehta
  8 siblings, 1 reply; 37+ messages in thread
From: Sohil Mehta @ 2025-05-13 20:38 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Sohil Mehta, Andrew Cooper,
	Kirill A . Shutemov, Jacob Pan, Andi Kleen, Kai Huang,
	Sandipan Das, linux-perf-users, linux-edac, kvm, linux-pm,
	linux-trace-kernel

From: Jacob Pan <jacob.jun.pan@linux.intel.com>

Program the designated PMI NMI-source vector into the local vector table
for the PMU. An NMI for the PMU would directly invoke the PMI handler
without polling other NMI handlers, resulting in reduced PMI delivery
latency.

Co-developed-by: Zeng Guang <guang.zeng@intel.com>
Signed-off-by: Zeng Guang <guang.zeng@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Sandipan Das <sandipan.das@amd.com> # AMD overlapping bits
---
v6: Picked up a tested-by tag.

v5: No significant change.
---
 arch/x86/events/core.c       | 4 ++--
 arch/x86/events/intel/core.c | 6 +++---
 arch/x86/include/asm/apic.h  | 1 +
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 031e908f0d61..42b270526631 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1695,7 +1695,7 @@ int x86_pmu_handle_irq(struct pt_regs *regs)
 	 * This generic handler doesn't seem to have any issues where the
 	 * unmasking occurs so it was left at the top.
 	 */
-	apic_write(APIC_LVTPC, APIC_DM_NMI);
+	apic_write(APIC_LVTPC, PERF_NMI);
 
 	for_each_set_bit(idx, x86_pmu.cntr_mask, X86_PMC_IDX_MAX) {
 		if (!test_bit(idx, cpuc->active_mask))
@@ -1737,7 +1737,7 @@ void perf_events_lapic_init(void)
 	/*
 	 * Always use NMI for PMU
 	 */
-	apic_write(APIC_LVTPC, APIC_DM_NMI);
+	apic_write(APIC_LVTPC, PERF_NMI);
 }
 
 static int
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 09d2d66c9f21..87c624686c58 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3202,7 +3202,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
 	 * NMI handler.
 	 */
 	if (!late_ack && !mid_ack)
-		apic_write(APIC_LVTPC, APIC_DM_NMI);
+		apic_write(APIC_LVTPC, PERF_NMI);
 	intel_bts_disable_local();
 	cpuc->enabled = 0;
 	__intel_pmu_disable_all(true);
@@ -3239,7 +3239,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
 
 done:
 	if (mid_ack)
-		apic_write(APIC_LVTPC, APIC_DM_NMI);
+		apic_write(APIC_LVTPC, PERF_NMI);
 	/* Only restore PMU state when it's active. See x86_pmu_disable(). */
 	cpuc->enabled = pmu_enabled;
 	if (pmu_enabled)
@@ -3252,7 +3252,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
 	 * Haswell CPUs.
 	 */
 	if (late_ack)
-		apic_write(APIC_LVTPC, APIC_DM_NMI);
+		apic_write(APIC_LVTPC, PERF_NMI);
 	return handled;
 }
 
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 99033bfb26ea..d637717d42bd 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -30,6 +30,7 @@
 #define BT_NMI			(APIC_DM_NMI | NMIS_VECTOR_BT)
 #define KGDB_NMI		(APIC_DM_NMI | NMIS_VECTOR_KGDB)
 #define MCE_NMI			(APIC_DM_NMI | NMIS_VECTOR_MCE)
+#define PERF_NMI		(APIC_DM_NMI | NMIS_VECTOR_PMI)
 
 /*
  * Debugging macros
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v6 9/9] x86/nmi: Print source information with the unknown NMI console message
  2025-05-13 20:37 [PATCH v6 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
                   ` (7 preceding siblings ...)
  2025-05-13 20:38 ` [PATCH v6 8/9] perf/x86: Enable NMI-source reporting for perfmon Sohil Mehta
@ 2025-05-13 20:38 ` Sohil Mehta
  2025-06-03 16:55   ` Xin Li
  8 siblings, 1 reply; 37+ messages in thread
From: Sohil Mehta @ 2025-05-13 20:38 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Sohil Mehta, Andrew Cooper,
	Kirill A . Shutemov, Jacob Pan, Andi Kleen, Kai Huang,
	Sandipan Das, linux-perf-users, linux-edac, kvm, linux-pm,
	linux-trace-kernel

The NMI-source bitmap is critical information provided by the NMI-source
reporting feature. It is very useful in debugging unknown NMIs since it
can pinpoint the exact source that caused the NMI.

Print the source bitmap along with the "Unknown NMI" kernel log message.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v6: Drop the tracepoint modification part for now.

v5: New patch
---
 arch/x86/kernel/nmi.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index 55ecbe2ab5e4..7a288603683f 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -376,6 +376,9 @@ unknown_nmi_error(unsigned char reason, struct pt_regs *regs)
 	pr_emerg_ratelimited("Uhhuh. NMI received for unknown reason %02x on CPU %d.\n",
 			     reason, smp_processor_id());
 
+	if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE))
+		pr_emerg_ratelimited("NMI-source bitmap is 0x%lx\n", fred_event_data(regs));
+
 	if (unknown_nmi_panic || panic_on_unrecovered_nmi)
 		nmi_panic(regs, "NMI: Not continuing");
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 1/9] x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM
  2025-05-13 20:37 ` [PATCH v6 1/9] x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM Sohil Mehta
@ 2025-05-14 14:15   ` Sean Christopherson
  2025-05-14 21:34     ` Sohil Mehta
  0 siblings, 1 reply; 37+ messages in thread
From: Sean Christopherson @ 2025-05-14 14:15 UTC (permalink / raw)
  To: Sohil Mehta
  Cc: x86, linux-kernel, Xin Li, H . Peter Anvin, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Peter Zijlstra, Adrian Hunter, Kan Liang, Tony Luck, Zhang Rui,
	Steven Rostedt, Andrew Cooper, Kirill A . Shutemov, Jacob Pan,
	Andi Kleen, Kai Huang, Sandipan Das, linux-perf-users, linux-edac,
	kvm, linux-pm, linux-trace-kernel

On Tue, May 13, 2025, Sohil Mehta wrote:
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 5c5766467a61..1d43d4a2f6b6 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7079,7 +7079,7 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu,
>  
>  	kvm_before_interrupt(vcpu, KVM_HANDLING_IRQ);
>  	if (cpu_feature_enabled(X86_FEATURE_FRED))
> -		fred_entry_from_kvm(EVENT_TYPE_EXTINT, vector);
> +		fred_entry_from_kvm(EVENT_TYPE_EXTINT, vector, 0);
>  	else
>  		vmx_do_interrupt_irqoff(gate_offset((gate_desc *)host_idt_base + vector));
>  	kvm_after_interrupt(vcpu);
> @@ -7393,7 +7393,8 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
>  	    is_nmi(vmx_get_intr_info(vcpu))) {
>  		kvm_before_interrupt(vcpu, KVM_HANDLING_NMI);
>  		if (cpu_feature_enabled(X86_FEATURE_FRED))
> -			fred_entry_from_kvm(EVENT_TYPE_NMI, NMI_VECTOR);
> +			fred_entry_from_kvm(EVENT_TYPE_NMI, NMI_VECTOR,
> +					    vmx_get_exit_qual(vcpu));
>  		else
>  			vmx_do_nmi_irqoff();
>  		kvm_after_interrupt(vcpu);

As a prep patch, what if we provide separate wrappers for IRQ vs. NMI?  That way
KVM doesn't need to shove a '0' literal for the IRQ case.  There isn't that much
code that's actually shared between the two, once you account for KVM having to
hardcode the NMI information.

Compile tested only...

--
From: Sean Christopherson <seanjc@google.com>
Date: Wed, 14 May 2025 07:07:55 -0700
Subject: [PATCH] x86/fred: Provide separate IRQ vs. NMI wrappers for "entry"
 from KVM

Provide separate wrappers for forwarding IRQs vs NMIs from KVM in
anticipation of adding support for NMI source reporting, which will add
an NMI-only parameter, i.e. will further pollute the current API with a
param that is a hardcoded for one of the two call sites.

Opportunistically tag the non-FRED NMI wrapper __always_inline, as the
compiler could theoretically generate a function call and trigger and a
(completely benign) "leaving noinstr" warning.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/fred.h | 24 +++++++++++++++++++-----
 arch/x86/kvm/vmx/vmx.c      |  4 ++--
 2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h
index 2a29e5216881..dfb4f5e6a37a 100644
--- a/arch/x86/include/asm/fred.h
+++ b/arch/x86/include/asm/fred.h
@@ -10,6 +10,7 @@
 
 #include <asm/asm.h>
 #include <asm/trapnr.h>
+#include <asm/irq_vectors.h>
 
 /*
  * FRED event return instruction opcodes for ERET{S,U}; supported in
@@ -70,14 +71,26 @@ __visible void fred_entry_from_user(struct pt_regs *regs);
 __visible void fred_entry_from_kernel(struct pt_regs *regs);
 __visible void __fred_entry_from_kvm(struct pt_regs *regs);
 
-/* Can be called from noinstr code, thus __always_inline */
-static __always_inline void fred_entry_from_kvm(unsigned int type, unsigned int vector)
+/* Must be called from noinstr code, thus __always_inline */
+static __always_inline void fred_nmi_from_kvm(void)
 {
 	struct fred_ss ss = {
 		.ss     =__KERNEL_DS,
-		.type   = type,
+		.type   = EVENT_TYPE_NMI,
+		.vector = NMI_VECTOR,
+		.nmi    = true,
+		.lm     = 1,
+	};
+
+	asm_fred_entry_from_kvm(ss);
+}
+
+static inline void fred_irq_from_kvm(unsigned int vector)
+{
+	struct fred_ss ss = {
+		.ss     =__KERNEL_DS,
+		.type   = EVENT_TYPE_EXTINT,
 		.vector = vector,
-		.nmi    = type == EVENT_TYPE_NMI,
 		.lm     = 1,
 	};
 
@@ -109,7 +122,8 @@ static __always_inline unsigned long fred_event_data(struct pt_regs *regs) { ret
 static inline void cpu_init_fred_exceptions(void) { }
 static inline void cpu_init_fred_rsps(void) { }
 static inline void fred_complete_exception_setup(void) { }
-static inline void fred_entry_from_kvm(unsigned int type, unsigned int vector) { }
+static __always_inline void fred_nmi_from_kvm(void) { }
+static inline void fred_irq_from_kvm(unsigned int vector) { }
 static inline void fred_sync_rsp0(unsigned long rsp0) { }
 static inline void fred_update_rsp0(void) { }
 #endif /* CONFIG_X86_FRED */
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 5c5766467a61..271f92fee76b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7079,7 +7079,7 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu,
 
 	kvm_before_interrupt(vcpu, KVM_HANDLING_IRQ);
 	if (cpu_feature_enabled(X86_FEATURE_FRED))
-		fred_entry_from_kvm(EVENT_TYPE_EXTINT, vector);
+		fred_irq_from_kvm(vector);
 	else
 		vmx_do_interrupt_irqoff(gate_offset((gate_desc *)host_idt_base + vector));
 	kvm_after_interrupt(vcpu);
@@ -7393,7 +7393,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
 	    is_nmi(vmx_get_intr_info(vcpu))) {
 		kvm_before_interrupt(vcpu, KVM_HANDLING_NMI);
 		if (cpu_feature_enabled(X86_FEATURE_FRED))
-			fred_entry_from_kvm(EVENT_TYPE_NMI, NMI_VECTOR);
+			fred_nmi_from_kvm();
 		else
 			vmx_do_nmi_irqoff();
 		kvm_after_interrupt(vcpu);

base-commit: 9f35e33144ae5377d6a8de86dd3bd4d995c6ac65
--

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 1/9] x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM
  2025-05-14 14:15   ` Sean Christopherson
@ 2025-05-14 21:34     ` Sohil Mehta
  2025-05-16  1:15       ` Sean Christopherson
  0 siblings, 1 reply; 37+ messages in thread
From: Sohil Mehta @ 2025-05-14 21:34 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: x86, linux-kernel, Xin Li, H . Peter Anvin, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Peter Zijlstra, Adrian Hunter, Kan Liang, Tony Luck, Zhang Rui,
	Steven Rostedt, Andrew Cooper, Kirill A . Shutemov, Jacob Pan,
	Andi Kleen, Kai Huang, Sandipan Das, linux-perf-users, linux-edac,
	kvm, linux-pm, linux-trace-kernel

On 5/14/2025 7:15 AM, Sean Christopherson wrote:
> On Tue, May 13, 2025, Sohil Mehta wrote:
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index 5c5766467a61..1d43d4a2f6b6 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -7079,7 +7079,7 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu,
>>  
>>  	kvm_before_interrupt(vcpu, KVM_HANDLING_IRQ);
>>  	if (cpu_feature_enabled(X86_FEATURE_FRED))
>> -		fred_entry_from_kvm(EVENT_TYPE_EXTINT, vector);
>> +		fred_entry_from_kvm(EVENT_TYPE_EXTINT, vector, 0);
>>  	else
>>  		vmx_do_interrupt_irqoff(gate_offset((gate_desc *)host_idt_base + vector));
>>  	kvm_after_interrupt(vcpu);
>> @@ -7393,7 +7393,8 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
>>  	    is_nmi(vmx_get_intr_info(vcpu))) {
>>  		kvm_before_interrupt(vcpu, KVM_HANDLING_NMI);
>>  		if (cpu_feature_enabled(X86_FEATURE_FRED))
>> -			fred_entry_from_kvm(EVENT_TYPE_NMI, NMI_VECTOR);
>> +			fred_entry_from_kvm(EVENT_TYPE_NMI, NMI_VECTOR,
>> +					    vmx_get_exit_qual(vcpu));
>>  		else
>>  			vmx_do_nmi_irqoff();
>>  		kvm_after_interrupt(vcpu);
> 
> As a prep patch, what if we provide separate wrappers for IRQ vs. NMI?  That way
> KVM doesn't need to shove a '0' literal for the IRQ case.  There isn't that much
> code that's actually shared between the two, once you account for KVM having to
> hardcode the NMI information.
> 

Seems reasonable to me.

I don't see the IRQ side using the Event data anytime soon (or ever). We
still need to pass 0 to asm_fred_entry_from_kvm() for the IRQ case but
that would be contained in asm/fred.h and would not affect KVM.

> Compile tested only...
> 

No worries. I'll test it out. I am assuming you want this patch to go as
part of this series.

> --
> From: Sean Christopherson <seanjc@google.com>
> Date: Wed, 14 May 2025 07:07:55 -0700
> Subject: [PATCH] x86/fred: Provide separate IRQ vs. NMI wrappers for "entry"
>  from KVM
> 
> Provide separate wrappers for forwarding IRQs vs NMIs from KVM in
> anticipation of adding support for NMI source reporting, which will add
> an NMI-only parameter, i.e. will further pollute the current API with a
> param that is a hardcoded for one of the two call sites.
> 
> Opportunistically tag the non-FRED NMI wrapper __always_inline, as the
> compiler could theoretically generate a function call and trigger and a
> (completely benign) "leaving noinstr" warning.
> 

If this is really a concern, wouldn't there be similar semantics in
other places as well?

> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/include/asm/fred.h | 24 +++++++++++++++++++-----
>  arch/x86/kvm/vmx/vmx.c      |  4 ++--
>  2 files changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h
> index 2a29e5216881..dfb4f5e6a37a 100644
> --- a/arch/x86/include/asm/fred.h
> +++ b/arch/x86/include/asm/fred.h
> @@ -10,6 +10,7 @@
>  
>  #include <asm/asm.h>
>  #include <asm/trapnr.h>
> +#include <asm/irq_vectors.h>
>  

I'll move the header insertion to keep it alphabetically ordered.

>  /*
>   * FRED event return instruction opcodes for ERET{S,U}; supported in
> @@ -70,14 +71,26 @@ __visible void fred_entry_from_user(struct pt_regs *regs);
>  __visible void fred_entry_from_kernel(struct pt_regs *regs);
>  __visible void __fred_entry_from_kvm(struct pt_regs *regs);
>  
> -/* Can be called from noinstr code, thus __always_inline */
> -static __always_inline void fred_entry_from_kvm(unsigned int type, unsigned int vector)
> +/* Must be called from noinstr code, thus __always_inline */
> +static __always_inline void fred_nmi_from_kvm(void)
>  {
>  	struct fred_ss ss = {
>  		.ss     =__KERNEL_DS,
> -		.type   = type,
> +		.type   = EVENT_TYPE_NMI,
> +		.vector = NMI_VECTOR,
> +		.nmi    = true,
> +		.lm     = 1,
> +	};
> +
> +	asm_fred_entry_from_kvm(ss);
> +}
> +

The original code uses spaces for alignment. Since we are modifying it,
I am thinking of changing it to tabs.

> +static inline void fred_irq_from_kvm(unsigned int vector)
> +{
> +	struct fred_ss ss = {
> +		.ss     =__KERNEL_DS,
> +		.type   = EVENT_TYPE_EXTINT,
>  		.vector = vector,
> -		.nmi    = type == EVENT_TYPE_NMI,
>  		.lm     = 1,
>  	};
>  
Thanks,
Sohil

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 1/9] x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM
  2025-05-14 21:34     ` Sohil Mehta
@ 2025-05-16  1:15       ` Sean Christopherson
  2025-06-02 20:45         ` Sohil Mehta
  0 siblings, 1 reply; 37+ messages in thread
From: Sean Christopherson @ 2025-05-16  1:15 UTC (permalink / raw)
  To: Sohil Mehta
  Cc: x86, linux-kernel, Xin Li, H . Peter Anvin, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Peter Zijlstra, Adrian Hunter, Kan Liang, Tony Luck, Zhang Rui,
	Steven Rostedt, Andrew Cooper, Kirill A . Shutemov, Jacob Pan,
	Andi Kleen, Kai Huang, Sandipan Das, linux-perf-users, linux-edac,
	kvm, linux-pm, linux-trace-kernel

On Wed, May 14, 2025, Sohil Mehta wrote:
> On 5/14/2025 7:15 AM, Sean Christopherson wrote:
> > Compile tested only...
> > 
> 
> No worries. I'll test it out. I am assuming you want this patch to go as
> part of this series.

Yes please.  I can also post it separately, but that seems unnecessary.

> > --
> > From: Sean Christopherson <seanjc@google.com>
> > Date: Wed, 14 May 2025 07:07:55 -0700
> > Subject: [PATCH] x86/fred: Provide separate IRQ vs. NMI wrappers for "entry"
> >  from KVM
> > 
> > Provide separate wrappers for forwarding IRQs vs NMIs from KVM in
> > anticipation of adding support for NMI source reporting, which will add
> > an NMI-only parameter, i.e. will further pollute the current API with a
> > param that is a hardcoded for one of the two call sites.
> > 
> > Opportunistically tag the non-FRED NMI wrapper __always_inline, as the
> > compiler could theoretically generate a function call and trigger and a
> > (completely benign) "leaving noinstr" warning.
> > 
> 
> If this is really a concern, wouldn't there be similar semantics in
> other places as well?

There are, e.g. the stubs in include/linux/context_tracking_state.h and many
other places.  It looks ridiculous, but the compiler will generate RET+CALL for
literal nops if the right sanitizers are enabled.  E.g. see commit
432727f1cb6e ("KVM: VMX: Always inline to_vmx() and to_kvm_vmx()").

> > @@ -70,14 +71,26 @@ __visible void fred_entry_from_user(struct pt_regs *regs);
> >  __visible void fred_entry_from_kernel(struct pt_regs *regs);
> >  __visible void __fred_entry_from_kvm(struct pt_regs *regs);
> >  
> > -/* Can be called from noinstr code, thus __always_inline */
> > -static __always_inline void fred_entry_from_kvm(unsigned int type, unsigned int vector)
> > +/* Must be called from noinstr code, thus __always_inline */
> > +static __always_inline void fred_nmi_from_kvm(void)
> >  {
> >  	struct fred_ss ss = {
> >  		.ss     =__KERNEL_DS,
> > -		.type   = type,
> > +		.type   = EVENT_TYPE_NMI,
> > +		.vector = NMI_VECTOR,
> > +		.nmi    = true,
> > +		.lm     = 1,
> > +	};
> > +
> > +	asm_fred_entry_from_kvm(ss);
> > +}
> > +
> 
> The original code uses spaces for alignment. Since we are modifying it,
> I am thinking of changing it to tabs.

Oof, yeah, definitely do that.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 1/9] x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM
  2025-05-16  1:15       ` Sean Christopherson
@ 2025-06-02 20:45         ` Sohil Mehta
  0 siblings, 0 replies; 37+ messages in thread
From: Sohil Mehta @ 2025-06-02 20:45 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: x86, linux-kernel, Xin Li, H . Peter Anvin, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Peter Zijlstra, Adrian Hunter, Kan Liang, Tony Luck, Zhang Rui,
	Steven Rostedt, Andrew Cooper, Kirill A . Shutemov, Jacob Pan,
	Andi Kleen, Kai Huang, Sandipan Das, linux-perf-users, linux-edac,
	kvm, linux-pm, linux-trace-kernel

On 5/15/2025 6:15 PM, Sean Christopherson wrote:
> On Wed, May 14, 2025, Sohil Mehta wrote:
>> On 5/14/2025 7:15 AM, Sean Christopherson wrote:
>>> Compile tested only...
>>>
>>
>> No worries. I'll test it out. I am assuming you want this patch to go as
>> part of this series.
> 
> Yes please.  I can also post it separately, but that seems unnecessary.

The proposed pre-patch works fine. I'll include in the next revision
along with minor changes to patch 1.

Wondering if there is any feedback for the rest of the patches 2-9? They
haven't received any comments and stay unmodified even with the new
pre-patch.

Thanks,
Sohil

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 3/9] x86/nmi: Extend the registration interface to include the NMI-source vector
  2025-05-13 20:37 ` [PATCH v6 3/9] x86/nmi: Extend the registration interface to include the NMI-source vector Sohil Mehta
@ 2025-06-03  7:23   ` Xin Li
  2025-06-03 17:35     ` Sohil Mehta
  2025-06-03 17:07   ` Dave Hansen
  1 sibling, 1 reply; 37+ messages in thread
From: Xin Li @ 2025-06-03  7:23 UTC (permalink / raw)
  To: Sohil Mehta, x86, linux-kernel
  Cc: H . Peter Anvin, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, Peter Zijlstra, Sean Christopherson,
	Adrian Hunter, Kan Liang, Tony Luck, Zhang Rui, Steven Rostedt,
	Andrew Cooper, Kirill A . Shutemov, Jacob Pan, Andi Kleen,
	Kai Huang, Sandipan Das, linux-perf-users, linux-edac, kvm,
	linux-pm, linux-trace-kernel

On 5/13/2025 1:37 PM, Sohil Mehta wrote:
> To prepare for NMI-source reporting, add a source vector argument to the
> NMI handler registration interface. Later, this will be used to
> register NMI handlers with a unique source vector that can be used to
> identify the originator of the NMI.
> 
> For now, just extend the interface and pass zero as the source vector
> for all handlers. No functional change intended.
> 
> Originally-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

Just two nits below, other than that:

Reviewed-by: Xin Li (Intel) <xin@zytor.com>
> diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
> index 79d88d12c8fb..f0a577bf7bba 100644
> --- a/arch/x86/include/asm/nmi.h
> +++ b/arch/x86/include/asm/nmi.h
> @@ -54,6 +54,7 @@ struct nmiaction {
>   	u64			max_duration;
>   	unsigned long		flags;
>   	const char		*name;
> +	u8			source_vector;
>   };
>   
>   /**
> @@ -62,6 +63,7 @@ struct nmiaction {
>    * @fn:   The NMI handler
>    * @fg:   Flags associated with the NMI handler
>    * @n:    Name of the NMI handler
> + * @src:  NMI-source based vector for the NMI handler

"NMI-source based" sounds weird to me.

>    * @init: Optional __init* attributes for struct nmiaction
>    *
>    * Adds the provided handler to the list of handlers for the specified
> @@ -75,13 +77,14 @@ struct nmiaction {
>    *
>    * Return: 0 on success, or an error code on failure.
>    */
> -#define register_nmi_handler(t, fn, fg, n, init...)	\
> +#define register_nmi_handler(t, fn, fg, n, src, init...)	\
>   ({							\
>   	static struct nmiaction init fn##_na = {	\
>   		.list = LIST_HEAD_INIT(fn##_na.list),	\
>   		.handler = (fn),			\
>   		.name = (n),				\
>   		.flags = (fg),				\
> +		.source_vector = (src),			\
>   	};						\
>   	__register_nmi_handler((t), &fn##_na);		\
>   })
Please keep the line-ending backslashes (\) aligned.

I guess you want to keep the change minimal.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 4/9] x86/nmi: Assign and register NMI-source vectors
  2025-05-13 20:37 ` [PATCH v6 4/9] x86/nmi: Assign and register NMI-source vectors Sohil Mehta
@ 2025-06-03 16:34   ` Xin Li
  2025-06-03 21:45     ` Sohil Mehta
  2025-06-04 15:28     ` H. Peter Anvin
  2025-06-03 17:23   ` Dave Hansen
  1 sibling, 2 replies; 37+ messages in thread
From: Xin Li @ 2025-06-03 16:34 UTC (permalink / raw)
  To: Sohil Mehta, x86, linux-kernel
  Cc: H . Peter Anvin, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, Peter Zijlstra, Sean Christopherson,
	Adrian Hunter, Kan Liang, Tony Luck, Zhang Rui, Steven Rostedt,
	Andrew Cooper, Kirill A . Shutemov, Jacob Pan, Andi Kleen,
	Kai Huang, Sandipan Das, linux-perf-users, linux-edac, kvm,
	linux-pm, linux-trace-kernel

On 5/13/2025 1:37 PM, Sohil Mehta wrote:
> Prior to NMI-source support, the vector information was ignored by the
> hardware while delivering NMIs. With NMI-source, the architecture
> currently supports a 16-bit source bitmap to identify the source of the
> NMI. Upon receiving an NMI, this bitmap is delivered as part of the FRED
> event delivery mechanism to the kernel.
> 
> Assign a vector space of 0-15 that is specific to NMI-source and
> independent of the IDT vector space of 0-255. Being a bitmap, the
> NMI-source vectors do not have any inherent priority associated with
> them. The order of executing the NMI handlers is up to the kernel.

I'm thinking should we mention that the bitmap could be extended more
than 16 bits in future?  Or we just don't emphasize 16-bit or 0~15?


> 
> Existing NMI handling already has a priority mechanism for the NMI
> handlers, with CPU-specific (NMI_LOCAL) handlers executed first,
> followed by platform NMI handlers and unknown NMI (NMI_UNKNOWN) handlers
> being last. Within each of these NMI types, the handlers registered with
> NMI_FLAG_FIRST are given priority.
> 
> NMI-source follows the same priority scheme to avoid unnecessary
> complexity. Therefore, the NMI-source vectors are assigned arbitrarily,
> except for vectors 0 and 2.
> 
> Vector 0 is set by the hardware whenever a source vector was not used
> while generating an NMI or the originator could not be reliably
> identified. Do not assign it to any handler.
> 
> Vector 2 is reserved for external NMIs corresponding to Local APIC -
> LINT1. Some third-party chipsets may send NMI messages with a hardcoded
> vector of 2, which would result in vector 2 being set in the NMI-source
> bitmap. To avoid confusion, do not assign vector 2 to any handler.
> 
> NMI-source vectors are only assigned for NMI_LOCAL type handlers.
> Platform NMI handlers have a single handler registered per type. They
> don't need additional source information to differentiate among them.
> 
> Use the assigned vectors to register the respective NMI handlers. Warn
> if the vector values are unexpected.
> 
> A couple of NMI handlers, such as the microcode rendezvous and the crash
> reboot, do not use the typical NMI registration interface. Leave them
> as-is for now.
> 
> Originally-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

Reviewed-by: Xin Li (Intel) <xin@zytor.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 5/9] x86/nmi: Add support to handle NMIs with source information
  2025-05-13 20:37 ` [PATCH v6 5/9] x86/nmi: Add support to handle NMIs with source information Sohil Mehta
@ 2025-06-03 16:53   ` Xin Li
  0 siblings, 0 replies; 37+ messages in thread
From: Xin Li @ 2025-06-03 16:53 UTC (permalink / raw)
  To: Sohil Mehta, x86, linux-kernel
  Cc: H . Peter Anvin, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, Peter Zijlstra, Sean Christopherson,
	Adrian Hunter, Kan Liang, Tony Luck, Zhang Rui, Steven Rostedt,
	Andrew Cooper, Kirill A . Shutemov, Jacob Pan, Andi Kleen,
	Kai Huang, Sandipan Das, linux-perf-users, linux-edac, kvm,
	linux-pm, linux-trace-kernel

On 5/13/2025 1:37 PM, Sohil Mehta wrote:
> The NMI-source bitmap is delivered as FRED event data to the kernel.
> When available, use NMI-source based filtering to determine the exact
> handlers to run.
> 
> Activate NMI-source based filtering only for Local NMIs. While handling
> platform NMI types (such as SERR and IOCHK), do not use the source
> bitmap. They have only one handler registered per type, so there is no
> need to disambiguate between multiple handlers.
> 
> Some third-party chipsets may send NMI messages with a hardcoded vector
> of 2, which would result in bit 2 being set in the NMI-source bitmap.
> Skip the local NMI handlers in this situation.
> 
> Bit 0 of the source bitmap is set by the hardware whenever a source
> vector was not used while generating an NMI, or the originator could not
> be reliably identified. Poll all the registered handlers in that case.
> 
> When multiple handlers need to be executed, adhere to the existing
> priority scheme and execute the handlers registered with NMI_FLAG_FIRST
> before others.
> 
> The logic for handling legacy NMIs is unaffected since the source bitmap
> would always have all bits set.
> 
> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

Reviewed-by: Xin Li (Intel) <xin@zytor.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 9/9] x86/nmi: Print source information with the unknown NMI console message
  2025-05-13 20:38 ` [PATCH v6 9/9] x86/nmi: Print source information with the unknown NMI console message Sohil Mehta
@ 2025-06-03 16:55   ` Xin Li
  2025-06-04  1:55     ` Sohil Mehta
  2025-06-04 15:29     ` H. Peter Anvin
  0 siblings, 2 replies; 37+ messages in thread
From: Xin Li @ 2025-06-03 16:55 UTC (permalink / raw)
  To: Sohil Mehta, x86, linux-kernel
  Cc: H . Peter Anvin, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, Peter Zijlstra, Sean Christopherson,
	Adrian Hunter, Kan Liang, Tony Luck, Zhang Rui, Steven Rostedt,
	Andrew Cooper, Kirill A . Shutemov, Jacob Pan, Andi Kleen,
	Kai Huang, Sandipan Das, linux-perf-users, linux-edac, kvm,
	linux-pm, linux-trace-kernel

On 5/13/2025 1:38 PM, Sohil Mehta wrote:
> +	if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE))
> +		pr_emerg_ratelimited("NMI-source bitmap is 0x%lx\n", fred_event_data(regs));

"0x%04lx"?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 8/9] perf/x86: Enable NMI-source reporting for perfmon
  2025-05-13 20:38 ` [PATCH v6 8/9] perf/x86: Enable NMI-source reporting for perfmon Sohil Mehta
@ 2025-06-03 16:57   ` Xin Li
  0 siblings, 0 replies; 37+ messages in thread
From: Xin Li @ 2025-06-03 16:57 UTC (permalink / raw)
  To: Sohil Mehta, x86, linux-kernel
  Cc: H . Peter Anvin, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, Peter Zijlstra, Sean Christopherson,
	Adrian Hunter, Kan Liang, Tony Luck, Zhang Rui, Steven Rostedt,
	Andrew Cooper, Kirill A . Shutemov, Jacob Pan, Andi Kleen,
	Kai Huang, Sandipan Das, linux-perf-users, linux-edac, kvm,
	linux-pm, linux-trace-kernel

On 5/13/2025 1:38 PM, Sohil Mehta wrote:
> 
> Program the designated PMI NMI-source vector into the local vector table
> for the PMU. An NMI for the PMU would directly invoke the PMI handler
> without polling other NMI handlers, resulting in reduced PMI delivery
> latency.
> 
> Co-developed-by: Zeng Guang<guang.zeng@intel.com>
> Signed-off-by: Zeng Guang<guang.zeng@intel.com>
> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
> Signed-off-by: Sohil Mehta<sohil.mehta@intel.com>
> Reviewed-by: Kan Liang<kan.liang@linux.intel.com>
> Tested-by: Sandipan Das<sandipan.das@amd.com> # AMD overlapping bits

Reviewed-by: Xin Li (Intel) <xin@zytor.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 3/9] x86/nmi: Extend the registration interface to include the NMI-source vector
  2025-05-13 20:37 ` [PATCH v6 3/9] x86/nmi: Extend the registration interface to include the NMI-source vector Sohil Mehta
  2025-06-03  7:23   ` Xin Li
@ 2025-06-03 17:07   ` Dave Hansen
  2025-06-03 18:02     ` Sohil Mehta
  1 sibling, 1 reply; 37+ messages in thread
From: Dave Hansen @ 2025-06-03 17:07 UTC (permalink / raw)
  To: Sohil Mehta, x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Andrew Cooper, Kirill A . Shutemov,
	Jacob Pan, Andi Kleen, Kai Huang, Sandipan Das, linux-perf-users,
	linux-edac, kvm, linux-pm, linux-trace-kernel

On 5/13/25 13:37, Sohil Mehta wrote:
> --- a/drivers/watchdog/hpwdt.c
> +++ b/drivers/watchdog/hpwdt.c
> @@ -242,13 +242,13 @@ static int hpwdt_init_nmi_decoding(struct pci_dev *dev)
>  	/*
>  	 * Only one function can register for NMI_UNKNOWN
>  	 */
> -	retval = register_nmi_handler(NMI_UNKNOWN, hpwdt_pretimeout, 0, "hpwdt");
> +	retval = register_nmi_handler(NMI_UNKNOWN, hpwdt_pretimeout, 0, "hpwdt", 0);
>  	if (retval)
>  		goto error;
> -	retval = register_nmi_handler(NMI_SERR, hpwdt_pretimeout, 0, "hpwdt");
> +	retval = register_nmi_handler(NMI_SERR, hpwdt_pretimeout, 0, "hpwdt", 0);
>  	if (retval)
>  		goto error1;
> -	retval = register_nmi_handler(NMI_IO_CHECK, hpwdt_pretimeout, 0, "hpwdt");
> +	retval = register_nmi_handler(NMI_IO_CHECK, hpwdt_pretimeout, 0, "hpwdt", 0);
>  	if (retval)

Could we get rid of all these random 0's, please? (or at least try to
keep them from proliferating).

Either do a:

	register_nmi_handler_source()

that takes a source and leave

	register_nmi_handler()

in place and not take a source. Or, do this:

	retval = register_nmi_handler(NMI_IO_CHECK, hpwdt_pretimeout,
				      0, "hpwdt", NMI_NO_SOURCE);

where the 0 is at least given a symbolic name.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 4/9] x86/nmi: Assign and register NMI-source vectors
  2025-05-13 20:37 ` [PATCH v6 4/9] x86/nmi: Assign and register NMI-source vectors Sohil Mehta
  2025-06-03 16:34   ` Xin Li
@ 2025-06-03 17:23   ` Dave Hansen
  2025-06-03 20:22     ` Sohil Mehta
  2025-06-04 15:22     ` H. Peter Anvin
  1 sibling, 2 replies; 37+ messages in thread
From: Dave Hansen @ 2025-06-03 17:23 UTC (permalink / raw)
  To: Sohil Mehta, x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Andrew Cooper, Kirill A . Shutemov,
	Jacob Pan, Andi Kleen, Kai Huang, Sandipan Das, linux-perf-users,
	linux-edac, kvm, linux-pm, linux-trace-kernel

On 5/13/25 13:37, Sohil Mehta wrote:
...
> + * Vector 2 is reserved for external NMIs related to the Local APIC -
> + * LINT1. Some third-party chipsets may send NMI messages with a
> + * hardcoded vector of 2, which would result in bit 2 being set in the
> + * NMI-source bitmap.

This doesn't actually say what problem this causes. Is this better?

	Third-party chipsets send NMI messages with a fixed vector of 2.
	Using vector 2 for some other purpose would cause confusion
	between those Local APIC messages and the other purpose. Avoid
	using it.

> + * The vectors are in no particular priority order. Add new vector
> + * assignments sequentially in the list below.
> + */
> +#define NMIS_VECTOR_NONE	0	/* Reserved - Set for all unidentified sources */
> +#define NMIS_VECTOR_TEST	1	/* NMI selftest */
> +#define NMIS_VECTOR_EXTERNAL	2	/* Reserved - Match External NMI vector 2 */
> +#define NMIS_VECTOR_SMP_STOP	3	/* Panic stop CPU */
> +#define NMIS_VECTOR_BT		4	/* CPU backtrace */
> +#define NMIS_VECTOR_KGDB	5	/* Kernel debugger */
> +#define NMIS_VECTOR_MCE		6	/* MCE injection */
> +#define NMIS_VECTOR_PMI		7	/* PerfMon counters */
> +
> +#define NMIS_VECTORS_MAX	16	/* Maximum number of NMI-source vectors */

Would an enum fit here?

You could also add a:

	NMIS_VECTOR_COUNT

as the last entry and then just:

	BUILD_BUG_ON(NMIS_VECTOR_COUNT >= 16);

somewhere.

I guess it's a little annoying that you need NMIS_VECTOR_EXTERNAL to
have a fixed value of 2, but I do like way the enum makes the type explicit.


>  static int __init register_nmi_cpu_backtrace_handler(void)
>  {
> -	register_nmi_handler(NMI_LOCAL, nmi_cpu_backtrace_handler, 0, "arch_bt", 0);
> +	register_nmi_handler(NMI_LOCAL, nmi_cpu_backtrace_handler, 0, "arch_bt", NMIS_VECTOR_BT);
>  	return 0;
>  }

... Oh you replaced _most_ of the random 0's in this patch. That helps
for sure.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 3/9] x86/nmi: Extend the registration interface to include the NMI-source vector
  2025-06-03  7:23   ` Xin Li
@ 2025-06-03 17:35     ` Sohil Mehta
  0 siblings, 0 replies; 37+ messages in thread
From: Sohil Mehta @ 2025-06-03 17:35 UTC (permalink / raw)
  To: Xin Li
  Cc: H . Peter Anvin, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, Peter Zijlstra, Sean Christopherson,
	Adrian Hunter, Kan Liang, Tony Luck, Zhang Rui, Steven Rostedt,
	Andrew Cooper, Kirill A . Shutemov, Jacob Pan, Andi Kleen,
	Kai Huang, Sandipan Das, linux-perf-users, linux-edac, kvm,
	linux-pm, linux-trace-kernel, x86, linux-kernel

On 6/3/2025 12:23 AM, Xin Li wrote:
> 
> Just two nits below, other than that:
> 
> Reviewed-by: Xin Li (Intel) <xin@zytor.com>

Thanks!

>> diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
>> index 79d88d12c8fb..f0a577bf7bba 100644
>> --- a/arch/x86/include/asm/nmi.h
>> +++ b/arch/x86/include/asm/nmi.h
>> @@ -54,6 +54,7 @@ struct nmiaction {
>>   	u64			max_duration;
>>   	unsigned long		flags;
>>   	const char		*name;
>> +	u8			source_vector;
>>   };
>>   
>>   /**
>> @@ -62,6 +63,7 @@ struct nmiaction {
>>    * @fn:   The NMI handler
>>    * @fg:   Flags associated with the NMI handler
>>    * @n:    Name of the NMI handler
>> + * @src:  NMI-source based vector for the NMI handler
> 
> "NMI-source based" sounds weird to me.
> 

It sounds odd to me as well. I'll get rid of "based".

"NMI-source vector for the NMI handler".

>>    * @init: Optional __init* attributes for struct nmiaction
>>    *
>>    * Adds the provided handler to the list of handlers for the specified
>> @@ -75,13 +77,14 @@ struct nmiaction {
>>    *
>>    * Return: 0 on success, or an error code on failure.
>>    */
>> -#define register_nmi_handler(t, fn, fg, n, init...)	\
>> +#define register_nmi_handler(t, fn, fg, n, src, init...)	\
>>   ({							\
>>   	static struct nmiaction init fn##_na = {	\
>>   		.list = LIST_HEAD_INIT(fn##_na.list),	\
>>   		.handler = (fn),			\
>>   		.name = (n),				\
>>   		.flags = (fg),				\
>> +		.source_vector = (src),			\
>>   	};						\
>>   	__register_nmi_handler((t), &fn##_na);		\
>>   })
> Please keep the line-ending backslashes (\) aligned.
> 

I somehow missed this. It should be possible to keep them aligned
without changing the rest of the lines.

> I guess you want to keep the change minimal.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 3/9] x86/nmi: Extend the registration interface to include the NMI-source vector
  2025-06-03 17:07   ` Dave Hansen
@ 2025-06-03 18:02     ` Sohil Mehta
  2025-06-03 18:19       ` Dave Hansen
  0 siblings, 1 reply; 37+ messages in thread
From: Sohil Mehta @ 2025-06-03 18:02 UTC (permalink / raw)
  To: Dave Hansen, x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Andrew Cooper, Kirill A . Shutemov,
	Jacob Pan, Andi Kleen, Kai Huang, Sandipan Das, linux-perf-users,
	linux-edac, kvm, linux-pm, linux-trace-kernel

On 6/3/2025 10:07 AM, Dave Hansen wrote:
> On 5/13/25 13:37, Sohil Mehta wrote:

>> -	retval = register_nmi_handler(NMI_IO_CHECK, hpwdt_pretimeout, 0, "hpwdt");
>> +	retval = register_nmi_handler(NMI_IO_CHECK, hpwdt_pretimeout, 0, "hpwdt", 0);
>>  	if (retval)
> 
> Could we get rid of all these random 0's, please? (or at least try to
> keep them from proliferating).
> 

I had patches for avoiding both zeros, but I ended up not including
them. I wasn't sure if folks prefer '0' when not using a parameter or
explicitly prefer to deny.

> Either do a:
> 
> 	register_nmi_handler_source()
> 
> that takes a source and leave
> 
> 	register_nmi_handler()
> 
> in place and not take a source. Or, do this:

> 
> 	retval = register_nmi_handler(NMI_IO_CHECK, hpwdt_pretimeout,
> 				      0, "hpwdt", NMI_NO_SOURCE);
> 

I prefer this approach. Since we are touching all these lines, maybe
it's a good time to get rid of the other 0 as well (in a separate patch).

The 3rd parameter pertains to handler "flags". The only flag in use
right now is NMI_FLAG_FIRST. Assuming that more flags might get added
later, the 0 should probably correspond to NMI_FLAG_NONE. Agree?

The other option would be NMI_FLAG_LAST, which would be the opposite of
NMI_FLAG_FIRST, but that seems shortsighted.


> where the 0 is at least given a symbolic name.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 3/9] x86/nmi: Extend the registration interface to include the NMI-source vector
  2025-06-03 18:02     ` Sohil Mehta
@ 2025-06-03 18:19       ` Dave Hansen
  2025-06-03 20:24         ` Sohil Mehta
  0 siblings, 1 reply; 37+ messages in thread
From: Dave Hansen @ 2025-06-03 18:19 UTC (permalink / raw)
  To: Sohil Mehta, x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Andrew Cooper, Kirill A . Shutemov,
	Jacob Pan, Andi Kleen, Kai Huang, Sandipan Das, linux-perf-users,
	linux-edac, kvm, linux-pm, linux-trace-kernel

On 6/3/25 11:02, Sohil Mehta wrote:
> The 3rd parameter pertains to handler "flags". The only flag in use
> right now is NMI_FLAG_FIRST. Assuming that more flags might get added
> later, the 0 should probably correspond to NMI_FLAG_NONE. Agree?
> 
> The other option would be NMI_FLAG_LAST, which would be the opposite of
> NMI_FLAG_FIRST, but that seems shortsighted.
I don't feel as strongly about the flags. But this code has been there
for over a decade as-is, so I don't think we need to be planning to add
a bunch more flags.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 4/9] x86/nmi: Assign and register NMI-source vectors
  2025-06-03 17:23   ` Dave Hansen
@ 2025-06-03 20:22     ` Sohil Mehta
  2025-06-03 21:54       ` Dave Hansen
  2025-06-04 15:22     ` H. Peter Anvin
  1 sibling, 1 reply; 37+ messages in thread
From: Sohil Mehta @ 2025-06-03 20:22 UTC (permalink / raw)
  To: Dave Hansen, x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Andrew Cooper, Kirill A . Shutemov,
	Jacob Pan, Andi Kleen, Kai Huang, Sandipan Das, linux-perf-users,
	linux-edac, kvm, linux-pm, linux-trace-kernel

On 6/3/2025 10:23 AM, Dave Hansen wrote:
> On 5/13/25 13:37, Sohil Mehta wrote:
> ...
>> + * Vector 2 is reserved for external NMIs related to the Local APIC -
>> + * LINT1. Some third-party chipsets may send NMI messages with a
>> + * hardcoded vector of 2, which would result in bit 2 being set in the
>> + * NMI-source bitmap.
> 
> This doesn't actually say what problem this causes. Is this better?
> 
> 	Third-party chipsets send NMI messages with a fixed vector of 2.
> 	Using vector 2 for some other purpose would cause confusion
> 	between those Local APIC messages and the other purpose. Avoid
> 	using it.
> 

Yes, that is better. Though, the behavior of these external NMIs isn't
expected to consistent across all third-parties.

Third-party chipsets may send NMI messages with a fixed vector of 2.
Using vector 2 for some other purpose would cause confusion between
those external NMI messages and the other purpose. Avoid using it.


>> + * The vectors are in no particular priority order. Add new vector
>> + * assignments sequentially in the list below.
>> + */
>> +#define NMIS_VECTOR_NONE	0	/* Reserved - Set for all unidentified sources */
>> +#define NMIS_VECTOR_TEST	1	/* NMI selftest */
>> +#define NMIS_VECTOR_EXTERNAL	2	/* Reserved - Match External NMI vector 2 */
>> +#define NMIS_VECTOR_SMP_STOP	3	/* Panic stop CPU */
>> +#define NMIS_VECTOR_BT		4	/* CPU backtrace */
>> +#define NMIS_VECTOR_KGDB	5	/* Kernel debugger */
>> +#define NMIS_VECTOR_MCE		6	/* MCE injection */
>> +#define NMIS_VECTOR_PMI		7	/* PerfMon counters */
>> +
>> +#define NMIS_VECTORS_MAX	16	/* Maximum number of NMI-source vectors */
> 
> Would an enum fit here?
> 

I had experimented using an enum, but I avoided that approach because
the source bitmap would also be visible to users/developers. For
example, patch 9 includes it in the "unknown NMI" print. I am planning
to add it to trace_nmi_handler() as well.

With an enum, it's harder to figure out the exact sources when let's say
the source bitmap is printed as 0x0090.

enum nmi_source_vector {
    NMIS_VECTOR_NONE,
    NMIS_VECTOR_TEST,
    NMIS_VECTOR_EXTERNAL,
    NMIS_VECTOR_SMP_STOP,
    NMIS_VECTOR_BT,
    NMIS_VECTOR_KGDB,
    NMIS_VECTOR_MCE,
    NMIS_VECTOR_PMI,
    ...,
    NMIS_VECTOR_COUNT
};

Users would have to convert the enum back to numbers to make sense of
the bitmap. It isn't bad but feels inconvenient.

> You could also add a:
> 
> 	NMIS_VECTOR_COUNT
> 
> as the last entry and then just:
> 
> 	BUILD_BUG_ON(NMIS_VECTOR_COUNT >= 16);
> 
> somewhere.
> 
> I guess it's a little annoying that you need NMIS_VECTOR_EXTERNAL to
> have a fixed value of 2, but I do like way the enum makes the type explicit.
> 

Yeah, fixing the external vector makes the enum annoying to use.

We could probably define an enum in this unusual way to keep the table
consistent and help users quickly refer bit offsets. But I am not sure
if this is any better than the current macros.

enum nmi_source_vector {
    NMIS_VECTOR_NONE        = 0,
    NMIS_VECTOR_TEST        = 1,
    NMIS_VECTOR_EXTERNAL    = 2,
    NMIS_VECTOR_SMP_STOP    = 3,
    NMIS_VECTOR_BT          = 4,
    NMIS_VECTOR_KGDB        = 5,
    NMIS_VECTOR_MCE         = 6,
    NMIS_VECTOR_PMI         = 7
};

Any suggestions?

> 
>>  static int __init register_nmi_cpu_backtrace_handler(void)
>>  {
>> -	register_nmi_handler(NMI_LOCAL, nmi_cpu_backtrace_handler, 0, "arch_bt", 0);
>> +	register_nmi_handler(NMI_LOCAL, nmi_cpu_backtrace_handler, 0, "arch_bt", NMIS_VECTOR_BT);
>>  	return 0;
>>  }
> 
> ... Oh you replaced _most_ of the random 0's in this patch. That helps
> for sure.

There are still quite a few places left with an extra 0 at the end.
Explicitly using NMIS_VECTOR_NONE for them should be useful.


> ret = register_nmi_handler(NMI_LOCAL, perf_ibs_nmi_handler, 0, "perf_ibs", 0);
> register_nmi_handler(NMI_UNKNOWN, hv_nmi_unknown, NMI_FLAG_FIRST, "hv_nmi_unknown", 0);
> retval = register_nmi_handler(NMI_UNKNOWN, kgdb_nmi_handler, 0, "kgdb", 0);
> if (register_nmi_handler(NMI_UNKNOWN, uv_handle_nmi, 0, "uv", 0))
> if (register_nmi_handler(NMI_LOCAL, uv_handle_nmi_ping, 0, "uvping", 0))
> register_nmi_handler(NMI_LOCAL, ghes_notify_nmi, 0, "ghes", 0);
> rv = register_nmi_handler(NMI_UNKNOWN, ipmi_nmi, 0, "ipmi", 0);
> rc = register_nmi_handler(NMI_SERR, ecclog_nmi_handler, 0, IGEN6_NMI_NAME, 0);
> retval = register_nmi_handler(NMI_UNKNOWN, hpwdt_pretimeout, 0, "hpwdt", 0);
> retval = register_nmi_handler(NMI_SERR, hpwdt_pretimeout, 0, "hpwdt", 0);
> retval = register_nmi_handler(NMI_IO_CHECK, hpwdt_pretimeout, 0, "hpwdt", 0);






^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 3/9] x86/nmi: Extend the registration interface to include the NMI-source vector
  2025-06-03 18:19       ` Dave Hansen
@ 2025-06-03 20:24         ` Sohil Mehta
  0 siblings, 0 replies; 37+ messages in thread
From: Sohil Mehta @ 2025-06-03 20:24 UTC (permalink / raw)
  To: Dave Hansen, x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Andrew Cooper, Kirill A . Shutemov,
	Jacob Pan, Andi Kleen, Kai Huang, Sandipan Das, linux-perf-users,
	linux-edac, kvm, linux-pm, linux-trace-kernel

On 6/3/2025 11:19 AM, Dave Hansen wrote:
> On 6/3/25 11:02, Sohil Mehta wrote:
>> The 3rd parameter pertains to handler "flags". The only flag in use
>> right now is NMI_FLAG_FIRST. Assuming that more flags might get added
>> later, the 0 should probably correspond to NMI_FLAG_NONE. Agree?
>>
>> The other option would be NMI_FLAG_LAST, which would be the opposite of
>> NMI_FLAG_FIRST, but that seems shortsighted.
> I don't feel as strongly about the flags. But this code has been there
> for over a decade as-is, so I don't think we need to be planning to add
> a bunch more flags.

Sure, I'll leave the flags as-is for now to keep the focus of this
series on NMI-source.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 4/9] x86/nmi: Assign and register NMI-source vectors
  2025-06-03 16:34   ` Xin Li
@ 2025-06-03 21:45     ` Sohil Mehta
  2025-06-04 15:28     ` H. Peter Anvin
  1 sibling, 0 replies; 37+ messages in thread
From: Sohil Mehta @ 2025-06-03 21:45 UTC (permalink / raw)
  To: Xin Li, x86, linux-kernel
  Cc: H . Peter Anvin, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, Peter Zijlstra, Sean Christopherson,
	Adrian Hunter, Kan Liang, Tony Luck, Zhang Rui, Steven Rostedt,
	Andrew Cooper, Kirill A . Shutemov, Jacob Pan, Andi Kleen,
	Kai Huang, Sandipan Das, linux-perf-users, linux-edac, kvm,
	linux-pm, linux-trace-kernel

On 6/3/2025 9:34 AM, Xin Li wrote:
> On 5/13/2025 1:37 PM, Sohil Mehta wrote:
>> Prior to NMI-source support, the vector information was ignored by the
>> hardware while delivering NMIs. With NMI-source, the architecture
>> currently supports a 16-bit source bitmap to identify the source of the
>> NMI. Upon receiving an NMI, this bitmap is delivered as part of the FRED
>> event delivery mechanism to the kernel.
>>
>> Assign a vector space of 0-15 that is specific to NMI-source and
>> independent of the IDT vector space of 0-255. Being a bitmap, the
>> NMI-source vectors do not have any inherent priority associated with
>> them. The order of executing the NMI handlers is up to the kernel.
> 
> I'm thinking should we mention that the bitmap could be extended more
> than 16 bits in future?  Or we just don't emphasize 16-bit or 0~15?
> 

That was mainly to justify the value of NMIS_VECTORS_MAX defined in this
patch. I will include a sentence to mention that the bitmap size could
be extended in the future.

We could even set NMIS_VECTORS_MAX to 64 right now to make this
future-proof. Though in practice, I don't see it happening any time soon.

Sohil

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 4/9] x86/nmi: Assign and register NMI-source vectors
  2025-06-03 20:22     ` Sohil Mehta
@ 2025-06-03 21:54       ` Dave Hansen
  2025-06-03 22:33         ` Sohil Mehta
  0 siblings, 1 reply; 37+ messages in thread
From: Dave Hansen @ 2025-06-03 21:54 UTC (permalink / raw)
  To: Sohil Mehta, x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Andrew Cooper, Kirill A . Shutemov,
	Jacob Pan, Andi Kleen, Kai Huang, Sandipan Das, linux-perf-users,
	linux-edac, kvm, linux-pm, linux-trace-kernel

On 6/3/25 13:22, Sohil Mehta wrote:
> With an enum, it's harder to figure out the exact sources when let's say
> the source bitmap is printed as 0x0090.

Uhh, then don't print a bitmap. ;)

/proc/cpuinfo doesn't print out CPUID leaves, it prints out bits mapped
to strings.

Look at the kmalloc trace points:

            Xorg-4589    [003] ..... 1568557.823993: kmalloc: ...
gfp_flags=GFP_KERNEL
            Xorg-4589    [003] ..... 1568557.823993: kmalloc: ...
gfp_flags=GFP_KERNEL

gfp_flags are a bitmap, yet they're mapped out with strings and symbolic
names.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 4/9] x86/nmi: Assign and register NMI-source vectors
  2025-06-03 21:54       ` Dave Hansen
@ 2025-06-03 22:33         ` Sohil Mehta
  0 siblings, 0 replies; 37+ messages in thread
From: Sohil Mehta @ 2025-06-03 22:33 UTC (permalink / raw)
  To: Dave Hansen, x86, linux-kernel
  Cc: Xin Li, H . Peter Anvin, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra,
	Sean Christopherson, Adrian Hunter, Kan Liang, Tony Luck,
	Zhang Rui, Steven Rostedt, Andrew Cooper, Kirill A . Shutemov,
	Jacob Pan, Andi Kleen, Kai Huang, Sandipan Das, linux-perf-users,
	linux-edac, kvm, linux-pm, linux-trace-kernel

On 6/3/2025 2:54 PM, Dave Hansen wrote:
> On 6/3/25 13:22, Sohil Mehta wrote:
>> With an enum, it's harder to figure out the exact sources when let's say
>> the source bitmap is printed as 0x0090.
> 
> Uhh, then don't print a bitmap. ;)
> 

Ah! Didn't look at it that way.

> /proc/cpuinfo doesn't print out CPUID leaves, it prints out bits mapped
> to strings.
> 

Hasn't that caused more pains than gains :p But, point taken, I'll give
it a shot.

> Look at the kmalloc trace points:
> 
>             Xorg-4589    [003] ..... 1568557.823993: kmalloc: ...
> gfp_flags=GFP_KERNEL
>             Xorg-4589    [003] ..... 1568557.823993: kmalloc: ...
> gfp_flags=GFP_KERNEL
> 
> gfp_flags are a bitmap, yet they're mapped out with strings and symbolic
> names.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 9/9] x86/nmi: Print source information with the unknown NMI console message
  2025-06-03 16:55   ` Xin Li
@ 2025-06-04  1:55     ` Sohil Mehta
  2025-06-04  2:41       ` Xin Li
  2025-06-04 15:41       ` H. Peter Anvin
  2025-06-04 15:29     ` H. Peter Anvin
  1 sibling, 2 replies; 37+ messages in thread
From: Sohil Mehta @ 2025-06-04  1:55 UTC (permalink / raw)
  To: Xin Li, x86, linux-kernel
  Cc: H . Peter Anvin, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, Peter Zijlstra, Sean Christopherson,
	Adrian Hunter, Kan Liang, Tony Luck, Zhang Rui, Steven Rostedt,
	Andrew Cooper, Kirill A . Shutemov, Jacob Pan, Andi Kleen,
	Kai Huang, Sandipan Das, linux-perf-users, linux-edac, kvm,
	linux-pm, linux-trace-kernel

On 6/3/2025 9:55 AM, Xin Li wrote:
> On 5/13/2025 1:38 PM, Sohil Mehta wrote:
>> +	if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE))
>> +		pr_emerg_ratelimited("NMI-source bitmap is 0x%lx\n", fred_event_data(regs));
> 
> "0x%04lx"?

Yeah, this would be better. But, it might not matter now.

Based on the discussion in the other patch, I am considering printing
the source names rather than the raw bitmap.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 9/9] x86/nmi: Print source information with the unknown NMI console message
  2025-06-04  1:55     ` Sohil Mehta
@ 2025-06-04  2:41       ` Xin Li
  2025-06-04 15:41       ` H. Peter Anvin
  1 sibling, 0 replies; 37+ messages in thread
From: Xin Li @ 2025-06-04  2:41 UTC (permalink / raw)
  To: Sohil Mehta, x86, linux-kernel
  Cc: H . Peter Anvin, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, Peter Zijlstra, Sean Christopherson,
	Adrian Hunter, Kan Liang, Tony Luck, Zhang Rui, Steven Rostedt,
	Andrew Cooper, Kirill A . Shutemov, Jacob Pan, Andi Kleen,
	Kai Huang, Sandipan Das, linux-perf-users, linux-edac, kvm,
	linux-pm, linux-trace-kernel

On 6/3/2025 6:55 PM, Sohil Mehta wrote:
> On 6/3/2025 9:55 AM, Xin Li wrote:
>> On 5/13/2025 1:38 PM, Sohil Mehta wrote:
>>> +	if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE))
>>> +		pr_emerg_ratelimited("NMI-source bitmap is 0x%lx\n", fred_event_data(regs));
>>
>> "0x%04lx"?
> 
> Yeah, this would be better. But, it might not matter now.
> 
> Based on the discussion in the other patch, I am considering printing
> the source names rather than the raw bitmap.

Yep, I saw that :)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 4/9] x86/nmi: Assign and register NMI-source vectors
  2025-06-03 17:23   ` Dave Hansen
  2025-06-03 20:22     ` Sohil Mehta
@ 2025-06-04 15:22     ` H. Peter Anvin
  1 sibling, 0 replies; 37+ messages in thread
From: H. Peter Anvin @ 2025-06-04 15:22 UTC (permalink / raw)
  To: Dave Hansen, Sohil Mehta, x86, linux-kernel
  Cc: Xin Li, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, Peter Zijlstra, Sean Christopherson,
	Adrian Hunter, Kan Liang, Tony Luck, Zhang Rui, Steven Rostedt,
	Andrew Cooper, Kirill A . Shutemov, Jacob Pan, Andi Kleen,
	Kai Huang, Sandipan Das, linux-perf-users, linux-edac, kvm,
	linux-pm, linux-trace-kernel

On June 3, 2025 10:23:42 AM PDT, Dave Hansen <dave.hansen@intel.com> wrote:
>On 5/13/25 13:37, Sohil Mehta wrote:
>...
>> + * Vector 2 is reserved for external NMIs related to the Local APIC -
>> + * LINT1. Some third-party chipsets may send NMI messages with a
>> + * hardcoded vector of 2, which would result in bit 2 being set in the
>> + * NMI-source bitmap.
>
>This doesn't actually say what problem this causes. Is this better?
>
>	Third-party chipsets send NMI messages with a fixed vector of 2.
>	Using vector 2 for some other purpose would cause confusion
>	between those Local APIC messages and the other purpose. Avoid
>	using it.
>
>> + * The vectors are in no particular priority order. Add new vector
>> + * assignments sequentially in the list below.
>> + */
>> +#define NMIS_VECTOR_NONE	0	/* Reserved - Set for all unidentified sources */
>> +#define NMIS_VECTOR_TEST	1	/* NMI selftest */
>> +#define NMIS_VECTOR_EXTERNAL	2	/* Reserved - Match External NMI vector 2 */
>> +#define NMIS_VECTOR_SMP_STOP	3	/* Panic stop CPU */
>> +#define NMIS_VECTOR_BT		4	/* CPU backtrace */
>> +#define NMIS_VECTOR_KGDB	5	/* Kernel debugger */
>> +#define NMIS_VECTOR_MCE		6	/* MCE injection */
>> +#define NMIS_VECTOR_PMI		7	/* PerfMon counters */
>> +
>> +#define NMIS_VECTORS_MAX	16	/* Maximum number of NMI-source vectors */
>
>Would an enum fit here?
>
>You could also add a:
>
>	NMIS_VECTOR_COUNT
>
>as the last entry and then just:
>
>	BUILD_BUG_ON(NMIS_VECTOR_COUNT >= 16);
>
>somewhere.
>
>I guess it's a little annoying that you need NMIS_VECTOR_EXTERNAL to
>have a fixed value of 2, but I do like way the enum makes the type explicit.
>
>
>>  static int __init register_nmi_cpu_backtrace_handler(void)
>>  {
>> -	register_nmi_handler(NMI_LOCAL, nmi_cpu_backtrace_handler, 0, "arch_bt", 0);
>> +	register_nmi_handler(NMI_LOCAL, nmi_cpu_backtrace_handler, 0, "arch_bt", NMIS_VECTOR_BT);
>>  	return 0;
>>  }
>
>... Oh you replaced _most_ of the random 0's in this patch. That helps
>for sure.

The "may" here is important. *Most* sources are expected to send these as either a programmable MSI or as LINT1, which is programmable in the APIC, but since the vector hasn't mattered, we can't rule out that some hardware might have been built to send an MSI with a hard-coded vector, in which case it is most likely they send either 0, 2 or 255. 0 and 255 will set the unknown source bit (0), but 2 is a legitimate source so it is defensive programming to leave it fallow.

Note also that although the current implementation is limited to 15 sources (+ the unknown/lost source bit), the architecture allows for up to 63.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 4/9] x86/nmi: Assign and register NMI-source vectors
  2025-06-03 16:34   ` Xin Li
  2025-06-03 21:45     ` Sohil Mehta
@ 2025-06-04 15:28     ` H. Peter Anvin
  2025-06-11 21:34       ` Sohil Mehta
  1 sibling, 1 reply; 37+ messages in thread
From: H. Peter Anvin @ 2025-06-04 15:28 UTC (permalink / raw)
  To: Xin Li, Sohil Mehta, x86, linux-kernel
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Peter Zijlstra, Sean Christopherson, Adrian Hunter,
	Kan Liang, Tony Luck, Zhang Rui, Steven Rostedt, Andrew Cooper,
	Kirill A . Shutemov, Jacob Pan, Andi Kleen, Kai Huang,
	Sandipan Das, linux-perf-users, linux-edac, kvm, linux-pm,
	linux-trace-kernel

On June 3, 2025 9:34:27 AM PDT, Xin Li <xin@zytor.com> wrote:
>On 5/13/2025 1:37 PM, Sohil Mehta wrote:
>> Prior to NMI-source support, the vector information was ignored by the
>> hardware while delivering NMIs. With NMI-source, the architecture
>> currently supports a 16-bit source bitmap to identify the source of the
>> NMI. Upon receiving an NMI, this bitmap is delivered as part of the FRED
>> event delivery mechanism to the kernel.
>> 
>> Assign a vector space of 0-15 that is specific to NMI-source and
>> independent of the IDT vector space of 0-255. Being a bitmap, the
>> NMI-source vectors do not have any inherent priority associated with
>> them. The order of executing the NMI handlers is up to the kernel.
>
>I'm thinking should we mention that the bitmap could be extended more
>than 16 bits in future?  Or we just don't emphasize 16-bit or 0~15?
>
>
>> 
>> Existing NMI handling already has a priority mechanism for the NMI
>> handlers, with CPU-specific (NMI_LOCAL) handlers executed first,
>> followed by platform NMI handlers and unknown NMI (NMI_UNKNOWN) handlers
>> being last. Within each of these NMI types, the handlers registered with
>> NMI_FLAG_FIRST are given priority.
>> 
>> NMI-source follows the same priority scheme to avoid unnecessary
>> complexity. Therefore, the NMI-source vectors are assigned arbitrarily,
>> except for vectors 0 and 2.
>> 
>> Vector 0 is set by the hardware whenever a source vector was not used
>> while generating an NMI or the originator could not be reliably
>> identified. Do not assign it to any handler.
>> 
>> Vector 2 is reserved for external NMIs corresponding to Local APIC -
>> LINT1. Some third-party chipsets may send NMI messages with a hardcoded
>> vector of 2, which would result in vector 2 being set in the NMI-source
>> bitmap. To avoid confusion, do not assign vector 2 to any handler.
>> 
>> NMI-source vectors are only assigned for NMI_LOCAL type handlers.
>> Platform NMI handlers have a single handler registered per type. They
>> don't need additional source information to differentiate among them.
>> 
>> Use the assigned vectors to register the respective NMI handlers. Warn
>> if the vector values are unexpected.
>> 
>> A couple of NMI handlers, such as the microcode rendezvous and the crash
>> reboot, do not use the typical NMI registration interface. Leave them
>> as-is for now.
>> 
>> Originally-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
>> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
>
>Reviewed-by: Xin Li (Intel) <xin@zytor.com>

The architectural maximum* is 63 sources (plus the error bit), although the current version only supports 1-15. If extended to be wider, we would presumably add a new cpuid bit.

However, we should make sure there is nothing in the implementation that limits us to 16 or 32 bits; when manipulating the bitmask we should use a 64 bit type. 

   -hpa

* If we were to use the additional error reporting fields in the FRED frame at some future point it could be extended as far as 207, but since we at this point don't have anyone clamouring for more than 15 this seems like a very remote possibility.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 9/9] x86/nmi: Print source information with the unknown NMI console message
  2025-06-03 16:55   ` Xin Li
  2025-06-04  1:55     ` Sohil Mehta
@ 2025-06-04 15:29     ` H. Peter Anvin
  1 sibling, 0 replies; 37+ messages in thread
From: H. Peter Anvin @ 2025-06-04 15:29 UTC (permalink / raw)
  To: Xin Li, Sohil Mehta, x86, linux-kernel
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Peter Zijlstra, Sean Christopherson, Adrian Hunter,
	Kan Liang, Tony Luck, Zhang Rui, Steven Rostedt, Andrew Cooper,
	Kirill A . Shutemov, Jacob Pan, Andi Kleen, Kai Huang,
	Sandipan Das, linux-perf-users, linux-edac, kvm, linux-pm,
	linux-trace-kernel

On June 3, 2025 9:55:32 AM PDT, Xin Li <xin@zytor.com> wrote:
>On 5/13/2025 1:38 PM, Sohil Mehta wrote:
>> +	if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE))
>> +		pr_emerg_ratelimited("NMI-source bitmap is 0x%lx\n", fred_event_data(regs));
>
>"0x%04lx"?

Seems unnecessary to me to zero-pad it; it just makes small numbers harder to read. We don't even use vector 12+ at this time, do we?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 9/9] x86/nmi: Print source information with the unknown NMI console message
  2025-06-04  1:55     ` Sohil Mehta
  2025-06-04  2:41       ` Xin Li
@ 2025-06-04 15:41       ` H. Peter Anvin
  2025-06-11 21:39         ` Sohil Mehta
  1 sibling, 1 reply; 37+ messages in thread
From: H. Peter Anvin @ 2025-06-04 15:41 UTC (permalink / raw)
  To: Sohil Mehta, Xin Li, x86, linux-kernel
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Peter Zijlstra, Sean Christopherson, Adrian Hunter,
	Kan Liang, Tony Luck, Zhang Rui, Steven Rostedt, Andrew Cooper,
	Kirill A . Shutemov, Jacob Pan, Andi Kleen, Kai Huang,
	Sandipan Das, linux-perf-users, linux-edac, kvm, linux-pm,
	linux-trace-kernel

I think printing the bitmap is fine for unknown bits set... after all, you might have surprise vectors. Maybe a list would be cleaner by some definition, but we already print bitmaps all over the place. It is not critical as long as the information necessary for debugging is there. In fact, it is often better for the information to be compact so users don't get tempted to abbreviate a bug report.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 4/9] x86/nmi: Assign and register NMI-source vectors
  2025-06-04 15:28     ` H. Peter Anvin
@ 2025-06-11 21:34       ` Sohil Mehta
  0 siblings, 0 replies; 37+ messages in thread
From: Sohil Mehta @ 2025-06-11 21:34 UTC (permalink / raw)
  To: H. Peter Anvin, Xin Li, x86, linux-kernel
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Peter Zijlstra, Sean Christopherson, Adrian Hunter,
	Kan Liang, Tony Luck, Zhang Rui, Steven Rostedt, Andrew Cooper,
	Kirill A . Shutemov, Jacob Pan, Andi Kleen, Kai Huang,
	Sandipan Das, linux-perf-users, linux-edac, kvm, linux-pm,
	linux-trace-kernel

On 6/4/2025 8:28 AM, H. Peter Anvin wrote:
> 
> The architectural maximum* is 63 sources (plus the error bit), 
> although the current version only supports 1-15. If extended to be 
> wider, we would presumably add a new cpuid bit.
> 
> However, we should make sure there is nothing in the implementation 
> that limits us to 16 or 32 bits; when manipulating the bitmask we 
> should use a 64 bit type.
> 

There isn't anything in the implementation that would limit it to less
than 64 bits. However, (as suggested by Dave), I am planning to include:

	BUILD_BUG_ON(NMIS_VECTORS_MAX > 16);

If someone hits this compile issue while adding new vectors, it would
help them realize the limitations of the existing hardware.

We can recommend vector aliasing at that point or if there is hardware
in the pipeline that extends this beyond 16, we can simply get rid of
the BUILD_BUG_ON() along with the patches that add the new vectors.

> -hpa
> 
> * If we were to use the additional error reporting fields in the 
> FRED frame at some future point it could be extended as far as 207, 
> but since we at this point don't have anyone clamouring for more 
> than 15 this seems like a very remote possibility.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 9/9] x86/nmi: Print source information with the unknown NMI console message
  2025-06-04 15:41       ` H. Peter Anvin
@ 2025-06-11 21:39         ` Sohil Mehta
  0 siblings, 0 replies; 37+ messages in thread
From: Sohil Mehta @ 2025-06-11 21:39 UTC (permalink / raw)
  To: H. Peter Anvin, Xin Li, x86, linux-kernel
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Peter Zijlstra, Sean Christopherson, Adrian Hunter,
	Kan Liang, Tony Luck, Zhang Rui, Steven Rostedt, Andrew Cooper,
	Kirill A . Shutemov, Jacob Pan, Andi Kleen, Kai Huang,
	Sandipan Das, linux-perf-users, linux-edac, kvm, linux-pm,
	linux-trace-kernel

On 6/4/2025 8:41 AM, H. Peter Anvin wrote:
> I think printing the bitmap is fine for unknown bits set... after
> all, you might have surprise vectors. Maybe a list would be cleaner
> by some definition, but we already print bitmaps all over the place.
> It is not critical as long as the information necessary for
> debugging is there. In fact, it is often better for the information
> to be compact so users don't get tempted to abbreviate a bug report.
Sure, will print the full bitmap here. Since this is the unknown NMI
message, it may have unexpected bits set in the bitmap.



^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2025-06-11 21:39 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-13 20:37 [PATCH v6 0/9] x86: Add support for NMI-source reporting with FRED Sohil Mehta
2025-05-13 20:37 ` [PATCH v6 1/9] x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM Sohil Mehta
2025-05-14 14:15   ` Sean Christopherson
2025-05-14 21:34     ` Sohil Mehta
2025-05-16  1:15       ` Sean Christopherson
2025-06-02 20:45         ` Sohil Mehta
2025-05-13 20:37 ` [PATCH v6 2/9] x86/cpufeatures: Add the CPUID feature bit for NMI-source reporting Sohil Mehta
2025-05-13 20:37 ` [PATCH v6 3/9] x86/nmi: Extend the registration interface to include the NMI-source vector Sohil Mehta
2025-06-03  7:23   ` Xin Li
2025-06-03 17:35     ` Sohil Mehta
2025-06-03 17:07   ` Dave Hansen
2025-06-03 18:02     ` Sohil Mehta
2025-06-03 18:19       ` Dave Hansen
2025-06-03 20:24         ` Sohil Mehta
2025-05-13 20:37 ` [PATCH v6 4/9] x86/nmi: Assign and register NMI-source vectors Sohil Mehta
2025-06-03 16:34   ` Xin Li
2025-06-03 21:45     ` Sohil Mehta
2025-06-04 15:28     ` H. Peter Anvin
2025-06-11 21:34       ` Sohil Mehta
2025-06-03 17:23   ` Dave Hansen
2025-06-03 20:22     ` Sohil Mehta
2025-06-03 21:54       ` Dave Hansen
2025-06-03 22:33         ` Sohil Mehta
2025-06-04 15:22     ` H. Peter Anvin
2025-05-13 20:37 ` [PATCH v6 5/9] x86/nmi: Add support to handle NMIs with source information Sohil Mehta
2025-06-03 16:53   ` Xin Li
2025-05-13 20:38 ` [PATCH v6 6/9] x86/nmi: Prepare for the new NMI-source vector encoding Sohil Mehta
2025-05-13 20:38 ` [PATCH v6 7/9] x86/nmi: Enable NMI-source for IPIs delivered as NMIs Sohil Mehta
2025-05-13 20:38 ` [PATCH v6 8/9] perf/x86: Enable NMI-source reporting for perfmon Sohil Mehta
2025-06-03 16:57   ` Xin Li
2025-05-13 20:38 ` [PATCH v6 9/9] x86/nmi: Print source information with the unknown NMI console message Sohil Mehta
2025-06-03 16:55   ` Xin Li
2025-06-04  1:55     ` Sohil Mehta
2025-06-04  2:41       ` Xin Li
2025-06-04 15:41       ` H. Peter Anvin
2025-06-11 21:39         ` Sohil Mehta
2025-06-04 15:29     ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).