All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf/x86/intel: Fix spurious NMI on fixed counter
@ 2019-06-25 14:21 kan.liang
  2019-06-25 14:58 ` Jiri Olsa
  2019-07-13 11:13 ` [tip:perf/urgent] " tip-bot for Kan Liang
  0 siblings, 2 replies; 6+ messages in thread
From: kan.liang @ 2019-06-25 14:21 UTC (permalink / raw)
  To: mingo, jolsa, peterz, linux-kernel; +Cc: ak, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

If a user first sample a PEBS event on a fixed counter, then sample a
non-PEBS event on the same fixed counter on Icelake, it will trigger
spurious NMI. For example,

  perf record -e 'cycles:p' -a
  perf record -e 'cycles' -a

The error message for spurious NMI.

  [June 21 15:38] Uhhuh. NMI received for unknown reason 30 on CPU 2.
  [  +0.000000] Do you have a strange power saving mode enabled?
  [  +0.000000] Dazed and confused, but trying to continue

The issue was introduced by the following commit:

  commit 6f55967ad9d9 ("perf/x86/intel: Fix race in intel_pmu_disable_event()")

The commit moves the intel_pmu_pebs_disable() after
intel_pmu_disable_fixed(), which returns immediately.
The related bit of PEBS_ENABLE MSR will never be cleared for the fixed
counter. Then a non-PEBS event runs on the fixed counter, but the bit
on PEBS_ENABLE is still set, which trigger spurious NMI.

Check and disable PEBS for fixed counter after intel_pmu_disable_fixed().

Reported-by: Yi, Ammy <ammy.yi@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Fixes: 6f55967ad9d9 ("perf/x86/intel: Fix race in intel_pmu_disable_event()")
---
 arch/x86/events/intel/core.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 4377bf6a6f82..464316218b77 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2160,12 +2160,10 @@ static void intel_pmu_disable_event(struct perf_event *event)
 	cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx);
 	cpuc->intel_cp_status &= ~(1ull << hwc->idx);
 
-	if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) {
+	if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL))
 		intel_pmu_disable_fixed(hwc);
-		return;
-	}
-
-	x86_pmu_disable_event(event);
+	else
+		x86_pmu_disable_event(event);
 
 	/*
 	 * Needs to be called after x86_pmu_disable_event,
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-07-13 11:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-06-25 14:21 [PATCH] perf/x86/intel: Fix spurious NMI on fixed counter kan.liang
2019-06-25 14:58 ` Jiri Olsa
2019-07-05  0:23   ` Jin, Yao
2019-07-05 11:44     ` Peter Zijlstra
2019-07-10 19:37       ` Andi Kleen
2019-07-13 11:13 ` [tip:perf/urgent] " tip-bot for Kan Liang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.