* [PATCH] perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows
@ 2011-03-24 20:36 Cyrill Gorcunov
2011-03-25 11:51 ` [tip:perf/urgent] " tip-bot for Don Zickus
0 siblings, 1 reply; 2+ messages in thread
From: Cyrill Gorcunov @ 2011-03-24 20:36 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Don Zickus, Lin Ming, lkml
From: Don Zickus <dzickus@redhat.com>
Subject: [PATCH -tip] perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows
The read of a proper MSR register was missed and instead of counter the
configration register was tested (it has ARCH_P4_UNFLAGGED_BIT always
cleared) leading to unknown NMI hitting the system. As result the user may
obtain "Dazed and confused, but trying to continue" message. Fix it by reading
a proper MSR register.
When an NMI happens on a P4, the perf nmi handler checks the configuration
register to see if the overflow bit is set or not before taking
appropriate action. Unfortunately, various P4 machines had a broken
overflow bit, so a backup mechanism was implemented. This mechanism
checked to see if the counter rolled over or not.
A previous commit that implemented this backup mechanism was broken.
Instead of reading the counter register, it used the configuration
register to determine if the counter rolled over or not. Reading that bit
would give incorrect results.
This would lead to 'Dazed and confused' messages for the end user when
using the perf tool (or if the nmi watchdog is running).
The fix is to read the counter register before determining if the counter
rolled over or not.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Lin Ming <ming.m.lin@intel.com>
---
arch/x86/kernel/cpu/perf_event_p4.c | 1 +
1 file changed, 1 insertion(+)
Index: linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
===================================================================
--- linux-2.6.tip.orig/arch/x86/kernel/cpu/perf_event_p4.c
+++ linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
@@ -777,6 +777,7 @@ static inline int p4_pmu_clear_cccr_ovf(
* the counter has reached zero value and continued counting before
* real NMI signal was received:
*/
+ rdmsrl(hwc->event_base, v);
if (!(v & ARCH_P4_UNFLAGGED_BIT))
return 1;
--
Cyrill
^ permalink raw reply [flat|nested] 2+ messages in thread* [tip:perf/urgent] perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows
2011-03-24 20:36 [PATCH] perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows Cyrill Gorcunov
@ 2011-03-25 11:51 ` tip-bot for Don Zickus
0 siblings, 0 replies; 2+ messages in thread
From: tip-bot for Don Zickus @ 2011-03-25 11:51 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, hpa, mingo, gorcunov, ming.m.lin, tglx, dzickus,
mingo
Commit-ID: 242214f9c1eeaae40eca11e3b4d37bfce960a7cd
Gitweb: http://git.kernel.org/tip/242214f9c1eeaae40eca11e3b4d37bfce960a7cd
Author: Don Zickus <dzickus@redhat.com>
AuthorDate: Thu, 24 Mar 2011 23:36:25 +0300
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Thu, 24 Mar 2011 21:40:01 +0100
perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows
The read of a proper MSR register was missed and instead of
counter the configration register was tested (it has
ARCH_P4_UNFLAGGED_BIT always cleared) leading to unknown NMI
hitting the system. As result the user may obtain "Dazed and
confused, but trying to continue" message. Fix it by reading a
proper MSR register.
When an NMI happens on a P4, the perf nmi handler checks the
configuration register to see if the overflow bit is set or not
before taking appropriate action. Unfortunately, various P4
machines had a broken overflow bit, so a backup mechanism was
implemented. This mechanism checked to see if the counter
rolled over or not.
A previous commit that implemented this backup mechanism was
broken. Instead of reading the counter register, it used the
configuration register to determine if the counter rolled over
or not. Reading that bit would give incorrect results.
This would lead to 'Dazed and confused' messages for the end
user when using the perf tool (or if the nmi watchdog is
running).
The fix is to read the counter register before determining if
the counter rolled over or not.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Lin Ming <ming.m.lin@intel.com>
LKML-Reference: <4D8BAB49.3080701@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/x86/kernel/cpu/perf_event_p4.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event_p4.c b/arch/x86/kernel/cpu/perf_event_p4.c
index 3769ac8..d3d7b59 100644
--- a/arch/x86/kernel/cpu/perf_event_p4.c
+++ b/arch/x86/kernel/cpu/perf_event_p4.c
@@ -777,6 +777,7 @@ static inline int p4_pmu_clear_cccr_ovf(struct hw_perf_event *hwc)
* the counter has reached zero value and continued counting before
* real NMI signal was received:
*/
+ rdmsrl(hwc->event_base, v);
if (!(v & ARCH_P4_UNFLAGGED_BIT))
return 1;
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2011-03-25 11:52 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-24 20:36 [PATCH] perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows Cyrill Gorcunov
2011-03-25 11:51 ` [tip:perf/urgent] " tip-bot for Don Zickus
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox