public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test
@ 2011-03-24  8:29 Cyrill Gorcunov
  2011-03-24  8:48 ` Ingo Molnar
  2011-03-24 12:28 ` Don Zickus
  0 siblings, 2 replies; 13+ messages in thread
From: Cyrill Gorcunov @ 2011-03-24  8:29 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Lin Ming, Don Zickus, Linux kernel mailing list

[-- Attachment #1: Type: text/plain, Size: 106 bytes --]

Don, I've added yours SOB, ok? (The patch is attached to avoid
space/tabs problem
due to web-mail client)

[-- Attachment #2: perf-x86-p4-pmu-unflagged --]
[-- Type: application/octet-stream, Size: 896 bytes --]

From: Don Zickus <dzickus@redhat.com>
Subject: [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test

The read of real MSR register was missed before
if () clause which leads that test never success.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
---
 arch/x86/kernel/cpu/perf_event_p4.c |    1 +
 1 file changed, 1 insertion(+)

Index: linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
===================================================================
--- linux-2.6.tip.orig/arch/x86/kernel/cpu/perf_event_p4.c
+++ linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
@@ -777,6 +777,7 @@ static inline int p4_pmu_clear_cccr_ovf(
 	 * the counter has reached zero value and continued counting before
 	 * real NMI signal was received:
 	 */
+	rdmsrl(hwc->event_base, v);
 	if (!(v & ARCH_P4_UNFLAGGED_BIT))
 		return 1;
 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test
  2011-03-24  8:29 [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test Cyrill Gorcunov
@ 2011-03-24  8:48 ` Ingo Molnar
  2011-03-24  9:33   ` Cyrill Gorcunov
  2011-03-24 15:38   ` Cyrill Gorcunov
  2011-03-24 12:28 ` Don Zickus
  1 sibling, 2 replies; 13+ messages in thread
From: Ingo Molnar @ 2011-03-24  8:48 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: Lin Ming, Don Zickus, Linux kernel mailing list


* Cyrill Gorcunov <gorcunov@gmail.com> wrote:

> Don, I've added yours SOB, ok? (The patch is attached to avoid
> space/tabs problem
> due to web-mail client)

The patch lacks a proper description about the motivation and effects of the 
patch.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test
  2011-03-24  8:48 ` Ingo Molnar
@ 2011-03-24  9:33   ` Cyrill Gorcunov
  2011-03-24 15:38   ` Cyrill Gorcunov
  1 sibling, 0 replies; 13+ messages in thread
From: Cyrill Gorcunov @ 2011-03-24  9:33 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Lin Ming, Don Zickus, Linux kernel mailing list

On Thursday, March 24, 2011, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Cyrill Gorcunov <gorcunov@gmail.com> wrote:
>
>> Don, I've added yours SOB, ok? (The patch is attached to avoid
>> space/tabs problem
>> due to web-mail client)
>
> The patch lacks a proper description about the motivation and effects of the
> patch.
>
> Thanks,
>
>         Ingo
>

Ok, i'll resend more detailed description later, thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test
  2011-03-24  8:29 [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test Cyrill Gorcunov
  2011-03-24  8:48 ` Ingo Molnar
@ 2011-03-24 12:28 ` Don Zickus
  1 sibling, 0 replies; 13+ messages in thread
From: Don Zickus @ 2011-03-24 12:28 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: Ingo Molnar, Lin Ming, Linux kernel mailing list

On Thu, Mar 24, 2011 at 11:29:43AM +0300, Cyrill Gorcunov wrote:
> Don, I've added yours SOB, ok? (The patch is attached to avoid
> space/tabs problem
> due to web-mail client)

Fine by me.

Cheers,
Don


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test
  2011-03-24  8:48 ` Ingo Molnar
  2011-03-24  9:33   ` Cyrill Gorcunov
@ 2011-03-24 15:38   ` Cyrill Gorcunov
  2011-03-24 16:33     ` Ingo Molnar
  1 sibling, 1 reply; 13+ messages in thread
From: Cyrill Gorcunov @ 2011-03-24 15:38 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Lin Ming, Don Zickus, Linux kernel mailing list

On 03/24/2011 11:48 AM, Ingo Molnar wrote:
> 
> * Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> 
>> Don, I've added yours SOB, ok? (The patch is attached to avoid
>> space/tabs problem
>> due to web-mail client)
> 
> The patch lacks a proper description about the motivation and effects of the 
> patch.
> 
> Thanks,
> 
> 	Ingo

Ingo, does this one looks better?

---
From: Don Zickus <dzickus@redhat.com>
Subject: [PATCH -tip] perf, x86: P4 PMU - Add missing read of MSR register to catch unflagged overflows

The read of a proper MSR register was missed so instead of a counter the
configration register is tested (it has ARCH_P4_UNFLAGGED_BIT always
cleared) and unflagged overflows never have been catched. Fix it by
reading a proper MSR register.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
---
 arch/x86/kernel/cpu/perf_event_p4.c |    1 +
 1 file changed, 1 insertion(+)

Index: linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
===================================================================
--- linux-2.6.tip.orig/arch/x86/kernel/cpu/perf_event_p4.c
+++ linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
@@ -777,6 +777,7 @@ static inline int p4_pmu_clear_cccr_ovf(
 	 * the counter has reached zero value and continued counting before
 	 * real NMI signal was received:
 	 */
+	rdmsrl(hwc->event_base, v);
 	if (!(v & ARCH_P4_UNFLAGGED_BIT))
 		return 1;


-- 
    Cyrill

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test
  2011-03-24 15:38   ` Cyrill Gorcunov
@ 2011-03-24 16:33     ` Ingo Molnar
  2011-03-24 16:46       ` Cyrill Gorcunov
  0 siblings, 1 reply; 13+ messages in thread
From: Ingo Molnar @ 2011-03-24 16:33 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: Lin Ming, Don Zickus, Linux kernel mailing list


* Cyrill Gorcunov <gorcunov@gmail.com> wrote:

> On 03/24/2011 11:48 AM, Ingo Molnar wrote:
> > 
> > * Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> > 
> >> Don, I've added yours SOB, ok? (The patch is attached to avoid
> >> space/tabs problem
> >> due to web-mail client)
> > 
> > The patch lacks a proper description about the motivation and effects of the 
> > patch.
> > 
> > Thanks,
> > 
> > 	Ingo
> 
> Ingo, does this one looks better?
> 
> ---
> From: Don Zickus <dzickus@redhat.com>
> Subject: [PATCH -tip] perf, x86: P4 PMU - Add missing read of MSR register to catch unflagged overflows
> 
> The read of a proper MSR register was missed so instead of a counter the 
> configration register is tested (it has ARCH_P4_UNFLAGGED_BIT always cleared) 
> and unflagged overflows never have been catched. Fix it by reading a proper 
> MSR register.

So what effect does this have on the regular perf user? Please try to describe 
the real-life effect of the bug/problem fixed here.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test
  2011-03-24 16:33     ` Ingo Molnar
@ 2011-03-24 16:46       ` Cyrill Gorcunov
  2011-03-24 16:47         ` Cyrill Gorcunov
                           ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Cyrill Gorcunov @ 2011-03-24 16:46 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Lin Ming, Don Zickus, Linux kernel mailing list

On 03/24/2011 07:33 PM, Ingo Molnar wrote:
> 
> * Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> 
>> On 03/24/2011 11:48 AM, Ingo Molnar wrote:
>>>
>>> * Cyrill Gorcunov <gorcunov@gmail.com> wrote:
>>>
>>>> Don, I've added yours SOB, ok? (The patch is attached to avoid
>>>> space/tabs problem
>>>> due to web-mail client)
>>>
>>> The patch lacks a proper description about the motivation and effects of the 
>>> patch.
>>>
>>> Thanks,
>>>
>>> 	Ingo
>>
>> Ingo, does this one looks better?
>>
>> ---
>> From: Don Zickus <dzickus@redhat.com>
>> Subject: [PATCH -tip] perf, x86: P4 PMU - Add missing read of MSR register to catch unflagged overflows
>>
>> The read of a proper MSR register was missed so instead of a counter the 
>> configration register is tested (it has ARCH_P4_UNFLAGGED_BIT always cleared) 
>> and unflagged overflows never have been catched. Fix it by reading a proper 
>> MSR register.
> 
> So what effect does this have on the regular perf user? Please try to describe 
> the real-life effect of the bug/problem fixed here.
> 
> Thanks,
> 
> 	Ingo

Unflagged overflows never have been catched due to missed read of a register which
is to signalize about it, and as result unknown nmi may happen leading to
"Dazen and confused" message. That is what supposed to be in changelog?

-- 
    Cyrill

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test
  2011-03-24 16:46       ` Cyrill Gorcunov
@ 2011-03-24 16:47         ` Cyrill Gorcunov
  2011-03-24 16:51         ` Ingo Molnar
  2011-03-24 18:22         ` Don Zickus
  2 siblings, 0 replies; 13+ messages in thread
From: Cyrill Gorcunov @ 2011-03-24 16:47 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Lin Ming, Don Zickus, Linux kernel mailing list

On 03/24/2011 07:46 PM, Cyrill Gorcunov wrote:
...
>>
>> So what effect does this have on the regular perf user? Please try to describe 
>> the real-life effect of the bug/problem fixed here.
>>
>> Thanks,
>>
>> 	Ingo
> 
> Unflagged overflows never have been catched due to missed read of a register which
> is to signalize about it, and as result unknown nmi may happen leading to
> "Dazen and confused" message. That is what supposed to be in changelog?
> 

Or you mean the tech details would be appropriate in changelog as well? If so
I could put such details there.

-- 
    Cyrill

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test
  2011-03-24 16:46       ` Cyrill Gorcunov
  2011-03-24 16:47         ` Cyrill Gorcunov
@ 2011-03-24 16:51         ` Ingo Molnar
  2011-03-24 17:06           ` Cyrill Gorcunov
  2011-03-24 18:22         ` Don Zickus
  2 siblings, 1 reply; 13+ messages in thread
From: Ingo Molnar @ 2011-03-24 16:51 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: Lin Ming, Don Zickus, Linux kernel mailing list


* Cyrill Gorcunov <gorcunov@gmail.com> wrote:

> Unflagged overflows never have been catched due to missed read of a register which
> is to signalize about it, and as result unknown nmi may happen leading to
> "Dazen and confused" message. That is what supposed to be in changelog?

Exactly, the 'Dazed and confused' message is *all* that the user cares about so 
it must feature prominently in the changelog.

If a P4 user searches lkml he wants to know which fixed address 
dazed-and-confused messages. He will know nothing about 'unflagged overflows' 
or other internals ...

All the other details about how the patch does the fix is secondary to what 
users experience when they hit this bug.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test
  2011-03-24 16:51         ` Ingo Molnar
@ 2011-03-24 17:06           ` Cyrill Gorcunov
  0 siblings, 0 replies; 13+ messages in thread
From: Cyrill Gorcunov @ 2011-03-24 17:06 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Lin Ming, Don Zickus, Linux kernel mailing list

On 03/24/2011 07:51 PM, Ingo Molnar wrote:
> 
> * Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> 
>> Unflagged overflows never have been catched due to missed read of a register which
>> is to signalize about it, and as result unknown nmi may happen leading to
>> "Dazen and confused" message. That is what supposed to be in changelog?
> 
> Exactly, the 'Dazed and confused' message is *all* that the user cares about so 
> it must feature prominently in the changelog.
> 
> If a P4 user searches lkml he wants to know which fixed address 
> dazed-and-confused messages. He will know nothing about 'unflagged overflows' 
> or other internals ...
> 
> All the other details about how the patch does the fix is secondary to what 
> users experience when they hit this bug.
> 
> Thanks,
> 
> 	Ingo

ok, let me try
---
From: Don Zickus <dzickus@redhat.com>
Subject: [PATCH -tip] perf, x86: P4 PMU - Catch unknown NMI on unflagged overflows

The read of a proper MSR register was missed and instead of counter the
configration register was tested (it has ARCH_P4_UNFLAGGED_BIT always
cleared) leading to unknown NMI hitting the system. As result the user may
obtain "Dazed and confused, but trying to continue" message. Fix it by reading
a proper MSR register.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
---
 arch/x86/kernel/cpu/perf_event_p4.c |    1 +
 1 file changed, 1 insertion(+)

Index: linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
===================================================================
--- linux-2.6.tip.orig/arch/x86/kernel/cpu/perf_event_p4.c
+++ linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
@@ -777,6 +777,7 @@ static inline int p4_pmu_clear_cccr_ovf(
 	 * the counter has reached zero value and continued counting before
 	 * real NMI signal was received:
 	 */
+	rdmsrl(hwc->event_base, v);
 	if (!(v & ARCH_P4_UNFLAGGED_BIT))
 		return 1;


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test
  2011-03-24 16:46       ` Cyrill Gorcunov
  2011-03-24 16:47         ` Cyrill Gorcunov
  2011-03-24 16:51         ` Ingo Molnar
@ 2011-03-24 18:22         ` Don Zickus
  2011-03-24 18:26           ` Cyrill Gorcunov
  2 siblings, 1 reply; 13+ messages in thread
From: Don Zickus @ 2011-03-24 18:22 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: Ingo Molnar, Lin Ming, Linux kernel mailing list

On Thu, Mar 24, 2011 at 07:46:40PM +0300, Cyrill Gorcunov wrote:
> >> The read of a proper MSR register was missed so instead of a counter the 
> >> configration register is tested (it has ARCH_P4_UNFLAGGED_BIT always cleared) 
> >> and unflagged overflows never have been catched. Fix it by reading a proper 
> >> MSR register.
> > 
> > So what effect does this have on the regular perf user? Please try to describe 
> > the real-life effect of the bug/problem fixed here.
> > 
> > Thanks,
> > 
> > 	Ingo
> 
> Unflagged overflows never have been catched due to missed read of a register which
> is to signalize about it, and as result unknown nmi may happen leading to
> "Dazen and confused" message. That is what supposed to be in changelog?

I think Ingo is looking for something like this:

When an NMI happens on a P4, the perf nmi handler checks the configuration
register to see if the overflow bit is set or not before taking
appropriate action.  Unfortunately, various P4 machines had a broken
overflow bit, so a backup mechanism was implemented.  This mechanism
checked to see if the counter rolled over or not.

A previous commit that implemented this backup mechanism was broken.
Instead of reading the counter register, it used the configuration
register to determine if the counter rolled over or not.  Reading that bit
would give incorrect results.

This would lead to 'Dazed and confused' messages for the end user when
using the perf tool (or if the nmi watchdog is running).

The fix is to read the counter register before determining if the counter
rolled over or not.

Cheers,
Don

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test
  2011-03-24 18:22         ` Don Zickus
@ 2011-03-24 18:26           ` Cyrill Gorcunov
  2011-03-24 20:03             ` Ingo Molnar
  0 siblings, 1 reply; 13+ messages in thread
From: Cyrill Gorcunov @ 2011-03-24 18:26 UTC (permalink / raw)
  To: Don Zickus; +Cc: Ingo Molnar, Lin Ming, Linux kernel mailing list

On 03/24/2011 09:22 PM, Don Zickus wrote:
> On Thu, Mar 24, 2011 at 07:46:40PM +0300, Cyrill Gorcunov wrote:
>>>> The read of a proper MSR register was missed so instead of a counter the 
>>>> configration register is tested (it has ARCH_P4_UNFLAGGED_BIT always cleared) 
>>>> and unflagged overflows never have been catched. Fix it by reading a proper 
>>>> MSR register.
>>>
>>> So what effect does this have on the regular perf user? Please try to describe 
>>> the real-life effect of the bug/problem fixed here.
>>>
>>> Thanks,
>>>
>>> 	Ingo
>>
>> Unflagged overflows never have been catched due to missed read of a register which
>> is to signalize about it, and as result unknown nmi may happen leading to
>> "Dazen and confused" message. That is what supposed to be in changelog?
> 
> I think Ingo is looking for something like this:
> 

Thanks Don, if Ingo agree I can update it.

-- 
    Cyrill

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test
  2011-03-24 18:26           ` Cyrill Gorcunov
@ 2011-03-24 20:03             ` Ingo Molnar
  0 siblings, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2011-03-24 20:03 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: Don Zickus, Lin Ming, Linux kernel mailing list


* Cyrill Gorcunov <gorcunov@gmail.com> wrote:

> On 03/24/2011 09:22 PM, Don Zickus wrote:
> > On Thu, Mar 24, 2011 at 07:46:40PM +0300, Cyrill Gorcunov wrote:
> >>>> The read of a proper MSR register was missed so instead of a counter the 
> >>>> configration register is tested (it has ARCH_P4_UNFLAGGED_BIT always cleared) 
> >>>> and unflagged overflows never have been catched. Fix it by reading a proper 
> >>>> MSR register.
> >>>
> >>> So what effect does this have on the regular perf user? Please try to describe 
> >>> the real-life effect of the bug/problem fixed here.
> >>>
> >>> Thanks,
> >>>
> >>> 	Ingo
> >>
> >> Unflagged overflows never have been catched due to missed read of a register which
> >> is to signalize about it, and as result unknown nmi may happen leading to
> >> "Dazen and confused" message. That is what supposed to be in changelog?
> > 
> > I think Ingo is looking for something like this:
> > 
> 
> Thanks Don, if Ingo agree I can update it.

Sure - please resend the final patch in a clean thread, with a proper title, 
etc.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2011-03-24 20:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-24  8:29 [PATCH -tip] perf, x86: P4 PMU - Add missing read of a counter before test Cyrill Gorcunov
2011-03-24  8:48 ` Ingo Molnar
2011-03-24  9:33   ` Cyrill Gorcunov
2011-03-24 15:38   ` Cyrill Gorcunov
2011-03-24 16:33     ` Ingo Molnar
2011-03-24 16:46       ` Cyrill Gorcunov
2011-03-24 16:47         ` Cyrill Gorcunov
2011-03-24 16:51         ` Ingo Molnar
2011-03-24 17:06           ` Cyrill Gorcunov
2011-03-24 18:22         ` Don Zickus
2011-03-24 18:26           ` Cyrill Gorcunov
2011-03-24 20:03             ` Ingo Molnar
2011-03-24 12:28 ` Don Zickus

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox