public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] oprofile: fix CPU unplug panic in ppro_stop()
@ 2008-12-02  6:21 Eric Dumazet
  2008-12-02  8:17 ` Ingo Molnar
  2008-12-04  0:04 ` Robert Richter
  0 siblings, 2 replies; 5+ messages in thread
From: Eric Dumazet @ 2008-12-02  6:21 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Ingo Molnar, linux kernel, Robert Richter

[-- Attachment #1: Type: text/plain, Size: 265 bytes --]

If oprofile statically compiled in kernel, a cpu unplug triggers
a panic in ppro_stop(), because a NULL pointer is dereferenced.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
 arch/x86/oprofile/op_model_ppro.c |    4 ++++
 1 files changed, 4 insertions(+)

[-- Attachment #2: ppro_stop.patch --]
[-- Type: text/plain, Size: 660 bytes --]

diff --git a/arch/x86/oprofile/op_model_ppro.c b/arch/x86/oprofile/op_model_ppro.c
index 716d26f..e9f80c7 100644
--- a/arch/x86/oprofile/op_model_ppro.c
+++ b/arch/x86/oprofile/op_model_ppro.c
@@ -156,6 +156,8 @@ static void ppro_start(struct op_msrs const * const msrs)
 	unsigned int low, high;
 	int i;
 
+	if (!reset_value)
+		return;
 	for (i = 0; i < num_counters; ++i) {
 		if (reset_value[i]) {
 			CTRL_READ(low, high, msrs, i);
@@ -171,6 +173,8 @@ static void ppro_stop(struct op_msrs const * const msrs)
 	unsigned int low, high;
 	int i;
 
+	if (!reset_value)
+		return;
 	for (i = 0; i < num_counters; ++i) {
 		if (!reset_value[i])
 			continue;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] oprofile: fix CPU unplug panic in ppro_stop()
  2008-12-02  6:21 [PATCH] oprofile: fix CPU unplug panic in ppro_stop() Eric Dumazet
@ 2008-12-02  8:17 ` Ingo Molnar
  2008-12-03 11:59   ` Robert Richter
  2008-12-04  0:04 ` Robert Richter
  1 sibling, 1 reply; 5+ messages in thread
From: Ingo Molnar @ 2008-12-02  8:17 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andi Kleen, linux kernel, Robert Richter, Thomas Gleixner,
	H. Peter Anvin


* Eric Dumazet <dada1@cosmosbay.com> wrote:

> If oprofile statically compiled in kernel, a cpu unplug triggers
> a panic in ppro_stop(), because a NULL pointer is dereferenced.
>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> ---
> arch/x86/oprofile/op_model_ppro.c |    4 ++++
> 1 files changed, 4 insertions(+)

> diff --git a/arch/x86/oprofile/op_model_ppro.c b/arch/x86/oprofile/op_model_ppro.c
> index 716d26f..e9f80c7 100644
> --- a/arch/x86/oprofile/op_model_ppro.c
> +++ b/arch/x86/oprofile/op_model_ppro.c
> @@ -156,6 +156,8 @@ static void ppro_start(struct op_msrs const * const msrs)
>  	unsigned int low, high;
>  	int i;
>  
> +	if (!reset_value)
> +		return;
>
> 	for (i = 0; i < num_counters; ++i) {
>  		if (reset_value[i]) {
>  			CTRL_READ(low, high, msrs, i);

i checked which commit caused this, and it is:

  From b99170288421c79f0c2efa8b33e26e65f4bb7fb8 Mon Sep 17 00:00:00 2001
  From: Andi Kleen <ak@linux.intel.com>
  Date: Mon, 18 Aug 2008 14:50:31 +0200
  Subject: [PATCH] oprofile: Implement Intel architectural perfmon support

it is an absolutely horrible commit - which has caused the second 
regression in a row already. The _real_ "perfmon support" patch should 
have been a _oneliner_:

  -#define NUM_COUNTERS 2
  -#define NUM_CONTROLS 2
  +#define NUM_COUNTERS 8
  +#define NUM_CONTROLS 8

as Nehalem has 4 performance counters so 8 is plenty - and we dont expect 
more than 8 in the next 5 years or so.

It was absolutely unnecessary to add kmalloc to this rarely executed 
codepath - and the way it was added was absolutely horrible as well, it 
was tacked on in the middle of an existing codepath, instead of factoring 
it out nicely. Perfmon will eventually replace PMC management anyway, so 
there was no "this way it's cleaner" argument either. So this code should 
have been changed minimally, instead of slapping in a full kmalloc for a 
simple array extension from 2 to 4 entries ...

You need to be more careful when changing x86 architecture code.

	Ingo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] oprofile: fix CPU unplug panic in ppro_stop()
  2008-12-02  8:17 ` Ingo Molnar
@ 2008-12-03 11:59   ` Robert Richter
  2008-12-03 14:08     ` Andi Kleen
  0 siblings, 1 reply; 5+ messages in thread
From: Robert Richter @ 2008-12-03 11:59 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Eric Dumazet, Andi Kleen, linux kernel, Thomas Gleixner,
	H. Peter Anvin

On 02.12.08 09:17:29, Ingo Molnar wrote:
> 
> * Eric Dumazet <dada1@cosmosbay.com> wrote:
> 
> > If oprofile statically compiled in kernel, a cpu unplug triggers
> > a panic in ppro_stop(), because a NULL pointer is dereferenced.
> >
> > Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> > ---
> > arch/x86/oprofile/op_model_ppro.c |    4 ++++
> > 1 files changed, 4 insertions(+)
> 
> > diff --git a/arch/x86/oprofile/op_model_ppro.c b/arch/x86/oprofile/op_model_ppro.c
> > index 716d26f..e9f80c7 100644
> > --- a/arch/x86/oprofile/op_model_ppro.c
> > +++ b/arch/x86/oprofile/op_model_ppro.c
> > @@ -156,6 +156,8 @@ static void ppro_start(struct op_msrs const * const msrs)
> >  	unsigned int low, high;
> >  	int i;
> >  
> > +	if (!reset_value)
> > +		return;
> >
> > 	for (i = 0; i < num_counters; ++i) {
> >  		if (reset_value[i]) {
> >  			CTRL_READ(low, high, msrs, i);

The patch fixes the null pointer access and this ok. But the root
cause seems to be in the cpu hotplug and initialization
code. xxx_start() should not be called before xxx_setup_ctrs() or
after xxx_shutdown(). Also, running only xxx_start() and xxx_stop() in
the cpu notifier functions is not sufficient. There is at least some
on_each_cpu code in nmi_setup() that should be called also in the cpu
notifier functions. I have to review that code.

[...]

> It was absolutely unnecessary to add kmalloc to this rarely executed 
> codepath - and the way it was added was absolutely horrible as well, it 
> was tacked on in the middle of an existing codepath, instead of factoring 
> it out nicely. Perfmon will eventually replace PMC management anyway, so 
> there was no "this way it's cleaner" argument either. So this code should 
> have been changed minimally, instead of slapping in a full kmalloc for a 
> simple array extension from 2 to 4 entries ...

Ingo, you are right that using kmalloc is unnecessary for
reset_value. So, Andi, maybe you could make this code easier?

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center
email: robert.richter@amd.com


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] oprofile: fix CPU unplug panic in ppro_stop()
  2008-12-03 11:59   ` Robert Richter
@ 2008-12-03 14:08     ` Andi Kleen
  0 siblings, 0 replies; 5+ messages in thread
From: Andi Kleen @ 2008-12-03 14:08 UTC (permalink / raw)
  To: Robert Richter
  Cc: Ingo Molnar, Eric Dumazet, linux kernel, Thomas Gleixner,
	H. Peter Anvin

Robert Richter wrote:
> On 02.12.08 09:17:29, Ingo Molnar wrote:
>> * Eric Dumazet <dada1@cosmosbay.com> wrote:
>>
>>> If oprofile statically compiled in kernel, a cpu unplug triggers
>>> a panic in ppro_stop(), because a NULL pointer is dereferenced.
>>>
>>> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
>>> ---
>>> arch/x86/oprofile/op_model_ppro.c |    4 ++++
>>> 1 files changed, 4 insertions(+)
>>> diff --git a/arch/x86/oprofile/op_model_ppro.c b/arch/x86/oprofile/op_model_ppro.c
>>> index 716d26f..e9f80c7 100644
>>> --- a/arch/x86/oprofile/op_model_ppro.c
>>> +++ b/arch/x86/oprofile/op_model_ppro.c
>>> @@ -156,6 +156,8 @@ static void ppro_start(struct op_msrs const * const msrs)
>>>  	unsigned int low, high;
>>>  	int i;
>>>  
>>> +	if (!reset_value)
>>> +		return;
>>>
>>> 	for (i = 0; i < num_counters; ++i) {
>>>  		if (reset_value[i]) {
>>>  			CTRL_READ(low, high, msrs, i);
> 
> The patch fixes the null pointer access and this ok. But the root
> cause seems to be in the cpu hotplug and initialization
> code. xxx_start() should not be called before xxx_setup_ctrs() or
> after xxx_shutdown(). 

Yes, it would be better to fix that. At least it would make
the code cleaner than the add checks for this backdoor everywhere.

> Also, running only xxx_start() and xxx_stop() in
> the cpu notifier functions is not sufficient. There is at least some
> on_each_cpu code in nmi_setup() that should be called also in the cpu
> notifier functions. I have to review that code.

AFAIK cpu hotplug has more problems in oprofile anyways. That is why
I didn't test that case.

> 
> [...]
> 
>> It was absolutely unnecessary to add kmalloc to this rarely executed 
>> codepath - and the way it was added was absolutely horrible as well, it 
>> was tacked on in the middle of an existing codepath, instead of factoring 
>> it out nicely. Perfmon will eventually replace PMC management anyway, so 
>> there was no "this way it's cleaner" argument either. So this code should 
>> have been changed minimally, instead of slapping in a full kmalloc for a 
>> simple array extension from 2 to 4 entries ...
> 
> Ingo, you are right that using kmalloc is unnecessary for
> reset_value. So, Andi, maybe you could make this code easier?

The reason I added the kmalloc is that there's also a varying number
of separate fixed function counters (although that's not currently
submitted).

Also I would prefer to not have a hard coded number for future
CPUs. Contrary to other people's opinion architectural perfmon is
not for Nehalem only.

-Andi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] oprofile: fix CPU unplug panic in ppro_stop()
  2008-12-02  6:21 [PATCH] oprofile: fix CPU unplug panic in ppro_stop() Eric Dumazet
  2008-12-02  8:17 ` Ingo Molnar
@ 2008-12-04  0:04 ` Robert Richter
  1 sibling, 0 replies; 5+ messages in thread
From: Robert Richter @ 2008-12-04  0:04 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Andi Kleen, Ingo Molnar, linux kernel

On 02.12.08 07:21:21, Eric Dumazet wrote:
> If oprofile statically compiled in kernel, a cpu unplug triggers
> a panic in ppro_stop(), because a NULL pointer is dereferenced.
>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>

Eric, I applied your patch and it will go upstream for 2.6.28.

Thanks,

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center
email: robert.richter@amd.com


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-12-04  0:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-02  6:21 [PATCH] oprofile: fix CPU unplug panic in ppro_stop() Eric Dumazet
2008-12-02  8:17 ` Ingo Molnar
2008-12-03 11:59   ` Robert Richter
2008-12-03 14:08     ` Andi Kleen
2008-12-04  0:04 ` Robert Richter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox