public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Kosina <jkosina@suse.cz>, Andi Kleen <andi@firstfloor.org>,
	Robert Richter <robert.richter@amd.com>,
	oprofile-list@lists.sf.net, Jiri Benc <jbenc@suse.cz>,
	Vilem Marsik <vmarsik@suse.cz>,
	Pekka Enberg <penberg@cs.helsinki.fi>,
	linux-kernel@vger.kernel.org
Subject: Re: Oprofile [still] doesn't work on 2.6.28-rc4 on certain CPU
Date: Thu, 13 Nov 2008 22:49:16 +0100	[thread overview]
Message-ID: <491CA0DC.8070405@cosmosbay.com> (raw)
In-Reply-To: <20081113213744.GA8429@elte.hu>

[-- Attachment #1: Type: text/plain, Size: 3372 bytes --]

Ingo Molnar a écrit :
> * Jiri Kosina <jkosina@suse.cz> wrote:
> 
>> On Thu, 13 Nov 2008, Ingo Molnar wrote:
>>
>>>> I haven't yet found a time to start bisecting this.
>>> Would be nice to identify a commit to revert - in case we run out of 
>>> time fixing it.
>> Yup, I first wanted to make this known to the public in hope that it 
>> will ring a bell somewhere.
>>
>> If noone sees an obvous reason for this, I will do my best to bisect 
>> this tomorrow.
> 
> We've got the one patch below pending, but that's not for AMD cpus so 
> it shouldnt impact your case.
> 
> But ... some change made it all much more fragile. I'm curious why 
> things became more fragile.
> 
> 	Ingo
> 
> --------------->
> Subject: oprofile: un-mask APIC before resetting counter in ppro_check_ctrs()
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Tue, 11 Nov 2008 09:32:12 +0100
> 
> While using oprofile on my HP BL460c G1, (two quad core intel E5450 CPU),
> I noticed that one CPU after the other could not get anymore NMI.
> 
> After a while, all cores where blocked (ie not generating events for oprofile)
> I tried all major linux versions and all where affected by this freeze.
> 
> I found that we have to un-mask APIC *before* writing to MSR counter
> when we get event notification, because we use APIC_LVTPC in edge triggered mode.
> 
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
>  arch/x86/oprofile/op_model_ppro.c |   10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> Index: tip/arch/x86/oprofile/op_model_ppro.c
> ===================================================================
> --- tip.orig/arch/x86/oprofile/op_model_ppro.c
> +++ tip/arch/x86/oprofile/op_model_ppro.c
> @@ -126,6 +126,12 @@ static int ppro_check_ctrs(struct pt_reg
>  	u64 val;
>  	int i;
>  
> +	/*
> +	 * We need to unmask the apic vector *before* writing reset_value
> +	 * to msr counter, because we use edge trigger
> +	 */
> +	apic_write(APIC_LVTPC, apic_read(APIC_LVTPC) & ~APIC_LVT_MASKED);
> +
>  	for (i = 0 ; i < num_counters; ++i) {
>  		if (!reset_value[i])
>  			continue;
> @@ -136,10 +142,6 @@ static int ppro_check_ctrs(struct pt_reg
>  		}
>  	}
>  
> -	/* Only P6 based Pentium M need to re-unmask the apic vector but it
> -	 * doesn't hurt other P6 variant */
> -	apic_write(APIC_LVTPC, apic_read(APIC_LVTPC) & ~APIC_LVT_MASKED);
> -
>  	/* We can't work out if we really handled an interrupt. We
>  	 * might have caught a *second* counter just after overflowing
>  	 * the interrupt for this counter then arrives

Just to clarify, I found this patch necessary for previous linux versions as well.

Maybe new CPUS from intel triggers a software bug, I dont know.


Also, I posted a patch about the kmalloc() of reset_value, I am not sure patch was pushed.

This one is a real bug.

[PATCH] oprofile: fix an overflow in ppro code

reset_value was changed from long to u64 in commit b99170288421c79f0c2efa8b33e26e65f4bb7fb8
(oprofile: Implement Intel architectural perfmon support)

But dynamic allocation of this array use a wrong type (long instead of u64)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
arch/x86/oprofile/op_model_ppro.c |    2 +-
1 files changed, 1 insertion(+), 1 deletion(-)


[-- Attachment #2: oprofile_ppro.patch --]
[-- Type: text/plain, Size: 481 bytes --]

diff --git a/arch/x86/oprofile/op_model_ppro.c b/arch/x86/oprofile/op_model_ppro.c
index 3f1b81a..716d26f 100644
--- a/arch/x86/oprofile/op_model_ppro.c
+++ b/arch/x86/oprofile/op_model_ppro.c
@@ -69,7 +69,7 @@ static void ppro_setup_ctrs(struct op_msrs const * const msrs)
 	int i;
 
 	if (!reset_value) {
-		reset_value = kmalloc(sizeof(unsigned) * num_counters,
+		reset_value = kmalloc(sizeof(reset_value[0]) * num_counters,
 					GFP_ATOMIC);
 		if (!reset_value)
 			return;

  reply	other threads:[~2008-11-13 21:49 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-13 20:57 Oprofile [still] doesn't work on 2.6.28-rc4 on certain CPU Jiri Kosina
2008-11-13 21:24 ` Ingo Molnar
2008-11-13 21:27   ` Jiri Kosina
2008-11-13 21:37     ` Ingo Molnar
2008-11-13 21:49       ` Eric Dumazet [this message]
2008-11-14 15:25         ` Oprofile : need to adjust PC by 16 bytes Eric Dumazet
2008-11-14 15:59           ` Andi Kleen
2008-11-14 16:02             ` Eric Dumazet
2008-11-14 17:01           ` Mikael Pettersson
2008-11-14 17:21             ` Eric Dumazet
2008-11-14 17:50               ` Andi Kleen
2008-11-15 16:30                 ` Eric Dumazet
2008-11-15 18:36                   ` Andi Kleen
2008-11-17 15:02                     ` stephane eranian
2008-11-17 15:16                       ` Eric Dumazet
2008-11-17 15:24                       ` Andi Kleen
2008-11-14 10:09     ` Oprofile [still] doesn't work on 2.6.28-rc4 on certain CPU Robert Richter
2008-11-14 11:12       ` Jiri Kosina
2008-11-13 21:33   ` Maynard Johnson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=491CA0DC.8070405@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=andi@firstfloor.org \
    --cc=jbenc@suse.cz \
    --cc=jkosina@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oprofile-list@lists.sf.net \
    --cc=penberg@cs.helsinki.fi \
    --cc=robert.richter@amd.com \
    --cc=vmarsik@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox