public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Oops running oprofile in 2.5.62
@ 2003-02-27  6:29 Dave Hansen
  2003-02-27 17:37 ` John Levon
  2003-03-05 20:15 ` Philippe Elie
  0 siblings, 2 replies; 4+ messages in thread
From: Dave Hansen @ 2003-02-27  6:29 UTC (permalink / raw)
  To: oprofile-list; +Cc: Linux Kernel Mailing List

This happened while running dbench on 2.5.62.  I haven't seen it before,
but I thought I'd report it anyway.  I'm using the 0.5 version of the
userspace tools.

I'm pretty sure it happened on this line in oprofile_add_sample():
	cpu_buf->buffer[cpu_buf->pos].eip = eip;

Unable to handle kernel paging request at virtual address f8c3c000
c0212022
*pde = 00000000
Oops: 0002
CPU:    13
EIP:    0060:[<c0212022>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010046
eax: 40082d94   ebx: 00000340   ecx: 00002000   edx: f8c2c000
esi: 00000000   edi: c0310f00   ebp: 00000000   esp: e2fbdf60
ds: 007b   es: 007b   ss: 0068
Stack: 00000000 00000000 c03a22c8 c02139f0 40082d94 00000000 00000000
       0000000d e2fbc000 e2fbdfc4 bffff7a0 bffffbb8 40082d94 c0213291
       0000000d c03a22c8 e2fbdfc4 c010a1eb e2fbdfc4 0000000d 40122020
       00000000 c010962e e2fbdfc4
Call Trace:
 [<c02139f0>] ppro_check_ctrs+0x4c/0x80
 [<c0213291>] nmi_callback+0x21/0x28
 [<c010a1eb>] do_nmi+0x2b/0x48
 [<c010962e>] nmi+0x1e/0x30
Code: 89 04 ca 8b 8b c4 0b 31 c0 8b 93 d0 0b 31 c0 8b 44 24 18 89
p

>>EIP; c0212022 <oprofile_add_sample+102/128>   <=====

>>edi; c0310f00 <cpu_buffer+340/800>

Trace; c02139f0 <ppro_check_ctrs+4c/80>
Trace; c0213291 <nmi_callback+21/28>
Trace; c010a1eb <do_nmi+2b/48>
Trace; c010962e <nmi+1e/30>

Code;  c0212022 <oprofile_add_sample+102/128>
00000000 <_EIP>:
Code;  c0212022 <oprofile_add_sample+102/128>   <=====
   0:   89 04 ca                  mov    %eax,(%edx,%ecx,8)   <=====
Code;  c0212025 <oprofile_add_sample+105/128>
   3:   8b 8b c4 0b 31 c0         mov    0xc0310bc4(%ebx),%ecx
Code;  c021202b <oprofile_add_sample+10b/128>
   9:   8b 93 d0 0b 31 c0         mov    0xc0310bd0(%ebx),%edx
Code;  c0212031 <oprofile_add_sample+111/128>
   f:   8b 44 24 18               mov    0x18(%esp,1),%eax
Code;  c0212035 <oprofile_add_sample+115/128>
  13:   89 00                     mov    %eax,(%eax)



-- 
Dave Hansen
haveblue@us.ibm.com


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Oops running oprofile in 2.5.62
  2003-02-27  6:29 Oops running oprofile in 2.5.62 Dave Hansen
@ 2003-02-27 17:37 ` John Levon
  2003-02-27 17:54   ` Dave Hansen
  2003-03-05 20:15 ` Philippe Elie
  1 sibling, 1 reply; 4+ messages in thread
From: John Levon @ 2003-02-27 17:37 UTC (permalink / raw)
  To: Dave Hansen; +Cc: oprofile-list, Linux Kernel Mailing List

On Wed, Feb 26, 2003 at 10:29:43PM -0800, Dave Hansen wrote:

> I'm pretty sure it happened on this line in oprofile_add_sample():
> 	cpu_buf->buffer[cpu_buf->pos].eip = eip;
> 
> Unable to handle kernel paging request at virtual address f8c3c000

Odd. UP or SMP ? Were you shutting down oprofile at the time (or did the
daemon crash ? check /var/lib/oprofile/oprofiled.log)

The only thing I can think is that the buffer got freed then we  got an
NMI afterwards (which obviously isn't supposed to happen...)

UP oprofile in current linus is still totally broken, but I can't
imagine it causing the above.

regards
john

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Oops running oprofile in 2.5.62
  2003-02-27 17:37 ` John Levon
@ 2003-02-27 17:54   ` Dave Hansen
  0 siblings, 0 replies; 4+ messages in thread
From: Dave Hansen @ 2003-02-27 17:54 UTC (permalink / raw)
  To: John Levon; +Cc: oprofile-list, Linux Kernel Mailing List

John Levon wrote:
> On Wed, Feb 26, 2003 at 10:29:43PM -0800, Dave Hansen wrote:
> 
>>I'm pretty sure it happened on this line in oprofile_add_sample():
>>	cpu_buf->buffer[cpu_buf->pos].eip = eip;
>>
>>Unable to handle kernel paging request at virtual address f8c3c000
> 
> Odd. UP or SMP ? Were you shutting down oprofile at the time (or did the
> daemon crash ? check /var/lib/oprofile/oprofiled.log)

16-way NUMAQ :)  It was in the middle of a dbench run, so nothing
critical was happening with oprofile.

> The only thing I can think is that the buffer got freed then we  got an
> NMI afterwards (which obviously isn't supposed to happen...)

The irqbalance code is causing some funniness on these machines, so it
might be related.  I'll speak up if I see it again.

-- 
Dave Hansen
haveblue@us.ibm.com


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Oops running oprofile in 2.5.62
  2003-02-27  6:29 Oops running oprofile in 2.5.62 Dave Hansen
  2003-02-27 17:37 ` John Levon
@ 2003-03-05 20:15 ` Philippe Elie
  1 sibling, 0 replies; 4+ messages in thread
From: Philippe Elie @ 2003-03-05 20:15 UTC (permalink / raw)
  To: Dave Hansen; +Cc: oprofile-list, Linux Kernel Mailing List, John Levon

Dave Hansen wrote:
> This happened while running dbench on 2.5.62.  I haven't seen it before,
> but I thought I'd report it anyway.  I'm using the 0.5 version of the
> userspace tools.
> 
> I'm pretty sure it happened on this line in oprofile_add_sample():
> 	cpu_buf->buffer[cpu_buf->pos].eip = eip;

yes, in the last chunk of code in oprofile_add_sample()

> Unable to handle kernel paging request at virtual address f8c3c000
> c0212022
> *pde = 00000000
> Oops: 0002
> CPU:    13
> EIP:    0060:[<c0212022>]    Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010046
> eax: 40082d94   ebx: 00000340   ecx: 00002000   edx: f8c2c000
                                        ^^^^^^^^

buffer overrrun by one entry (8192 entry by default of 8 bytes
each on x86), potentially oprofile_add_sample() add 3 events
in buffer but the protection at begin of code protect against
two addition not three

The bug is rare because add_sample use three entry in rare case,
and thing are wrong only when cpu_buf->pos == buffer_size - 2
the code is not fixed in 2.5.64, John if you have not already
a patch pending for this can you push it in your tree ?

void oprofile_add_sample(unsigned long eip, unsigned int is_kernel,
	unsigned long event, int cpu)
.....
-	if (cpu_buf->pos > buffer_size - 2) {
+ 
if (cpu_buf->pos > buffer_size - 3) {
		cpu_buf->sample_lost_overflow++;
		goto out;
	}

>>>EIP; c0212022 <oprofile_add_sample+102/128>   <=====
>>
> 
>>>edi; c0310f00 <cpu_buffer+340/800>
>>
> 
> Trace; c02139f0 <ppro_check_ctrs+4c/80>
> Trace; c0213291 <nmi_callback+21/28>
> Trace; c010a1eb <do_nmi+2b/48>
> Trace; c010962e <nmi+1e/30>
> 
> Code;  c0212022 <oprofile_add_sample+102/128>
> 00000000 <_EIP>:
> Code;  c0212022 <oprofile_add_sample+102/128>   <=====
>    0:   89 04 ca                  mov    %eax,(%edx,%ecx,8)   <=====

ecx == cpu_bufffer->pos == buffer_size ... boom ...

regards,
Philippe Elie


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2003-03-05 19:01 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-02-27  6:29 Oops running oprofile in 2.5.62 Dave Hansen
2003-02-27 17:37 ` John Levon
2003-02-27 17:54   ` Dave Hansen
2003-03-05 20:15 ` Philippe Elie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox