public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [OOPS] -git8,9:  NULL pointer dereference in mptspi_dv_renegotiate_work
@ 2006-09-28 20:25 Bryce Harrington
  2006-09-28 20:34 ` [Eng] [OOPS] -git8, 9: " Bryce Harrington
  2006-09-28 21:51 ` [OOPS] -git8,9: " Andrew Morton
  0 siblings, 2 replies; 6+ messages in thread
From: Bryce Harrington @ 2006-09-28 20:25 UTC (permalink / raw)
  To: linux-kernel

Apologies if this has already been reported; I didn't spot it on the
list.  We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
-git9, but not -git7:

 mptbase: Initiating ioc0 recovery
 Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP: 
  [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
 PGD 0 
 Oops: 0000 [1] PREEMPT SMP 
 CPU 0 
 Modules linked in:
 Pid: 8, comm: events/0 Not tainted 2.6.18-git8 #1
 RIP: 0010:[<ffffffff80489aa2>]  [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
 RSP: 0000:ffff81003ec65e40  EFLAGS: 00010282
 RAX: 0000000000000002 RBX: ffff81003e86f640 RCX: 000000000000001e
 RDX: 0000000000000001 RSI: 0000000000000213 RDI: 000000000003e86f
 RBP: 0000000000000500 R08: ffff81003ec64000 R09: ffff81003ed0cf40
 R10: ffff81003e86f640 R11: ffff81003ed0cf40 R12: ffff81003ed0cf40
 R13: 0000000000000213 R14: ffff81003e86f640 R15: ffffffff80489a96
 FS:  0000000000000000(0000) GS:ffffffff80779000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
 CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
 Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
 Stack:  ffff81003ec65ef8 ffff81003e86f640 ffff81003e86f648 ffffffff8023f1bd
  ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
  00000000fffffffc ffffffff80593ffd 0000000000000000 ffffffff8023f300
 Call Trace:
  [<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
  [<ffffffff8023f204>] worker_thread+0x0/0x12e
  [<ffffffff8023f300>] worker_thread+0xfc/0x12e
  [<ffffffff80229f62>] default_wake_function+0x0/0xe
  [<ffffffff80229f62>] default_wake_function+0x0/0xe
  [<ffffffff80242433>] kthread+0xc8/0xf1
  [<ffffffff8020a3f8>] child_rip+0xa/0x12
  [<ffffffff8024236b>] kthread+0x0/0xf1
  [<ffffffff8020a3ee>] child_rip+0x0/0x12
 
 
 Code: 48 8b 45 00 31 f6 48 8b b8 50 01 00 00 e8 5c 4d fe ff 48 85 
 RIP  [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
  RSP <ffff81003ec65e40>
 CR2: 0000000000000500
  <6>mptbase: Initiating ioc0 recovery

Full console logs showing the above oops are here:
-git7:   ok   http://crucible.osdl.org/runs/2223/sysinfo/amd01.console
-git8:  Oops  http://crucible.osdl.org/runs/2233/sysinfo/amd01.console
-git9:  Oops  http://crucible.osdl.org/runs/2241/sysinfo/amd01.console

Reference information about the machine this is run on:
    http://crucible.osdl.org/runs/2223/sysinfo/amd01.1/

Config files:
-git7:  http://crucible.osdl.org/runs/2223/sysinfo/amd01.config
-git8:  http://crucible.osdl.org/runs/2233/sysinfo/amd01.config

Bryce

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Eng] [OOPS] -git8, 9:  NULL pointer dereference in mptspi_dv_renegotiate_work
  2006-09-28 20:25 [OOPS] -git8,9: NULL pointer dereference in mptspi_dv_renegotiate_work Bryce Harrington
@ 2006-09-28 20:34 ` Bryce Harrington
  2006-09-28 21:51 ` [OOPS] -git8,9: " Andrew Morton
  1 sibling, 0 replies; 6+ messages in thread
From: Bryce Harrington @ 2006-09-28 20:34 UTC (permalink / raw)
  To: linux-kernel

Just checked against latest -git10, same oops:

   http://crucible.osdl.org/runs/2256/sysinfo/amd01.console

However, it is not occurring on our ita64, x86, or x86_64 systems
running the same kernels.

Bryce

On Thu, Sep 28, 2006 at 01:25:48PM -0700, Bryce Harrington wrote:
> Apologies if this has already been reported; I didn't spot it on the
> list.  We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
> -git9, but not -git7:
> 
>  mptbase: Initiating ioc0 recovery
>  Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP: 
>   [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
>  PGD 0 
>  Oops: 0000 [1] PREEMPT SMP 
>  CPU 0 
>  Modules linked in:
>  Pid: 8, comm: events/0 Not tainted 2.6.18-git8 #1
>  RIP: 0010:[<ffffffff80489aa2>]  [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
>  RSP: 0000:ffff81003ec65e40  EFLAGS: 00010282
>  RAX: 0000000000000002 RBX: ffff81003e86f640 RCX: 000000000000001e
>  RDX: 0000000000000001 RSI: 0000000000000213 RDI: 000000000003e86f
>  RBP: 0000000000000500 R08: ffff81003ec64000 R09: ffff81003ed0cf40
>  R10: ffff81003e86f640 R11: ffff81003ed0cf40 R12: ffff81003ed0cf40
>  R13: 0000000000000213 R14: ffff81003e86f640 R15: ffffffff80489a96
>  FS:  0000000000000000(0000) GS:ffffffff80779000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
>  CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
>  Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
>  Stack:  ffff81003ec65ef8 ffff81003e86f640 ffff81003e86f648 ffffffff8023f1bd
>   ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
>   00000000fffffffc ffffffff80593ffd 0000000000000000 ffffffff8023f300
>  Call Trace:
>   [<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
>   [<ffffffff8023f204>] worker_thread+0x0/0x12e
>   [<ffffffff8023f300>] worker_thread+0xfc/0x12e
>   [<ffffffff80229f62>] default_wake_function+0x0/0xe
>   [<ffffffff80229f62>] default_wake_function+0x0/0xe
>   [<ffffffff80242433>] kthread+0xc8/0xf1
>   [<ffffffff8020a3f8>] child_rip+0xa/0x12
>   [<ffffffff8024236b>] kthread+0x0/0xf1
>   [<ffffffff8020a3ee>] child_rip+0x0/0x12
>  
>  
>  Code: 48 8b 45 00 31 f6 48 8b b8 50 01 00 00 e8 5c 4d fe ff 48 85 
>  RIP  [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
>   RSP <ffff81003ec65e40>
>  CR2: 0000000000000500
>   <6>mptbase: Initiating ioc0 recovery
> 
> Full console logs showing the above oops are here:
> -git7:   ok   http://crucible.osdl.org/runs/2223/sysinfo/amd01.console
> -git8:  Oops  http://crucible.osdl.org/runs/2233/sysinfo/amd01.console
> -git9:  Oops  http://crucible.osdl.org/runs/2241/sysinfo/amd01.console
> 
> Reference information about the machine this is run on:
>     http://crucible.osdl.org/runs/2223/sysinfo/amd01.1/
> 
> Config files:
> -git7:  http://crucible.osdl.org/runs/2223/sysinfo/amd01.config
> -git8:  http://crucible.osdl.org/runs/2233/sysinfo/amd01.config
> 
> Bryce
> _______________________________________________
> Eng mailing list
> Eng@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/eng

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [OOPS] -git8,9:  NULL pointer dereference in mptspi_dv_renegotiate_work
  2006-09-28 20:25 [OOPS] -git8,9: NULL pointer dereference in mptspi_dv_renegotiate_work Bryce Harrington
  2006-09-28 20:34 ` [Eng] [OOPS] -git8, 9: " Bryce Harrington
@ 2006-09-28 21:51 ` Andrew Morton
  2006-09-28 22:54   ` Bryce Harrington
  1 sibling, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2006-09-28 21:51 UTC (permalink / raw)
  To: Bryce Harrington; +Cc: linux-kernel, Moore, Eric Dean, linux-scsi


(cc's added)

On Thu, 28 Sep 2006 13:25:48 -0700
Bryce Harrington <bryce@osdl.org> wrote:

> Apologies if this has already been reported;

It has not.

>  I didn't spot it on the
> list.  We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
> -git9, but not -git7:
> 
>  mptbase: Initiating ioc0 recovery
>  Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP: 
>   [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
>  PGD 0 
>  Oops: 0000 [1] PREEMPT SMP 
>  CPU 0 
>  Modules linked in:
>  Pid: 8, comm: events/0 Not tainted 2.6.18-git8 #1
>  RIP: 0010:[<ffffffff80489aa2>]  [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
>  RSP: 0000:ffff81003ec65e40  EFLAGS: 00010282
>  RAX: 0000000000000002 RBX: ffff81003e86f640 RCX: 000000000000001e
>  RDX: 0000000000000001 RSI: 0000000000000213 RDI: 000000000003e86f
>  RBP: 0000000000000500 R08: ffff81003ec64000 R09: ffff81003ed0cf40
>  R10: ffff81003e86f640 R11: ffff81003ed0cf40 R12: ffff81003ed0cf40
>  R13: 0000000000000213 R14: ffff81003e86f640 R15: ffffffff80489a96
>  FS:  0000000000000000(0000) GS:ffffffff80779000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
>  CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
>  Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
>  Stack:  ffff81003ec65ef8 ffff81003e86f640 ffff81003e86f648 ffffffff8023f1bd
>   ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
>   00000000fffffffc ffffffff80593ffd 0000000000000000 ffffffff8023f300
>  Call Trace:
>   [<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
>   [<ffffffff8023f204>] worker_thread+0x0/0x12e
>   [<ffffffff8023f300>] worker_thread+0xfc/0x12e
>   [<ffffffff80229f62>] default_wake_function+0x0/0xe
>   [<ffffffff80229f62>] default_wake_function+0x0/0xe
>   [<ffffffff80242433>] kthread+0xc8/0xf1
>   [<ffffffff8020a3f8>] child_rip+0xa/0x12
>   [<ffffffff8024236b>] kthread+0x0/0xf1
>   [<ffffffff8020a3ee>] child_rip+0x0/0x12
>  
>  
>  Code: 48 8b 45 00 31 f6 48 8b b8 50 01 00 00 e8 5c 4d fe ff 48 85 
>  RIP  [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
>   RSP <ffff81003ec65e40>
>  CR2: 0000000000000500
>   <6>mptbase: Initiating ioc0 recovery

That's very clever.  With gcc-4.0.2 and your .config I get

(gdb) x/20i mptspi_dv_renegotiate_work
0xffffffff8048475e <mptspi_dv_renegotiate_work>:        push   %rbp
0xffffffff8048475f <mptspi_dv_renegotiate_work+1>:      push   %rbx
0xffffffff80484760 <mptspi_dv_renegotiate_work+2>:      push   %rbp
0xffffffff80484761 <mptspi_dv_renegotiate_work+3>:      mov    0x60(%rdi),%rbp
0xffffffff80484765 <mptspi_dv_renegotiate_work+7>:      callq  0xffffffff8026df58 <kfree>
0xffffffff8048476a <mptspi_dv_renegotiate_work+12>:     mov    0x0(%rbp),%rax
0xffffffff8048476e <mptspi_dv_renegotiate_work+16>:     xor    %esi,%esi
0xffffffff80484770 <mptspi_dv_renegotiate_work+18>:     mov    0x150(%rax),%rdi

So on entry to this function, wqw->hd is 0x500.

Or kfree() somehow scrogged your %rbp register.


> Full console logs showing the above oops are here:
> -git7:   ok   http://crucible.osdl.org/runs/2223/sysinfo/amd01.console
> -git8:  Oops  http://crucible.osdl.org/runs/2233/sysinfo/amd01.console
> -git9:  Oops  http://crucible.osdl.org/runs/2241/sysinfo/amd01.console
> 
> Reference information about the machine this is run on:
>     http://crucible.osdl.org/runs/2223/sysinfo/amd01.1/
> 
> Config files:
> -git7:  http://crucible.osdl.org/runs/2223/sysinfo/amd01.config
> -git8:  http://crucible.osdl.org/runs/2233/sysinfo/amd01.config

> ...

> Just checked against latest -git10, same oops:
> 
>    http://crucible.osdl.org/runs/2256/sysinfo/amd01.console
> 
> However, it is not occurring on our ita64, x86, or x86_64 systems
> running the same kernels.
> 

I'd be suspecting a miscompile, or something horrid in kfree().

Does it change anything if you move that kfree() down a bit?

--- a/drivers/message/fusion/mptspi.c~a
+++ a/drivers/message/fusion/mptspi.c
@@ -790,10 +790,9 @@ mptspi_dv_renegotiate_work(void *data)
 	struct _MPT_SCSI_HOST *hd = wqw->hd;
 	struct scsi_device *sdev;
 
-	kfree(wqw);
-
 	shost_for_each_device(sdev, hd->ioc->sh)
 		mptspi_dv_device(hd, sdev);
+	kfree(wqw);
 }
 
 static void
_


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [OOPS] -git8,9:  NULL pointer dereference in mptspi_dv_renegotiate_work
  2006-09-28 21:51 ` [OOPS] -git8,9: " Andrew Morton
@ 2006-09-28 22:54   ` Bryce Harrington
  2006-09-29  0:26     ` Andrew Morton
  0 siblings, 1 reply; 6+ messages in thread
From: Bryce Harrington @ 2006-09-28 22:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Moore, Eric Dean, linux-scsi

On Thu, Sep 28, 2006 at 02:51:21PM -0700, Andrew Morton wrote:
> On Thu, 28 Sep 2006 13:25:48 -0700
> Bryce Harrington <bryce@osdl.org> wrote:
> 
> > Apologies if this has already been reported;
> 
> It has not.
> 
> >  I didn't spot it on the
> > list.  We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
> > -git9, but not -git7:
> > 
> >  mptbase: Initiating ioc0 recovery
> >  Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP: 
> >   [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
> >  PGD 0 
> >  Oops: 0000 [1] PREEMPT SMP 
> 

> That's very clever.  
>
> I'd be suspecting a miscompile, or something horrid in kfree().
> 
> Does it change anything if you move that kfree() down a bit?
> 

Got essentially the same oops, although the addresses have changed a
little:

mptbase: Initiating ioc0 recovery
Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP:
 [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
PGD 0
Oops: 0000 [1] PREEMPT SMP
CPU 0
Modules linked in:
Pid: 8, comm: events/0 Not tainted 2.6.18-git10 #1
RIP: 0010:[<ffffffff80489aa3>]  [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
RSP: 0000:ffff81003ec65e40  EFLAGS: 00010246
RAX: ffff81003ec65ef8 RBX: ffff81003eff6640 RCX: ffff81003ec65ef8
RDX: ffff81003ed0cf58 RSI: 0000000000000000 RDI: ffff81003eff6640
RBP: 0000000000000500 R08: ffff81003ec64000 R09: 00000000ffffffff
R10: 00000000ffffffff R11: ffff81003ed0cf40 R12: ffff81003eff6640
R13: 0000000000000213 R14: ffff81003eff6640 R15: ffffffff80489a96
FS:  0000000000000000(0000) GS:ffffffff8077a000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
Stack:  ffff81003eff6640 ffff81003eff6648 ffff81003ed0cf40 ffffffff8023f1bd
 ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
 00000000fffffffc ffffffff8059457d 0000000000000000 ffffffff8023f30
Call Trace:
 [<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
 [<ffffffff8023f204>] worker_thread+0x0/0x12e
 [<ffffffff8023f300>] worker_thread+0xfc/0x12e
 [<ffffffff80229f62>] default_wake_function+0x0/0xe
 [<ffffffff80229f62>] default_wake_function+0x0/0xe
 [<ffffffff80242433>] kthread+0xc8/0xf1
 [<ffffffff8020a3f8>] child_rip+0xa/0x12
 [<ffffffff8024236b>] kthread+0x0/0xf1
 [<ffffffff8020a3ee>] child_rip+0x0/0x12


Code: 48 8b 45 00 48 8b b8 50 01 00 00 e8 5d 4d fe ff 48 85 c0 48
RIP  [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
 RSP <ffff81003ec65e40>
CR2: 0000000000000500
 <6>mptbase: Initiating ioc0 recovery
mptbase: Initiating ioc0 recovery
mptbase: Initiating ioc0 recovery
mptbase: Initiating ioc0 recovery
mptbase: Initiating ioc0 recovery
scsi0 : ioc0: LSI53C1030, FwRev=01030600h, Ports=1, MaxQ=255, IRQ=185
 target0:0:0: dma_alloc_coherent for parameters failed
mptscsih: ioc0: attempting task abort! (sc=ffff81003e840c80)
scsi 0:0:0:0:
        command: cdb[0]=0x12: 12 00 00 00 24 00
mptbase: Initiating ioc0 recovery

Bryce

> With gcc-4.0.2 and your .config I get
> 
> (gdb) x/20i mptspi_dv_renegotiate_work
> 0xffffffff8048475e <mptspi_dv_renegotiate_work>:        push   %rbp
> 0xffffffff8048475f <mptspi_dv_renegotiate_work+1>:      push   %rbx
> 0xffffffff80484760 <mptspi_dv_renegotiate_work+2>:      push   %rbp
> 0xffffffff80484761 <mptspi_dv_renegotiate_work+3>:      mov    0x60(%rdi),%rbp
> 0xffffffff80484765 <mptspi_dv_renegotiate_work+7>:      callq  0xffffffff8026df58 <kfree>
> 0xffffffff8048476a <mptspi_dv_renegotiate_work+12>:     mov    0x0(%rbp),%rax
> 0xffffffff8048476e <mptspi_dv_renegotiate_work+16>:     xor    %esi,%esi
> 0xffffffff80484770 <mptspi_dv_renegotiate_work+18>:     mov    0x150(%rax),%rdi
> 
> So on entry to this function, wqw->hd is 0x500.
> 
> Or kfree() somehow scrogged your %rbp register.
> 
> 
> > Full console logs showing the above oops are here:
> > -git7:   ok   http://crucible.osdl.org/runs/2223/sysinfo/amd01.console
> > -git8:  Oops  http://crucible.osdl.org/runs/2233/sysinfo/amd01.console
> > -git9:  Oops  http://crucible.osdl.org/runs/2241/sysinfo/amd01.console
> > 
> > Reference information about the machine this is run on:
> >     http://crucible.osdl.org/runs/2223/sysinfo/amd01.1/
> > 
> > Config files:
> > -git7:  http://crucible.osdl.org/runs/2223/sysinfo/amd01.config
> > -git8:  http://crucible.osdl.org/runs/2233/sysinfo/amd01.config
> 
> > ...
> 
> > Just checked against latest -git10, same oops:
> > 
> >    http://crucible.osdl.org/runs/2256/sysinfo/amd01.console
> > 
> > However, it is not occurring on our ita64, x86, or x86_64 systems
> > running the same kernels.
> > 
> 
> I'd be suspecting a miscompile, or something horrid in kfree().
> 
> Does it change anything if you move that kfree() down a bit?
> 
> --- a/drivers/message/fusion/mptspi.c~a
> +++ a/drivers/message/fusion/mptspi.c
> @@ -790,10 +790,9 @@ mptspi_dv_renegotiate_work(void *data)
>  	struct _MPT_SCSI_HOST *hd = wqw->hd;
>  	struct scsi_device *sdev;
>  
> -	kfree(wqw);
> -
>  	shost_for_each_device(sdev, hd->ioc->sh)
>  		mptspi_dv_device(hd, sdev);
> +	kfree(wqw);
>  }
>  
>  static void
> _

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [OOPS] -git8,9:  NULL pointer dereference in mptspi_dv_renegotiate_work
  2006-09-28 22:54   ` Bryce Harrington
@ 2006-09-29  0:26     ` Andrew Morton
  2006-09-29 17:17       ` Bryce Harrington
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2006-09-29  0:26 UTC (permalink / raw)
  To: Bryce Harrington; +Cc: linux-kernel, Moore, Eric Dean, linux-scsi

On Thu, 28 Sep 2006 15:54:26 -0700
Bryce Harrington <bryce@osdl.org> wrote:

> On Thu, Sep 28, 2006 at 02:51:21PM -0700, Andrew Morton wrote:
> > On Thu, 28 Sep 2006 13:25:48 -0700
> > Bryce Harrington <bryce@osdl.org> wrote:
> > 
> > > Apologies if this has already been reported;
> > 
> > It has not.
> > 
> > >  I didn't spot it on the
> > > list.  We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
> > > -git9, but not -git7:
> > > 
> > >  mptbase: Initiating ioc0 recovery
> > >  Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP: 
> > >   [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
> > >  PGD 0 
> > >  Oops: 0000 [1] PREEMPT SMP 
> > 
> 
> > That's very clever.  
> >
> > I'd be suspecting a miscompile, or something horrid in kfree().
> > 
> > Does it change anything if you move that kfree() down a bit?
> > 
> 
> Got essentially the same oops, although the addresses have changed a
> little:
> 
> mptbase: Initiating ioc0 recovery
> Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP:
>  [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
> PGD 0
> Oops: 0000 [1] PREEMPT SMP
> CPU 0
> Modules linked in:
> Pid: 8, comm: events/0 Not tainted 2.6.18-git10 #1
> RIP: 0010:[<ffffffff80489aa3>]  [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
> RSP: 0000:ffff81003ec65e40  EFLAGS: 00010246
> RAX: ffff81003ec65ef8 RBX: ffff81003eff6640 RCX: ffff81003ec65ef8
> RDX: ffff81003ed0cf58 RSI: 0000000000000000 RDI: ffff81003eff6640
> RBP: 0000000000000500 R08: ffff81003ec64000 R09: 00000000ffffffff
> R10: 00000000ffffffff R11: ffff81003ed0cf40 R12: ffff81003eff6640
> R13: 0000000000000213 R14: ffff81003eff6640 R15: ffffffff80489a96
> FS:  0000000000000000(0000) GS:ffffffff8077a000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
> Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
> Stack:  ffff81003eff6640 ffff81003eff6648 ffff81003ed0cf40 ffffffff8023f1bd
>  ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
>  00000000fffffffc ffffffff8059457d 0000000000000000 ffffffff8023f30
> Call Trace:
>  [<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
>  [<ffffffff8023f204>] worker_thread+0x0/0x12e
>  [<ffffffff8023f300>] worker_thread+0xfc/0x12e
>  [<ffffffff80229f62>] default_wake_function+0x0/0xe
>  [<ffffffff80229f62>] default_wake_function+0x0/0xe
>  [<ffffffff80242433>] kthread+0xc8/0xf1
>  [<ffffffff8020a3f8>] child_rip+0xa/0x12
>  [<ffffffff8024236b>] kthread+0x0/0xf1
>  [<ffffffff8020a3ee>] child_rip+0x0/0x12
> 
> 
> Code: 48 8b 45 00 48 8b b8 50 01 00 00 e8 5d 4d fe ff 48 85 c0 48
> RIP  [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
>  RSP <ffff81003ec65e40>
> CR2: 0000000000000500
>  <6>mptbase: Initiating ioc0 recovery
> mptbase: Initiating ioc0 recovery
> mptbase: Initiating ioc0 recovery
> mptbase: Initiating ioc0 recovery
> mptbase: Initiating ioc0 recovery
> scsi0 : ioc0: LSI53C1030, FwRev=01030600h, Ports=1, MaxQ=255, IRQ=185
>  target0:0:0: dma_alloc_coherent for parameters failed
> mptscsih: ioc0: attempting task abort! (sc=ffff81003e840c80)
> scsi 0:0:0:0:
>         command: cdb[0]=0x12: 12 00 00 00 24 00
> mptbase: Initiating ioc0 recovery
> 

Ah.  Maybe we're simply being passed a junk pointer.  This, please:

--- a/drivers/message/fusion/mptspi.c~a
+++ a/drivers/message/fusion/mptspi.c
@@ -804,6 +804,9 @@ mptspi_dv_renegotiate(struct _MPT_SCSI_H
 	if (!wqw)
 		return;
 
+	printk("%p\n", hd);
+	if ((unsigned long)hd < 4000UL)
+		dump_stack();
 	INIT_WORK(&wqw->work, mptspi_dv_renegotiate_work, wqw);
 	wqw->hd = hd;
 
_


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [OOPS] -git8,9:  NULL pointer dereference in mptspi_dv_renegotiate_work
  2006-09-29  0:26     ` Andrew Morton
@ 2006-09-29 17:17       ` Bryce Harrington
  0 siblings, 0 replies; 6+ messages in thread
From: Bryce Harrington @ 2006-09-29 17:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Moore, Eric Dean, linux-scsi

On Thu, Sep 28, 2006 at 05:26:52PM -0700, Andrew Morton wrote:
> On Thu, 28 Sep 2006 15:54:26 -0700
> Bryce Harrington <bryce@osdl.org> wrote:
> 
> > On Thu, Sep 28, 2006 at 02:51:21PM -0700, Andrew Morton wrote:
> > > On Thu, 28 Sep 2006 13:25:48 -0700
> > > Bryce Harrington <bryce@osdl.org> wrote:
> > > 
> > > > Apologies if this has already been reported;
> > > 
> > > It has not.
> > > 
> > > >  I didn't spot it on the
> > > > list.  We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
> > > > -git9, but not -git7:
> > > > 
> > > >  mptbase: Initiating ioc0 recovery
> > > >  Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP: 
> > > >   [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
> > > >  PGD 0 
> > > >  Oops: 0000 [1] PREEMPT SMP 
> > > 
> > 
> > > That's very clever.  
> > >
> > > I'd be suspecting a miscompile, or something horrid in kfree().
> > > 
> > > Does it change anything if you move that kfree() down a bit?
> > > 
> > 
> > Got essentially the same oops, although the addresses have changed a
> > little:
> > 
> > mptbase: Initiating ioc0 recovery
> > Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP:
> >  [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
> > PGD 0
> > Oops: 0000 [1] PREEMPT SMP
> > CPU 0
> > Modules linked in:
> > Pid: 8, comm: events/0 Not tainted 2.6.18-git10 #1
> > RIP: 0010:[<ffffffff80489aa3>]  [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
> > RSP: 0000:ffff81003ec65e40  EFLAGS: 00010246
> > RAX: ffff81003ec65ef8 RBX: ffff81003eff6640 RCX: ffff81003ec65ef8
> > RDX: ffff81003ed0cf58 RSI: 0000000000000000 RDI: ffff81003eff6640
> > RBP: 0000000000000500 R08: ffff81003ec64000 R09: 00000000ffffffff
> > R10: 00000000ffffffff R11: ffff81003ed0cf40 R12: ffff81003eff6640
> > R13: 0000000000000213 R14: ffff81003eff6640 R15: ffffffff80489a96
> > FS:  0000000000000000(0000) GS:ffffffff8077a000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
> > Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
> > Stack:  ffff81003eff6640 ffff81003eff6648 ffff81003ed0cf40 ffffffff8023f1bd
> >  ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
> >  00000000fffffffc ffffffff8059457d 0000000000000000 ffffffff8023f30
> > Call Trace:
> >  [<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
> >  [<ffffffff8023f204>] worker_thread+0x0/0x12e
> >  [<ffffffff8023f300>] worker_thread+0xfc/0x12e
> >  [<ffffffff80229f62>] default_wake_function+0x0/0xe
> >  [<ffffffff80229f62>] default_wake_function+0x0/0xe
> >  [<ffffffff80242433>] kthread+0xc8/0xf1
> >  [<ffffffff8020a3f8>] child_rip+0xa/0x12
> >  [<ffffffff8024236b>] kthread+0x0/0xf1
> >  [<ffffffff8020a3ee>] child_rip+0x0/0x12
> > 
> > 
> > Code: 48 8b 45 00 48 8b b8 50 01 00 00 e8 5d 4d fe ff 48 85 c0 48
> > RIP  [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
> >  RSP <ffff81003ec65e40>
> > CR2: 0000000000000500
> >  <6>mptbase: Initiating ioc0 recovery
> > mptbase: Initiating ioc0 recovery
> > mptbase: Initiating ioc0 recovery
> > mptbase: Initiating ioc0 recovery
> > mptbase: Initiating ioc0 recovery
> > scsi0 : ioc0: LSI53C1030, FwRev=01030600h, Ports=1, MaxQ=255, IRQ=185
> >  target0:0:0: dma_alloc_coherent for parameters failed
> > mptscsih: ioc0: attempting task abort! (sc=ffff81003e840c80)
> > scsi 0:0:0:0:
> >         command: cdb[0]=0x12: 12 00 00 00 24 00
> > mptbase: Initiating ioc0 recovery
> > 
> 
> Ah.  Maybe we're simply being passed a junk pointer.  This, please:
> 
> --- a/drivers/message/fusion/mptspi.c~a
> +++ a/drivers/message/fusion/mptspi.c
> @@ -804,6 +804,9 @@ mptspi_dv_renegotiate(struct _MPT_SCSI_H
>  	if (!wqw)
>  		return;
>  
> +	printk("%p\n", hd);
> +	if ((unsigned long)hd < 4000UL)
> +		dump_stack();
>  	INIT_WORK(&wqw->work, mptspi_dv_renegotiate_work, wqw);
>  	wqw->hd = hd;
>  
> _

Here's the stack dump:

mptbase: Initiating ioc0 recovery
0000000000000500

Call Trace:
<IRQ>  [<ffffffff803e0d37>] vgacon_cursor+0x0/0x1a7
[<ffffffff80489b19>] mptspi_dv_renegotiate+0x3e/0x79
[<ffffffff80489b7b>] mptspi_ioc_reset+0x27/0x2e
[<ffffffff804846ea>] mpt_do_ioc_recovery+0x115f/0x11dd
[<ffffffff802387d7>] current_tick_length+0x5/0x26
[<ffffffff80238dd4>] do_timer+0x2f6/0x574
[<ffffffff8023851c>] lock_timer_base+0x1b/0x3c
[<ffffffff8055ebff>] _spin_lock+0xe/0x5e
[<ffffffff8022797a>] task_rq_lock+0x3d/0x6f
[<ffffffff80227d03>] resched_task+0x4e/0x71
[<ffffffff8022847f>] try_to_wake_up+0x3d4/0x3e6
[<ffffffff80461794>] atapi_output_bytes+0x21/0x5c
[<ffffffff802387d7>] current_tick_length+0x5/0x26
[<ffffffff8055ebff>] _spin_lock+0xe/0x5e
[<ffffffff8022797a>] task_rq_lock+0x3d/0x6f
[<ffffffff80227d03>] resched_task+0x4e/0x71
[<ffffffff8022847f>] try_to_wake_up+0x3d4/0x3e6
[<ffffffff8020cf66>] main_timer_handler+0x1e6/0x3a6
[<ffffffff80228f29>] find_busiest_group+0x21f/0x66f
[<ffffffff80484f94>] mpt_HardResetHandler+0xb4/0x12c
[<ffffffff8048500c>] mpt_timer_expired+0x0/0x24
[<ffffffff80485017>] mpt_timer_expired+0xb/0x24
[<ffffffff80238a07>] run_timer_softirq+0x156/0x1b2
[<ffffffff802357a1>] __do_softirq+0x46/0xb1
[<ffffffff8020a76c>] call_softirq+0x1c/0x28
[<ffffffff8020bb5f>] do_softirq+0x2c/0x7d
[<ffffffff80207c37>] default_idle+0x0/0x47
[<ffffffff8023583f>] irq_exit+0x33/0x3e
[<ffffffff8020a216>] apic_timer_interrupt+0x66/0x70
<EOI>  [<ffffffff80207c60>] default_idle+0x29/0x47
[<ffffffff80207e3e>] cpu_idle+0x87/0xbe
[<ffffffff80794704>] start_kernel+0x203/0x205
[<ffffffff80794179>] _sinittext+0x179/0x17d

Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP: 
[<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
PGD 0 
Oops: 0000 [1] PREEMPT SMP 
CPU 0 
Modules linked in:
Pid: 8, comm: events/0 Not tainted 2.6.18-git10 #1
RIP: 0010:[<ffffffff80489aa2>]  [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
RSP: 0000:ffff81003ec65e40  EFLAGS: 00010282
RAX: 0000000000000004 RBX: ffff81003eff3640 RCX: 000000000000001e
RDX: 0000000000000003 RSI: 0000000000000213 RDI: 000000000003eff3
RBP: 0000000000000500 R08: ffff81003ed0cf88 R09: ffff81003ed0cf40
R10: ffff81003eff3640 R11: ffff81003ed0cf40 R12: ffff81003ed0cf40
R13: 0000000000000213 R14: ffff81003eff3640 R15: ffffffff80489a96
FS:  0000000000knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
Stack:  0000000000000000 ffff81003eff3640 ffff81003eff3648 ffffffff8023f1bd
ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
00000000fffffffc fd 0000000000000000 ffffffff8023f300
Call Trace:
[<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
[<ffffffff8023f204>] worker_thread+0x0/0x12e
[<ffffffff8023f300>] worker_thread+0xfc/0x12e
[<ffffffff80229f62>] default_wake_function+0x0/0xe
[<ffffffff80229f62>] default_wake_function+0x0/0xe
[<ffffffff80242433>] kthread+0xc8/0xf1
[<ffffffff8020a3f8>] child_rip+0xa/0x12
[<ffffad+0x0/0xf1
[<ffffffff8020a3ee>] child_rip+0x0/0x12


Code: 48 8b 45 00 31 f6 48 8b b8 50 01 00 00 e8 5c 4d fe ff 48 85 
RIP  [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
RSP <ffff81003ec65e40>
CR2: 0000000000000500
<6>mptbase: Initiating ioc0 recovery
0000000000000500

Call Trace:
<IRQ>  [<ffffffff803e0d37>] vgacon_cursor+0x0/0x1a7
[<ffffffff80489b19>] mptspi_dv_renegotiate+0x3e/0x79
[<ffffffff80489b7b>] mptspi_ioc_reset+0x27/0x2e
[<ffffffff804846ea>] mpt_do_ioc_recovery+0x115f/0x11dd
[<ffffffff802387d7>] current_tick_length+0x5/0x26
[<ffffffff80238dd4>] do_timer+0x2f6/0x574
[<ffffffff8023851c>] lock_timer_base+0x1b/0x3c
[<ffffffff8055ebff>] _spin_lock+0xe/0x5e
[<ffffffff8022797a>] task_rq_lock+0x3d/0x6f
[<ffffffff80227d03>] resched_task+0x4e/0x71
[<ffffffff8022847f>] try_to_wake_up+0x3d4/0x3e6
[<ffffffff80461794>] atapi_output_bytes+0x21/0x5c
[<ffffffff802387d7>] current_tick_length+0x5/0x26
[<ffffffff80238dd4>] do_timer+0x2f6/0x574
[<ffffffff8055ebff>] _spin_lock+0xe/0x5e
[<ffffffff8020cf66>] main_timer_hax3a6
[<ffffffff80228f29>] find_busiest_group+0x21f/0x66f
[<ffffffff8055ebff>] _spin_lock+0xe/0x5e
[<ffffffff80484f94>] mpt_HardResetHandler+0xb4/0x12c
[<ffffffff8048500c>] mpt_timer_expired+0x0/0x24
[<ffffffff80485017>] mpt_timer_expired+0xb/0x24
[<ffffffff80238a07>] run_timer_softirq+0x156/0x1b2
[<ffffffff802357a1>] __do_softirq+0x46/0xb1
[<ffffffff8020a76c>] call_softir+0x1c/0x28
[<ffffffff8020bb5f>] do_softirq+0x2c/0x7d
[<ffffffff80207c37>] default_idle+0x0/0x47
[<ffffffff8023583f>] irq_exit+0x33/0x3e
[<ffffffff8020a216>] apic_timer_interrupt+0x66/0x70
<EOI>  [<ffffffff80207c60>] default_idle+0x29/0x47
[<ffffffff80207e3e>] cpu_idle+0x87/0xbe
[<ffffffff80794704>] start_kernel+0x203/0x205
[<ffffffff80794179>] _sinittext+0x179/0x17d


Bryce

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-09-29 17:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-28 20:25 [OOPS] -git8,9: NULL pointer dereference in mptspi_dv_renegotiate_work Bryce Harrington
2006-09-28 20:34 ` [Eng] [OOPS] -git8, 9: " Bryce Harrington
2006-09-28 21:51 ` [OOPS] -git8,9: " Andrew Morton
2006-09-28 22:54   ` Bryce Harrington
2006-09-29  0:26     ` Andrew Morton
2006-09-29 17:17       ` Bryce Harrington

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox