public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [OOPS] -git8,9:  NULL pointer dereference in mptspi_dv_renegotiate_work
@ 2006-09-28 20:25 Bryce Harrington
  2006-09-28 21:51 ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Bryce Harrington @ 2006-09-28 20:25 UTC (permalink / raw)
  To: linux-kernel

Apologies if this has already been reported; I didn't spot it on the
list.  We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
-git9, but not -git7:

 mptbase: Initiating ioc0 recovery
 Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP: 
  [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
 PGD 0 
 Oops: 0000 [1] PREEMPT SMP 
 CPU 0 
 Modules linked in:
 Pid: 8, comm: events/0 Not tainted 2.6.18-git8 #1
 RIP: 0010:[<ffffffff80489aa2>]  [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
 RSP: 0000:ffff81003ec65e40  EFLAGS: 00010282
 RAX: 0000000000000002 RBX: ffff81003e86f640 RCX: 000000000000001e
 RDX: 0000000000000001 RSI: 0000000000000213 RDI: 000000000003e86f
 RBP: 0000000000000500 R08: ffff81003ec64000 R09: ffff81003ed0cf40
 R10: ffff81003e86f640 R11: ffff81003ed0cf40 R12: ffff81003ed0cf40
 R13: 0000000000000213 R14: ffff81003e86f640 R15: ffffffff80489a96
 FS:  0000000000000000(0000) GS:ffffffff80779000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
 CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
 Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
 Stack:  ffff81003ec65ef8 ffff81003e86f640 ffff81003e86f648 ffffffff8023f1bd
  ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
  00000000fffffffc ffffffff80593ffd 0000000000000000 ffffffff8023f300
 Call Trace:
  [<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
  [<ffffffff8023f204>] worker_thread+0x0/0x12e
  [<ffffffff8023f300>] worker_thread+0xfc/0x12e
  [<ffffffff80229f62>] default_wake_function+0x0/0xe
  [<ffffffff80229f62>] default_wake_function+0x0/0xe
  [<ffffffff80242433>] kthread+0xc8/0xf1
  [<ffffffff8020a3f8>] child_rip+0xa/0x12
  [<ffffffff8024236b>] kthread+0x0/0xf1
  [<ffffffff8020a3ee>] child_rip+0x0/0x12
 
 
 Code: 48 8b 45 00 31 f6 48 8b b8 50 01 00 00 e8 5c 4d fe ff 48 85 
 RIP  [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
  RSP <ffff81003ec65e40>
 CR2: 0000000000000500
  <6>mptbase: Initiating ioc0 recovery

Full console logs showing the above oops are here:
-git7:   ok   http://crucible.osdl.org/runs/2223/sysinfo/amd01.console
-git8:  Oops  http://crucible.osdl.org/runs/2233/sysinfo/amd01.console
-git9:  Oops  http://crucible.osdl.org/runs/2241/sysinfo/amd01.console

Reference information about the machine this is run on:
    http://crucible.osdl.org/runs/2223/sysinfo/amd01.1/

Config files:
-git7:  http://crucible.osdl.org/runs/2223/sysinfo/amd01.config
-git8:  http://crucible.osdl.org/runs/2233/sysinfo/amd01.config

Bryce

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [OOPS] -git8,9:  NULL pointer dereference in mptspi_dv_renegotiate_work
  2006-09-28 20:25 Bryce Harrington
@ 2006-09-28 21:51 ` Andrew Morton
  2006-09-28 22:54   ` Bryce Harrington
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2006-09-28 21:51 UTC (permalink / raw)
  To: Bryce Harrington; +Cc: linux-kernel, Moore, Eric Dean, linux-scsi


(cc's added)

On Thu, 28 Sep 2006 13:25:48 -0700
Bryce Harrington <bryce@osdl.org> wrote:

> Apologies if this has already been reported;

It has not.

>  I didn't spot it on the
> list.  We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
> -git9, but not -git7:
> 
>  mptbase: Initiating ioc0 recovery
>  Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP: 
>   [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
>  PGD 0 
>  Oops: 0000 [1] PREEMPT SMP 
>  CPU 0 
>  Modules linked in:
>  Pid: 8, comm: events/0 Not tainted 2.6.18-git8 #1
>  RIP: 0010:[<ffffffff80489aa2>]  [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
>  RSP: 0000:ffff81003ec65e40  EFLAGS: 00010282
>  RAX: 0000000000000002 RBX: ffff81003e86f640 RCX: 000000000000001e
>  RDX: 0000000000000001 RSI: 0000000000000213 RDI: 000000000003e86f
>  RBP: 0000000000000500 R08: ffff81003ec64000 R09: ffff81003ed0cf40
>  R10: ffff81003e86f640 R11: ffff81003ed0cf40 R12: ffff81003ed0cf40
>  R13: 0000000000000213 R14: ffff81003e86f640 R15: ffffffff80489a96
>  FS:  0000000000000000(0000) GS:ffffffff80779000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
>  CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
>  Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
>  Stack:  ffff81003ec65ef8 ffff81003e86f640 ffff81003e86f648 ffffffff8023f1bd
>   ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
>   00000000fffffffc ffffffff80593ffd 0000000000000000 ffffffff8023f300
>  Call Trace:
>   [<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
>   [<ffffffff8023f204>] worker_thread+0x0/0x12e
>   [<ffffffff8023f300>] worker_thread+0xfc/0x12e
>   [<ffffffff80229f62>] default_wake_function+0x0/0xe
>   [<ffffffff80229f62>] default_wake_function+0x0/0xe
>   [<ffffffff80242433>] kthread+0xc8/0xf1
>   [<ffffffff8020a3f8>] child_rip+0xa/0x12
>   [<ffffffff8024236b>] kthread+0x0/0xf1
>   [<ffffffff8020a3ee>] child_rip+0x0/0x12
>  
>  
>  Code: 48 8b 45 00 31 f6 48 8b b8 50 01 00 00 e8 5c 4d fe ff 48 85 
>  RIP  [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
>   RSP <ffff81003ec65e40>
>  CR2: 0000000000000500
>   <6>mptbase: Initiating ioc0 recovery

That's very clever.  With gcc-4.0.2 and your .config I get

(gdb) x/20i mptspi_dv_renegotiate_work
0xffffffff8048475e <mptspi_dv_renegotiate_work>:        push   %rbp
0xffffffff8048475f <mptspi_dv_renegotiate_work+1>:      push   %rbx
0xffffffff80484760 <mptspi_dv_renegotiate_work+2>:      push   %rbp
0xffffffff80484761 <mptspi_dv_renegotiate_work+3>:      mov    0x60(%rdi),%rbp
0xffffffff80484765 <mptspi_dv_renegotiate_work+7>:      callq  0xffffffff8026df58 <kfree>
0xffffffff8048476a <mptspi_dv_renegotiate_work+12>:     mov    0x0(%rbp),%rax
0xffffffff8048476e <mptspi_dv_renegotiate_work+16>:     xor    %esi,%esi
0xffffffff80484770 <mptspi_dv_renegotiate_work+18>:     mov    0x150(%rax),%rdi

So on entry to this function, wqw->hd is 0x500.

Or kfree() somehow scrogged your %rbp register.


> Full console logs showing the above oops are here:
> -git7:   ok   http://crucible.osdl.org/runs/2223/sysinfo/amd01.console
> -git8:  Oops  http://crucible.osdl.org/runs/2233/sysinfo/amd01.console
> -git9:  Oops  http://crucible.osdl.org/runs/2241/sysinfo/amd01.console
> 
> Reference information about the machine this is run on:
>     http://crucible.osdl.org/runs/2223/sysinfo/amd01.1/
> 
> Config files:
> -git7:  http://crucible.osdl.org/runs/2223/sysinfo/amd01.config
> -git8:  http://crucible.osdl.org/runs/2233/sysinfo/amd01.config

> ...

> Just checked against latest -git10, same oops:
> 
>    http://crucible.osdl.org/runs/2256/sysinfo/amd01.console
> 
> However, it is not occurring on our ita64, x86, or x86_64 systems
> running the same kernels.
> 

I'd be suspecting a miscompile, or something horrid in kfree().

Does it change anything if you move that kfree() down a bit?

--- a/drivers/message/fusion/mptspi.c~a
+++ a/drivers/message/fusion/mptspi.c
@@ -790,10 +790,9 @@ mptspi_dv_renegotiate_work(void *data)
 	struct _MPT_SCSI_HOST *hd = wqw->hd;
 	struct scsi_device *sdev;
 
-	kfree(wqw);
-
 	shost_for_each_device(sdev, hd->ioc->sh)
 		mptspi_dv_device(hd, sdev);
+	kfree(wqw);
 }
 
 static void
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [OOPS] -git8,9:  NULL pointer dereference in mptspi_dv_renegotiate_work
  2006-09-28 21:51 ` Andrew Morton
@ 2006-09-28 22:54   ` Bryce Harrington
  2006-09-29  0:26     ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Bryce Harrington @ 2006-09-28 22:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Moore, Eric Dean, linux-scsi

On Thu, Sep 28, 2006 at 02:51:21PM -0700, Andrew Morton wrote:
> On Thu, 28 Sep 2006 13:25:48 -0700
> Bryce Harrington <bryce@osdl.org> wrote:
> 
> > Apologies if this has already been reported;
> 
> It has not.
> 
> >  I didn't spot it on the
> > list.  We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
> > -git9, but not -git7:
> > 
> >  mptbase: Initiating ioc0 recovery
> >  Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP: 
> >   [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
> >  PGD 0 
> >  Oops: 0000 [1] PREEMPT SMP 
> 

> That's very clever.  
>
> I'd be suspecting a miscompile, or something horrid in kfree().
> 
> Does it change anything if you move that kfree() down a bit?
> 

Got essentially the same oops, although the addresses have changed a
little:

mptbase: Initiating ioc0 recovery
Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP:
 [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
PGD 0
Oops: 0000 [1] PREEMPT SMP
CPU 0
Modules linked in:
Pid: 8, comm: events/0 Not tainted 2.6.18-git10 #1
RIP: 0010:[<ffffffff80489aa3>]  [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
RSP: 0000:ffff81003ec65e40  EFLAGS: 00010246
RAX: ffff81003ec65ef8 RBX: ffff81003eff6640 RCX: ffff81003ec65ef8
RDX: ffff81003ed0cf58 RSI: 0000000000000000 RDI: ffff81003eff6640
RBP: 0000000000000500 R08: ffff81003ec64000 R09: 00000000ffffffff
R10: 00000000ffffffff R11: ffff81003ed0cf40 R12: ffff81003eff6640
R13: 0000000000000213 R14: ffff81003eff6640 R15: ffffffff80489a96
FS:  0000000000000000(0000) GS:ffffffff8077a000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
Stack:  ffff81003eff6640 ffff81003eff6648 ffff81003ed0cf40 ffffffff8023f1bd
 ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
 00000000fffffffc ffffffff8059457d 0000000000000000 ffffffff8023f30
Call Trace:
 [<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
 [<ffffffff8023f204>] worker_thread+0x0/0x12e
 [<ffffffff8023f300>] worker_thread+0xfc/0x12e
 [<ffffffff80229f62>] default_wake_function+0x0/0xe
 [<ffffffff80229f62>] default_wake_function+0x0/0xe
 [<ffffffff80242433>] kthread+0xc8/0xf1
 [<ffffffff8020a3f8>] child_rip+0xa/0x12
 [<ffffffff8024236b>] kthread+0x0/0xf1
 [<ffffffff8020a3ee>] child_rip+0x0/0x12


Code: 48 8b 45 00 48 8b b8 50 01 00 00 e8 5d 4d fe ff 48 85 c0 48
RIP  [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
 RSP <ffff81003ec65e40>
CR2: 0000000000000500
 <6>mptbase: Initiating ioc0 recovery
mptbase: Initiating ioc0 recovery
mptbase: Initiating ioc0 recovery
mptbase: Initiating ioc0 recovery
mptbase: Initiating ioc0 recovery
scsi0 : ioc0: LSI53C1030, FwRev=01030600h, Ports=1, MaxQ=255, IRQ=185
 target0:0:0: dma_alloc_coherent for parameters failed
mptscsih: ioc0: attempting task abort! (sc=ffff81003e840c80)
scsi 0:0:0:0:
        command: cdb[0]=0x12: 12 00 00 00 24 00
mptbase: Initiating ioc0 recovery

Bryce

> With gcc-4.0.2 and your .config I get
> 
> (gdb) x/20i mptspi_dv_renegotiate_work
> 0xffffffff8048475e <mptspi_dv_renegotiate_work>:        push   %rbp
> 0xffffffff8048475f <mptspi_dv_renegotiate_work+1>:      push   %rbx
> 0xffffffff80484760 <mptspi_dv_renegotiate_work+2>:      push   %rbp
> 0xffffffff80484761 <mptspi_dv_renegotiate_work+3>:      mov    0x60(%rdi),%rbp
> 0xffffffff80484765 <mptspi_dv_renegotiate_work+7>:      callq  0xffffffff8026df58 <kfree>
> 0xffffffff8048476a <mptspi_dv_renegotiate_work+12>:     mov    0x0(%rbp),%rax
> 0xffffffff8048476e <mptspi_dv_renegotiate_work+16>:     xor    %esi,%esi
> 0xffffffff80484770 <mptspi_dv_renegotiate_work+18>:     mov    0x150(%rax),%rdi
> 
> So on entry to this function, wqw->hd is 0x500.
> 
> Or kfree() somehow scrogged your %rbp register.
> 
> 
> > Full console logs showing the above oops are here:
> > -git7:   ok   http://crucible.osdl.org/runs/2223/sysinfo/amd01.console
> > -git8:  Oops  http://crucible.osdl.org/runs/2233/sysinfo/amd01.console
> > -git9:  Oops  http://crucible.osdl.org/runs/2241/sysinfo/amd01.console
> > 
> > Reference information about the machine this is run on:
> >     http://crucible.osdl.org/runs/2223/sysinfo/amd01.1/
> > 
> > Config files:
> > -git7:  http://crucible.osdl.org/runs/2223/sysinfo/amd01.config
> > -git8:  http://crucible.osdl.org/runs/2233/sysinfo/amd01.config
> 
> > ...
> 
> > Just checked against latest -git10, same oops:
> > 
> >    http://crucible.osdl.org/runs/2256/sysinfo/amd01.console
> > 
> > However, it is not occurring on our ita64, x86, or x86_64 systems
> > running the same kernels.
> > 
> 
> I'd be suspecting a miscompile, or something horrid in kfree().
> 
> Does it change anything if you move that kfree() down a bit?
> 
> --- a/drivers/message/fusion/mptspi.c~a
> +++ a/drivers/message/fusion/mptspi.c
> @@ -790,10 +790,9 @@ mptspi_dv_renegotiate_work(void *data)
>  	struct _MPT_SCSI_HOST *hd = wqw->hd;
>  	struct scsi_device *sdev;
>  
> -	kfree(wqw);
> -
>  	shost_for_each_device(sdev, hd->ioc->sh)
>  		mptspi_dv_device(hd, sdev);
> +	kfree(wqw);
>  }
>  
>  static void
> _

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [OOPS] -git8,9:  NULL pointer dereference in mptspi_dv_renegotiate_work
  2006-09-28 22:54   ` Bryce Harrington
@ 2006-09-29  0:26     ` Andrew Morton
  2006-09-29 17:17       ` Bryce Harrington
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2006-09-29  0:26 UTC (permalink / raw)
  To: Bryce Harrington; +Cc: linux-kernel, Moore, Eric Dean, linux-scsi

On Thu, 28 Sep 2006 15:54:26 -0700
Bryce Harrington <bryce@osdl.org> wrote:

> On Thu, Sep 28, 2006 at 02:51:21PM -0700, Andrew Morton wrote:
> > On Thu, 28 Sep 2006 13:25:48 -0700
> > Bryce Harrington <bryce@osdl.org> wrote:
> > 
> > > Apologies if this has already been reported;
> > 
> > It has not.
> > 
> > >  I didn't spot it on the
> > > list.  We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
> > > -git9, but not -git7:
> > > 
> > >  mptbase: Initiating ioc0 recovery
> > >  Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP: 
> > >   [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
> > >  PGD 0 
> > >  Oops: 0000 [1] PREEMPT SMP 
> > 
> 
> > That's very clever.  
> >
> > I'd be suspecting a miscompile, or something horrid in kfree().
> > 
> > Does it change anything if you move that kfree() down a bit?
> > 
> 
> Got essentially the same oops, although the addresses have changed a
> little:
> 
> mptbase: Initiating ioc0 recovery
> Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP:
>  [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
> PGD 0
> Oops: 0000 [1] PREEMPT SMP
> CPU 0
> Modules linked in:
> Pid: 8, comm: events/0 Not tainted 2.6.18-git10 #1
> RIP: 0010:[<ffffffff80489aa3>]  [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
> RSP: 0000:ffff81003ec65e40  EFLAGS: 00010246
> RAX: ffff81003ec65ef8 RBX: ffff81003eff6640 RCX: ffff81003ec65ef8
> RDX: ffff81003ed0cf58 RSI: 0000000000000000 RDI: ffff81003eff6640
> RBP: 0000000000000500 R08: ffff81003ec64000 R09: 00000000ffffffff
> R10: 00000000ffffffff R11: ffff81003ed0cf40 R12: ffff81003eff6640
> R13: 0000000000000213 R14: ffff81003eff6640 R15: ffffffff80489a96
> FS:  0000000000000000(0000) GS:ffffffff8077a000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
> Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
> Stack:  ffff81003eff6640 ffff81003eff6648 ffff81003ed0cf40 ffffffff8023f1bd
>  ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
>  00000000fffffffc ffffffff8059457d 0000000000000000 ffffffff8023f30
> Call Trace:
>  [<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
>  [<ffffffff8023f204>] worker_thread+0x0/0x12e
>  [<ffffffff8023f300>] worker_thread+0xfc/0x12e
>  [<ffffffff80229f62>] default_wake_function+0x0/0xe
>  [<ffffffff80229f62>] default_wake_function+0x0/0xe
>  [<ffffffff80242433>] kthread+0xc8/0xf1
>  [<ffffffff8020a3f8>] child_rip+0xa/0x12
>  [<ffffffff8024236b>] kthread+0x0/0xf1
>  [<ffffffff8020a3ee>] child_rip+0x0/0x12
> 
> 
> Code: 48 8b 45 00 48 8b b8 50 01 00 00 e8 5d 4d fe ff 48 85 c0 48
> RIP  [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
>  RSP <ffff81003ec65e40>
> CR2: 0000000000000500
>  <6>mptbase: Initiating ioc0 recovery
> mptbase: Initiating ioc0 recovery
> mptbase: Initiating ioc0 recovery
> mptbase: Initiating ioc0 recovery
> mptbase: Initiating ioc0 recovery
> scsi0 : ioc0: LSI53C1030, FwRev=01030600h, Ports=1, MaxQ=255, IRQ=185
>  target0:0:0: dma_alloc_coherent for parameters failed
> mptscsih: ioc0: attempting task abort! (sc=ffff81003e840c80)
> scsi 0:0:0:0:
>         command: cdb[0]=0x12: 12 00 00 00 24 00
> mptbase: Initiating ioc0 recovery
> 

Ah.  Maybe we're simply being passed a junk pointer.  This, please:

--- a/drivers/message/fusion/mptspi.c~a
+++ a/drivers/message/fusion/mptspi.c
@@ -804,6 +804,9 @@ mptspi_dv_renegotiate(struct _MPT_SCSI_H
 	if (!wqw)
 		return;
 
+	printk("%p\n", hd);
+	if ((unsigned long)hd < 4000UL)
+		dump_stack();
 	INIT_WORK(&wqw->work, mptspi_dv_renegotiate_work, wqw);
 	wqw->hd = hd;
 
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [OOPS] -git8,9:  NULL pointer dereference in mptspi_dv_renegotiate_work
  2006-09-29  0:26     ` Andrew Morton
@ 2006-09-29 17:17       ` Bryce Harrington
  0 siblings, 0 replies; 10+ messages in thread
From: Bryce Harrington @ 2006-09-29 17:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Moore, Eric Dean, linux-scsi

On Thu, Sep 28, 2006 at 05:26:52PM -0700, Andrew Morton wrote:
> On Thu, 28 Sep 2006 15:54:26 -0700
> Bryce Harrington <bryce@osdl.org> wrote:
> 
> > On Thu, Sep 28, 2006 at 02:51:21PM -0700, Andrew Morton wrote:
> > > On Thu, 28 Sep 2006 13:25:48 -0700
> > > Bryce Harrington <bryce@osdl.org> wrote:
> > > 
> > > > Apologies if this has already been reported;
> > > 
> > > It has not.
> > > 
> > > >  I didn't spot it on the
> > > > list.  We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
> > > > -git9, but not -git7:
> > > > 
> > > >  mptbase: Initiating ioc0 recovery
> > > >  Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP: 
> > > >   [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
> > > >  PGD 0 
> > > >  Oops: 0000 [1] PREEMPT SMP 
> > > 
> > 
> > > That's very clever.  
> > >
> > > I'd be suspecting a miscompile, or something horrid in kfree().
> > > 
> > > Does it change anything if you move that kfree() down a bit?
> > > 
> > 
> > Got essentially the same oops, although the addresses have changed a
> > little:
> > 
> > mptbase: Initiating ioc0 recovery
> > Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP:
> >  [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
> > PGD 0
> > Oops: 0000 [1] PREEMPT SMP
> > CPU 0
> > Modules linked in:
> > Pid: 8, comm: events/0 Not tainted 2.6.18-git10 #1
> > RIP: 0010:[<ffffffff80489aa3>]  [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
> > RSP: 0000:ffff81003ec65e40  EFLAGS: 00010246
> > RAX: ffff81003ec65ef8 RBX: ffff81003eff6640 RCX: ffff81003ec65ef8
> > RDX: ffff81003ed0cf58 RSI: 0000000000000000 RDI: ffff81003eff6640
> > RBP: 0000000000000500 R08: ffff81003ec64000 R09: 00000000ffffffff
> > R10: 00000000ffffffff R11: ffff81003ed0cf40 R12: ffff81003eff6640
> > R13: 0000000000000213 R14: ffff81003eff6640 R15: ffffffff80489a96
> > FS:  0000000000000000(0000) GS:ffffffff8077a000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
> > Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
> > Stack:  ffff81003eff6640 ffff81003eff6648 ffff81003ed0cf40 ffffffff8023f1bd
> >  ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
> >  00000000fffffffc ffffffff8059457d 0000000000000000 ffffffff8023f30
> > Call Trace:
> >  [<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
> >  [<ffffffff8023f204>] worker_thread+0x0/0x12e
> >  [<ffffffff8023f300>] worker_thread+0xfc/0x12e
> >  [<ffffffff80229f62>] default_wake_function+0x0/0xe
> >  [<ffffffff80229f62>] default_wake_function+0x0/0xe
> >  [<ffffffff80242433>] kthread+0xc8/0xf1
> >  [<ffffffff8020a3f8>] child_rip+0xa/0x12
> >  [<ffffffff8024236b>] kthread+0x0/0xf1
> >  [<ffffffff8020a3ee>] child_rip+0x0/0x12
> > 
> > 
> > Code: 48 8b 45 00 48 8b b8 50 01 00 00 e8 5d 4d fe ff 48 85 c0 48
> > RIP  [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
> >  RSP <ffff81003ec65e40>
> > CR2: 0000000000000500
> >  <6>mptbase: Initiating ioc0 recovery
> > mptbase: Initiating ioc0 recovery
> > mptbase: Initiating ioc0 recovery
> > mptbase: Initiating ioc0 recovery
> > mptbase: Initiating ioc0 recovery
> > scsi0 : ioc0: LSI53C1030, FwRev=01030600h, Ports=1, MaxQ=255, IRQ=185
> >  target0:0:0: dma_alloc_coherent for parameters failed
> > mptscsih: ioc0: attempting task abort! (sc=ffff81003e840c80)
> > scsi 0:0:0:0:
> >         command: cdb[0]=0x12: 12 00 00 00 24 00
> > mptbase: Initiating ioc0 recovery
> > 
> 
> Ah.  Maybe we're simply being passed a junk pointer.  This, please:
> 
> --- a/drivers/message/fusion/mptspi.c~a
> +++ a/drivers/message/fusion/mptspi.c
> @@ -804,6 +804,9 @@ mptspi_dv_renegotiate(struct _MPT_SCSI_H
>  	if (!wqw)
>  		return;
>  
> +	printk("%p\n", hd);
> +	if ((unsigned long)hd < 4000UL)
> +		dump_stack();
>  	INIT_WORK(&wqw->work, mptspi_dv_renegotiate_work, wqw);
>  	wqw->hd = hd;
>  
> _

Here's the stack dump:

mptbase: Initiating ioc0 recovery
0000000000000500

Call Trace:
<IRQ>  [<ffffffff803e0d37>] vgacon_cursor+0x0/0x1a7
[<ffffffff80489b19>] mptspi_dv_renegotiate+0x3e/0x79
[<ffffffff80489b7b>] mptspi_ioc_reset+0x27/0x2e
[<ffffffff804846ea>] mpt_do_ioc_recovery+0x115f/0x11dd
[<ffffffff802387d7>] current_tick_length+0x5/0x26
[<ffffffff80238dd4>] do_timer+0x2f6/0x574
[<ffffffff8023851c>] lock_timer_base+0x1b/0x3c
[<ffffffff8055ebff>] _spin_lock+0xe/0x5e
[<ffffffff8022797a>] task_rq_lock+0x3d/0x6f
[<ffffffff80227d03>] resched_task+0x4e/0x71
[<ffffffff8022847f>] try_to_wake_up+0x3d4/0x3e6
[<ffffffff80461794>] atapi_output_bytes+0x21/0x5c
[<ffffffff802387d7>] current_tick_length+0x5/0x26
[<ffffffff8055ebff>] _spin_lock+0xe/0x5e
[<ffffffff8022797a>] task_rq_lock+0x3d/0x6f
[<ffffffff80227d03>] resched_task+0x4e/0x71
[<ffffffff8022847f>] try_to_wake_up+0x3d4/0x3e6
[<ffffffff8020cf66>] main_timer_handler+0x1e6/0x3a6
[<ffffffff80228f29>] find_busiest_group+0x21f/0x66f
[<ffffffff80484f94>] mpt_HardResetHandler+0xb4/0x12c
[<ffffffff8048500c>] mpt_timer_expired+0x0/0x24
[<ffffffff80485017>] mpt_timer_expired+0xb/0x24
[<ffffffff80238a07>] run_timer_softirq+0x156/0x1b2
[<ffffffff802357a1>] __do_softirq+0x46/0xb1
[<ffffffff8020a76c>] call_softirq+0x1c/0x28
[<ffffffff8020bb5f>] do_softirq+0x2c/0x7d
[<ffffffff80207c37>] default_idle+0x0/0x47
[<ffffffff8023583f>] irq_exit+0x33/0x3e
[<ffffffff8020a216>] apic_timer_interrupt+0x66/0x70
<EOI>  [<ffffffff80207c60>] default_idle+0x29/0x47
[<ffffffff80207e3e>] cpu_idle+0x87/0xbe
[<ffffffff80794704>] start_kernel+0x203/0x205
[<ffffffff80794179>] _sinittext+0x179/0x17d

Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP: 
[<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
PGD 0 
Oops: 0000 [1] PREEMPT SMP 
CPU 0 
Modules linked in:
Pid: 8, comm: events/0 Not tainted 2.6.18-git10 #1
RIP: 0010:[<ffffffff80489aa2>]  [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
RSP: 0000:ffff81003ec65e40  EFLAGS: 00010282
RAX: 0000000000000004 RBX: ffff81003eff3640 RCX: 000000000000001e
RDX: 0000000000000003 RSI: 0000000000000213 RDI: 000000000003eff3
RBP: 0000000000000500 R08: ffff81003ed0cf88 R09: ffff81003ed0cf40
R10: ffff81003eff3640 R11: ffff81003ed0cf40 R12: ffff81003ed0cf40
R13: 0000000000000213 R14: ffff81003eff3640 R15: ffffffff80489a96
FS:  0000000000knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
Stack:  0000000000000000 ffff81003eff3640 ffff81003eff3648 ffffffff8023f1bd
ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
00000000fffffffc fd 0000000000000000 ffffffff8023f300
Call Trace:
[<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
[<ffffffff8023f204>] worker_thread+0x0/0x12e
[<ffffffff8023f300>] worker_thread+0xfc/0x12e
[<ffffffff80229f62>] default_wake_function+0x0/0xe
[<ffffffff80229f62>] default_wake_function+0x0/0xe
[<ffffffff80242433>] kthread+0xc8/0xf1
[<ffffffff8020a3f8>] child_rip+0xa/0x12
[<ffffad+0x0/0xf1
[<ffffffff8020a3ee>] child_rip+0x0/0x12


Code: 48 8b 45 00 31 f6 48 8b b8 50 01 00 00 e8 5c 4d fe ff 48 85 
RIP  [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
RSP <ffff81003ec65e40>
CR2: 0000000000000500
<6>mptbase: Initiating ioc0 recovery
0000000000000500

Call Trace:
<IRQ>  [<ffffffff803e0d37>] vgacon_cursor+0x0/0x1a7
[<ffffffff80489b19>] mptspi_dv_renegotiate+0x3e/0x79
[<ffffffff80489b7b>] mptspi_ioc_reset+0x27/0x2e
[<ffffffff804846ea>] mpt_do_ioc_recovery+0x115f/0x11dd
[<ffffffff802387d7>] current_tick_length+0x5/0x26
[<ffffffff80238dd4>] do_timer+0x2f6/0x574
[<ffffffff8023851c>] lock_timer_base+0x1b/0x3c
[<ffffffff8055ebff>] _spin_lock+0xe/0x5e
[<ffffffff8022797a>] task_rq_lock+0x3d/0x6f
[<ffffffff80227d03>] resched_task+0x4e/0x71
[<ffffffff8022847f>] try_to_wake_up+0x3d4/0x3e6
[<ffffffff80461794>] atapi_output_bytes+0x21/0x5c
[<ffffffff802387d7>] current_tick_length+0x5/0x26
[<ffffffff80238dd4>] do_timer+0x2f6/0x574
[<ffffffff8055ebff>] _spin_lock+0xe/0x5e
[<ffffffff8020cf66>] main_timer_hax3a6
[<ffffffff80228f29>] find_busiest_group+0x21f/0x66f
[<ffffffff8055ebff>] _spin_lock+0xe/0x5e
[<ffffffff80484f94>] mpt_HardResetHandler+0xb4/0x12c
[<ffffffff8048500c>] mpt_timer_expired+0x0/0x24
[<ffffffff80485017>] mpt_timer_expired+0xb/0x24
[<ffffffff80238a07>] run_timer_softirq+0x156/0x1b2
[<ffffffff802357a1>] __do_softirq+0x46/0xb1
[<ffffffff8020a76c>] call_softir+0x1c/0x28
[<ffffffff8020bb5f>] do_softirq+0x2c/0x7d
[<ffffffff80207c37>] default_idle+0x0/0x47
[<ffffffff8023583f>] irq_exit+0x33/0x3e
[<ffffffff8020a216>] apic_timer_interrupt+0x66/0x70
<EOI>  [<ffffffff80207c60>] default_idle+0x29/0x47
[<ffffffff80207e3e>] cpu_idle+0x87/0xbe
[<ffffffff80794704>] start_kernel+0x203/0x205
[<ffffffff80794179>] _sinittext+0x179/0x17d


Bryce

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [OOPS] -git8,9:  NULL pointer dereference in mptspi_dv_renegotiate_work
@ 2006-09-29 18:29 Moore, Eric
  2006-09-29 21:41 ` Bryce Harrington
  0 siblings, 1 reply; 10+ messages in thread
From: Moore, Eric @ 2006-09-29 18:29 UTC (permalink / raw)
  To: Bryce Harrington, Andrew Morton; +Cc: linux-kernel, linux-scsi

On Friday, September 29, 2006 11:18 AM, Bryce Harrington wrote:  

> [<ffffffff80484f94>] mpt_HardResetHandler+0xb4/0x12c
> [<ffffffff8048500c>] mpt_timer_expired+0x0/0x24

mpt_timer_expired means most likely we timed out sending 
request for config page from firmware.  The timeout results
in host reset, which results in domain validation being called.
Perhaps the config pages failed before we allocated memory for hd.

Can you enable debug messages in the driver Makefile, for
the line called MPT_DEBUG_CONFIG; that way we can find out which
config page failed.  

There were some changes in scsi_transort_spi.c, that occured
between 2.6.18-git1 and 2.6.18-git2.  I doubt these changes
would of effected this.   Can you determine between which
git version releases did this problem begin occuring?

Also, can you describe your configuration?  Such as which
kind of devices are you usign, and whether if they are U320 devices,
or are their older ones, such as U160.

Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [OOPS] -git8,9:  NULL pointer dereference in mptspi_dv_renegotiate_work
  2006-09-29 18:29 Moore, Eric
@ 2006-09-29 21:41 ` Bryce Harrington
  0 siblings, 0 replies; 10+ messages in thread
From: Bryce Harrington @ 2006-09-29 21:41 UTC (permalink / raw)
  To: Moore, Eric; +Cc: Andrew Morton, linux-kernel, linux-scsi

On Fri, Sep 29, 2006 at 12:29:55PM -0600, Moore, Eric wrote:
> On Friday, September 29, 2006 11:18 AM, Bryce Harrington wrote:  
> 
> > [<ffffffff80484f94>] mpt_HardResetHandler+0xb4/0x12c
> > [<ffffffff8048500c>] mpt_timer_expired+0x0/0x24
> 
> mpt_timer_expired means most likely we timed out sending 
> request for config page from firmware.  The timeout results
> in host reset, which results in domain validation being called.
> Perhaps the config pages failed before we allocated memory for hd.
> 
> Can you enable debug messages in the driver Makefile, for
> the line called MPT_DEBUG_CONFIG; that way we can find out which
> config page failed.  

Sure; not sure what the interesting part is, but here's the full log
from this:

   http://crucible.osdl.org/runs/2265/sysinfo/amd01.2.console

> There were some changes in scsi_transort_spi.c, that occured
> between 2.6.18-git1 and 2.6.18-git2.  I doubt these changes
> would of effected this.   Can you determine between which
> git version releases did this problem begin occuring?

I found that the problem did not occur with -git7, but did occur with
-git8, 9, 10, 11, and 12.  I didn't check kernels prior to that but
could if you think it would help.
 
> Also, can you describe your configuration?  Such as which
> kind of devices are you usign, and whether if they are U320 devices,
> or are their older ones, such as U160.

Sure.  Yes, there are two U320 SCSI hd's.

    Host:               amd01
    Kernel:             2.6.12-gentoo-r10
    Distribution:       gentoo 1.6.14
    Memory:             2053852 kB
    Arch:               x86_64
    CPU(s):             2x AMD Opteron(tm) Processor 242

SCSI:
     *-pci:1
          description: PCI bridge
          product: AMD-8131 PCI-X Bridge
          vendor: Advanced Micro Devices [AMD]
          physical id: 2
          bus info: pci@00:02.0
          version: 12
          width: 32 bits
          clock: 66MHz
          capabilities: pci normal_decode bus_master cap_list
        *-scsi:0
             description: SCSI storage controller
             product: 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI
             vendor: LSI Logic / Symbios Logic
             physical id: 1
             bus info: pci@02:01.0
             version: 07
             width: 64 bits
             clock: 33MHz
             capabilities: scsi bus_master cap_list
             configuration: driver=mptbase
             resources: ioport:c400-c4ff iomemory:fe980000-fe98ffff iomemory:fe970000-fe97ffff irq:185
        *-scsi:1
             description: SCSI storage controller
             product: 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI
             vendor: LSI Logic / Symbios Logic
             physical id: 1.1
             bus info: pci@02:01.1
             version: 07
             width: 64 bits
             clock: 33MHz
             capabilities: scsi bus_master cap_list
             configuration: driver=mptbase
             resources: ioport:c800-c8ff iomemory:fe9f0000-fe9fffff iomemory:fe9e0000-fe9effff irq:193


PCI:

    00:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) (prog-if 00 [Normal decode])
    00:01.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) (prog-if 10 [IO-APIC])
    00:02.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) (prog-if 00 [Normal decode])
    00:02.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) (prog-if 10 [IO-APIC])
    00:06.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8111 PCI (rev 07) (prog-if 00 [Normal decode])
    00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-8111 LPC (rev 05)
    00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-8111 IDE (rev 03) (prog-if 8a [Master SecP PriP])
    00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-8111 ACPI (rev 05)
    00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
    00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
    00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
    00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
    00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
    00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
    00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
    00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
    01:00.0 USB Controller: Advanced Micro Devices [AMD] AMD-8111 USB (rev 0b) (prog-if 10 [OHCI])
    01:00.1 USB Controller: Advanced Micro Devices [AMD] AMD-8111 USB (rev 0b) (prog-if 10 [OHCI])
    01:04.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) (prog-if 00 [VGA])
    02:01.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)
    02:01.1 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)
    02:03.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03)
    02:03.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03)

More info about this machine can be found here (for a different testrun).
The INFO directory has the full output from lshw:

    http://crucible.osdl.org/runs/2284/sysinfo/amd01.1/

Bryce

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [OOPS] -git8,9:  NULL pointer dereference in mptspi_dv_renegotiate_work
@ 2006-09-30  0:10 Moore, Eric
  2006-09-30  0:27 ` Bryce Harrington
  0 siblings, 1 reply; 10+ messages in thread
From: Moore, Eric @ 2006-09-30  0:10 UTC (permalink / raw)
  To: Bryce Harrington; +Cc: Andrew Morton, linux-kernel, linux-scsi

On Friday, September 29, 2006 3:41 PM, Bryce Harrington wrote: 
> > Can you enable debug messages in the driver Makefile, for
> > the line called MPT_DEBUG_CONFIG; that way we can find out which
> > config page failed.  
> 
> Sure; not sure what the interesting part is, but here's the full log
> from this:
> 
>    http://crucible.osdl.org/runs/2265/sysinfo/amd01.2.console
> 


Thanks.  It appears you enabled MPT_DEBUG instead of MPT_DEBUG_CONFIG.
All the "WaitForDoorbell" debugs are from that.  Can you recheck your
Makefile.

Thanks,
Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [OOPS] -git8,9:  NULL pointer dereference in mptspi_dv_renegotiate_work
  2006-09-30  0:10 [OOPS] -git8,9: NULL pointer dereference in mptspi_dv_renegotiate_work Moore, Eric
@ 2006-09-30  0:27 ` Bryce Harrington
       [not found]   ` <664A4EBB07F29743873A87CF62C26D702A994F@NAMAIL4.ad.lsil.com>
  0 siblings, 1 reply; 10+ messages in thread
From: Bryce Harrington @ 2006-09-30  0:27 UTC (permalink / raw)
  To: Moore, Eric; +Cc: Andrew Morton, linux-kernel, linux-scsi

On Fri, Sep 29, 2006 at 06:10:42PM -0600, Moore, Eric wrote:
> On Friday, September 29, 2006 3:41 PM, Bryce Harrington wrote: 
> > > Can you enable debug messages in the driver Makefile, for
> > > the line called MPT_DEBUG_CONFIG; that way we can find out which
> > > config page failed.  
> > 
> > Sure; not sure what the interesting part is, but here's the full log
> > from this:
> > 
> >    http://crucible.osdl.org/runs/2265/sysinfo/amd01.2.console
> > 
> 
> 
> Thanks.  It appears you enabled MPT_DEBUG instead of MPT_DEBUG_CONFIG.
> All the "WaitForDoorbell" debugs are from that.  Can you recheck your
> Makefile.

Does this look better?

    http://crucible.osdl.org/runs/2265/sysinfo/amd01.3.console

Bryce


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [OOPS] -git8,9:  NULL pointer dereference in mptspi_dv_renegotiate_work
       [not found]   ` <664A4EBB07F29743873A87CF62C26D702A994F@NAMAIL4.ad.lsil.com>
@ 2006-09-30 21:55     ` Bryce Harrington
  0 siblings, 0 replies; 10+ messages in thread
From: Bryce Harrington @ 2006-09-30 21:55 UTC (permalink / raw)
  To: Moore, Eric; +Cc: Andrew Morton, linux-kernel, linux-scsi

On Fri, Sep 29, 2006 at 10:22:50PM -0600, Moore, Eric wrote:
> On Fri 9/29/2006 6:27 PM, Bryce Harrington wrote:
> 
> > Does this look better?
> >
> >    http://crucible.osdl.org/runs/2265/sysinfo/amd01.3.console
> 
> 
> It appears that the problem is we're not receiving interrupts.
> The first command after interrupts enabled is not getting a response
> back from firmware, thus timing out.   I noticed in the log its
> saying interrupt is at 185, but apparently the INT line is not getting
> raised.  
> 
> In addition, I understand why the panic.  You've compiled the drivers
> into the kernel, instead of module.  i.e. if you compiled as module, 
> mptspi wouldn't been called while mptbase is loaded, as in your case.
> I guess we would need to add sanity check for that case. I'm usually
> testing as modules.

Ah, that could explain it; when doing testing we do compile everything
in.  So it sounds like we could eliminate the panic by compiling as a
module, however is it intended that the driver should work when compiled
in as well?  If so, I'd be happy to do additional testing to verify any
fixes worth trying out.

> Besides, we need to undertand why your interrupt controller is not
> generating interrupts.

We typically boot every -mm, -git, -rc and mainline kernel on this
machine, but it's only been relatively recently that this particular
behavior has occurred.  Could this suggest that there was a regression
due to recent changes?

Thanks,
Bryce

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-09-30 21:55 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-30  0:10 [OOPS] -git8,9: NULL pointer dereference in mptspi_dv_renegotiate_work Moore, Eric
2006-09-30  0:27 ` Bryce Harrington
     [not found]   ` <664A4EBB07F29743873A87CF62C26D702A994F@NAMAIL4.ad.lsil.com>
2006-09-30 21:55     ` Bryce Harrington
  -- strict thread matches above, loose matches on Subject: below --
2006-09-29 18:29 Moore, Eric
2006-09-29 21:41 ` Bryce Harrington
2006-09-28 20:25 Bryce Harrington
2006-09-28 21:51 ` Andrew Morton
2006-09-28 22:54   ` Bryce Harrington
2006-09-29  0:26     ` Andrew Morton
2006-09-29 17:17       ` Bryce Harrington

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox