All of lore.kernel.org
 help / color / mirror / Atom feed
From: Randy Dunlap <randy.dunlap@oracle.com>
To: scsi <linux-scsi@vger.kernel.org>
Cc: "Miller, Mike (OS Dev)" <Mike.Miller@hp.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	James Bottomley <James.Bottomley@hansenpartnership.com>,
	lkml <linux-kernel@vger.kernel.org>,
	akpm <akpm@linux-foundation.org>
Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr
Date: Tue, 18 Nov 2008 13:32:45 -0800	[thread overview]
Message-ID: <4923347D.8000205@oracle.com> (raw)
In-Reply-To: <4923237B.9090707@xenotime.net>

Randy Dunlap wrote:
> Randy Dunlap wrote:
>> Miller, Mike (OS Dev) wrote:
>>>> -----Original Message-----
>>>> From: Randy Dunlap [mailto:randy.dunlap@oracle.com]
>>>> Sent: Thursday, September 25, 2008 3:40 PM
>>>> To: scsi
>>>> Cc: Jens Axboe; Miller, Mike (OS Dev); James Bottomley; lkml; akpm
>>>> Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr
>>>>
>>>> On Thu, 25 Sep 2008 13:33:07 -0700 Randy Dunlap wrote:
>>>>
>>>>> Jens Axboe wrote:
>>>>>> On Thu, Sep 04 2008, Miller, Mike (OS Dev) wrote:
>>>>>>>>>>> 0x3bb2 <do_cciss_intr+1649>:    mov    0x2(%r8),%dx
>>>>>>>>>>> 0x3bb7 <do_cciss_intr+1654>:    test   %dx,%dx
>>>>>>>>>>> 0x3bba <do_cciss_intr+1657>:    je     0x3f0e
>>>> <do_cciss_intr+2509>
>>>>>>>>>>> $ addr2line -e cciss.o -f  do_cciss_intr+0x627 SA5_fifo_full
>>>>>>>>>>>
>>>> /home/rdunlap/linsrc/linux-2.6.27-rc3-git7/drivers/block/cciss.h:2
>>>>>>>> 06
>>>>>>>>>> OK ...that's confusing.  It seems to be saying that
>>>> ctrlr_info_t
>>>>>>>>>> * was NULL.  However, I can't see a way of getting into the
>>>>>>>> fifo_full
>>>>>>>>>> callback from do_cciss_intr ..
>>>>>>>>>> especially not with an NULL host.
>>>>>>>>>>
>>>>>>>>>> James
>>>>>>>>> That is weird. Even if we could get there fifo_full doesn't
>>>>>>>> do anything but wait for a bit.
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> This just happened again.  This time it's on 2.6.27-rc5-git3.
>>>>>>>>
>>>>>>>> ~Randy
>>>>>>> Thanks Randy. I think. :)
>>>>>>>
>>>>>>> I'll try to recreate in my lab.
>>>>>> This looks somewhat strange, mostly like 'c' is NULL and it's
>>>>>> oopsing in in removeQ (I don't think Randy's analysis is
>>>> correct in
>>>>>> assuming it's 'h' and it's in fifo_full). Given that 'c'
>>>> cannot be
>>>>>> NULL, it's c->prev or c->next that are NULL.
>> This BUG: has happened (now) 5 times today.  Higher frequency than usual for
>> some reason.
>>
>> I enabled CCISS_DEBUG and added one printk in removeQ().  On the first call
> 
> s/first/second/
> 
> 
>> to removeQ(), both c->next and c->prev are NULL.
>>
>> Here's the kernel log output from cciss:

I added a printk() in addQ() as well.  Here's the new output:

HP CISS Driver (v 3.6.20)
ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 54
cciss 0000:42:08.0: PCI INT A -> Link[LNKA] -> GSI 54 (level, high) -> IRQ 54
command = 147
irq = 36
board_id = 3211103c
cciss 0000:42:08.0: irq 87 for MSI/MSI-X
address 0 = fdf80000
cfg base address = 10
cfg base address index = 0
cfg offset = 400
Controller Configuration information
------------------------------------
   Signature = CISS
   Spec Number = 1
   Transport methods supported = 0x6
   Transport methods active = 0x3
   Requested transport Method = 0x0
   Coalesce Interrupt Delay = 0x0
   Coalesce Interrupt Count = 0x1
   Max outstanding commands = 0x256
   Bus Types = 0x200000
   Server Name =
   Heartbeat Counter = 0x1672


Trying to put board into Simple mode
I counter got to 1 0
Controller Configuration information
------------------------------------
   Signature = CISS
   Spec Number = 1
   Transport methods supported = 0x6
   Transport methods active = 0x3
   Requested transport Method = 0x0
   Coalesce Interrupt Delay = 0x0
   Coalesce Interrupt Count = 0x1
   Max outstanding commands = 0x256
   Bus Types = 0x200000
   Server Name =
   Heartbeat Counter = 0x1672


cciss0: <0x3238> at PCI 0000:42:08.0 IRQ 87 using DAC
cciss: intr_pending 8
cciss: addQ: Qptr=ffff88027e0100b8, c=ffff88007f83e000
cciss: removeQ: Qptr=ffff88027e0100b8, c=ffff88007f83e000, next=ffff88007f83e000, prev=ffff88007f83e000
Sending 7f83e000 - down to controller
cciss: addQ: Qptr=ffff88027e0100c0, c=ffff88007f83e000
cciss: intr_pending 8
cciss:  Read 4 back from board
cciss: removeQ: Qptr=ffff88027e0100c0, c=ffff88007f840000, next=0000000000000000, prev=0000000000000000
BUG: unable to handle kernel NULL pointer dereference at 0000000000000248
IP: [<ffffffffa0025106>] do_cciss_intr+0x706/0xb6c [cciss]
PGD 0
Oops: 0002 [#1] SMP
last sysfs file: /sys/block/ram15/dev
CPU 2
Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
Pid: 0, comm: swapper Not tainted 2.6.28-rc5 #1
RIP: 0010:[<ffffffffa0025106>]  [<ffffffffa0025106>] do_cciss_intr+0x706/0xb6c [cciss]
RSP: 0018:ffff88027f643ee8  EFLAGS: 00010087
RAX: 0000000000000000 RBX: ffff88007f840000 RCX: 000000000000a44f
RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffffffff8080e634
RBP: ffff88027f643f18 R08: 0000000000000000 R09: ffff88017e964800
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88027e000000
R13: 0000000000000000 R14: 0000000000000057 R15: 0000000000000086
FS:  00000000008558f0(0000) GS:ffff88017fc01c80(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000248 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffff88017fa9e000, task ffff88017fa5d400)
Stack:
 0000000000000030 ffff88027f627500 0000000000000000 0000000000000000
 0000000000000057 0000000000000000 ffff88027f643f48 ffffffff8026a8b9
 ffffffff8074ab00 0000000000000057 ffff88027f627500 ffffffff8074ab58
Call Trace:
 <IRQ> <0> [<ffffffff8026a8b9>] handle_IRQ_event+0x27/0x57
 [<ffffffff8026c424>] handle_edge_irq+0xde/0x11f
 [<ffffffff8020e29b>] do_IRQ+0xfc/0x175
 [<ffffffff8020c3e6>] ret_from_intr+0x0/0xa
 <EOI> <0> [<ffffffff8023c7d2>] ? ksoftirqd+0x0/0xa6
 [<ffffffff80212575>] ? default_idle+0x2b/0x40
 [<ffffffff80212799>] ? c1e_idle+0xe5/0xec
 [<ffffffff8056a7f6>] ? atomic_notifier_call_chain+0xf/0x11
 [<ffffffff8020acd1>] ? cpu_idle+0x40/0x5e
 [<ffffffff8056284e>] ? start_secondary+0x174/0x179
Code: 8b 83 48 02 00 00 48 39 d8 74 37 49 39 9c 24 c0 00 01 00 75 08 49 89 84 24 c0 00 01 00 48 8b 83 40 02 00 00 48 8b 93 48 02 00 00 <48> 89 90 48 02 00 00 48 8b 93 48 02 00 00 48 89 82 40 02 00 00
RIP  [<ffffffffa0025106>] do_cciss_intr+0x706/0xb6c [cciss]
 RSP <ffff88027f643ee8>
CR2: 0000000000000248
Kernel panic - not syncing: Fatal exception in interrupt

~Randy


WARNING: multiple messages have this Message-ID (diff)
From: Randy Dunlap <randy.dunlap@oracle.com>
To: scsi <linux-scsi@vger.kernel.org>
Cc: "Miller, Mike (OS Dev)" <Mike.Miller@hp.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	James Bottomley <James.Bottomley@hansenpartnership.com>,
	lkml <linux-kernel@vger.kernel.org>,
	akpm <akpm@linux-foundation.org>
Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr
Date: Tue, 18 Nov 2008 13:32:45 -0800	[thread overview]
Message-ID: <4923347D.8000205@oracle.com> (raw)
In-Reply-To: <4923237B.9090707@xenotime.net>

Randy Dunlap wrote:
> Randy Dunlap wrote:
>> Miller, Mike (OS Dev) wrote:
>>>> -----Original Message-----
>>>> From: Randy Dunlap [mailto:randy.dunlap@oracle.com]
>>>> Sent: Thursday, September 25, 2008 3:40 PM
>>>> To: scsi
>>>> Cc: Jens Axboe; Miller, Mike (OS Dev); James Bottomley; lkml; akpm
>>>> Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr
>>>>
>>>> On Thu, 25 Sep 2008 13:33:07 -0700 Randy Dunlap wrote:
>>>>
>>>>> Jens Axboe wrote:
>>>>>> On Thu, Sep 04 2008, Miller, Mike (OS Dev) wrote:
>>>>>>>>>>> 0x3bb2 <do_cciss_intr+1649>:    mov    0x2(%r8),%dx
>>>>>>>>>>> 0x3bb7 <do_cciss_intr+1654>:    test   %dx,%dx
>>>>>>>>>>> 0x3bba <do_cciss_intr+1657>:    je     0x3f0e
>>>> <do_cciss_intr+2509>
>>>>>>>>>>> $ addr2line -e cciss.o -f  do_cciss_intr+0x627 SA5_fifo_full
>>>>>>>>>>>
>>>> /home/rdunlap/linsrc/linux-2.6.27-rc3-git7/drivers/block/cciss.h:2
>>>>>>>> 06
>>>>>>>>>> OK ...that's confusing.  It seems to be saying that
>>>> ctrlr_info_t
>>>>>>>>>> * was NULL.  However, I can't see a way of getting into the
>>>>>>>> fifo_full
>>>>>>>>>> callback from do_cciss_intr ..
>>>>>>>>>> especially not with an NULL host.
>>>>>>>>>>
>>>>>>>>>> James
>>>>>>>>> That is weird. Even if we could get there fifo_full doesn't
>>>>>>>> do anything but wait for a bit.
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> This just happened again.  This time it's on 2.6.27-rc5-git3.
>>>>>>>>
>>>>>>>> ~Randy
>>>>>>> Thanks Randy. I think. :)
>>>>>>>
>>>>>>> I'll try to recreate in my lab.
>>>>>> This looks somewhat strange, mostly like 'c' is NULL and it's
>>>>>> oopsing in in removeQ (I don't think Randy's analysis is
>>>> correct in
>>>>>> assuming it's 'h' and it's in fifo_full). Given that 'c'
>>>> cannot be
>>>>>> NULL, it's c->prev or c->next that are NULL.
>> This BUG: has happened (now) 5 times today.  Higher frequency than usual for
>> some reason.
>>
>> I enabled CCISS_DEBUG and added one printk in removeQ().  On the first call
> 
> s/first/second/
> 
> 
>> to removeQ(), both c->next and c->prev are NULL.
>>
>> Here's the kernel log output from cciss:

I added a printk() in addQ() as well.  Here's the new output:

HP CISS Driver (v 3.6.20)
ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 54
cciss 0000:42:08.0: PCI INT A -> Link[LNKA] -> GSI 54 (level, high) -> IRQ 54
command = 147
irq = 36
board_id = 3211103c
cciss 0000:42:08.0: irq 87 for MSI/MSI-X
address 0 = fdf80000
cfg base address = 10
cfg base address index = 0
cfg offset = 400
Controller Configuration information
------------------------------------
   Signature = CISS
   Spec Number = 1
   Transport methods supported = 0x6
   Transport methods active = 0x3
   Requested transport Method = 0x0
   Coalesce Interrupt Delay = 0x0
   Coalesce Interrupt Count = 0x1
   Max outstanding commands = 0x256
   Bus Types = 0x200000
   Server Name =
   Heartbeat Counter = 0x1672


Trying to put board into Simple mode
I counter got to 1 0
Controller Configuration information
------------------------------------
   Signature = CISS
   Spec Number = 1
   Transport methods supported = 0x6
   Transport methods active = 0x3
   Requested transport Method = 0x0
   Coalesce Interrupt Delay = 0x0
   Coalesce Interrupt Count = 0x1
   Max outstanding commands = 0x256
   Bus Types = 0x200000
   Server Name =
   Heartbeat Counter = 0x1672


cciss0: <0x3238> at PCI 0000:42:08.0 IRQ 87 using DAC
cciss: intr_pending 8
cciss: addQ: Qptr=ffff88027e0100b8, c=ffff88007f83e000
cciss: removeQ: Qptr=ffff88027e0100b8, c=ffff88007f83e000, next=ffff88007f83e000, prev=ffff88007f83e000
Sending 7f83e000 - down to controller
cciss: addQ: Qptr=ffff88027e0100c0, c=ffff88007f83e000
cciss: intr_pending 8
cciss:  Read 4 back from board
cciss: removeQ: Qptr=ffff88027e0100c0, c=ffff88007f840000, next=0000000000000000, prev=0000000000000000
BUG: unable to handle kernel NULL pointer dereference at 0000000000000248
IP: [<ffffffffa0025106>] do_cciss_intr+0x706/0xb6c [cciss]
PGD 0
Oops: 0002 [#1] SMP
last sysfs file: /sys/block/ram15/dev
CPU 2
Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
Pid: 0, comm: swapper Not tainted 2.6.28-rc5 #1
RIP: 0010:[<ffffffffa0025106>]  [<ffffffffa0025106>] do_cciss_intr+0x706/0xb6c [cciss]
RSP: 0018:ffff88027f643ee8  EFLAGS: 00010087
RAX: 0000000000000000 RBX: ffff88007f840000 RCX: 000000000000a44f
RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffffffff8080e634
RBP: ffff88027f643f18 R08: 0000000000000000 R09: ffff88017e964800
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88027e000000
R13: 0000000000000000 R14: 0000000000000057 R15: 0000000000000086
FS:  00000000008558f0(0000) GS:ffff88017fc01c80(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000248 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffff88017fa9e000, task ffff88017fa5d400)
Stack:
 0000000000000030 ffff88027f627500 0000000000000000 0000000000000000
 0000000000000057 0000000000000000 ffff88027f643f48 ffffffff8026a8b9
 ffffffff8074ab00 0000000000000057 ffff88027f627500 ffffffff8074ab58
Call Trace:
 <IRQ> <0> [<ffffffff8026a8b9>] handle_IRQ_event+0x27/0x57
 [<ffffffff8026c424>] handle_edge_irq+0xde/0x11f
 [<ffffffff8020e29b>] do_IRQ+0xfc/0x175
 [<ffffffff8020c3e6>] ret_from_intr+0x0/0xa
 <EOI> <0> [<ffffffff8023c7d2>] ? ksoftirqd+0x0/0xa6
 [<ffffffff80212575>] ? default_idle+0x2b/0x40
 [<ffffffff80212799>] ? c1e_idle+0xe5/0xec
 [<ffffffff8056a7f6>] ? atomic_notifier_call_chain+0xf/0x11
 [<ffffffff8020acd1>] ? cpu_idle+0x40/0x5e
 [<ffffffff8056284e>] ? start_secondary+0x174/0x179
Code: 8b 83 48 02 00 00 48 39 d8 74 37 49 39 9c 24 c0 00 01 00 75 08 49 89 84 24 c0 00 01 00 48 8b 83 40 02 00 00 48 8b 93 48 02 00 00 <48> 89 90 48 02 00 00 48 8b 93 48 02 00 00 48 89 82 40 02 00 00
RIP  [<ffffffffa0025106>] do_cciss_intr+0x706/0xb6c [cciss]
 RSP <ffff88027f643ee8>
CR2: 0000000000000248
Kernel panic - not syncing: Fatal exception in interrupt

~Randy


  reply	other threads:[~2008-11-18 21:33 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-21  5:52 BUG: in 2.6.23-rc3-git7 in do_cciss_intr rdunlap
2008-08-21  7:16 ` Andrew Morton
2008-08-21 14:26 ` Miller, Mike (OS Dev)
2008-08-21 15:43   ` Randy Dunlap
2008-08-21 15:48     ` Miller, Mike (OS Dev)
2008-08-21 16:15       ` Randy Dunlap
2008-08-21 16:25         ` Miller, Mike (OS Dev)
2008-08-22  0:26           ` Randy Dunlap
2008-08-22 15:48             ` Miller, Mike (OS Dev)
2008-08-22 15:54               ` James Bottomley
2008-08-22 16:49                 ` Randy Dunlap
2008-08-22 17:02                   ` James Bottomley
2008-08-22 18:25                     ` Miller, Mike (OS Dev)
2008-09-04 16:59                       ` Randy Dunlap
2008-09-04 18:00                         ` Miller, Mike (OS Dev)
2008-09-05  9:28                           ` Jens Axboe
2008-09-25 20:33                             ` Randy Dunlap
2008-09-25 20:40                               ` Randy Dunlap
2008-09-25 20:56                                 ` Miller, Mike (OS Dev)
2008-11-18 20:14                                   ` Randy Dunlap
2008-11-18 20:20                                     ` Randy Dunlap
2008-11-18 21:32                                       ` Randy Dunlap [this message]
2008-11-18 21:32                                         ` Randy Dunlap
2008-11-19  8:52                                         ` Jens Axboe
2008-11-19 17:00                                           ` Miller, Mike (OS Dev)
2008-11-19 17:22                                             ` Randy Dunlap
2008-11-19 17:27                                               ` Miller, Mike (OS Dev)
2008-11-19 17:29                                                 ` Jens Axboe
2008-11-19 19:15                                                   ` Miller, Mike (OS Dev)
2008-11-19 20:46                                                     ` Jens Axboe
2008-11-20  9:13                                                       ` Jens Axboe
2008-11-20 16:41                                                         ` Miller, Mike (OS Dev)
2008-11-20 17:50                                                           ` Jens Axboe
2008-11-20 19:12                                                             ` Miller, Mike (OS Dev)
2008-11-19 17:18                                           ` Randy Dunlap
2008-11-18 21:32                                     ` Miller, Mike (OS Dev)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4923347D.8000205@oracle.com \
    --to=randy.dunlap@oracle.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=Mike.Miller@hp.com \
    --cc=akpm@linux-foundation.org \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.