linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 13311] New: mptsas: ioc0: removing ssp device, kernel oops
@ 2009-05-14 18:17 bugzilla-daemon
  2009-05-28  8:00 ` Andrew Morton
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-05-14 18:17 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=13311

           Summary: mptsas: ioc0: removing ssp device, kernel oops
           Product: SCSI Drivers
           Version: 2.5
    Kernel Version: 2.6.27.21
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Other
        AssignedTo: scsi_drivers-other@kernel-bugs.osdl.org
        ReportedBy: mike.tummy@gmail.com
        Regression: No


Created an attachment (id=21358)
 --> (http://bugzilla.kernel.org/attachment.cgi?id=21358)
System information

Distribution: openSUSE 11.1 (x86_64)

SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express 
Fusion-MPT SAS (rev 08)

This system had a kernel oops with the mptsas driver, the result of which
caused the port to be detached.  This removed the disks from the system causing
processes like hald and umount to permanently block.

I've attached the system information.

Here are the syslog messages for the oops:
------------------------>8 Cut Here 8<------------------------
May 12 05:19:59 tile01-primary kernel: mptsas: ioc0: removing ssp device:
fw_channel 0, fw_id 9, phy 8, sas_addr 0x5000155356664400
May 12 05:19:59 tile01-primary kernel:  phy-6:0:16: mptsas: ioc0: delete phy 8,
phy-obj (0xffff8804367ed400)
May 12 05:19:59 tile01-primary kernel:  phy-6:0:17: mptsas: ioc0: delete phy 9,
phy-obj (0xffff8804367f0400)
May 12 05:19:59 tile01-primary kernel:  phy-6:0:18: mptsas: ioc0: delete phy
10, phy-obj (0xffff8804367f0c00)
May 12 05:19:59 tile01-primary kernel:  phy-6:0:19: mptsas: ioc0: delete phy
11, phy-obj (0xffff8804367f4400)
May 12 05:19:59 tile01-primary kernel:  port-6:0:1: mptsas: ioc0: delete port
1, sas_addr (0x5000155356664400)
May 12 05:19:59 tile01-primary kernel: end_request: I/O error, dev sdc, sector
10975
May 12 05:19:59 tile01-primary kernel: REISERFS abort (device sdc1): Journal
write error in flush_commit_list
May 12 05:20:20 tile01-primary kernel: mptsas: ioc0: removing ssp device:
fw_channel 0, fw_id 9, phy 9, sas_addr 0x5000155356664400
May 12 05:20:20 tile01-primary kernel:  phy-6:0:16: mptsas: ioc0: delete phy 8,
phy-obj (0xffff8804367ed400)
May 12 05:20:20 tile01-primary kernel:  phy-6:0:17: mptsas: ioc0: delete phy 9,
phy-obj (0xffff8804367f0400)
May 12 05:20:20 tile01-primary kernel:  phy-6:0:18: mptsas: ioc0: delete phy
10, phy-obj (0xffff8804367f0c00)
May 12 05:20:20 tile01-primary kernel:  phy-6:0:19: mptsas: ioc0: delete phy
11, phy-obj (0xffff8804367f4400)
May 12 05:20:20 tile01-primary kernel:  port-6:0:1: mptsas: ioc0: delete port
1, sas_addr (0x5000155356664400)
May 12 05:20:20 tile01-primary kernel: BUG: unable to handle kernel NULL
pointer dereference at 0000000000000028
May 12 05:20:20 tile01-primary kernel: IP: [<ffffffff802fe7c2>]
sysfs_find_dirent+0x9/0x2f
May 12 05:20:20 tile01-primary kernel: PGD 8350bd067 PUD 831964067 PMD 0 
May 12 05:20:20 tile01-primary kernel: Oops: 0000 [1] SMP 
May 12 05:20:20 tile01-primary kernel: last sysfs file:
/sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
May 12 05:20:20 tile01-primary kernel: CPU 5 
May 12 05:20:20 tile01-primary kernel: Modules linked in: reiserfs ip6t_LOG
xt_tcpudp xt_pkttype ipt_LOG xt_limit 8021q garp stp bonding ip6t_REJECT
nf_conntrack_ipv6 
ip6table_raw xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter
ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack
ip_tables ip6table_filter
 ip6_tables x_tables ipv6 cpufreq_conservative cpufreq_userspace
cpufreq_powersave powernow_k8 ext3 jbd mbcache loop dm_mod cfi_cmdset_0002(N)
cfi_util(N) jedec_probe(
N) cfi_probe(N) gen_probe(N) ck804xrom(N) mtd sr_mod qla3xxx rtc_cmos joydev
button i2c_nforce2 shpchp cdrom forcedeth rtc_core chipreg(N) mptctl i2c_core
map_funcs(N)
 pcspkr rtc_lib sg pci_hotplug usbhid hid ff_memless ohci_hcd ehci_hcd sd_mod
crc_t10dif usbcore qla4xxx scsi_transport_iscsi edd xfs fan 3w_9xxx
ide_pci_generic amd74
xx ide_core ata_generic pata_amd mptsas mptscsih mptbase scsi_transport_sas
sata_nv libata scsi_mod dock thermal processor thermal_sys hwmon [last
unloaded: libcrc32c]
May 12 05:20:20 tile01-primary kernel: Supported: No
May 12 05:20:20 tile01-primary kernel: Pid: 210, comm: mpt/0 Tainted: G        
 2.6.27.21-0.1-default #1
May 12 05:20:20 tile01-primary kernel: RIP: 0010:[<ffffffff802fe7c2>] 
[<ffffffff802fe7c2>] sysfs_find_dirent+0x9/0x2f
May 12 05:20:20 tile01-primary kernel: RSP: 0018:ffff880434479c00  EFLAGS:
00010286
May 12 05:20:20 tile01-primary kernel: RAX: ffff8804367edda8 RBX:
ffffffff805b9885 RCX: ffff8804367edda8
May 12 05:20:20 tile01-primary kernel: RDX: ffff8804367edda8 RSI:
ffffffff805b9885 RDI: 0000000000000000
May 12 05:20:20 tile01-primary kernel: RBP: ffffffff805b9885 R08:
ffff880400000030 R09: ffff880400000030
May 12 05:20:20 tile01-primary kernel: R10: 0000000000000010 R11:
0000000000018620 R12: 0000000000000000
May 12 05:20:20 tile01-primary kernel: R13: ffff8804367edcf0 R14:
ffff8804367edcf0 R15: ffff8804367ede60
May 12 05:20:20 tile01-primary kernel: FS:  00007f9eb74ce6f0(0000)
GS:ffff88083657c2c0(0000) knlGS:0000000000000000
May 12 05:20:20 tile01-primary kernel: CS:  0010 DS: 0018 ES: 0018 CR0:
000000008005003b
May 12 05:20:20 tile01-primary kernel: CR2: 0000000000000028 CR3:
0000000834c1d000 CR4: 00000000000006e0
May 12 05:20:20 tile01-primary kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
May 12 05:20:20 tile01-primary kernel: DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
May 12 05:20:20 tile01-primary kernel: Process mpt/0 (pid: 210, threadinfo
ffff880434478000, task ffff880434f0e3c0)
May 12 05:20:20 tile01-primary kernel: Stack:  ffff880434479c40
ffffffff805b9885 0000000000000000 ffffffff802fe8be
May 12 05:20:20 tile01-primary kernel:  ffff8804365278a0 ffff8804367edc00
ffffffff807032a0 ffffffff802ffd64
May 12 05:20:20 tile01-primary kernel:  ffff880436527778 ffff8804367edc00
ffff8804367edc00 ffff880435cc0038
May 12 05:20:20 tile01-primary kernel: Call Trace:
May 12 05:20:20 tile01-primary kernel:  [<ffffffff802fe8be>]
sysfs_get_dirent+0x24/0x59
May 12 05:20:20 tile01-primary kernel:  [<ffffffff802ffd64>]
sysfs_remove_group+0x24/0xce
May 12 05:20:20 tile01-primary kernel:  [<ffffffff803e3b07>]
device_del+0x1b/0x1ad
May 12 05:20:20 tile01-primary kernel:  [<ffffffffa008591d>]
sas_port_delete+0x10d/0x129 [scsi_transport_sas]
May 12 05:20:20 tile01-primary kernel:  [<ffffffffa00b3a6a>]
mptsas_delete_expander_siblings+0x3f/0xb5 [mptsas]
May 12 05:20:20 tile01-primary kernel:  [<ffffffffa00b38d2>]
mptsas_expander_delete+0xb6/0x20f [mptsas]
May 12 05:20:20 tile01-primary kernel:  [<ffffffffa00b3b8e>]
mptsas_send_expander_event+0xae/0xc2 [mptsas]
May 12 05:20:20 tile01-primary kernel:  [<ffffffffa00b5fd4>]
mptsas_firmware_event_work+0x1dc/0x200 [mptsas]
May 12 05:20:20 tile01-primary kernel:  [<ffffffff8024c88d>]
run_workqueue+0x7a/0x100
May 12 05:20:20 tile01-primary kernel:  [<ffffffff8024c9eb>]
worker_thread+0xd8/0xe7
May 12 05:20:20 tile01-primary kernel:  [<ffffffff8024f9e7>] kthread+0x47/0x73
May 12 05:20:20 tile01-primary kernel:  [<ffffffff8020cf79>] child_rip+0xa/0x11
May 12 05:20:20 tile01-primary kernel: 
May 12 05:20:20 tile01-primary kernel: 
May 12 05:20:20 tile01-primary kernel: Code: c7 10 0d 6f 80 e8 bf c8 19 00 5a
5b 5d 41 5c 41 5d 41 5e 31 c0 41 5f c3 48 c7 45 38 ff ff ff 7f eb dc 55 48 89
f5 53 48 83
 ec 08 <48> 8b 5f 28 eb 14 48 8b 7b 18 48 89 ee e8 50 52 06 00 85 c0 74 
May 12 05:20:20 tile01-primary kernel: RIP  [<ffffffff802fe7c2>]
sysfs_find_dirent+0x9/0x2f
May 12 05:20:20 tile01-primary kernel:  RSP <ffff880434479c00>
May 12 05:20:20 tile01-primary kernel: CR2: 0000000000000028
------------------------>8 Cut Here 8<------------------------

The system needed a reset to recover due to so many processes blocking trying
to access the devices and related processes.

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bug 13311] New: mptsas: ioc0: removing ssp device, kernel oops
  2009-05-14 18:17 [Bug 13311] New: mptsas: ioc0: removing ssp device, kernel oops bugzilla-daemon
@ 2009-05-28  8:00 ` Andrew Morton
  2009-05-28 11:54   ` Kay Sievers
  2009-06-09 21:27   ` Mike Loseke
  2009-05-28  8:01 ` [Bug 13311] " bugzilla-daemon
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 13+ messages in thread
From: Andrew Morton @ 2009-05-28  8:00 UTC (permalink / raw)
  To: mike.tummy; +Cc: linux-scsi, bugzilla-daemon, Kay Sievers, Greg KH


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Thu, 14 May 2009 18:17:10 GMT bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=13311
> 
>            Summary: mptsas: ioc0: removing ssp device, kernel oops

I'd have thought that the severity of this problem is not matched by
the response.

>            Product: SCSI Drivers
>            Version: 2.5
>     Kernel Version: 2.6.27.21

Is it reproducible?  If so, is there any change that it can be retested
under a 2.6.29-based kernel?

Thanks.

<searches through a wordwrapped mess.  Sigh>

> May 12 05:20:20 tile01-primary kernel: RIP: 0010:[<ffffffff802fe7c2>]  [<ffffffff802fe7c2>] sysfs_find_dirent+0x9/0x2f

OK, I assume that the scsi driver did something bad to sysfs and
that sysfs then fell on its face.

Really, given the frequency and imaginativeness with which drivers
abuse sysfs, the driver-core should be more robust.

Kay, would you have time to plunk through this and see if we can
strengthen the sysfs code a bit so it doesn't crash?

But the core bug is presumably in the mptsas driver.  Perhaps you can
also see if you can work out what it did wrong?

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 13311] mptsas: ioc0: removing ssp device, kernel oops
  2009-05-14 18:17 [Bug 13311] New: mptsas: ioc0: removing ssp device, kernel oops bugzilla-daemon
  2009-05-28  8:00 ` Andrew Morton
@ 2009-05-28  8:01 ` bugzilla-daemon
  2009-05-28 11:54 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-05-28  8:01 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=13311





--- Comment #1 from Andrew Morton <akpm@linux-foundation.org>  2009-05-28 08:01:06 ---
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Thu, 14 May 2009 18:17:10 GMT bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=13311
> 
>            Summary: mptsas: ioc0: removing ssp device, kernel oops

I'd have thought that the severity of this problem is not matched by
the response.

>            Product: SCSI Drivers
>            Version: 2.5
>     Kernel Version: 2.6.27.21

Is it reproducible?  If so, is there any change that it can be retested
under a 2.6.29-based kernel?

Thanks.

<searches through a wordwrapped mess.  Sigh>

> May 12 05:20:20 tile01-primary kernel: RIP: 0010:[<ffffffff802fe7c2>]  [<ffffffff802fe7c2>] sysfs_find_dirent+0x9/0x2f

OK, I assume that the scsi driver did something bad to sysfs and
that sysfs then fell on its face.

Really, given the frequency and imaginativeness with which drivers
abuse sysfs, the driver-core should be more robust.

Kay, would you have time to plunk through this and see if we can
strengthen the sysfs code a bit so it doesn't crash?

But the core bug is presumably in the mptsas driver.  Perhaps you can
also see if you can work out what it did wrong?

Thanks.

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bug 13311] New: mptsas: ioc0: removing ssp device, kernel oops
  2009-05-28  8:00 ` Andrew Morton
@ 2009-05-28 11:54   ` Kay Sievers
  2009-06-09 21:27   ` Mike Loseke
  1 sibling, 0 replies; 13+ messages in thread
From: Kay Sievers @ 2009-05-28 11:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mike.tummy, linux-scsi, bugzilla-daemon, Greg KH

On Thu, May 28, 2009 at 10:00, Andrew Morton <akpm@linux-foundation.org> wrote:

>>            Product: SCSI Drivers
>>            Version: 2.5
>>     Kernel Version: 2.6.27.21
>
> Is it reproducible?  If so, is there any change that it can be retested
> under a 2.6.29-based kernel?

This driver version:
  MPT_LINUX_VERSION_COMMON       "4.00.43.00suse"
seems not in the upstream kernel:
  MPT_LINUX_VERSION_COMMON       "3.04.07"

The failing code is only in the new driver. This bug should move to
the Novell bugzilla, so the people who added the driver to the SUSE
kernel can check with LSI directly.

> OK, I assume that the scsi driver did something bad to sysfs and
> that sysfs then fell on its face.
>
> Really, given the frequency and imaginativeness with which drivers
> abuse sysfs, the driver-core should be more robust.
>
> Kay, would you have time to plunk through this and see if we can
> strengthen the sysfs code a bit so it doesn't crash?

Might be a missing lock/wrong refcounting issue in the caller, where
two threads try to remove the same thing. With Eric's upcoming changes
to the cleanup logic in sysfs, this should be handled better.

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 13311] mptsas: ioc0: removing ssp device, kernel oops
  2009-05-14 18:17 [Bug 13311] New: mptsas: ioc0: removing ssp device, kernel oops bugzilla-daemon
  2009-05-28  8:00 ` Andrew Morton
  2009-05-28  8:01 ` [Bug 13311] " bugzilla-daemon
@ 2009-05-28 11:54 ` bugzilla-daemon
  2009-06-09 21:27 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-05-28 11:54 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=13311





--- Comment #2 from Kay Sievers <kay.sievers@vrfy.org>  2009-05-28 11:54:23 ---
On Thu, May 28, 2009 at 10:00, Andrew Morton <akpm@linux-foundation.org> wrote:

>>            Product: SCSI Drivers
>>            Version: 2.5
>>     Kernel Version: 2.6.27.21
>
> Is it reproducible?  If so, is there any change that it can be retested
> under a 2.6.29-based kernel?

This driver version:
  MPT_LINUX_VERSION_COMMON       "4.00.43.00suse"
seems not in the upstream kernel:
  MPT_LINUX_VERSION_COMMON       "3.04.07"

The failing code is only in the new driver. This bug should move to
the Novell bugzilla, so the people who added the driver to the SUSE
kernel can check with LSI directly.

> OK, I assume that the scsi driver did something bad to sysfs and
> that sysfs then fell on its face.
>
> Really, given the frequency and imaginativeness with which drivers
> abuse sysfs, the driver-core should be more robust.
>
> Kay, would you have time to plunk through this and see if we can
> strengthen the sysfs code a bit so it doesn't crash?

Might be a missing lock/wrong refcounting issue in the caller, where
two threads try to remove the same thing. With Eric's upcoming changes
to the cleanup logic in sysfs, this should be handled better.

Thanks,
Kay

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bug 13311] New: mptsas: ioc0: removing ssp device, kernel oops
  2009-05-28  8:00 ` Andrew Morton
  2009-05-28 11:54   ` Kay Sievers
@ 2009-06-09 21:27   ` Mike Loseke
  2009-06-09 21:52     ` Andrew Morton
  1 sibling, 1 reply; 13+ messages in thread
From: Mike Loseke @ 2009-06-09 21:27 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-scsi, bugzilla-daemon, Kay Sievers, Greg KH

[-- Attachment #1: Type: text/plain, Size: 1532 bytes --]

On Thu, May 28, 2009 at 2:00 AM, Andrew Morton
<akpm@linux-foundation.org> wrote:
>
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Thu, 14 May 2009 18:17:10 GMT bugzilla-daemon@bugzilla.kernel.org wrote:
>
> > http://bugzilla.kernel.org/show_bug.cgi?id=13311
> >
> >            Summary: mptsas: ioc0: removing ssp device, kernel oops
>
> I'd have thought that the severity of this problem is not matched by
> the response.
>
> >            Product: SCSI Drivers
> >            Version: 2.5
> >     Kernel Version: 2.6.27.21
>
> Is it reproducible?  If so, is there any change that it can be retested
> under a 2.6.29-based kernel?

We've put a 2.6.29 kernel on these two systems and experienced another
kernel oops yesterday.  So far, we haven't been able to reproduce it
on demand, but it has occurred under a heavier system load each time
(load average of 16 with 2,000 blocks/sec every 5 seconds writes to
the devices attached using the mptsas driver.

The oops from yesterday isn't identical to the previous oops, but the
end result is the same where the system has to be rebooted.  I've
attached that the log capture of the oops.

The system is identical to the original specs, just the kernel has changed:

# cat /proc/version
Linux version 2.6.29.4-0.1-default (root@tile01-primary) (gcc version
4.3.2 [gcc-4_3-branch revision 141291] (SUSE Linux) ) #1 SMP Tue May
26 22:50:58 CDT 2009

Hopefully this is helpful.

Mike

[-- Attachment #2: tile01-secondary.oops --]
[-- Type: application/octet-stream, Size: 7912 bytes --]

Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
Jun  8 17:06:10 tile01-secondary kernel: sd 2:0:0:0: [sda] Unhandled error code
Jun  8 17:06:10 tile01-secondary kernel: sd 2:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
Jun  8 17:06:10 tile01-secondary kernel: end_request: I/O error, dev sda, sector 207
Jun  8 17:06:10 tile01-secondary kernel: device-mapper: multipath: Failing path 8:0.
Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
Jun  8 17:06:10 tile01-secondary kernel: sd 2:0:0:0: [sda] Unhandled error code
Jun  8 17:06:10 tile01-secondary kernel: sd 2:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
Jun  8 17:06:10 tile01-secondary kernel: end_request: I/O error, dev sda, sector 65679
Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff88021e08e880)
Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 f0 87 00 04 00 00
Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff88021e08e880)
Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff880106684dc0)
Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 f4 87 00 04 00 00
Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff880106684dc0)
Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff8803b0a131c0)
Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 f8 87 00 04 00 00
Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff8803b0a131c0)
Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff8803b0a13ec0)
Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 fc 87 00 00 08 00
Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff8803b0a13ec0)
Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff8803b0a13cc0)
Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 fc 8f 00 04 00 00
Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff8803b0a13cc0)
Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting bus reset! (sc=ffff88021e08e880)
Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 f0 87 00 04 00 00
Jun  8 17:06:11 tile01-secondary kernel: BUG: unable to handle kernel NULL pointer dereference at (null)
Jun  8 17:06:11 tile01-secondary kernel: IP: [<ffffffffa008cc98>] mptscsih_bus_reset+0x97/0xfa [mptscsih]
Jun  8 17:06:11 tile01-secondary kernel: PGD 82944c067 PUD 82e4e9067 PMD 0 
Jun  8 17:06:11 tile01-secondary kernel: Oops: 0000 [#1] SMP 
Jun  8 17:06:11 tile01-secondary kernel: last sysfs file: /sys/kernel/uevent_seqnum
Jun  8 17:06:11 tile01-secondary kernel: CPU 1 
Jun  8 17:06:11 tile01-secondary kernel: Modules linked in: reiserfs dm_round_robin ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp iptable_filter dm_multipath scsi_dh ip_tables iscsi_trgt crc32c x_tables 8021q garp stp bonding ipv6 cpufreq_conservative cpufreq_userspace cpufreq_powersave powernow_k8 ext3 jbd mbcache loop dm_mod qla4xxx scsi_transport_iscsi qla3xxx rtc_cmos i2c_nforce2 rtc_core rtc_lib shpchp forcedeth pcspkr joydev serio_raw mptctl pci_hotplug i2c_core button sr_mod sg cdrom usbhid hid ohci_hcd ehci_hcd sd_mod crc_t10dif usbcore edd xfs exportfs fan 3w_9xxx ide_pci_generic amd74xx ide_core ata_generic thermal processor thermal_sys hwmon sata_nv mptsas mptscsih mptbase scsi_transport_sas pata_amd libata scsi_mod
Jun  8 17:06:11 tile01-secondary kernel: Pid: 175, comm: scsi_eh_2 Not tainted 2.6.29.4-0.1-default #1 H8DM3-2
Jun  8 17:06:11 tile01-secondary kernel: RIP: 0010:[<ffffffffa008cc98>]  [<ffffffffa008cc98>] mptscsih_bus_reset+0x97/0xfa [mptscsih]
Jun  8 17:06:11 tile01-secondary kernel: RSP: 0018:ffff88083354ddb0  EFLAGS: 00010203
Jun  8 17:06:11 tile01-secondary kernel: RAX: ffff8804359cb002 RBX: ffff88043368a560 RCX: ffff88021e08e880
Jun  8 17:06:11 tile01-secondary kernel: RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88043368a560
Jun  8 17:06:11 tile01-secondary kernel: RBP: ffff88083354dde0 R08: 0000000000000002 R09: 0000000000000000
Jun  8 17:06:11 tile01-secondary kernel: R10: ffffffff80d7e600 R11: 0000000000000010 R12: ffff88021e08e880
Jun  8 17:06:11 tile01-secondary kernel: R13: ffff8804335a3000 R14: ffff8804335a3008 R15: ffff88083354dee0
Jun  8 17:06:11 tile01-secondary kernel: FS:  00007f66c7122740(0000) GS:ffff88043596edc0(0000) knlGS:0000000000000000
Jun  8 17:06:11 tile01-secondary kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Jun  8 17:06:11 tile01-secondary kernel: CR2: 0000000000000000 CR3: 000000082d955000 CR4: 00000000000006e0
Jun  8 17:06:11 tile01-secondary kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun  8 17:06:11 tile01-secondary kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun  8 17:06:11 tile01-secondary kernel: Process scsi_eh_2 (pid: 175, threadinfo ffff88083354c000, task ffff8808331082c0)
Jun  8 17:06:11 tile01-secondary kernel: Stack:
Jun  8 17:06:11 tile01-secondary kernel:  ffff8804337b4810 0000000000000000 ffff88021e08e880 0000000000002003
Jun  8 17:06:11 tile01-secondary kernel:  ffff8804359cb000 0000000000000000 ffff88083354de00 ffffffffa00034ee
Jun  8 17:06:11 tile01-secondary kernel:  ffff88021e08e880 0000000000000000 ffff88083354de60 ffffffffa000441f
Jun  8 17:06:11 tile01-secondary kernel: Call Trace:
Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffffa00034ee>] scsi_try_bus_reset+0x52/0xde [scsi_mod]
Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffffa000441f>] scsi_eh_ready_devs+0x4c3/0x737 [scsi_mod]
Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffffa0004bfe>] scsi_error_handler+0x37d/0x51b [scsi_mod]
Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff8022f2ea>] ? __wake_up_common+0x46/0x76
Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffffa0004881>] ? scsi_error_handler+0x0/0x51b [scsi_mod]
Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff80251952>] kthread+0x49/0x76
Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff8020d03a>] child_rip+0xa/0x20
Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff80251909>] ? kthread+0x0/0x76
Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff8020d030>] ? child_rip+0x0/0x20
Jun  8 17:06:11 tile01-secondary kernel: Code: 00 48 83 f8 ff 74 0a 48 ff c0 48 89 83 b0 00 00 00 49 8b 04 24 48 89 df be 04 00 00 00 48 8b 90 88 00 00 00 41 8a 85 98 00 00 00 <48> 8b 12 3c 01 19 c0 45 31 c9 45 31 c0 83 e0 1e 31 c9 0f b6 52 
Jun  8 17:06:11 tile01-secondary kernel: RIP  [<ffffffffa008cc98>] mptscsih_bus_reset+0x97/0xfa [mptscsih]
Jun  8 17:06:11 tile01-secondary kernel:  RSP <ffff88083354ddb0>
Jun  8 17:06:11 tile01-secondary kernel: CR2: 0000000000000000
Jun  8 17:06:11 tile01-secondary kernel: ---[ end trace 54f83dcc0f7b0b26 ]---

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 13311] mptsas: ioc0: removing ssp device, kernel oops
  2009-05-14 18:17 [Bug 13311] New: mptsas: ioc0: removing ssp device, kernel oops bugzilla-daemon
                   ` (2 preceding siblings ...)
  2009-05-28 11:54 ` bugzilla-daemon
@ 2009-06-09 21:27 ` bugzilla-daemon
  2009-06-09 21:30 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-06-09 21:27 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=13311





--- Comment #3 from Mike Loseke <mike.tummy@gmail.com>  2009-06-09 21:27:07 ---
On Thu, May 28, 2009 at 2:00 AM, Andrew Morton
<akpm@linux-foundation.org> wrote:
>
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Thu, 14 May 2009 18:17:10 GMT bugzilla-daemon@bugzilla.kernel.org wrote:
>
> > http://bugzilla.kernel.org/show_bug.cgi?id=13311
> >
> >            Summary: mptsas: ioc0: removing ssp device, kernel oops
>
> I'd have thought that the severity of this problem is not matched by
> the response.
>
> >            Product: SCSI Drivers
> >            Version: 2.5
> >     Kernel Version: 2.6.27.21
>
> Is it reproducible?  If so, is there any change that it can be retested
> under a 2.6.29-based kernel?

We've put a 2.6.29 kernel on these two systems and experienced another
kernel oops yesterday.  So far, we haven't been able to reproduce it
on demand, but it has occurred under a heavier system load each time
(load average of 16 with 2,000 blocks/sec every 5 seconds writes to
the devices attached using the mptsas driver.

The oops from yesterday isn't identical to the previous oops, but the
end result is the same where the system has to be rebooted.  I've
attached that the log capture of the oops.

The system is identical to the original specs, just the kernel has changed:

# cat /proc/version
Linux version 2.6.29.4-0.1-default (root@tile01-primary) (gcc version
4.3.2 [gcc-4_3-branch revision 141291] (SUSE Linux) ) #1 SMP Tue May
26 22:50:58 CDT 2009

Hopefully this is helpful.

Mike

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 13311] mptsas: ioc0: removing ssp device, kernel oops
  2009-05-14 18:17 [Bug 13311] New: mptsas: ioc0: removing ssp device, kernel oops bugzilla-daemon
                   ` (3 preceding siblings ...)
  2009-06-09 21:27 ` bugzilla-daemon
@ 2009-06-09 21:30 ` bugzilla-daemon
  2009-06-09 21:53 ` bugzilla-daemon
  2009-07-16 21:41 ` bugzilla-daemon
  6 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-06-09 21:30 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=13311





--- Comment #4 from Mike Loseke <mike.tummy@gmail.com>  2009-06-09 21:30:46 ---
Created an attachment (id=21835)
 --> (http://bugzilla.kernel.org/attachment.cgi?id=21835)
log entries for the 2.6.29 oops, re message #3

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bug 13311] New: mptsas: ioc0: removing ssp device, kernel oops
  2009-06-09 21:27   ` Mike Loseke
@ 2009-06-09 21:52     ` Andrew Morton
  0 siblings, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2009-06-09 21:52 UTC (permalink / raw)
  To: Mike Loseke; +Cc: linux-scsi, bugzilla-daemon, kay.sievers, greg, Eric Moore

On Tue, 9 Jun 2009 15:27:05 -0600
Mike Loseke <mike.tummy@gmail.com> wrote:

> On Thu, May 28, 2009 at 2:00 AM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
> >
> > (switched to email. __Please respond via emailed reply-to-all, not via the
> > bugzilla web interface).
> >
> > On Thu, 14 May 2009 18:17:10 GMT bugzilla-daemon@bugzilla.kernel.org wrote:
> >
> > > http://bugzilla.kernel.org/show_bug.cgi?id=13311
> > >
> > > __ __ __ __ __ __Summary: mptsas: ioc0: removing ssp device, kernel oops
> >
> > I'd have thought that the severity of this problem is not matched by
> > the response.
> >
> > > __ __ __ __ __ __Product: SCSI Drivers
> > > __ __ __ __ __ __Version: 2.5
> > > __ __ Kernel Version: 2.6.27.21
> >
> > Is it reproducible? __If so, is there any change that it can be retested
> > under a 2.6.29-based kernel?
> 
> We've put a 2.6.29 kernel on these two systems and experienced another
> kernel oops yesterday.  So far, we haven't been able to reproduce it
> on demand, but it has occurred under a heavier system load each time
> (load average of 16 with 2,000 blocks/sec every 5 seconds writes to
> the devices attached using the mptsas driver.
> 
> The oops from yesterday isn't identical to the previous oops, but the
> end result is the same where the system has to be rebooted.  I've
> attached that the log capture of the oops.
> 
> The system is identical to the original specs, just the kernel has changed:
> 
> # cat /proc/version
> Linux version 2.6.29.4-0.1-default (root@tile01-primary) (gcc version
> 4.3.2 [gcc-4_3-branch revision 141291] (SUSE Linux) ) #1 SMP Tue May
> 26 22:50:58 CDT 2009
> 
> Hopefully this is helpful.
> 

So we have two issues here.  One is the IO errors - are they unexpected?

The other of course is that mptscsih_bus_reset() oopsed when trying to
handle those errors.


> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
> Jun  8 17:06:10 tile01-secondary kernel: sd 2:0:0:0: [sda] Unhandled error code
> Jun  8 17:06:10 tile01-secondary kernel: sd 2:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
> Jun  8 17:06:10 tile01-secondary kernel: end_request: I/O error, dev sda, sector 207
> Jun  8 17:06:10 tile01-secondary kernel: device-mapper: multipath: Failing path 8:0.
> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
> Jun  8 17:06:10 tile01-secondary kernel: sd 2:0:0:0: [sda] Unhandled error code
> Jun  8 17:06:10 tile01-secondary kernel: sd 2:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
> Jun  8 17:06:10 tile01-secondary kernel: end_request: I/O error, dev sda, sector 65679
> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff88021e08e880)
> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 f0 87 00 04 00 00
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff88021e08e880)
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff880106684dc0)
> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 f4 87 00 04 00 00
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff880106684dc0)
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff8803b0a131c0)
> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 f8 87 00 04 00 00
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff8803b0a131c0)
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff8803b0a13ec0)
> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 fc 87 00 00 08 00
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff8803b0a13ec0)
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff8803b0a13cc0)
> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 fc 8f 00 04 00 00
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff8803b0a13cc0)
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting bus reset! (sc=ffff88021e08e880)
> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 f0 87 00 04 00 00
> Jun  8 17:06:11 tile01-secondary kernel: BUG: unable to handle kernel NULL pointer dereference at (null)
> Jun  8 17:06:11 tile01-secondary kernel: IP: [<ffffffffa008cc98>] mptscsih_bus_reset+0x97/0xfa [mptscsih]
> Jun  8 17:06:11 tile01-secondary kernel: PGD 82944c067 PUD 82e4e9067 PMD 0 
> Jun  8 17:06:11 tile01-secondary kernel: Oops: 0000 [#1] SMP 
> Jun  8 17:06:11 tile01-secondary kernel: last sysfs file: /sys/kernel/uevent_seqnum
> Jun  8 17:06:11 tile01-secondary kernel: CPU 1 
> Jun  8 17:06:11 tile01-secondary kernel: Modules linked in: reiserfs dm_round_robin ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp iptable_filter dm_multipath scsi_dh ip_tables iscsi_trgt crc32c x_tables 8021q garp stp bonding ipv6 cpufreq_conservative cpufreq_userspace cpufreq_powersave powernow_k8 ext3 jbd mbcache loop dm_mod qla4xxx scsi_transport_iscsi qla3xxx rtc_cmos i2c_nforce2 rtc_core rtc_lib shpchp forcedeth pcspkr joydev serio_raw mptctl pci_hotplug i2c_core button sr_mod sg cdrom usbhid hid ohci_hcd ehci_hcd sd_mod crc_t10dif usbcore edd xfs exportfs fan 3w_9xxx ide_pci_generic amd74xx ide_core ata_generic thermal processor thermal_sys hwmon sata_nv mptsas mptscsih mptbase scsi_transport_sas pata_amd libata scsi_mod
> Jun  8 17:06:11 tile01-secondary kernel: Pid: 175, comm: scsi_eh_2 Not tainted 2.6.29.4-0.1-default #1 H8DM3-2
> Jun  8 17:06:11 tile01-secondary kernel: RIP: 0010:[<ffffffffa008cc98>]  [<ffffffffa008cc98>] mptscsih_bus_reset+0x97/0xfa [mptscsih]
> Jun  8 17:06:11 tile01-secondary kernel: RSP: 0018:ffff88083354ddb0  EFLAGS: 00010203
> Jun  8 17:06:11 tile01-secondary kernel: RAX: ffff8804359cb002 RBX: ffff88043368a560 RCX: ffff88021e08e880
> Jun  8 17:06:11 tile01-secondary kernel: RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88043368a560
> Jun  8 17:06:11 tile01-secondary kernel: RBP: ffff88083354dde0 R08: 0000000000000002 R09: 0000000000000000
> Jun  8 17:06:11 tile01-secondary kernel: R10: ffffffff80d7e600 R11: 0000000000000010 R12: ffff88021e08e880
> Jun  8 17:06:11 tile01-secondary kernel: R13: ffff8804335a3000 R14: ffff8804335a3008 R15: ffff88083354dee0
> Jun  8 17:06:11 tile01-secondary kernel: FS:  00007f66c7122740(0000) GS:ffff88043596edc0(0000) knlGS:0000000000000000
> Jun  8 17:06:11 tile01-secondary kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> Jun  8 17:06:11 tile01-secondary kernel: CR2: 0000000000000000 CR3: 000000082d955000 CR4: 00000000000006e0
> Jun  8 17:06:11 tile01-secondary kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jun  8 17:06:11 tile01-secondary kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jun  8 17:06:11 tile01-secondary kernel: Process scsi_eh_2 (pid: 175, threadinfo ffff88083354c000, task ffff8808331082c0)
> Jun  8 17:06:11 tile01-secondary kernel: Stack:
> Jun  8 17:06:11 tile01-secondary kernel:  ffff8804337b4810 0000000000000000 ffff88021e08e880 0000000000002003
> Jun  8 17:06:11 tile01-secondary kernel:  ffff8804359cb000 0000000000000000 ffff88083354de00 ffffffffa00034ee
> Jun  8 17:06:11 tile01-secondary kernel:  ffff88021e08e880 0000000000000000 ffff88083354de60 ffffffffa000441f
> Jun  8 17:06:11 tile01-secondary kernel: Call Trace:
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffffa00034ee>] scsi_try_bus_reset+0x52/0xde [scsi_mod]
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffffa000441f>] scsi_eh_ready_devs+0x4c3/0x737 [scsi_mod]
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffffa0004bfe>] scsi_error_handler+0x37d/0x51b [scsi_mod]
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff8022f2ea>] ? __wake_up_common+0x46/0x76
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffffa0004881>] ? scsi_error_handler+0x0/0x51b [scsi_mod]
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff80251952>] kthread+0x49/0x76
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff8020d03a>] child_rip+0xa/0x20
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff80251909>] ? kthread+0x0/0x76
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff8020d030>] ? child_rip+0x0/0x20
> Jun  8 17:06:11 tile01-secondary kernel: Code: 00 48 83 f8 ff 74 0a 48 ff c0 48 89 83 b0 00 00 00 49 8b 04 24 48 89 df be 04 00 00 00 48 8b 90 88 00 00 00 41 8a 85 98 00 00 00 <48> 8b 12 3c 01 19 c0 45 31 c9 45 31 c0 83 e0 1e 31 c9 0f b6 52 
> Jun  8 17:06:11 tile01-secondary kernel: RIP  [<ffffffffa008cc98>] mptscsih_bus_reset+0x97/0xfa [mptscsih]
> Jun  8 17:06:11 tile01-secondary kernel:  RSP <ffff88083354ddb0>
> Jun  8 17:06:11 tile01-secondary kernel: CR2: 0000000000000000
> Jun  8 17:06:11 tile01-secondary kernel: ---[ end trace 54f83dcc0f7b0b26 ]---
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 13311] mptsas: ioc0: removing ssp device, kernel oops
  2009-05-14 18:17 [Bug 13311] New: mptsas: ioc0: removing ssp device, kernel oops bugzilla-daemon
                   ` (4 preceding siblings ...)
  2009-06-09 21:30 ` bugzilla-daemon
@ 2009-06-09 21:53 ` bugzilla-daemon
  2009-07-16 21:41 ` bugzilla-daemon
  6 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-06-09 21:53 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=13311





--- Comment #5 from Andrew Morton <akpm@linux-foundation.org>  2009-06-09 21:53:09 ---
On Tue, 9 Jun 2009 15:27:05 -0600
Mike Loseke <mike.tummy@gmail.com> wrote:

> On Thu, May 28, 2009 at 2:00 AM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
> >
> > (switched to email. __Please respond via emailed reply-to-all, not via the
> > bugzilla web interface).
> >
> > On Thu, 14 May 2009 18:17:10 GMT bugzilla-daemon@bugzilla.kernel.org wrote:
> >
> > > http://bugzilla.kernel.org/show_bug.cgi?id=13311
> > >
> > > __ __ __ __ __ __Summary: mptsas: ioc0: removing ssp device, kernel oops
> >
> > I'd have thought that the severity of this problem is not matched by
> > the response.
> >
> > > __ __ __ __ __ __Product: SCSI Drivers
> > > __ __ __ __ __ __Version: 2.5
> > > __ __ Kernel Version: 2.6.27.21
> >
> > Is it reproducible? __If so, is there any change that it can be retested
> > under a 2.6.29-based kernel?
> 
> We've put a 2.6.29 kernel on these two systems and experienced another
> kernel oops yesterday.  So far, we haven't been able to reproduce it
> on demand, but it has occurred under a heavier system load each time
> (load average of 16 with 2,000 blocks/sec every 5 seconds writes to
> the devices attached using the mptsas driver.
> 
> The oops from yesterday isn't identical to the previous oops, but the
> end result is the same where the system has to be rebooted.  I've
> attached that the log capture of the oops.
> 
> The system is identical to the original specs, just the kernel has changed:
> 
> # cat /proc/version
> Linux version 2.6.29.4-0.1-default (root@tile01-primary) (gcc version
> 4.3.2 [gcc-4_3-branch revision 141291] (SUSE Linux) ) #1 SMP Tue May
> 26 22:50:58 CDT 2009
> 
> Hopefully this is helpful.
> 

So we have two issues here.  One is the IO errors - are they unexpected?

The other of course is that mptscsih_bus_reset() oopsed when trying to
handle those errors.


> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
> Jun  8 17:06:10 tile01-secondary kernel: sd 2:0:0:0: [sda] Unhandled error code
> Jun  8 17:06:10 tile01-secondary kernel: sd 2:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
> Jun  8 17:06:10 tile01-secondary kernel: end_request: I/O error, dev sda, sector 207
> Jun  8 17:06:10 tile01-secondary kernel: device-mapper: multipath: Failing path 8:0.
> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
> Jun  8 17:06:10 tile01-secondary kernel: sd 2:0:0:0: [sda] Unhandled error code
> Jun  8 17:06:10 tile01-secondary kernel: sd 2:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
> Jun  8 17:06:10 tile01-secondary kernel: end_request: I/O error, dev sda, sector 65679
> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff88021e08e880)
> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 f0 87 00 04 00 00
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff88021e08e880)
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff880106684dc0)
> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 f4 87 00 04 00 00
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff880106684dc0)
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff8803b0a131c0)
> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 f8 87 00 04 00 00
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff8803b0a131c0)
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff8803b0a13ec0)
> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 fc 87 00 00 08 00
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff8803b0a13ec0)
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff8803b0a13cc0)
> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 fc 8f 00 04 00 00
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff8803b0a13cc0)
> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting bus reset! (sc=ffff88021e08e880)
> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 f0 87 00 04 00 00
> Jun  8 17:06:11 tile01-secondary kernel: BUG: unable to handle kernel NULL pointer dereference at (null)
> Jun  8 17:06:11 tile01-secondary kernel: IP: [<ffffffffa008cc98>] mptscsih_bus_reset+0x97/0xfa [mptscsih]
> Jun  8 17:06:11 tile01-secondary kernel: PGD 82944c067 PUD 82e4e9067 PMD 0 
> Jun  8 17:06:11 tile01-secondary kernel: Oops: 0000 [#1] SMP 
> Jun  8 17:06:11 tile01-secondary kernel: last sysfs file: /sys/kernel/uevent_seqnum
> Jun  8 17:06:11 tile01-secondary kernel: CPU 1 
> Jun  8 17:06:11 tile01-secondary kernel: Modules linked in: reiserfs dm_round_robin ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp iptable_filter dm_multipath scsi_dh ip_tables iscsi_trgt crc32c x_tables 8021q garp stp bonding ipv6 cpufreq_conservative cpufreq_userspace cpufreq_powersave powernow_k8 ext3 jbd mbcache loop dm_mod qla4xxx scsi_transport_iscsi qla3xxx rtc_cmos i2c_nforce2 rtc_core rtc_lib shpchp forcedeth pcspkr joydev serio_raw mptctl pci_hotplug i2c_core button sr_mod sg cdrom usbhid hid ohci_hcd ehci_hcd sd_mod crc_t10dif usbcore edd xfs exportfs fan 3w_9xxx ide_pci_generic amd74xx ide_core ata_generic thermal processor thermal_sys hwmon sata_nv mptsas mptscsih mptbase scsi_transport_sas pata_amd libata scsi_mod
> Jun  8 17:06:11 tile01-secondary kernel: Pid: 175, comm: scsi_eh_2 Not tainted 2.6.29.4-0.1-default #1 H8DM3-2
> Jun  8 17:06:11 tile01-secondary kernel: RIP: 0010:[<ffffffffa008cc98>]  [<ffffffffa008cc98>] mptscsih_bus_reset+0x97/0xfa [mptscsih]
> Jun  8 17:06:11 tile01-secondary kernel: RSP: 0018:ffff88083354ddb0  EFLAGS: 00010203
> Jun  8 17:06:11 tile01-secondary kernel: RAX: ffff8804359cb002 RBX: ffff88043368a560 RCX: ffff88021e08e880
> Jun  8 17:06:11 tile01-secondary kernel: RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88043368a560
> Jun  8 17:06:11 tile01-secondary kernel: RBP: ffff88083354dde0 R08: 0000000000000002 R09: 0000000000000000
> Jun  8 17:06:11 tile01-secondary kernel: R10: ffffffff80d7e600 R11: 0000000000000010 R12: ffff88021e08e880
> Jun  8 17:06:11 tile01-secondary kernel: R13: ffff8804335a3000 R14: ffff8804335a3008 R15: ffff88083354dee0
> Jun  8 17:06:11 tile01-secondary kernel: FS:  00007f66c7122740(0000) GS:ffff88043596edc0(0000) knlGS:0000000000000000
> Jun  8 17:06:11 tile01-secondary kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> Jun  8 17:06:11 tile01-secondary kernel: CR2: 0000000000000000 CR3: 000000082d955000 CR4: 00000000000006e0
> Jun  8 17:06:11 tile01-secondary kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jun  8 17:06:11 tile01-secondary kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jun  8 17:06:11 tile01-secondary kernel: Process scsi_eh_2 (pid: 175, threadinfo ffff88083354c000, task ffff8808331082c0)
> Jun  8 17:06:11 tile01-secondary kernel: Stack:
> Jun  8 17:06:11 tile01-secondary kernel:  ffff8804337b4810 0000000000000000 ffff88021e08e880 0000000000002003
> Jun  8 17:06:11 tile01-secondary kernel:  ffff8804359cb000 0000000000000000 ffff88083354de00 ffffffffa00034ee
> Jun  8 17:06:11 tile01-secondary kernel:  ffff88021e08e880 0000000000000000 ffff88083354de60 ffffffffa000441f
> Jun  8 17:06:11 tile01-secondary kernel: Call Trace:
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffffa00034ee>] scsi_try_bus_reset+0x52/0xde [scsi_mod]
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffffa000441f>] scsi_eh_ready_devs+0x4c3/0x737 [scsi_mod]
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffffa0004bfe>] scsi_error_handler+0x37d/0x51b [scsi_mod]
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff8022f2ea>] ? __wake_up_common+0x46/0x76
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffffa0004881>] ? scsi_error_handler+0x0/0x51b [scsi_mod]
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff80251952>] kthread+0x49/0x76
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff8020d03a>] child_rip+0xa/0x20
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff80251909>] ? kthread+0x0/0x76
> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff8020d030>] ? child_rip+0x0/0x20
> Jun  8 17:06:11 tile01-secondary kernel: Code: 00 48 83 f8 ff 74 0a 48 ff c0 48 89 83 b0 00 00 00 49 8b 04 24 48 89 df be 04 00 00 00 48 8b 90 88 00 00 00 41 8a 85 98 00 00 00 <48> 8b 12 3c 01 19 c0 45 31 c9 45 31 c0 83 e0 1e 31 c9 0f b6 52 
> Jun  8 17:06:11 tile01-secondary kernel: RIP  [<ffffffffa008cc98>] mptscsih_bus_reset+0x97/0xfa [mptscsih]
> Jun  8 17:06:11 tile01-secondary kernel:  RSP <ffff88083354ddb0>
> Jun  8 17:06:11 tile01-secondary kernel: CR2: 0000000000000000
> Jun  8 17:06:11 tile01-secondary kernel: ---[ end trace 54f83dcc0f7b0b26 ]---
> 
>

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 13311] mptsas: ioc0: removing ssp device, kernel oops
  2009-05-14 18:17 [Bug 13311] New: mptsas: ioc0: removing ssp device, kernel oops bugzilla-daemon
                   ` (5 preceding siblings ...)
  2009-06-09 21:53 ` bugzilla-daemon
@ 2009-07-16 21:41 ` bugzilla-daemon
  6 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-07-16 21:41 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=13311





--- Comment #6 from Mike Loseke <mike.tummy@gmail.com>  2009-07-16 21:41:39 ---
Sorry for the delay in getting back on this.

On this system, the IO errors may not be completely unexpected.  The
disk is being made available via a Promise RAID and a few times now
we've had both controllers reset which seems to cause these kernel
Oops' in some cases (sometimes, not always, and not on two hosts
connected to the Promise like this during the same reset event).
We're actively working with Promise on the controller reset issue.

I do have some more dmesg/log output for another Oops that happened
today - what's the best way to present that information here?

Mike


On Tue, Jun 9, 2009 at 3:53 PM, <bugzilla-daemon@bugzilla.kernel.org> wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=13311
>
>
>
>
>
> --- Comment #5 from Andrew Morton <akpm@linux-foundation.org>  2009-06-09 21:53:09 ---
> On Tue, 9 Jun 2009 15:27:05 -0600
> Mike Loseke <mike.tummy@gmail.com> wrote:
>
>> On Thu, May 28, 2009 at 2:00 AM, Andrew Morton
>> <akpm@linux-foundation.org> wrote:
>> >
>> > (switched to email. __Please respond via emailed reply-to-all, not via the
>> > bugzilla web interface).
>> >
>> > On Thu, 14 May 2009 18:17:10 GMT bugzilla-daemon@bugzilla.kernel.org wrote:
>> >
>> > > http://bugzilla.kernel.org/show_bug.cgi?id=13311
>> > >
>> > > __ __ __ __ __ __Summary: mptsas: ioc0: removing ssp device, kernel oops
>> >
>> > I'd have thought that the severity of this problem is not matched by
>> > the response.
>> >
>> > > __ __ __ __ __ __Product: SCSI Drivers
>> > > __ __ __ __ __ __Version: 2.5
>> > > __ __ Kernel Version: 2.6.27.21
>> >
>> > Is it reproducible? __If so, is there any change that it can be retested
>> > under a 2.6.29-based kernel?
>>
>> We've put a 2.6.29 kernel on these two systems and experienced another
>> kernel oops yesterday.  So far, we haven't been able to reproduce it
>> on demand, but it has occurred under a heavier system load each time
>> (load average of 16 with 2,000 blocks/sec every 5 seconds writes to
>> the devices attached using the mptsas driver.
>>
>> The oops from yesterday isn't identical to the previous oops, but the
>> end result is the same where the system has to be rebooted.  I've
>> attached that the log capture of the oops.
>>
>> The system is identical to the original specs, just the kernel has changed:
>>
>> # cat /proc/version
>> Linux version 2.6.29.4-0.1-default (root@tile01-primary) (gcc version
>> 4.3.2 [gcc-4_3-branch revision 141291] (SUSE Linux) ) #1 SMP Tue May
>> 26 22:50:58 CDT 2009
>>
>> Hopefully this is helpful.
>>
>
> So we have two issues here.  One is the IO errors - are they unexpected?
>
> The other of course is that mptscsih_bus_reset() oopsed when trying to
> handle those errors.
>
>
>> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
>> Jun  8 17:06:10 tile01-secondary kernel: sd 2:0:0:0: [sda] Unhandled error code
>> Jun  8 17:06:10 tile01-secondary kernel: sd 2:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>> Jun  8 17:06:10 tile01-secondary kernel: end_request: I/O error, dev sda, sector 207
>> Jun  8 17:06:10 tile01-secondary kernel: device-mapper: multipath: Failing path 8:0.
>> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
>> Jun  8 17:06:10 tile01-secondary kernel: sd 2:0:0:0: [sda] Unhandled error code
>> Jun  8 17:06:10 tile01-secondary kernel: sd 2:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>> Jun  8 17:06:10 tile01-secondary kernel: end_request: I/O error, dev sda, sector 65679
>> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
>> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
>> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
>> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
>> Jun  8 17:06:10 tile01-secondary kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
>> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff88021e08e880)
>> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 f0 87 00 04 00 00
>> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff88021e08e880)
>> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff880106684dc0)
>> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 f4 87 00 04 00 00
>> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff880106684dc0)
>> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff8803b0a131c0)
>> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 f8 87 00 04 00 00
>> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff8803b0a131c0)
>> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff8803b0a13ec0)
>> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 fc 87 00 00 08 00
>> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff8803b0a13ec0)
>> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting task abort! (sc=ffff8803b0a13cc0)
>> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 fc 8f 00 04 00 00
>> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff8803b0a13cc0)
>> Jun  8 17:06:11 tile01-secondary kernel: mptscsih: ioc0: attempting bus reset! (sc=ffff88021e08e880)
>> Jun  8 17:06:11 tile01-secondary kernel: scsi 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 00 f0 87 00 04 00 00
>> Jun  8 17:06:11 tile01-secondary kernel: BUG: unable to handle kernel NULL pointer dereference at (null)
>> Jun  8 17:06:11 tile01-secondary kernel: IP: [<ffffffffa008cc98>] mptscsih_bus_reset+0x97/0xfa [mptscsih]
>> Jun  8 17:06:11 tile01-secondary kernel: PGD 82944c067 PUD 82e4e9067 PMD 0
>> Jun  8 17:06:11 tile01-secondary kernel: Oops: 0000 [#1] SMP
>> Jun  8 17:06:11 tile01-secondary kernel: last sysfs file: /sys/kernel/uevent_seqnum
>> Jun  8 17:06:11 tile01-secondary kernel: CPU 1
>> Jun  8 17:06:11 tile01-secondary kernel: Modules linked in: reiserfs dm_round_robin ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp iptable_filter dm_multipath scsi_dh ip_tables iscsi_trgt crc32c x_tables 8021q garp stp bonding ipv6 cpufreq_conservative cpufreq_userspace cpufreq_powersave powernow_k8 ext3 jbd mbcache loop dm_mod qla4xxx scsi_transport_iscsi qla3xxx rtc_cmos i2c_nforce2 rtc_core rtc_lib shpchp forcedeth pcspkr joydev serio_raw mptctl pci_hotplug i2c_core button sr_mod sg cdrom usbhid hid ohci_hcd ehci_hcd sd_mod crc_t10dif usbcore edd xfs exportfs fan 3w_9xxx ide_pci_generic amd74xx ide_core ata_generic thermal processor thermal_sys hwmon sata_nv mptsas mptscsih mptbase scsi_transport_sas pata_amd libata scsi_mod
>> Jun  8 17:06:11 tile01-secondary kernel: Pid: 175, comm: scsi_eh_2 Not tainted 2.6.29.4-0.1-default #1 H8DM3-2
>> Jun  8 17:06:11 tile01-secondary kernel: RIP: 0010:[<ffffffffa008cc98>]  [<ffffffffa008cc98>] mptscsih_bus_reset+0x97/0xfa [mptscsih]
>> Jun  8 17:06:11 tile01-secondary kernel: RSP: 0018:ffff88083354ddb0  EFLAGS: 00010203
>> Jun  8 17:06:11 tile01-secondary kernel: RAX: ffff8804359cb002 RBX: ffff88043368a560 RCX: ffff88021e08e880
>> Jun  8 17:06:11 tile01-secondary kernel: RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88043368a560
>> Jun  8 17:06:11 tile01-secondary kernel: RBP: ffff88083354dde0 R08: 0000000000000002 R09: 0000000000000000
>> Jun  8 17:06:11 tile01-secondary kernel: R10: ffffffff80d7e600 R11: 0000000000000010 R12: ffff88021e08e880
>> Jun  8 17:06:11 tile01-secondary kernel: R13: ffff8804335a3000 R14: ffff8804335a3008 R15: ffff88083354dee0
>> Jun  8 17:06:11 tile01-secondary kernel: FS:  00007f66c7122740(0000) GS:ffff88043596edc0(0000) knlGS:0000000000000000
>> Jun  8 17:06:11 tile01-secondary kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
>> Jun  8 17:06:11 tile01-secondary kernel: CR2: 0000000000000000 CR3: 000000082d955000 CR4: 00000000000006e0
>> Jun  8 17:06:11 tile01-secondary kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> Jun  8 17:06:11 tile01-secondary kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Jun  8 17:06:11 tile01-secondary kernel: Process scsi_eh_2 (pid: 175, threadinfo ffff88083354c000, task ffff8808331082c0)
>> Jun  8 17:06:11 tile01-secondary kernel: Stack:
>> Jun  8 17:06:11 tile01-secondary kernel:  ffff8804337b4810 0000000000000000 ffff88021e08e880 0000000000002003
>> Jun  8 17:06:11 tile01-secondary kernel:  ffff8804359cb000 0000000000000000 ffff88083354de00 ffffffffa00034ee
>> Jun  8 17:06:11 tile01-secondary kernel:  ffff88021e08e880 0000000000000000 ffff88083354de60 ffffffffa000441f
>> Jun  8 17:06:11 tile01-secondary kernel: Call Trace:
>> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffffa00034ee>] scsi_try_bus_reset+0x52/0xde [scsi_mod]
>> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffffa000441f>] scsi_eh_ready_devs+0x4c3/0x737 [scsi_mod]
>> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffffa0004bfe>] scsi_error_handler+0x37d/0x51b [scsi_mod]
>> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff8022f2ea>] ? __wake_up_common+0x46/0x76
>> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffffa0004881>] ? scsi_error_handler+0x0/0x51b [scsi_mod]
>> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff80251952>] kthread+0x49/0x76
>> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff8020d03a>] child_rip+0xa/0x20
>> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff80251909>] ? kthread+0x0/0x76
>> Jun  8 17:06:11 tile01-secondary kernel:  [<ffffffff8020d030>] ? child_rip+0x0/0x20
>> Jun  8 17:06:11 tile01-secondary kernel: Code: 00 48 83 f8 ff 74 0a 48 ff c0 48 89 83 b0 00 00 00 49 8b 04 24 48 89 df be 04 00 00 00 48 8b 90 88 00 00 00 41 8a 85 98 00 00 00 <48> 8b 12 3c 01 19 c0 45 31 c9 45 31 c0 83 e0 1e 31 c9 0f b6 52
>> Jun  8 17:06:11 tile01-secondary kernel: RIP  [<ffffffffa008cc98>] mptscsih_bus_reset+0x97/0xfa [mptscsih]
>> Jun  8 17:06:11 tile01-secondary kernel:  RSP <ffff88083354ddb0>
>> Jun  8 17:06:11 tile01-secondary kernel: CR2: 0000000000000000
>> Jun  8 17:06:11 tile01-secondary kernel: ---[ end trace 54f83dcc0f7b0b26 ]---
>>
>>
>
> --
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug.
>

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 13311] mptsas: ioc0: removing ssp device, kernel oops
       [not found] <bug-13311-11613@https.bugzilla.kernel.org/>
@ 2012-06-07 10:49 ` bugzilla-daemon
  2012-06-07 10:49 ` bugzilla-daemon
  1 sibling, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2012-06-07 10:49 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=13311


Alan <alan@lxorguk.ukuu.org.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |alan@lxorguk.ukuu.org.uk
         Resolution|                            |OBSOLETE




-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 13311] mptsas: ioc0: removing ssp device, kernel oops
       [not found] <bug-13311-11613@https.bugzilla.kernel.org/>
  2012-06-07 10:49 ` bugzilla-daemon
@ 2012-06-07 10:49 ` bugzilla-daemon
  1 sibling, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2012-06-07 10:49 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=13311


Alan <alan@lxorguk.ukuu.org.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |CLOSED




-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-06-07 10:49 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-14 18:17 [Bug 13311] New: mptsas: ioc0: removing ssp device, kernel oops bugzilla-daemon
2009-05-28  8:00 ` Andrew Morton
2009-05-28 11:54   ` Kay Sievers
2009-06-09 21:27   ` Mike Loseke
2009-06-09 21:52     ` Andrew Morton
2009-05-28  8:01 ` [Bug 13311] " bugzilla-daemon
2009-05-28 11:54 ` bugzilla-daemon
2009-06-09 21:27 ` bugzilla-daemon
2009-06-09 21:30 ` bugzilla-daemon
2009-06-09 21:53 ` bugzilla-daemon
2009-07-16 21:41 ` bugzilla-daemon
     [not found] <bug-13311-11613@https.bugzilla.kernel.org/>
2012-06-07 10:49 ` bugzilla-daemon
2012-06-07 10:49 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).