[Bug 12120] New: [Block layer or SCSI] requests aborted too early during check

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition()
@ 2008-11-29 13:19 bugme-daemon
  2008-11-29 15:12 ` Stefan Richter
                   ` (12 more replies)
  0 siblings, 13 replies; 21+ messages in thread
From: bugme-daemon @ 2008-11-29 13:19 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12120

           Summary: [Block layer or SCSI] requests aborted too early during
                    check_partition()
           Product: IO/Storage
           Version: 2.5
     KernelVersion: 2.6.28-rc
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: SCSI
        AssignedTo: linux-scsi@vger.kernel.org
        ReportedBy: stefan-r-bz@s5r6.in-berlin.de
OtherBugsDependingO 11808
             nThis:
        Regression: 1


Latest working kernel version: 2.6.27.4
Earliest failing kernel version: 2.6.28-rc6  (I didn't test earlier -rcs yet)

Hardware Environment: OxSemi 912 and 92x based FireWire harddisks


See http://marc.info/?l=linux-scsi&m=122774022130397 --
Disks may still spin their motor up when the kernel attempts to read
partitions. In kernel 2.6.27, this was handled transparently.  But since
2.6.28-rc?, the kernel immediately aborts the associated requests and sets the
new scsi_device offline.


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 13:19 [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition() bugme-daemon
@ 2008-11-29 15:12 ` Stefan Richter
  2008-11-29 18:35   ` James Bottomley
  2008-11-29 15:12 ` [Bug 12120] " bugme-daemon
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 21+ messages in thread
From: Stefan Richter @ 2008-11-29 15:12 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux1394-devel, bugme-daemon

On 29 Nov, bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=12120
> 
>            Summary: [Block layer or SCSI] requests aborted too early during
>                     check_partition()
>            Product: IO/Storage
>            Version: 2.5
>      KernelVersion: 2.6.28-rc
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: SCSI
>         AssignedTo: linux-scsi@vger.kernel.org
>         ReportedBy: stefan-r-bz@s5r6.in-berlin.de
> OtherBugsDependingO 11808
>              nThis:
>         Regression: 1
> 
> 
> Latest working kernel version: 2.6.27.4
> Earliest failing kernel version: 2.6.28-rc6  (I didn't test earlier -rcs yet)
> 
> Hardware Environment: OxSemi 912 and 92x based FireWire harddisks
> 
> 
> See http://marc.info/?l=linux-scsi&m=122774022130397 --
> Disks may still spin their motor up when the kernel attempts to read
> partitions. In kernel 2.6.27, this was handled transparently.  But since
> 2.6.28-rc?, the kernel immediately aborts the associated requests and sets the
> new scsi_device offline.
> 
> 

I tried a workaround in firewire-sbp2:  Check whether the new sdev is
offline right after scsi_add_device.  If so, remove the sdev and try
everything again.  However,
  - a regression still remains because the addition of the HDD takes
    about 30...40 seconds instead of only a few seconds like in 2.6.27.
  - The workaround works with OXFW912 and OXUF922, but not with
    OXUF924DSB.  Login retries don't succeed with the latter at all.
  - The workaround would be very hard to implement in the older
    ieee1394/sbp2 driver.
  - Other hardware besides FireWire may be affected.
-- 
Stefan Richter
-=====-==--- =-== ===-=
http://arcgraph.de/sr/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 13:19 [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition() bugme-daemon
  2008-11-29 15:12 ` Stefan Richter
@ 2008-11-29 15:12 ` bugme-daemon
  2008-11-29 18:36 ` bugme-daemon
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: bugme-daemon @ 2008-11-29 15:12 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12120





------- Comment #1 from anonymous@kernel-bugs.osdl.org  2008-11-29 07:12 -------
Reply-To: stefanr@s5r6.in-berlin.de

On 29 Nov, bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=12120
> 
>            Summary: [Block layer or SCSI] requests aborted too early during
>                     check_partition()
>            Product: IO/Storage
>            Version: 2.5
>      KernelVersion: 2.6.28-rc
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: SCSI
>         AssignedTo: linux-scsi@vger.kernel.org
>         ReportedBy: stefan-r-bz@s5r6.in-berlin.de
> OtherBugsDependingO 11808
>              nThis:
>         Regression: 1
> 
> 
> Latest working kernel version: 2.6.27.4
> Earliest failing kernel version: 2.6.28-rc6  (I didn't test earlier -rcs yet)
> 
> Hardware Environment: OxSemi 912 and 92x based FireWire harddisks
> 
> 
> See http://marc.info/?l=linux-scsi&m=122774022130397 --
> Disks may still spin their motor up when the kernel attempts to read
> partitions. In kernel 2.6.27, this was handled transparently.  But since
> 2.6.28-rc?, the kernel immediately aborts the associated requests and sets the
> new scsi_device offline.
> 
> 

I tried a workaround in firewire-sbp2:  Check whether the new sdev is
offline right after scsi_add_device.  If so, remove the sdev and try
everything again.  However,
  - a regression still remains because the addition of the HDD takes
    about 30...40 seconds instead of only a few seconds like in 2.6.27.
  - The workaround works with OXFW912 and OXUF922, but not with
    OXUF924DSB.  Login retries don't succeed with the latter at all.
  - The workaround would be very hard to implement in the older
    ieee1394/sbp2 driver.
  - Other hardware besides FireWire may be affected.


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 15:12 ` Stefan Richter
@ 2008-11-29 18:35   ` James Bottomley
  0 siblings, 0 replies; 21+ messages in thread
From: James Bottomley @ 2008-11-29 18:35 UTC (permalink / raw)
  To: Stefan Richter; +Cc: linux-scsi, linux1394-devel, bugme-daemon

On Sat, 2008-11-29 at 16:12 +0100, Stefan Richter wrote:
> On 29 Nov, bugme-daemon@bugzilla.kernel.org wrote:
> > http://bugzilla.kernel.org/show_bug.cgi?id=12120
> > 
> >            Summary: [Block layer or SCSI] requests aborted too early during
> >                     check_partition()
> >            Product: IO/Storage
> >            Version: 2.5
> >      KernelVersion: 2.6.28-rc
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: SCSI
> >         AssignedTo: linux-scsi@vger.kernel.org
> >         ReportedBy: stefan-r-bz@s5r6.in-berlin.de
> > OtherBugsDependingO 11808
> >              nThis:
> >         Regression: 1
> > 
> > 
> > Latest working kernel version: 2.6.27.4
> > Earliest failing kernel version: 2.6.28-rc6  (I didn't test earlier -rcs yet)
> > 
> > Hardware Environment: OxSemi 912 and 92x based FireWire harddisks
> > 
> > 
> > See http://marc.info/?l=linux-scsi&m=122774022130397 --
> > Disks may still spin their motor up when the kernel attempts to read
> > partitions. In kernel 2.6.27, this was handled transparently.  But since
> > 2.6.28-rc?, the kernel immediately aborts the associated requests and sets the
> > new scsi_device offline.
> > 
> > 
> 
> I tried a workaround in firewire-sbp2:  Check whether the new sdev is
> offline right after scsi_add_device.  If so, remove the sdev and try
> everything again.  However,
>   - a regression still remains because the addition of the HDD takes
>     about 30...40 seconds instead of only a few seconds like in 2.6.27.
>   - The workaround works with OXFW912 and OXUF922, but not with
>     OXUF924DSB.  Login retries don't succeed with the latter at all.
>   - The workaround would be very hard to implement in the older
>     ieee1394/sbp2 driver.
>   - Other hardware besides FireWire may be affected.

It sounds like the device isn't being spun up, so commands which require
media read are being failed with not ready.  Could we get some traces to
show this; just enable all tracing in the boot up sequence:

echo 0xffffffff > /sys/module/scsi_mod/parameters/scsi_logging_level

James



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 13:19 [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition() bugme-daemon
  2008-11-29 15:12 ` Stefan Richter
  2008-11-29 15:12 ` [Bug 12120] " bugme-daemon
@ 2008-11-29 18:36 ` bugme-daemon
  2008-11-29 19:29 ` bugme-daemon
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: bugme-daemon @ 2008-11-29 18:36 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12120





------- Comment #2 from anonymous@kernel-bugs.osdl.org  2008-11-29 10:36 -------
Reply-To: James.Bottomley@HansenPartnership.com

On Sat, 2008-11-29 at 16:12 +0100, Stefan Richter wrote:
> On 29 Nov, bugme-daemon@bugzilla.kernel.org wrote:
> > http://bugzilla.kernel.org/show_bug.cgi?id=12120
> > 
> >            Summary: [Block layer or SCSI] requests aborted too early during
> >                     check_partition()
> >            Product: IO/Storage
> >            Version: 2.5
> >      KernelVersion: 2.6.28-rc
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: SCSI
> >         AssignedTo: linux-scsi@vger.kernel.org
> >         ReportedBy: stefan-r-bz@s5r6.in-berlin.de
> > OtherBugsDependingO 11808
> >              nThis:
> >         Regression: 1
> > 
> > 
> > Latest working kernel version: 2.6.27.4
> > Earliest failing kernel version: 2.6.28-rc6  (I didn't test earlier -rcs yet)
> > 
> > Hardware Environment: OxSemi 912 and 92x based FireWire harddisks
> > 
> > 
> > See http://marc.info/?l=linux-scsi&m=122774022130397 --
> > Disks may still spin their motor up when the kernel attempts to read
> > partitions. In kernel 2.6.27, this was handled transparently.  But since
> > 2.6.28-rc?, the kernel immediately aborts the associated requests and sets the
> > new scsi_device offline.
> > 
> > 
> 
> I tried a workaround in firewire-sbp2:  Check whether the new sdev is
> offline right after scsi_add_device.  If so, remove the sdev and try
> everything again.  However,
>   - a regression still remains because the addition of the HDD takes
>     about 30...40 seconds instead of only a few seconds like in 2.6.27.
>   - The workaround works with OXFW912 and OXUF922, but not with
>     OXUF924DSB.  Login retries don't succeed with the latter at all.
>   - The workaround would be very hard to implement in the older
>     ieee1394/sbp2 driver.
>   - Other hardware besides FireWire may be affected.

It sounds like the device isn't being spun up, so commands which require
media read are being failed with not ready.  Could we get some traces to
show this; just enable all tracing in the boot up sequence:

echo 0xffffffff > /sys/module/scsi_mod/parameters/scsi_logging_level

James


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 13:19 [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition() bugme-daemon
                   ` (2 preceding siblings ...)
  2008-11-29 18:36 ` bugme-daemon
@ 2008-11-29 19:29 ` bugme-daemon
  2008-11-29 19:32 ` bugme-daemon
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: bugme-daemon @ 2008-11-29 19:29 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12120





------- Comment #3 from anonymous@kernel-bugs.osdl.org  2008-11-29 11:29 -------
Reply-To: stefanr@s5r6.in-berlin.de

> It sounds like the device isn't being spun up, so commands which require
> media read are being failed with not ready.  Could we get some traces to
> show this; just enable all tracing in the boot up sequence:
> 
> echo 0xffffffff > /sys/module/scsi_mod/parameters/scsi_logging_level

OK, here is a log from a failing session in 2.6.28-rc6.
The firmware responds to everything without spinning the disk up,
until the first READ(10) is sent.

  # spin disk down artificially with
  # "modprobe -r firewire-sbp2" while manage_start_stop == 1
  #
  # then "modprobe firewire-sbp2"

Nov 29 19:57:05 stein Error handler scsi_eh_20 sleeping
Nov 29 19:57:05 stein scsi20 : SBP-2 IEEE-1394
Nov 29 19:57:06 stein firewire_sbp2: fw3.0: logged in to LUN 0000 (0 retries)
Nov 29 19:57:06 stein scsi 20:0:0:0: scsi scan: INQUIRY pass 1 length 36
Nov 29 19:57:06 stein scsi 20:0:0:0: Send: 0xf57986b0 
Nov 29 19:57:06 stein scsi 20:0:0:0: CDB: Inquiry: 12 00 00 00 24 00
Nov 29 19:57:06 stein buffer = 0xf725f9d0, bufflen = 36, queuecommand
0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein scsi 20:0:0:0: Done: 0xf57986b0 SUCCESS
Nov 29 19:57:06 stein scsi 20:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein scsi 20:0:0:0: CDB: Inquiry: 12 00 00 00 24 00
Nov 29 19:57:06 stein scsi 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein scsi 20:0:0:0: Notifying upper driver of completion
(result 0)
Nov 29 19:57:06 stein 0 sectors total, 36 bytes done.
Nov 29 19:57:06 stein scsi scan: INQUIRY successful with code 0x0
Nov 29 19:57:06 stein scsi 20:0:0:0: Direct-Access-RBC ST340083 2A             
 3.03 PQ: 0 ANSI: 4
Nov 29 19:57:06 stein sd 20:0:0:0: sd_attach
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] sd_revalidate_disk
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf57986b0 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 19:57:06 stein buffer = 0x00000000, bufflen = 0, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf57986b0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 19:57:06 stein 0 sectors total, 0 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf57986b0 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00
00 00 00
Nov 29 19:57:06 stein buffer = 0xf725f9d0, bufflen = 8, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf57986b0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00
00 00 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 19:57:06 stein 0 sectors total, 8 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] 781422768 512-byte hardware sectors:
(400 GB/372 GiB)
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf57986b0 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 00 3f 00 04 00
Nov 29 19:57:06 stein buffer = 0xf725f9d0, bufflen = 4, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf57986b0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 00 3f 00 04 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 19:57:06 stein 0 sectors total, 4 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Write Protect is off
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Mode Sense: 11 00 00 00
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf57986b0 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 04 00
Nov 29 19:57:06 stein buffer = 0xf725f9d0, bufflen = 4, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf57986b0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 04 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 19:57:06 stein 0 sectors total, 4 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf57986b0 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 16 00
Nov 29 19:57:06 stein buffer = 0xf725f9d0, bufflen = 22, queuecommand
0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf57986b0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 16 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 19:57:06 stein 0 sectors total, 22 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] sd_open
Nov 29 19:57:06 stein scsi_block_when_processing_errors: rtn: 1
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] sd_revalidate_disk
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf57984a0 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 19:57:06 stein buffer = 0x00000000, bufflen = 0, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf57984a0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 19:57:06 stein 0 sectors total, 0 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf5798550 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00
00 00 00
Nov 29 19:57:06 stein buffer = 0xf725fd68, bufflen = 8, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf5798550 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00
00 00 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 19:57:06 stein 0 sectors total, 8 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] 781422768 512-byte hardware sectors:
(400 GB/372 GiB)
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf57981e0 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 00 3f 00 04 00
Nov 29 19:57:06 stein buffer = 0xf725fed8, bufflen = 4, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf57981e0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 00 3f 00 04 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 19:57:06 stein 0 sectors total, 4 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Write Protect is off
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Mode Sense: 11 00 00 00
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf5798600 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 04 00
Nov 29 19:57:06 stein buffer = 0xf725f918, bufflen = 4, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf5798600 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 04 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 19:57:06 stein 0 sectors total, 4 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf5798ef0 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 16 00
Nov 29 19:57:06 stein buffer = 0xe5961b40, bufflen = 22, queuecommand
0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf5798ef0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 16 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 19:57:06 stein 0 sectors total, 22 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA

  # partition scanning code gets into gears,
  # issues very first READ command,
  # now the failures start

Nov 29 19:57:06 stein sdd:<6>sd 20:0:0:0: [sdd] sd_init_command: block=0,
count=8
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] block=0
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] reading 8/8 512 byte blocks.
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Send: 0xf5798ef0 
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Read(10): 28 00 00 00 00 00 00 00
08 00
Nov 29 19:57:06 stein buffer = 0xe5961b40, bufflen = 4096, queuecommand
0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Done: 0xf5798ef0 FAILED
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Read(10): 28 00 00 00 00 00 00 00
08 00
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Sense Key : Not Ready [current] 
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: Logical unit not ready,
initializing command required
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 0
Nov 29 19:57:06 stein Waking error handler thread
Nov 29 19:57:06 stein Error handler scsi_eh_20 waking up
Nov 29 19:57:06 stein sd 20:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1,
cancel: 0
Nov 29 19:57:06 stein Total of 1 commands on 1 devices require eh work
Nov 29 19:57:06 stein scsi_eh_20: Sending START_UNIT to sdev: 0xe58fc7f0
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Send: 0xf5798ef0 
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00 01
00
Nov 29 19:57:06 stein buffer = 0x00000000, bufflen = 0, queuecommand 0xf85fb980
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Done: 0xf5798ef0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00 01
00
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Unrecognized sense data (in hex):
Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 19:57:06 stein Sense Key : No Sense [current] 
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: No additional sense
information
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 1
Nov 29 19:57:06 stein scsi_send_eh_cmnd: scmd: f5798ef0, timeleft: 0
Nov 29 19:57:06 stein firewire_sbp2: fw3.0: sbp2_scsi_abort
Nov 29 19:57:06 stein scsi_eh_done scmd: f5798ef0 result: 20000
Nov 29 19:57:06 stein scsi_eh_20: START_UNIT failed to sdev: 0xe58fc7f0
Nov 29 19:57:06 stein scsi_eh_20: Sending BDR sdev: 0xe58fc7f0
Nov 29 19:57:06 stein scsi_eh_20: BDR failed sdev:0xe58fc7f0
Nov 29 19:57:06 stein scsi_eh_20: Sending target reset to target 0
Nov 29 19:57:06 stein scsi_eh_20: Target reset failed target: 0
Nov 29 19:57:06 stein scsi_eh_20: Sending BRST chan: 0
Nov 29 19:57:06 stein scsi_try_bus_reset: Snd Bus RST
Nov 29 19:57:06 stein scsi_eh_20: BRST failed chan: 0
Nov 29 19:57:06 stein scsi_eh_20: Sending HRST
Nov 29 19:57:06 stein scsi_try_host_reset: Snd Host RST
Nov 29 19:57:06 stein scsi_eh_20: HRST failed
Nov 29 19:57:06 stein sd 20:0:0:0: Device offlined - not ready after error
recovery
Nov 29 19:57:06 stein scsi_eh_20: flush finish cmd: f5798ef0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result
2)
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein 8 sectors total, 0 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein end_request: I/O error, dev sdd, sector 0
Nov 29 19:57:06 stein Buffer I/O error on device sdd, logical block 0
Nov 29 19:57:06 stein scsi_restart_operations: waking up host to restart
Nov 29 19:57:06 stein Error handler scsi_eh_20 sleeping
Nov 29 19:57:06 stein sd 20:0:0:0: rejecting I/O to offline device
Nov 29 19:57:06 stein Buffer I/O error on device sdd, logical block 0
Nov 29 19:57:06 stein ldm_validate_partition_table(): Disk read failed.
Nov 29 19:57:06 stein sd 20:0:0:0: rejecting I/O to offline device
Nov 29 19:57:06 stein Buffer I/O error on device sdd, logical block 0
Nov 29 19:57:06 stein sd 20:0:0:0: rejecting I/O to offline device
Nov 29 19:57:06 stein Buffer I/O error on device sdd, logical block 0
Nov 29 19:57:06 stein unable to read partition table
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] sd_release
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Attached SCSI disk
Nov 29 19:57:06 stein sg_alloc: dev=3 
Nov 29 19:57:06 stein sd 20:0:0:0: Attached scsi generic sg3 type 14
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] sd_open
Nov 29 19:57:06 stein scsi_block_when_processing_errors: rtn: 0
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] sd_open
Nov 29 19:57:06 stein scsi_block_when_processing_errors: rtn: 0
Nov 29 19:57:08 stein firewire_sbp2: fw3.0: reconnected to LUN 0000 (0 retries)
Nov 29 19:57:14 stein firewire_sbp2: fw3.0: reconnected to LUN 0000 (0 retries)
Nov 29 19:58:09 stein xinetd[4299]: START: imap2 pid=18913 from=192.168.0.42
Nov 29 19:58:09 stein imapd[18913]: imap service init from 192.168.0.42
Nov 29 19:58:09 stein imapd[18913]: Login user=stefan host=stein [192.168.0.42]
Nov 29 19:58:15 stein imapd[18914]: imap service init from 192.168.0.42
Nov 29 19:58:15 stein xinetd[4299]: START: imap2 pid=18914 from=192.168.0.42
Nov 29 19:58:15 stein imapd[18914]: Login user=stefan host=stein [192.168.0.42]
Nov 29 19:58:19 stein imapd[18913]: Logout user=stefan host=stein
[192.168.0.42]
Nov 29 19:58:19 stein xinetd[4299]: EXIT: imap2 status=0 pid=18913
duration=10(sec)
Nov 29 19:59:07 stein imapd[18914]: Logout user=stefan host=stein
[192.168.0.42]
Nov 29 19:59:07 stein xinetd[4299]: EXIT: imap2 status=0 pid=18914
duration=52(sec)


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 13:19 [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition() bugme-daemon
                   ` (3 preceding siblings ...)
  2008-11-29 19:29 ` bugme-daemon
@ 2008-11-29 19:32 ` bugme-daemon
  2008-11-29 19:54   ` James Bottomley
  2008-11-29 19:55 ` bugme-daemon
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 21+ messages in thread
From: bugme-daemon @ 2008-11-29 19:32 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12120





------- Comment #4 from anonymous@kernel-bugs.osdl.org  2008-11-29 11:32 -------
Reply-To: stefanr@s5r6.in-berlin.de

For comparison, here is a log from the very same disk, also spun down
when firewire-sbp2 logs in, from 2.6.27.4:

  # modprobe firewire-sbp2

Nov 29 20:17:01 mini scsi5 : SBP-2 IEEE-1394
Nov 29 20:17:01 mini Error handler scsi_eh_5 sleeping
Nov 29 20:17:01 mini firewire_sbp2: fw1.0: logged in to LUN 0000 (0 retries)
Nov 29 20:17:01 mini scsi 5:0:0:0: scsi scan: INQUIRY pass 1 length 36
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 5500,
(ffffffff80375047)
Nov 29 20:17:01 mini scsi 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini scsi 5:0:0:0: CDB: Inquiry: 12 00 00 00 24 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 36, queuecommand
0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini scsi 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini scsi 5:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini scsi 5:0:0:0: CDB: Inquiry: 12 00 00 00 24 00
Nov 29 20:17:01 mini scsi 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini scsi 5:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 20:17:01 mini 0 sectors total, 36 bytes done.
Nov 29 20:17:01 mini scsi scan: INQUIRY successful with code 0x0
Nov 29 20:17:01 mini scsi 5:0:0:0: Direct-Access-RBC ST340083 2A              
3.03 PQ: 0 ANSI: 4
Nov 29 20:17:01 mini sd 5:0:0:0: sd_attach
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] sd_revalidate_disk
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000,
(ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 20:17:01 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand
0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 20:17:01 mini 0 sectors total, 0 bytes done.
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000,
(ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00
00 00 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 8, queuecommand
0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00
00 00 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 20:17:01 mini 0 sectors total, 8 bytes done.
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] 781422768 512-byte hardware sectors
(400088 MB)
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000,
(ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 00 3f 00 04 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 4, queuecommand
0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 00 3f 00 04 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 20:17:01 mini 0 sectors total, 4 bytes done.
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Write Protect is off
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Mode Sense: 11 00 00 00
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000,
(ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 04 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 4, queuecommand
0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 04 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 20:17:01 mini 0 sectors total, 4 bytes done.
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000,
(ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 16 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 22, queuecommand
0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 16 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 20:17:01 mini 0 sectors total, 22 bytes done.
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] sd_open
Nov 29 20:17:01 mini scsi_block_when_processing_errors: rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] sd_revalidate_disk
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000,
(ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 20:17:01 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand
0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 20:17:01 mini 0 sectors total, 0 bytes done.
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000,
(ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00
00 00 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 8, queuecommand
0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00
00 00 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 20:17:01 mini 0 sectors total, 8 bytes done.
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] 781422768 512-byte hardware sectors
(400088 MB)
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000,
(ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 00 3f 00 04 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 4, queuecommand
0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 00 3f 00 04 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 20:17:01 mini 0 sectors total, 4 bytes done.
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Write Protect is off
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Mode Sense: 11 00 00 00
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000,
(ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 04 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 4, queuecommand
0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 04 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 20:17:01 mini 0 sectors total, 4 bytes done.
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000,
(ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 16 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 22, queuecommand
0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 16 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 20:17:01 mini 0 sectors total, 22 bytes done.
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA

  # coming up: the first READ(10)

Nov 29 20:17:01 mini sdb:<6>sd 5:0:0:0: [sdb] sd_init_command: block=0, count=8
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] block=0
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] reading 8/8 512 byte blocks.
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 60000,
(ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00
08 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 4096, queuecommand
0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Done: 0xffff88007d6cbed0 FAILED
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00
08 00
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Sense Key : Not Ready [current] 
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Add. Sense: Logical unit not ready,
initializing command required
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] scsi host busy 1 failed 0
Nov 29 20:17:01 mini Waking error handler thread
Nov 29 20:17:01 mini Error handler scsi_eh_5 waking up
Nov 29 20:17:01 mini sd 5:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1,
cancel: 0
Nov 29 20:17:01 mini Total of 1 commands on 1 devices require eh work
Nov 29 20:17:01 mini scsi_eh_5: Sending START_UNIT to sdev: 0xffff8800737fe000
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 20:17:01 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand
0xffffffffa00d2a8a
Nov 29 20:17:08 mini scsi_eh_done scmd: ffff88007d6cbed0 result: 0
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] scsi host busy 1 failed 1
Nov 29 20:17:08 mini scsi_send_eh_cmnd: scmd: ffff88007d6cbed0, timeleft: 52572
Nov 29 20:17:08 mini scsi_send_eh_cmnd: scsi_eh_completed_normally 2002
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Send: 0xffff88007d6cbed0 
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 20:17:08 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand
0xffffffffa00d2a8a
Nov 29 20:17:08 mini scsi_eh_done scmd: ffff88007d6cbed0 result: 0
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] scsi host busy 1 failed 1
Nov 29 20:17:08 mini scsi_send_eh_cmnd: scmd: ffff88007d6cbed0, timeleft: 10000
Nov 29 20:17:08 mini scsi_send_eh_cmnd: scsi_eh_completed_normally 2002
Nov 29 20:17:08 mini scsi_eh_tur: scmd ffff88007d6cbed0 rtn 2002
Nov 29 20:17:08 mini scsi_eh_5: flush retry cmd: ffff88007d6cbed0
Nov 29 20:17:08 mini Inserting command ffff88007d6cbed0 into mlqueue
Nov 29 20:17:08 mini scsi_restart_operations: waking up host to restart
Nov 29 20:17:08 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 60000,
(ffffffff80375047)
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Send: 0xffff88007d6cbed0 
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00
08 00
Nov 29 20:17:08 mini buffer = 0xffff88007df61eb0, bufflen = 4096, queuecommand
0xffffffffa00d2a8a

  # cutting out some traffic to [sda], the root filesystem disk

Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00
08 00
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] scsi host busy 1 failed 0
Nov 29 20:17:08 mini sd 5:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:08 mini 8 sectors total, 4096 bytes done.
Nov 29 20:17:08 mini sdb1
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] sd_release
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Attached SCSI disk
Nov 29 20:17:08 mini sg_alloc: dev=1 
Nov 29 20:17:08 mini sd 5:0:0:0: Attached scsi generic sg1 type 14
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] sd_open
Nov 29 20:17:08 mini scsi_block_when_processing_errors: rtn: 1
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] sd_init_command: block=0, count=32
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] block=0
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] reading 32/32 512 byte blocks.
Nov 29 20:17:08 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 60000,
(ffffffff80375047)
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Send: 0xffff88007d6cbed0 
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00
20 00
Nov 29 20:17:08 mini buffer = 0xffff88007df61eb0, bufflen = 16384, queuecommand
0xffffffffa00d2a8a
Nov 29 20:17:08 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:08 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00
20 00
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] scsi host busy 1 failed 0
Nov 29 20:17:08 mini sd 5:0:0:0: Notifying upper driver of completion (result
0)
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:08 mini 32 sectors total, 16384 bytes done.

  # et cetera


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 19:32 ` bugme-daemon
@ 2008-11-29 19:54   ` James Bottomley
  2008-11-29 21:30     ` Stefan Richter
  0 siblings, 1 reply; 21+ messages in thread
From: James Bottomley @ 2008-11-29 19:54 UTC (permalink / raw)
  To: bugme-daemon; +Cc: linux-scsi

OK, so there are a few problems.  First, by responding OK to the test
unit ready (which is illegal under spec) it avoids the spin up the sd
driver normally does, so we're relying on the eh allow_restart flag to
start the unit on the first failing command.  Then, in the failure case:

> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Read(10): 28 00 00 00 00 00 00 00
> 08 00
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Sense Key : Not Ready [current] 
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: Logical unit not ready,
> initializing command required
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 0
> Nov 29 19:57:06 stein Waking error handler thread
> Nov 29 19:57:06 stein Error handler scsi_eh_20 waking up
> Nov 29 19:57:06 stein sd 20:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1,
> cancel: 0
> Nov 29 19:57:06 stein Total of 1 commands on 1 devices require eh work
> Nov 29 19:57:06 stein scsi_eh_20: Sending START_UNIT to sdev: 0xe58fc7f0
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Send: 0xf5798ef0 
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00 01
> 00
> Nov 29 19:57:06 stein buffer = 0x00000000, bufflen = 0, queuecommand 0xf85fb980
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Done: 0xf5798ef0 SUCCESS
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Result: hostbyte=DID_OK
> driverbyte=DRIVER_OK,SUGGEST_OK
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00 01
> 00
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Unrecognized sense data (in hex):
> Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> Nov 29 19:57:06 stein Sense Key : No Sense [current] 
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: No additional sense
> information
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 1

The second start unit is a failure ... I suspect because of our change
to no sense return handling.  What the drive is probably trying to say
is (I'm spinning up) but this gets interpreted as an error because the
sense data for this isn't present (because we didn't ask for it).

Can you try this patch?  It should take the success return of the first
spin up and act on it instead of blindly sending another.

James

---

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 3863617..635d8b4 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -931,12 +931,15 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)
 	if (scmd->device->allow_restart) {
 		int i, rtn = NEEDS_RETRY;
 
-		for (i = 0; rtn == NEEDS_RETRY && i < 2; i++)
+		for (i = 0; rtn == NEEDS_RETRY && i < 2; i++) {
 			rtn = scsi_send_eh_cmnd(scmd, stu_command, 6,
 						scmd->device->timeout, 0);
 
-		if (rtn == SUCCESS)
-			return 0;
+			if (rtn == SUCCESS)
+				return 0;
+			/* if failure, wait before retrying */
+			ssleep(3);
+		}
 	}
 
 	return 1;




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 13:19 [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition() bugme-daemon
                   ` (4 preceding siblings ...)
  2008-11-29 19:32 ` bugme-daemon
@ 2008-11-29 19:55 ` bugme-daemon
  2008-11-29 21:30 ` bugme-daemon
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: bugme-daemon @ 2008-11-29 19:55 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12120





------- Comment #5 from anonymous@kernel-bugs.osdl.org  2008-11-29 11:55 -------
Reply-To: James.Bottomley@HansenPartnership.com

OK, so there are a few problems.  First, by responding OK to the test
unit ready (which is illegal under spec) it avoids the spin up the sd
driver normally does, so we're relying on the eh allow_restart flag to
start the unit on the first failing command.  Then, in the failure case:

> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Read(10): 28 00 00 00 00 00 00 00
> 08 00
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Sense Key : Not Ready [current] 
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: Logical unit not ready,
> initializing command required
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 0
> Nov 29 19:57:06 stein Waking error handler thread
> Nov 29 19:57:06 stein Error handler scsi_eh_20 waking up
> Nov 29 19:57:06 stein sd 20:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1,
> cancel: 0
> Nov 29 19:57:06 stein Total of 1 commands on 1 devices require eh work
> Nov 29 19:57:06 stein scsi_eh_20: Sending START_UNIT to sdev: 0xe58fc7f0
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Send: 0xf5798ef0 
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00 01
> 00
> Nov 29 19:57:06 stein buffer = 0x00000000, bufflen = 0, queuecommand 0xf85fb980
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Done: 0xf5798ef0 SUCCESS
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Result: hostbyte=DID_OK
> driverbyte=DRIVER_OK,SUGGEST_OK
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00 01
> 00
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Unrecognized sense data (in hex):
> Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> Nov 29 19:57:06 stein Sense Key : No Sense [current] 
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: No additional sense
> information
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 1

The second start unit is a failure ... I suspect because of our change
to no sense return handling.  What the drive is probably trying to say
is (I'm spinning up) but this gets interpreted as an error because the
sense data for this isn't present (because we didn't ask for it).

Can you try this patch?  It should take the success return of the first
spin up and act on it instead of blindly sending another.

James

---

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 3863617..635d8b4 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -931,12 +931,15 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)
        if (scmd->device->allow_restart) {
                int i, rtn = NEEDS_RETRY;

-               for (i = 0; rtn == NEEDS_RETRY && i < 2; i++)
+               for (i = 0; rtn == NEEDS_RETRY && i < 2; i++) {
                        rtn = scsi_send_eh_cmnd(scmd, stu_command, 6,
                                                scmd->device->timeout, 0);

-               if (rtn == SUCCESS)
-                       return 0;
+                       if (rtn == SUCCESS)
+                               return 0;
+                       /* if failure, wait before retrying */
+                       ssleep(3);
+               }
        }

        return 1;


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 19:54   ` James Bottomley
@ 2008-11-29 21:30     ` Stefan Richter
  2008-11-29 21:45       ` Stefan Richter
  2008-11-30  0:08       ` James Bottomley
  0 siblings, 2 replies; 21+ messages in thread
From: Stefan Richter @ 2008-11-29 21:30 UTC (permalink / raw)
  To: James Bottomley; +Cc: bugme-daemon, linux-scsi, linux1394-devel

On 29 Nov, James Bottomley wrote:
> OK, so there are a few problems.  First, by responding OK to the test
> unit ready (which is illegal under spec) it avoids the spin up the sd
> driver normally does, so we're relying on the eh allow_restart flag to
> start the unit on the first failing command.  Then, in the failure case:
> 
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Read(10): 28 00 00 00 00 00 00 00
>> 08 00
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Sense Key : Not Ready [current] 
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: Logical unit not ready,
>> initializing command required
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 0
>> Nov 29 19:57:06 stein Waking error handler thread
>> Nov 29 19:57:06 stein Error handler scsi_eh_20 waking up
>> Nov 29 19:57:06 stein sd 20:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1,
>> cancel: 0
>> Nov 29 19:57:06 stein Total of 1 commands on 1 devices require eh work
>> Nov 29 19:57:06 stein scsi_eh_20: Sending START_UNIT to sdev: 0xe58fc7f0
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Send: 0xf5798ef0 
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00 01
>> 00
>> Nov 29 19:57:06 stein buffer = 0x00000000, bufflen = 0, queuecommand 0xf85fb980
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Done: 0xf5798ef0 SUCCESS
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Result: hostbyte=DID_OK
>> driverbyte=DRIVER_OK,SUGGEST_OK
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00 01
>> 00
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Unrecognized sense data (in hex):
>> Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
>> Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
>> Nov 29 19:57:06 stein Sense Key : No Sense [current] 
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: No additional sense
>> information
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 1
> 
> The second start unit is a failure ... I suspect because of our change
> to no sense return handling.  What the drive is probably trying to say
> is (I'm spinning up) but this gets interpreted as an error because the
> sense data for this isn't present (because we didn't ask for it).
> 
> Can you try this patch?  It should take the success return of the first
> spin up and act on it instead of blindly sending another.
> 
> James
> 
> ---
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 3863617..635d8b4 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -931,12 +931,15 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)
>  	if (scmd->device->allow_restart) {
>  		int i, rtn = NEEDS_RETRY;
>  
> -		for (i = 0; rtn == NEEDS_RETRY && i < 2; i++)
> +		for (i = 0; rtn == NEEDS_RETRY && i < 2; i++) {
>  			rtn = scsi_send_eh_cmnd(scmd, stu_command, 6,
>  						scmd->device->timeout, 0);
>  
> -		if (rtn == SUCCESS)
> -			return 0;
> +			if (rtn == SUCCESS)
> +				return 0;
> +			/* if failure, wait before retrying */
> +			ssleep(3);
> +		}
>  	}
>  
>  	return 1;
> 
> 

Still no luck.  Here is a log from a simpler test:
# sg_start --stop /dev/sdb
# echo 0xfffffff > /sys/module/scsi_mod/parameters/scsi_logging_level
# hdparm -tT /dev/sdb
This works under 2.6.27.y (i.e. disk spins up, hdparm proceeds) but
fails under 2.6.28-rc with disk put offline.

The firmware apparently doesn't like the command abortion (SBP-2 fetch
agent reset) while it is spinning up.


Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] sd_init_command: block=0, count=512
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] block=0
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] reading 512/512 512 byte blocks.
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Send: 0xffff88007b76af08 
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 02 00 00
Nov 29 22:12:53 mini buffer = 0xffff88007b60c5f0, bufflen = 262144, queuecommand 0xffffffffa015d445
Nov 29 22:12:53 mini leaving scsi_dispatch_cmnd()
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Done: 0xffff88007b76af08 FAILED
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 02 00 00
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Sense Key : Not Ready [current] 
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Add. Sense: Logical unit not ready, initializing command required
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 0
Nov 29 22:12:53 mini Waking error handler thread
Nov 29 22:12:53 mini Error handler scsi_eh_2 waking up
Nov 29 22:12:53 mini sd 2:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1, cancel: 0
Nov 29 22:12:53 mini Total of 1 commands on 1 devices require eh work
Nov 29 22:12:53 mini scsi_eh_2: Sending START_UNIT to sdev: 0xffff8800712ef000
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Send: 0xffff88007b76af08 
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:12:53 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand 0xffffffffa015d445
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Done: 0xffff88007b76af08 SUCCESS
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Unrecognized sense data (in hex):
Nov 29 22:12:53 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 22:12:53 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 22:12:53 mini Sense Key : No Sense [current] 
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Add. Sense: No additional sense information
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 1
Nov 29 22:12:53 mini scsi_send_eh_cmnd: scmd: ffff88007b76af08, timeleft: 0
Nov 29 22:12:53 mini ieee1394: sbp2: aborting sbp2 command
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:12:53 mini scsi_eh_done scmd: ffff88007b76af08 result: 50000
Nov 29 22:12:55 mini ieee1394: sbp2: Reconnected to SBP-2 device
Nov 29 22:12:55 mini ieee1394: sbp2: Node 0-01:1023: Max speed [S400] - Max payload [2048]
Nov 29 22:12:56 mini scsi_eh_2: START_UNIT failed to sdev: 0xffff8800712ef000
Nov 29 22:12:56 mini scsi_eh_2: Sending BDR sdev: 0xffff8800712ef000
Nov 29 22:12:56 mini ieee1394: sbp2: reset requested
Nov 29 22:12:56 mini ieee1394: sbp2: generating sbp2 fetch agent reset
Nov 29 22:12:56 mini sd 2:0:0:0: [sdb] Send: 0xffff88007b76af08 
Nov 29 22:12:56 mini sd 2:0:0:0: [sdb] CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 22:12:56 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand 0xffffffffa015d445
Nov 29 22:13:02 mini ieee1394: sbp2: Reconnected to SBP-2 device
Nov 29 22:13:02 mini ieee1394: sbp2: Node 0-01:1023: Max speed [S400] - Max payload [2048]
Nov 29 22:13:02 mini scsi_eh_done scmd: ffff88007b76af08 result: 20000
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] Done: 0xffff88007b76af08 SUCCESS
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_BUS_BUSY driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 1
Nov 29 22:13:02 mini scsi_send_eh_cmnd: scmd: ffff88007b76af08, timeleft: 4840
Nov 29 22:13:02 mini scsi_send_eh_cmnd: scsi_eh_completed_normally 2003
Nov 29 22:13:02 mini scsi_eh_tur: scmd ffff88007b76af08 rtn 2003
Nov 29 22:13:02 mini scsi_eh_2: Sending target reset to target 0
Nov 29 22:13:02 mini scsi_eh_2: Target reset failed target: 0
Nov 29 22:13:02 mini scsi_eh_2: Sending BRST chan: 0
Nov 29 22:13:02 mini scsi_try_bus_reset: Snd Bus RST
Nov 29 22:13:02 mini scsi_eh_2: BRST failed chan: 0
Nov 29 22:13:02 mini scsi_eh_2: Sending HRST
Nov 29 22:13:02 mini scsi_try_host_reset: Snd Host RST
Nov 29 22:13:02 mini scsi_eh_2: HRST failed
Nov 29 22:13:02 mini sd 2:0:0:0: Device offlined - not ready after error recovery
Nov 29 22:13:02 mini scsi_eh_2: flush finish cmd: ffff88007b76af08
Nov 29 22:13:02 mini sd 2:0:0:0: Notifying upper driver of completion (result 2)
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:13:02 mini 512 sectors total, 0 bytes done.
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:13:02 mini end_request: I/O error, dev sdb, sector 0
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 0
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 1
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 2
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 3
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 4
Nov 29 22:13:02 mini sd 2:0:0:0: rejecting I/O to offline device
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 5
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 6
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 7
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 8
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 0
Nov 29 22:13:02 mini scsi_restart_operations: waking up host to restart
Nov 29 22:13:02 mini Error handler scsi_eh_2 sleeping

-- 
Stefan Richter
-=====-==--- =-== ===-=
http://arcgraph.de/sr/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 13:19 [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition() bugme-daemon
                   ` (5 preceding siblings ...)
  2008-11-29 19:55 ` bugme-daemon
@ 2008-11-29 21:30 ` bugme-daemon
  2008-11-29 21:45 ` bugme-daemon
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: bugme-daemon @ 2008-11-29 21:30 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12120





------- Comment #6 from anonymous@kernel-bugs.osdl.org  2008-11-29 13:30 -------
Reply-To: stefanr@s5r6.in-berlin.de

On 29 Nov, James Bottomley wrote:
> OK, so there are a few problems.  First, by responding OK to the test
> unit ready (which is illegal under spec) it avoids the spin up the sd
> driver normally does, so we're relying on the eh allow_restart flag to
> start the unit on the first failing command.  Then, in the failure case:
> 
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Read(10): 28 00 00 00 00 00 00 00
>> 08 00
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Sense Key : Not Ready [current] 
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: Logical unit not ready,
>> initializing command required
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 0
>> Nov 29 19:57:06 stein Waking error handler thread
>> Nov 29 19:57:06 stein Error handler scsi_eh_20 waking up
>> Nov 29 19:57:06 stein sd 20:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1,
>> cancel: 0
>> Nov 29 19:57:06 stein Total of 1 commands on 1 devices require eh work
>> Nov 29 19:57:06 stein scsi_eh_20: Sending START_UNIT to sdev: 0xe58fc7f0
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Send: 0xf5798ef0 
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00 01
>> 00
>> Nov 29 19:57:06 stein buffer = 0x00000000, bufflen = 0, queuecommand 0xf85fb980
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Done: 0xf5798ef0 SUCCESS
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Result: hostbyte=DID_OK
>> driverbyte=DRIVER_OK,SUGGEST_OK
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00 01
>> 00
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Unrecognized sense data (in hex):
>> Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
>> Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
>> Nov 29 19:57:06 stein Sense Key : No Sense [current] 
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: No additional sense
>> information
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 1
> 
> The second start unit is a failure ... I suspect because of our change
> to no sense return handling.  What the drive is probably trying to say
> is (I'm spinning up) but this gets interpreted as an error because the
> sense data for this isn't present (because we didn't ask for it).
> 
> Can you try this patch?  It should take the success return of the first
> spin up and act on it instead of blindly sending another.
> 
> James
> 
> ---
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 3863617..635d8b4 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -931,12 +931,15 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)
>  	if (scmd->device->allow_restart) {
>  		int i, rtn = NEEDS_RETRY;
>  
> -		for (i = 0; rtn == NEEDS_RETRY && i < 2; i++)
> +		for (i = 0; rtn == NEEDS_RETRY && i < 2; i++) {
>  			rtn = scsi_send_eh_cmnd(scmd, stu_command, 6,
>  						scmd->device->timeout, 0);
>  
> -		if (rtn == SUCCESS)
> -			return 0;
> +			if (rtn == SUCCESS)
> +				return 0;
> +			/* if failure, wait before retrying */
> +			ssleep(3);
> +		}
>  	}
>  
>  	return 1;
> 
> 

Still no luck.  Here is a log from a simpler test:
# sg_start --stop /dev/sdb
# echo 0xfffffff > /sys/module/scsi_mod/parameters/scsi_logging_level
# hdparm -tT /dev/sdb
This works under 2.6.27.y (i.e. disk spins up, hdparm proceeds) but
fails under 2.6.28-rc with disk put offline.

The firmware apparently doesn't like the command abortion (SBP-2 fetch
agent reset) while it is spinning up.


Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] sd_init_command: block=0, count=512
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] block=0
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] reading 512/512 512 byte blocks.
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Send: 0xffff88007b76af08 
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 02
00 00
Nov 29 22:12:53 mini buffer = 0xffff88007b60c5f0, bufflen = 262144,
queuecommand 0xffffffffa015d445
Nov 29 22:12:53 mini leaving scsi_dispatch_cmnd()
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Done: 0xffff88007b76af08 FAILED
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 02
00 00
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Sense Key : Not Ready [current] 
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Add. Sense: Logical unit not ready,
initializing command required
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 0
Nov 29 22:12:53 mini Waking error handler thread
Nov 29 22:12:53 mini Error handler scsi_eh_2 waking up
Nov 29 22:12:53 mini sd 2:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1,
cancel: 0
Nov 29 22:12:53 mini Total of 1 commands on 1 devices require eh work
Nov 29 22:12:53 mini scsi_eh_2: Sending START_UNIT to sdev: 0xffff8800712ef000
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Send: 0xffff88007b76af08 
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:12:53 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand
0xffffffffa015d445
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Done: 0xffff88007b76af08 SUCCESS
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Unrecognized sense data (in hex):
Nov 29 22:12:53 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 22:12:53 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 22:12:53 mini Sense Key : No Sense [current] 
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Add. Sense: No additional sense
information
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 1
Nov 29 22:12:53 mini scsi_send_eh_cmnd: scmd: ffff88007b76af08, timeleft: 0
Nov 29 22:12:53 mini ieee1394: sbp2: aborting sbp2 command
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:12:53 mini scsi_eh_done scmd: ffff88007b76af08 result: 50000
Nov 29 22:12:55 mini ieee1394: sbp2: Reconnected to SBP-2 device
Nov 29 22:12:55 mini ieee1394: sbp2: Node 0-01:1023: Max speed [S400] - Max
payload [2048]
Nov 29 22:12:56 mini scsi_eh_2: START_UNIT failed to sdev: 0xffff8800712ef000
Nov 29 22:12:56 mini scsi_eh_2: Sending BDR sdev: 0xffff8800712ef000
Nov 29 22:12:56 mini ieee1394: sbp2: reset requested
Nov 29 22:12:56 mini ieee1394: sbp2: generating sbp2 fetch agent reset
Nov 29 22:12:56 mini sd 2:0:0:0: [sdb] Send: 0xffff88007b76af08 
Nov 29 22:12:56 mini sd 2:0:0:0: [sdb] CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 22:12:56 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand
0xffffffffa015d445
Nov 29 22:13:02 mini ieee1394: sbp2: Reconnected to SBP-2 device
Nov 29 22:13:02 mini ieee1394: sbp2: Node 0-01:1023: Max speed [S400] - Max
payload [2048]
Nov 29 22:13:02 mini scsi_eh_done scmd: ffff88007b76af08 result: 20000
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] Done: 0xffff88007b76af08 SUCCESS
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_BUS_BUSY
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 1
Nov 29 22:13:02 mini scsi_send_eh_cmnd: scmd: ffff88007b76af08, timeleft: 4840
Nov 29 22:13:02 mini scsi_send_eh_cmnd: scsi_eh_completed_normally 2003
Nov 29 22:13:02 mini scsi_eh_tur: scmd ffff88007b76af08 rtn 2003
Nov 29 22:13:02 mini scsi_eh_2: Sending target reset to target 0
Nov 29 22:13:02 mini scsi_eh_2: Target reset failed target: 0
Nov 29 22:13:02 mini scsi_eh_2: Sending BRST chan: 0
Nov 29 22:13:02 mini scsi_try_bus_reset: Snd Bus RST
Nov 29 22:13:02 mini scsi_eh_2: BRST failed chan: 0
Nov 29 22:13:02 mini scsi_eh_2: Sending HRST
Nov 29 22:13:02 mini scsi_try_host_reset: Snd Host RST
Nov 29 22:13:02 mini scsi_eh_2: HRST failed
Nov 29 22:13:02 mini sd 2:0:0:0: Device offlined - not ready after error
recovery
Nov 29 22:13:02 mini scsi_eh_2: flush finish cmd: ffff88007b76af08
Nov 29 22:13:02 mini sd 2:0:0:0: Notifying upper driver of completion (result
2)
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:13:02 mini 512 sectors total, 0 bytes done.
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:13:02 mini end_request: I/O error, dev sdb, sector 0
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 0
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 1
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 2
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 3
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 4
Nov 29 22:13:02 mini sd 2:0:0:0: rejecting I/O to offline device
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 5
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 6
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 7
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 8
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 0
Nov 29 22:13:02 mini scsi_restart_operations: waking up host to restart
Nov 29 22:13:02 mini Error handler scsi_eh_2 sleeping


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 21:30     ` Stefan Richter
@ 2008-11-29 21:45       ` Stefan Richter
  2008-11-30  0:08       ` James Bottomley
  1 sibling, 0 replies; 21+ messages in thread
From: Stefan Richter @ 2008-11-29 21:45 UTC (permalink / raw)
  To: James Bottomley; +Cc: bugme-daemon, linux-scsi, linux1394-devel

I wrote:
> On 29 Nov, James Bottomley wrote:
>> Can you try this patch?  It should take the success return of the first
>> spin up and act on it instead of blindly sending another.
...
> Still no luck.  Here is a log from a simpler test:
> # sg_start --stop /dev/sdb
> # echo 0xfffffff > /sys/module/scsi_mod/parameters/scsi_logging_level
> # hdparm -tT /dev/sdb
> This works under 2.6.27.y (i.e. disk spins up, hdparm proceeds) but
> fails under 2.6.28-rc with disk put offline.
> 
> The firmware apparently doesn't like the command abortion (SBP-2 fetch
> agent reset) while it is spinning up.

Same with firewire-sbp2 instead of ieee1394/sbp2:

Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] sd_init_command: block=0, count=512
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] block=0
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] reading 512/512 512 byte blocks.
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Send: 0xffff88007169cf08 
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 02 00 00
Nov 29 22:36:31 mini buffer = 0xffff88007c8e75f0, bufflen = 262144, queuecommand 0xffffffffa013c9df
Nov 29 22:36:31 mini leaving scsi_dispatch_cmnd()
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Done: 0xffff88007169cf08 FAILED
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 02 00 00
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Sense Key : Not Ready [current] 
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Add. Sense: Logical unit not ready, initializing command required
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 0
Nov 29 22:36:31 mini Waking error handler thread
Nov 29 22:36:31 mini Error handler scsi_eh_2 waking up
Nov 29 22:36:31 mini sd 2:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1, cancel: 0
Nov 29 22:36:31 mini Total of 1 commands on 1 devices require eh work
Nov 29 22:36:31 mini scsi_eh_2: Sending START_UNIT to sdev: 0xffff88006e373000
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Send: 0xffff88007169cf08 
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:36:31 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand 0xffffffffa013c9df
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Done: 0xffff88007169cf08 SUCCESS
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Unrecognized sense data (in hex):
Nov 29 22:36:31 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 22:36:31 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 22:36:31 mini Sense Key : No Sense [current] 
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Add. Sense: No additional sense information
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 1
Nov 29 22:36:31 mini scsi_send_eh_cmnd: scmd: ffff88007169cf08, timeleft: 0
Nov 29 22:36:31 mini firewire_sbp2: fw1.0: sbp2_scsi_abort
Nov 29 22:36:31 mini scsi_eh_done scmd: ffff88007169cf08 result: 20000
Nov 29 22:36:33 mini firewire_sbp2: fw1.0: reconnected to LUN 0000 (0 retries)
Nov 29 22:36:34 mini scsi_eh_2: START_UNIT failed to sdev: 0xffff88006e373000
Nov 29 22:36:34 mini scsi_eh_2: Sending BDR sdev: 0xffff88006e373000
Nov 29 22:36:34 mini scsi_eh_2: BDR failed sdev:0xffff88006e373000
Nov 29 22:36:34 mini scsi_eh_2: Sending target reset to target 0
Nov 29 22:36:34 mini scsi_eh_2: Target reset failed target: 0
Nov 29 22:36:34 mini scsi_eh_2: Sending BRST chan: 0
Nov 29 22:36:34 mini scsi_try_bus_reset: Snd Bus RST
Nov 29 22:36:34 mini scsi_eh_2: BRST failed chan: 0
Nov 29 22:36:34 mini scsi_eh_2: Sending HRST
Nov 29 22:36:34 mini scsi_try_host_reset: Snd Host RST
Nov 29 22:36:34 mini scsi_eh_2: HRST failed
Nov 29 22:36:34 mini sd 2:0:0:0: Device offlined - not ready after error recovery
Nov 29 22:36:34 mini scsi_eh_2: flush finish cmd: ffff88007169cf08
Nov 29 22:36:34 mini sd 2:0:0:0: Notifying upper driver of completion (result 2)
Nov 29 22:36:34 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:36:34 mini 512 sectors total, 0 bytes done.
Nov 29 22:36:34 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:36:34 mini end_request: I/O error, dev sdb, sector 0
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 0
Nov 29 22:36:34 mini sd 2:0:0:0: rejecting I/O to offline device
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 0
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 1
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 2
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 3
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 4
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 5
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 6
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 7
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 8
Nov 29 22:36:34 mini scsi_restart_operations: waking up host to restart
Nov 29 22:36:34 mini Error handler scsi_eh_2 sleeping
Nov 29 22:36:37 mini sd 2:0:0:0: [sdb] sd_release
Nov 29 22:36:39 mini firewire_sbp2: fw1.0: reconnected to LUN 0000 (0 retries)

-- 
Stefan Richter
-=====-==--- =-== ===-=
http://arcgraph.de/sr/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 13:19 [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition() bugme-daemon
                   ` (6 preceding siblings ...)
  2008-11-29 21:30 ` bugme-daemon
@ 2008-11-29 21:45 ` bugme-daemon
  2008-11-30  0:08 ` bugme-daemon
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: bugme-daemon @ 2008-11-29 21:45 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12120





------- Comment #7 from anonymous@kernel-bugs.osdl.org  2008-11-29 13:45 -------
Reply-To: stefanr@s5r6.in-berlin.de

I wrote:
> On 29 Nov, James Bottomley wrote:
>> Can you try this patch?  It should take the success return of the first
>> spin up and act on it instead of blindly sending another.
...
> Still no luck.  Here is a log from a simpler test:
> # sg_start --stop /dev/sdb
> # echo 0xfffffff > /sys/module/scsi_mod/parameters/scsi_logging_level
> # hdparm -tT /dev/sdb
> This works under 2.6.27.y (i.e. disk spins up, hdparm proceeds) but
> fails under 2.6.28-rc with disk put offline.
> 
> The firmware apparently doesn't like the command abortion (SBP-2 fetch
> agent reset) while it is spinning up.

Same with firewire-sbp2 instead of ieee1394/sbp2:

Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] sd_init_command: block=0, count=512
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] block=0
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] reading 512/512 512 byte blocks.
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Send: 0xffff88007169cf08 
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 02
00 00
Nov 29 22:36:31 mini buffer = 0xffff88007c8e75f0, bufflen = 262144,
queuecommand 0xffffffffa013c9df
Nov 29 22:36:31 mini leaving scsi_dispatch_cmnd()
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Done: 0xffff88007169cf08 FAILED
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 02
00 00
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Sense Key : Not Ready [current] 
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Add. Sense: Logical unit not ready,
initializing command required
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 0
Nov 29 22:36:31 mini Waking error handler thread
Nov 29 22:36:31 mini Error handler scsi_eh_2 waking up
Nov 29 22:36:31 mini sd 2:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1,
cancel: 0
Nov 29 22:36:31 mini Total of 1 commands on 1 devices require eh work
Nov 29 22:36:31 mini scsi_eh_2: Sending START_UNIT to sdev: 0xffff88006e373000
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Send: 0xffff88007169cf08 
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:36:31 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand
0xffffffffa013c9df
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Done: 0xffff88007169cf08 SUCCESS
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Unrecognized sense data (in hex):
Nov 29 22:36:31 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 22:36:31 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 22:36:31 mini Sense Key : No Sense [current] 
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Add. Sense: No additional sense
information
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 1
Nov 29 22:36:31 mini scsi_send_eh_cmnd: scmd: ffff88007169cf08, timeleft: 0
Nov 29 22:36:31 mini firewire_sbp2: fw1.0: sbp2_scsi_abort
Nov 29 22:36:31 mini scsi_eh_done scmd: ffff88007169cf08 result: 20000
Nov 29 22:36:33 mini firewire_sbp2: fw1.0: reconnected to LUN 0000 (0 retries)
Nov 29 22:36:34 mini scsi_eh_2: START_UNIT failed to sdev: 0xffff88006e373000
Nov 29 22:36:34 mini scsi_eh_2: Sending BDR sdev: 0xffff88006e373000
Nov 29 22:36:34 mini scsi_eh_2: BDR failed sdev:0xffff88006e373000
Nov 29 22:36:34 mini scsi_eh_2: Sending target reset to target 0
Nov 29 22:36:34 mini scsi_eh_2: Target reset failed target: 0
Nov 29 22:36:34 mini scsi_eh_2: Sending BRST chan: 0
Nov 29 22:36:34 mini scsi_try_bus_reset: Snd Bus RST
Nov 29 22:36:34 mini scsi_eh_2: BRST failed chan: 0
Nov 29 22:36:34 mini scsi_eh_2: Sending HRST
Nov 29 22:36:34 mini scsi_try_host_reset: Snd Host RST
Nov 29 22:36:34 mini scsi_eh_2: HRST failed
Nov 29 22:36:34 mini sd 2:0:0:0: Device offlined - not ready after error
recovery
Nov 29 22:36:34 mini scsi_eh_2: flush finish cmd: ffff88007169cf08
Nov 29 22:36:34 mini sd 2:0:0:0: Notifying upper driver of completion (result
2)
Nov 29 22:36:34 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:36:34 mini 512 sectors total, 0 bytes done.
Nov 29 22:36:34 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:36:34 mini end_request: I/O error, dev sdb, sector 0
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 0
Nov 29 22:36:34 mini sd 2:0:0:0: rejecting I/O to offline device
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 0
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 1
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 2
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 3
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 4
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 5
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 6
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 7
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 8
Nov 29 22:36:34 mini scsi_restart_operations: waking up host to restart
Nov 29 22:36:34 mini Error handler scsi_eh_2 sleeping
Nov 29 22:36:37 mini sd 2:0:0:0: [sdb] sd_release
Nov 29 22:36:39 mini firewire_sbp2: fw1.0: reconnected to LUN 0000 (0 retries)


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 21:30     ` Stefan Richter
  2008-11-29 21:45       ` Stefan Richter
@ 2008-11-30  0:08       ` James Bottomley
  2008-11-30  9:44         ` Stefan Richter
  1 sibling, 1 reply; 21+ messages in thread
From: James Bottomley @ 2008-11-30  0:08 UTC (permalink / raw)
  To: Stefan Richter; +Cc: bugme-daemon, linux-scsi, linux1394-devel

On Sat, 2008-11-29 at 22:30 +0100, Stefan Richter wrote:
> Nov 29 22:12:53 mini scsi_eh_2: Sending START_UNIT to sdev: 0xffff8800712ef000
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Send: 0xffff88007b76af08 
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
> Nov 29 22:12:53 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand 0xffffffffa015d445
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Done: 0xffff88007b76af08 SUCCESS
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Unrecognized sense data (in hex):
> Nov 29 22:12:53 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> Nov 29 22:12:53 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> Nov 29 22:12:53 mini Sense Key : No Sense [current] 
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Add. Sense: No additional sense information
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 1
> Nov 29 22:12:53 mini scsi_send_eh_cmnd: scmd: ffff88007b76af08, timeleft: 0

This is suggestive:  timeleft shouldn't be zero since according to the
logging ticks virtually no time had elapsed.

> Nov 29 22:12:53 mini ieee1394: sbp2: aborting sbp2 command
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
> Nov 29 22:12:53 mini scsi_eh_done scmd: ffff88007b76af08 result: 50000

And this is the actual command completion occurring after timeout.

Theory number two is a block timeout cockup.  It looks like
sdev->timeout is vestigial and has zero value, so we shouldn't be using
it in the start command, so try this.

If this works, the correct fix will be to kill the sdev timeout
parameter so we get a compile failure where anything tries to use it.

James

---

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 3863617..de3f6d0 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -933,7 +933,7 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)
 
 		for (i = 0; rtn == NEEDS_RETRY && i < 2; i++)
 			rtn = scsi_send_eh_cmnd(scmd, stu_command, 6,
-						scmd->device->timeout, 0);
+						3*SENSE_TIMEOUT, 0);
 
 		if (rtn == SUCCESS)
 			return 0;



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 13:19 [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition() bugme-daemon
                   ` (7 preceding siblings ...)
  2008-11-29 21:45 ` bugme-daemon
@ 2008-11-30  0:08 ` bugme-daemon
  2008-11-30  9:44 ` bugme-daemon
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: bugme-daemon @ 2008-11-30  0:08 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12120





------- Comment #8 from anonymous@kernel-bugs.osdl.org  2008-11-29 16:08 -------
Reply-To: James.Bottomley@HansenPartnership.com

On Sat, 2008-11-29 at 22:30 +0100, Stefan Richter wrote:
> Nov 29 22:12:53 mini scsi_eh_2: Sending START_UNIT to sdev: 0xffff8800712ef000
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Send: 0xffff88007b76af08 
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
> Nov 29 22:12:53 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand 0xffffffffa015d445
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Done: 0xffff88007b76af08 SUCCESS
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Unrecognized sense data (in hex):
> Nov 29 22:12:53 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> Nov 29 22:12:53 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> Nov 29 22:12:53 mini Sense Key : No Sense [current] 
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Add. Sense: No additional sense information
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 1
> Nov 29 22:12:53 mini scsi_send_eh_cmnd: scmd: ffff88007b76af08, timeleft: 0

This is suggestive:  timeleft shouldn't be zero since according to the
logging ticks virtually no time had elapsed.

> Nov 29 22:12:53 mini ieee1394: sbp2: aborting sbp2 command
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
> Nov 29 22:12:53 mini scsi_eh_done scmd: ffff88007b76af08 result: 50000

And this is the actual command completion occurring after timeout.

Theory number two is a block timeout cockup.  It looks like
sdev->timeout is vestigial and has zero value, so we shouldn't be using
it in the start command, so try this.

If this works, the correct fix will be to kill the sdev timeout
parameter so we get a compile failure where anything tries to use it.

James

---

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 3863617..de3f6d0 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -933,7 +933,7 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)

                for (i = 0; rtn == NEEDS_RETRY && i < 2; i++)
                        rtn = scsi_send_eh_cmnd(scmd, stu_command, 6,
-                                               scmd->device->timeout, 0);
+                                               3*SENSE_TIMEOUT, 0);

                if (rtn == SUCCESS)
                        return 0;


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-30  0:08       ` James Bottomley
@ 2008-11-30  9:44         ` Stefan Richter
  2008-11-30 16:16           ` James Bottomley
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan Richter @ 2008-11-30  9:44 UTC (permalink / raw)
  To: James Bottomley; +Cc: bugme-daemon, linux-scsi, linux1394-devel

James Bottomley wrote:
> Theory number two is a block timeout cockup.  It looks like
> sdev->timeout is vestigial and has zero value, so we shouldn't be using
> it in the start command, so try this.
> 
> If this works, the correct fix will be to kill the sdev timeout
> parameter so we get a compile failure where anything tries to use it.
> 
> James
> 
> ---
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 3863617..de3f6d0 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -933,7 +933,7 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)
>  
>  		for (i = 0; rtn == NEEDS_RETRY && i < 2; i++)
>  			rtn = scsi_send_eh_cmnd(scmd, stu_command, 6,
> -						scmd->device->timeout, 0);
> +						3*SENSE_TIMEOUT, 0);
>  
>  		if (rtn == SUCCESS)
>  			return 0;
> 
> 

Yes, this works with all of the SBP-2 bridges here which exhibit this
false TEST UNIT READY "GOOD" status.

BTW, 3 * SENSE_TIMEOUT == 30 seconds may actually be a little bit narrow
as a START UNIT timeout.  The disks which I tested here --- a few 7200
RPM IDE or SATA disks behind the SBP-2 bridges --- usually take about 7
seconds to spin up in single-disk enclosures and 14 seconds in dual-disk
enclosures, since the bridge spins them up serially.  But a dual-disk
enclosure with Hitachi Deskstar 7K400 took 28 seconds.  Maybe this
should be used in scsi_eh_try_stu:
include/scsi/scsi.h:#define START_STOP_TIMEOUT          (60 * HZ)

Thanks a lot,
-- 
Stefan Richter
-=====-==--- =-== ====-
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 13:19 [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition() bugme-daemon
                   ` (8 preceding siblings ...)
  2008-11-30  0:08 ` bugme-daemon
@ 2008-11-30  9:44 ` bugme-daemon
  2008-11-30 16:16 ` bugme-daemon
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: bugme-daemon @ 2008-11-30  9:44 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12120





------- Comment #9 from anonymous@kernel-bugs.osdl.org  2008-11-30 01:44 -------
Reply-To: stefanr@s5r6.in-berlin.de

James Bottomley wrote:
> Theory number two is a block timeout cockup.  It looks like
> sdev->timeout is vestigial and has zero value, so we shouldn't be using
> it in the start command, so try this.
> 
> If this works, the correct fix will be to kill the sdev timeout
> parameter so we get a compile failure where anything tries to use it.
> 
> James
> 
> ---
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 3863617..de3f6d0 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -933,7 +933,7 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)
>  
>  		for (i = 0; rtn == NEEDS_RETRY && i < 2; i++)
>  			rtn = scsi_send_eh_cmnd(scmd, stu_command, 6,
> -						scmd->device->timeout, 0);
> +						3*SENSE_TIMEOUT, 0);
>  
>  		if (rtn == SUCCESS)
>  			return 0;
> 
> 

Yes, this works with all of the SBP-2 bridges here which exhibit this
false TEST UNIT READY "GOOD" status.

BTW, 3 * SENSE_TIMEOUT == 30 seconds may actually be a little bit narrow
as a START UNIT timeout.  The disks which I tested here --- a few 7200
RPM IDE or SATA disks behind the SBP-2 bridges --- usually take about 7
seconds to spin up in single-disk enclosures and 14 seconds in dual-disk
enclosures, since the bridge spins them up serially.  But a dual-disk
enclosure with Hitachi Deskstar 7K400 took 28 seconds.  Maybe this
should be used in scsi_eh_try_stu:
include/scsi/scsi.h:#define START_STOP_TIMEOUT          (60 * HZ)

Thanks a lot,


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-30  9:44         ` Stefan Richter
@ 2008-11-30 16:16           ` James Bottomley
  0 siblings, 0 replies; 21+ messages in thread
From: James Bottomley @ 2008-11-30 16:16 UTC (permalink / raw)
  To: Stefan Richter; +Cc: bugme-daemon, linux-scsi, linux1394-devel

On Sun, 2008-11-30 at 10:44 +0100, Stefan Richter wrote:
> James Bottomley wrote:
> > Theory number two is a block timeout cockup.  It looks like
> > sdev->timeout is vestigial and has zero value, so we shouldn't be using
> > it in the start command, so try this.
> > 
> > If this works, the correct fix will be to kill the sdev timeout
> > parameter so we get a compile failure where anything tries to use it.
> > 
> > James
> > 
> > ---
> > 
> > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> > index 3863617..de3f6d0 100644
> > --- a/drivers/scsi/scsi_error.c
> > +++ b/drivers/scsi/scsi_error.c
> > @@ -933,7 +933,7 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)
> >  
> >  		for (i = 0; rtn == NEEDS_RETRY && i < 2; i++)
> >  			rtn = scsi_send_eh_cmnd(scmd, stu_command, 6,
> > -						scmd->device->timeout, 0);
> > +						3*SENSE_TIMEOUT, 0);
> >  
> >  		if (rtn == SUCCESS)
> >  			return 0;
> > 
> > 
> 
> Yes, this works with all of the SBP-2 bridges here which exhibit this
> false TEST UNIT READY "GOOD" status.
> 
> BTW, 3 * SENSE_TIMEOUT == 30 seconds may actually be a little bit narrow
> as a START UNIT timeout.  The disks which I tested here --- a few 7200
> RPM IDE or SATA disks behind the SBP-2 bridges --- usually take about 7
> seconds to spin up in single-disk enclosures and 14 seconds in dual-disk
> enclosures, since the bridge spins them up serially.  But a dual-disk
> enclosure with Hitachi Deskstar 7K400 took 28 seconds.  Maybe this
> should be used in scsi_eh_try_stu:
> include/scsi/scsi.h:#define START_STOP_TIMEOUT          (60 * HZ)

Actually, unfortunately not.  This feature isn't designed to spin up non
standards compliant devices, it's designed to restart automatically
stopping media.  As such, the timeout is going to have to go back to
being the queue timeout ... it's just that this exposes a few more
problems than firewire.

James



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 13:19 [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition() bugme-daemon
                   ` (9 preceding siblings ...)
  2008-11-30  9:44 ` bugme-daemon
@ 2008-11-30 16:16 ` bugme-daemon
  2008-12-04  1:15 ` bugme-daemon
  2008-12-04  6:38 ` bugme-daemon
  12 siblings, 0 replies; 21+ messages in thread
From: bugme-daemon @ 2008-11-30 16:16 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12120





------- Comment #10 from anonymous@kernel-bugs.osdl.org  2008-11-30 08:16 -------
Reply-To: James.Bottomley@HansenPartnership.com

On Sun, 2008-11-30 at 10:44 +0100, Stefan Richter wrote:
> James Bottomley wrote:
> > Theory number two is a block timeout cockup.  It looks like
> > sdev->timeout is vestigial and has zero value, so we shouldn't be using
> > it in the start command, so try this.
> > 
> > If this works, the correct fix will be to kill the sdev timeout
> > parameter so we get a compile failure where anything tries to use it.
> > 
> > James
> > 
> > ---
> > 
> > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> > index 3863617..de3f6d0 100644
> > --- a/drivers/scsi/scsi_error.c
> > +++ b/drivers/scsi/scsi_error.c
> > @@ -933,7 +933,7 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)
> >  
> >  		for (i = 0; rtn == NEEDS_RETRY && i < 2; i++)
> >  			rtn = scsi_send_eh_cmnd(scmd, stu_command, 6,
> > -						scmd->device->timeout, 0);
> > +						3*SENSE_TIMEOUT, 0);
> >  
> >  		if (rtn == SUCCESS)
> >  			return 0;
> > 
> > 
> 
> Yes, this works with all of the SBP-2 bridges here which exhibit this
> false TEST UNIT READY "GOOD" status.
> 
> BTW, 3 * SENSE_TIMEOUT == 30 seconds may actually be a little bit narrow
> as a START UNIT timeout.  The disks which I tested here --- a few 7200
> RPM IDE or SATA disks behind the SBP-2 bridges --- usually take about 7
> seconds to spin up in single-disk enclosures and 14 seconds in dual-disk
> enclosures, since the bridge spins them up serially.  But a dual-disk
> enclosure with Hitachi Deskstar 7K400 took 28 seconds.  Maybe this
> should be used in scsi_eh_try_stu:
> include/scsi/scsi.h:#define START_STOP_TIMEOUT          (60 * HZ)

Actually, unfortunately not.  This feature isn't designed to spin up non
standards compliant devices, it's designed to restart automatically
stopping media.  As such, the timeout is going to have to go back to
being the queue timeout ... it's just that this exposes a few more
problems than firewire.

James


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 13:19 [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition() bugme-daemon
                   ` (10 preceding siblings ...)
  2008-11-30 16:16 ` bugme-daemon
@ 2008-12-04  1:15 ` bugme-daemon
  2008-12-04  6:38 ` bugme-daemon
  12 siblings, 0 replies; 21+ messages in thread
From: bugme-daemon @ 2008-12-04  1:15 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12120





------- Comment #11 from rjw@sisk.pl  2008-12-03 17:15 -------
On Thursday, 4 of December 2008, James Bottomley wrote:
> On Wed, 2008-12-03 at 22:49 +0100, Rafael J. Wysocki wrote:
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=12120
> > Subject         : [Block layer or SCSI] requests aborted too early
> > during check_partition()
> > Submitter       : Stefan Richter <stefan-r-bz@s5r6.in-berlin.de>
> > Date            : 2008-11-29 05:19 (5 days old)
> 
> Fixed by this patch:
> 
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9728c0814ecb505546696a659858fdb761375544


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()
  2008-11-29 13:19 [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition() bugme-daemon
                   ` (11 preceding siblings ...)
  2008-12-04  1:15 ` bugme-daemon
@ 2008-12-04  6:38 ` bugme-daemon
  12 siblings, 0 replies; 21+ messages in thread
From: bugme-daemon @ 2008-12-04  6:38 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12120


stefan-r-bz@s5r6.in-berlin.de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |CLOSED
         Resolution|                            |CODE_FIX




-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2008-12-04  6:38 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-29 13:19 [Bug 12120] New: [Block layer or SCSI] requests aborted too early during check_partition() bugme-daemon
2008-11-29 15:12 ` Stefan Richter
2008-11-29 18:35   ` James Bottomley
2008-11-29 15:12 ` [Bug 12120] " bugme-daemon
2008-11-29 18:36 ` bugme-daemon
2008-11-29 19:29 ` bugme-daemon
2008-11-29 19:32 ` bugme-daemon
2008-11-29 19:54   ` James Bottomley
2008-11-29 21:30     ` Stefan Richter
2008-11-29 21:45       ` Stefan Richter
2008-11-30  0:08       ` James Bottomley
2008-11-30  9:44         ` Stefan Richter
2008-11-30 16:16           ` James Bottomley
2008-11-29 19:55 ` bugme-daemon
2008-11-29 21:30 ` bugme-daemon
2008-11-29 21:45 ` bugme-daemon
2008-11-30  0:08 ` bugme-daemon
2008-11-30  9:44 ` bugme-daemon
2008-11-30 16:16 ` bugme-daemon
2008-12-04  1:15 ` bugme-daemon
2008-12-04  6:38 ` bugme-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox