linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 9010] SCSI device is not offlined properly and tries to cache data from previous device
       [not found] <bug-9010-11613@http.bugzilla.kernel.org/>
@ 2008-03-19  8:43 ` bugme-daemon
  2008-03-19 20:24   ` James Bottomley
  2008-03-19 20:24 ` bugme-daemon
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 13+ messages in thread
From: bugme-daemon @ 2008-03-19  8:43 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=9010


protasnb@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |linux-scsi@vger.kernel.org




------- Comment #24 from protasnb@gmail.com  2008-03-19 01:43 -------
Since bug analysis shows this is scsi layer problem - copying to SCSI devel


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bug 9010] SCSI device is not offlined properly and tries to cache data from previous device
  2008-03-19  8:43 ` bugme-daemon
@ 2008-03-19 20:24   ` James Bottomley
  0 siblings, 0 replies; 13+ messages in thread
From: James Bottomley @ 2008-03-19 20:24 UTC (permalink / raw)
  To: bugme-daemon; +Cc: linux-scsi

On Wed, 2008-03-19 at 01:43 -0700, bugme-daemon@bugzilla.kernel.org
wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=9010
> 
> 
> protasnb@gmail.com changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |linux-scsi@vger.kernel.org
> 
> 
> 
> 
> ------- Comment #24 from protasnb@gmail.com  2008-03-19 01:43 -------
> Since bug analysis shows this is scsi layer problem - copying to SCSI devel

That's rather a long bug report; what I can't seem to find is a
description of what the problem actually is.  If I had to guess I'd say
that unplugging and replugging a different SATA drive in the same slot
doesn't properly get the parameters of the new device; is that it?

James



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 9010] SCSI device is not offlined properly and tries to cache data from previous device
       [not found] <bug-9010-11613@http.bugzilla.kernel.org/>
  2008-03-19  8:43 ` bugme-daemon
@ 2008-03-19 20:24 ` bugme-daemon
  2008-03-21 13:35 ` bugme-daemon
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: bugme-daemon @ 2008-03-19 20:24 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=9010





------- Comment #25 from anonymous@kernel-bugs.osdl.org  2008-03-19 13:24 -------
Reply-To: James.Bottomley@HansenPartnership.com

On Wed, 2008-03-19 at 01:43 -0700, bugme-daemon@bugzilla.kernel.org
wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=9010
> 
> 
> protasnb@gmail.com changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |linux-scsi@vger.kernel.org
> 
> 
> 
> 
> ------- Comment #24 from protasnb@gmail.com  2008-03-19 01:43 -------
> Since bug analysis shows this is scsi layer problem - copying to SCSI devel

That's rather a long bug report; what I can't seem to find is a
description of what the problem actually is.  If I had to guess I'd say
that unplugging and replugging a different SATA drive in the same slot
doesn't properly get the parameters of the new device; is that it?

James


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 9010] SCSI device is not offlined properly and tries to cache data from previous device
       [not found] <bug-9010-11613@http.bugzilla.kernel.org/>
  2008-03-19  8:43 ` bugme-daemon
  2008-03-19 20:24 ` bugme-daemon
@ 2008-03-21 13:35 ` bugme-daemon
  2008-03-21 14:30   ` James Bottomley
  2008-03-21 14:30 ` bugme-daemon
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 13+ messages in thread
From: bugme-daemon @ 2008-03-21 13:35 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=9010





------- Comment #26 from lkmlist@gmail.com  2008-03-21 06:35 -------
To make it short:

Attach drive for the first time: --> sdb1
The disk works, I can access it.

When I remove it is removed... somehow... but It looks like there is a ghost
disk added (still with the kernelname sdb1) but not accessible (of couse.. I
hold the disk in my hand...).

replugging the same device doesn't fix the problem and does not work.

here's a short version of the above dmsg:

#######################################################################
#######################################################################
###########1) Hotpluging a disk for the first time#####################
#######################################################################
#######################################################################

Oct 30 02:33:09 freax [  951.885385] ata2: exception Emask 0x10 SAct 0x0 SErr
0x10000 action 0x2 frozen
Oct 30 02:33:09 freax [  951.885390] ata2: irq_stat 0x00400000, PHY RDY changed
Oct 30 02:33:09 freax [  951.885396] ata2: hard resetting port
Oct 30 02:33:09 freax [  952.607499] ata2: SATA link down (SStatus 0 SControl
.
.
.
Oct 30 02:33:23 freax [  965.906251] ata2: EH complete
Oct 30 02:33:23 freax [  965.906264] ata2.00: detaching (SCSI 1:0:0:0)
Oct 30 02:33:23 freax [  965.906524] sd 1:0:0:0: [sdb] Stopping disk
Oct 30 02:33:23 freax [  966.062277] scsi 1:0:0:0: Direct-Access     ATA     
External Disk 0  RGL1 PQ: 0 ANSI: 5
Oct 30 02:33:23 freax [  966.062415] sd 1:0:0:0: [sdb] 145226112 512-byte
hardware sectors (74356 MB)
Oct 30 02:33:23 freax [  966.062467] sd 1:0:0:0: [sdb] Write Protect is off
Oct 30 02:33:23 freax [  966.062502] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Oct 30 02:33:23 freax [  966.062557] sd 1:0:0:0: [sdb] Write cache: enabled,
read cache: enabled, doesn't support DPO or FUA
Oct 30 02:33:23 freax [  966.062646] sd 1:0:0:0: [sdb] 145226112 512-byte
hardware sectors (74356 MB)
Oct 30 02:33:23 freax [  966.062695] sd 1:0:0:0: [sdb] Write Protect is off
Oct 30 02:33:23 freax [  966.062729] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Oct 30 02:33:23 freax [  966.062781] sd 1:0:0:0: [sdb] Write cache: enabled,
read cache: enabled, doesn't support DPO or FUA
Oct 30 02:33:24 freax [  966.062956]  sdb: sdb1
Oct 30 02:33:24 freax [  966.984134] sd 1:0:0:0: [sdb] Attached SCSI disk
Oct 30 02:33:24 freax [  966.984218] sd 1:0:0:0: Attached scsi generic sg1 type
0

#######################################################################
#######################################################################
###########2) removing the disk shows the following:###################
#######################################################################
#######################################################################

Oct 30 02:34:47 freax [ 1050.239909] ata2: exception Emask 0x10 SAct 0x0 SErr
0x10000 action 0x2 frozen
Oct 30 02:34:47 freax [ 1050.239914] ata2: irq_stat 0x00400000, PHY RDY changed
Oct 30 02:34:47 freax [ 1050.239920] ata2: hard resetting port
Oct 30 02:34:48 freax [ 1050.961633] ata2: SATA link down (SStatus 0 SControl
300)
Oct 30 02:34:48 freax [ 1050.961642] ata2: failed to recover some devices,
retrying in 5 secs
Oct 30 02:34:53 freax [ 1055.952117] ata2: hard resetting port
Oct 30 02:34:53 freax [ 1056.256843] ata2: SATA link down (SStatus 0 SControl
.
.
.
Feb 17 16:30:44 freax [ 4312.842519]          res
51/04:00:01:01:80/00:00:00:00:00/a0 Emask 0x1 (device error)
Feb 17 16:30:44 freax [ 4312.842521] ata2.00: status: { DRDY ERR }
Feb 17 16:30:44 freax [ 4312.842523] ata2.00: error: { ABRT }
Feb 17 16:30:44 freax [ 4312.842589] ata2.00: device is on DMA blacklist,
disabling DMA
Feb 17 16:30:44 freax [ 4312.842662] ata2.00: configured for PIO4
Feb 17 16:30:44 freax [ 4312.842666] ata2: EH complete
Feb 17 16:30:44 freax [ 4312.842685] ata2.00: exception Emask 0x0 SAct 0x0 SErr
0x0 action 0x0
Feb 17 16:30:44 freax [ 4312.842687] ata2.00: irq_stat 0x40000001
Feb 17 16:30:44 freax [ 4312.842690] ata2.00: cmd
e7/00:00:00:00:00/00:00:00:00:00/a0 tag 0
.
.
.
Feb 17 16:30:47 freax [ 4315.384346] ata2.00: device is on DMA blacklist,
disabling DMA
Feb 17 16:30:47 freax [ 4315.384425] ata2.00: configured for PIO4
Feb 17 16:30:47 freax [ 4315.384430] ata2: EH complete
Feb 17 16:30:47 freax [ 4315.384437] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_SENSE,SUGGEST_OK
Feb 17 16:30:47 freax [ 4315.384440] sd 1:0:0:0: [sdb] Sense Key : Aborted
Command [current] [descriptor]
Feb 17 16:30:47 freax [ 4315.384456] sd 1:0:0:0: [sdb] Add. Sense: No
additional sense information
Feb 17 16:30:47 freax [ 4315.384469] sd 1:0:0:0: [sdb] Stopping disk
Feb 17 16:30:47 freax [ 4315.384614] scsi 1:0:0:0: Direct-Access     ATA     
Config  Disk     RGL1 PQ: 0 ANSI: 5
Feb 17 16:30:47 freax [ 4315.384699] sd 1:0:0:0: [sdb] 640 512-byte hardware
sectors (0 MB)
Feb 17 16:30:47 freax [ 4315.384710] sd 1:0:0:0: [sdb] Write Protect is off
Feb 17 16:30:47 freax [ 4315.384712] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Feb 17 16:30:47 freax [ 4315.384731] sd 1:0:0:0: [sdb] Write cache: disabled,
read cache: enabled, doesn't support DPO or FUA
Feb 17 16:30:47 freax [ 4315.384796] sd 1:0:0:0: [sdb] 640 512-byte hardware
sectors (0 MB)
Feb 17 16:30:47 freax [ 4315.384816] sd 1:0:0:0: [sdb] Write Protect is off
Feb 17 16:30:47 freax [ 4315.384827] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Feb 17 16:30:47 freax [ 4315.384853] sd 1:0:0:0: [sdb] Write cache: disabled,
read cache: enabled, doesn't support DPO or FUA
Feb 17 16:30:47 freax [ 4315.384872]  sdb: unknown partition table
Feb 17 16:30:47 freax [ 4315.385908] sd 1:0:0:0: [sdb] Attached SCSI disk
Feb 17 16:30:47 freax [ 4315.385954] sd 1:0:0:0: Attached scsi generic sg1 type
0
Feb 17 16:30:47 freax [ 4315.385988] sd 1:0:0:0: [sdb] 640 512-byte hardware
sectors (0 MB)
Feb 17 16:30:47 freax [ 4315.385999] sd 1:0:0:0: [sdb] Write Protect is off
Feb 17 16:30:47 freax [ 4315.386001] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Feb 17 16:30:47 freax [ 4315.386020] sd 1:0:0:0: [sdb] Write cache: disabled,
read cache: enabled, doesn't support DPO or FUA
Feb 17 16:30:47 freax [ 4315.921044] ata2.00: exception Emask 0x10 SAct 0x0
SErr 0x10000 action 0xa frozen
.
.
.
Feb 17 16:31:04 freax [ 4332.745067] Buffer I/O error on device sdb, logical
block 79
Feb 17 16:31:04 freax [ 4332.745074] ata2.00: detaching (SCSI 1:0:0:0)
Feb 17 16:31:04 freax [ 4332.745342] sd 1:0:0:0: [sdb] Stopping disk
Feb 17 16:31:04 freax [ 4332.745690] scsi 1:0:0:0: Direct-Access     ATA     
Config  Disk     RGL1 PQ: 0 ANSI: 5
Feb 17 16:31:04 freax [ 4332.745768] sd 1:0:0:0: [sdb] 640 512-byte hardware
sectors (0 MB)
Feb 17 16:31:04 freax [ 4332.745779] sd 1:0:0:0: [sdb] Write Protect is off
Feb 17 16:31:04 freax [ 4332.745781] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Feb 17 16:31:04 freax [ 4332.745800] sd 1:0:0:0: [sdb] Write cache: disabled,
read cache: enabled, doesn't support DPO or FUA
Feb 17 16:31:04 freax [ 4332.745845] sd 1:0:0:0: [sdb] 640 512-byte hardware
sectors (0 MB)
Feb 17 16:31:04 freax [ 4332.745855] sd 1:0:0:0: [sdb] Write Protect is off
Feb 17 16:31:04 freax [ 4332.745857] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Feb 17 16:31:04 freax [ 4332.745875] sd 1:0:0:0: [sdb] Write cache: disabled,
read cache: enabled, doesn't support DPO or FUA
Feb 17 16:31:04 freax [ 4332.745878]  sdb: unknown partition table
Feb 17 16:31:04 freax [ 4332.745959] sd 1:0:0:0: [sdb] Attached SCSI disk
Feb 17 16:31:04 freax [ 4332.745998] sd 1:0:0:0: Attached scsi generic sg1 type




#######################################################################
#######################################################################
###########3) replugging the SAME disk:################################
#######################################################################
#######################################################################

Feb 17 16:41:50 freax [ 4977.840074] ata2: exception Emask 0x10 SAct 0x0 SErr
0x10000 action 0xa frozen
Feb 17 16:41:50 freax [ 4977.840080] ata2: irq_stat 0x00400000, PHY RDY changed
Feb 17 16:41:50 freax [ 4977.840083] ata2: SError: { PHYRdyChg }
Feb 17 16:41:50 freax [ 4977.840091] ata2: hard resetting link
Feb 17 16:41:51 freax [ 4978.576507] ata2: SATA link down (SStatus 0 SControl
300)
Feb 17 16:41:51 freax [ 4978.576516] ata2: failed to recover some devices,
retrying in 5 secs
Feb 17 16:41:56 freax [ 4983.678479] ata2: hard resetting link
Feb 17 16:41:57 freax [ 4983.989543] ata2: SATA link down (SStatus 0 SControl
300)
Feb 17 16:41:57 freax [ 4983.989553] ata2: failed to recover some devices,
retrying in 5 secs
Feb 17 16:42:02 freax [ 4989.090894] ata2: hard resetting link
Feb 17 16:42:04 freax [ 4991.638546] ata2: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Feb 17 16:42:04 freax [ 4991.638619] ata2.00: model number mismatch 'Config 
Disk' != 'External Disk 0'
Feb 17 16:42:04 freax [ 4991.638622] ata2.00: revalidation failed (errno=-19)
Feb 17 16:42:04 freax [ 4991.638624] ata2.00: disabled
Feb 17 16:42:05 freax [ 4992.149526] ata2: hard resetting link
Feb 17 16:42:07 freax [ 4994.695718] ata2: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Feb 17 16:42:07 freax [ 4994.695802] ata2.00: ATA-6: External Disk 0, RGL10364,
max UDMA/133
Feb 17 16:42:07 freax [ 4994.695805] ata2.00: 1 sectors, multi 1: LBA48
Feb 17 16:42:07 freax [ 4994.695914] ata2.00: configured for UDMA/133
Feb 17 16:42:07 freax [ 4994.695922] ata2: exception Emask 0x10 SAct 0x0 SErr
0x0 action 0x9 t4
Feb 17 16:42:07 freax [ 4994.695924] ata2: irq_stat 0x00000040, connection
status changed
Feb 17 16:42:07 freax [ 4994.696084] ata2.00: configured for UDMA/133
Feb 17 16:42:07 freax [ 4994.696087] ata2: EH complete
Feb 17 16:42:07 freax [ 4994.696095] ata2.00: detaching (SCSI 1:0:0:0)
Feb 17 16:42:07 freax [ 4994.696345] sd 1:0:0:0: [sdb] Stopping disk
Feb 17 16:42:08 freax [ 4995.139171] scsi 1:0:0:0: Direct-Access     ATA     
External Disk 0  RGL1 PQ: 0 ANSI: 5
Feb 17 16:42:08 freax [ 4995.139390] sd 1:0:0:0: [sdb] 1 512-byte hardware
sectors (0 MB)
Feb 17 16:42:08 freax [ 4995.139402] sd 1:0:0:0: [sdb] Write Protect is off
Feb 17 16:42:08 freax [ 4995.139405] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Feb 17 16:42:08 freax [ 4995.139423] sd 1:0:0:0: [sdb] Write cache: enabled,
read cache: enabled, doesn't support DPO or FUA
Feb 17 16:42:08 freax [ 4995.139472] sd 1:0:0:0: [sdb] 1 512-byte hardware
sectors (0 MB)
Feb 17 16:42:08 freax [ 4995.139483] sd 1:0:0:0: [sdb] Write Protect is off
Feb 17 16:42:08 freax [ 4995.139485] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Feb 17 16:42:08 freax [ 4995.139503] sd 1:0:0:0: [sdb] Write cache: enabled,
read cache: enabled, doesn't support DPO or FUA
Feb 17 16:42:09 freax [ 4995.139506]  sdb: sdb1
Feb 17 16:42:09 freax [ 4996.266625]  sdb: p1 exceeds device capacity
Feb 17 16:42:09 freax [ 4996.266734] sd 1:0:0:0: [sdb] Attached SCSI disk
Feb 17 16:42:09 freax [ 4996.266791] sd 1:0:0:0: Attached scsi generic sg1 type
0
Feb 17 16:42:09 freax [ 4996.281486] attempt to access beyond end of device
Feb 17 16:42:09 freax [ 4996.281491] sdb: rw=0, want=65, limit=1
Feb 17 16:42:09 freax [ 4996.281494] Buffer I/O error on device sdb1, logical
block 0
.
.
.
Feb 17 16:42:09 freax [ 4996.287815] attempt to access beyond end of device
Feb 17 16:42:09 freax [ 4996.287816] sdb: rw=0, want=71, limit=1


##################################################################
##################################################################

Thats it.


Regards
Bjoern


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bug 9010] SCSI device is not offlined properly and tries to cache data from previous device
  2008-03-21 13:35 ` bugme-daemon
@ 2008-03-21 14:30   ` James Bottomley
  0 siblings, 0 replies; 13+ messages in thread
From: James Bottomley @ 2008-03-21 14:30 UTC (permalink / raw)
  To: bugme-daemon; +Cc: linux-scsi

On Fri, 2008-03-21 at 06:35 -0700, bugme-daemon@bugzilla.kernel.org
wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=9010
> 
> 
> 
> 
> 
> ------- Comment #26 from lkmlist@gmail.com  2008-03-21 06:35 -------
> To make it short:
> 
> Attach drive for the first time: --> sdb1
> The disk works, I can access it.
> 
> When I remove it is removed... somehow... but It looks like there is a ghost
> disk added (still with the kernelname sdb1) but not accessible (of couse.. I
> hold the disk in my hand...).
> 
> replugging the same device doesn't fix the problem and does not work.
> 
> here's a short version of the above dmsg:
[...]

All of this seems to show a hotplug failure in libata.  The SCSI
mid-layer handles this reasonably well (there are problems with
unplugging and replugging a device very rapidly).  All of our hotplug
busses (SAS, FC, iSCSI) work just fine.  For the non-hotplug busses like
SPI, you have to tell the kernel you've removed the disk manually, but
otherwise even that works.

This seems to be the place where the trouble is:


> Feb 17 16:30:47 freax [ 4315.384346] ata2.00: device is on DMA blacklist,
> disabling DMA
> Feb 17 16:30:47 freax [ 4315.384425] ata2.00: configured for PIO4
> Feb 17 16:30:47 freax [ 4315.384430] ata2: EH complete
> Feb 17 16:30:47 freax [ 4315.384437] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK
> driverbyte=DRIVER_SENSE,SUGGEST_OK
> Feb 17 16:30:47 freax [ 4315.384440] sd 1:0:0:0: [sdb] Sense Key : Aborted
> Command [current] [descriptor]
> Feb 17 16:30:47 freax [ 4315.384456] sd 1:0:0:0: [sdb] Add. Sense: No
> additional sense information
> Feb 17 16:30:47 freax [ 4315.384469] sd 1:0:0:0: [sdb] Stopping disk

This last message is from sd just before it tries to do the final put of
the device.  This is the tricky one, it's a special path only used by
libata (which sets the manage_start_stop flag).  After finishing this,
the device should be dead and gone.

> Feb 17 16:30:47 freax [ 4315.384614] scsi 1:0:0:0: Direct-Access     ATA     
> Config  Disk     RGL1 PQ: 0 ANSI: 5
> Feb 17 16:30:47 freax [ 4315.384699] sd 1:0:0:0: [sdb] 640 512-byte hardware
> sectors (0 MB)
> Feb 17 16:30:47 freax [ 4315.384710] sd 1:0:0:0: [sdb] Write Protect is off
> Feb 17 16:30:47 freax [ 4315.384712] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> Feb 17 16:30:47 freax [ 4315.384731] sd 1:0:0:0: [sdb] Write cache: disabled,
> read cache: enabled, doesn't support DPO or FUA
> Feb 17 16:30:47 freax [ 4315.384796] sd 1:0:0:0: [sdb] 640 512-byte hardware
> sectors (0 MB)
> Feb 17 16:30:47 freax [ 4315.384816] sd 1:0:0:0: [sdb] Write Protect is off
> Feb 17 16:30:47 freax [ 4315.384827] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> Feb 17 16:30:47 freax [ 4315.384853] sd 1:0:0:0: [sdb] Write cache: disabled,
> read cache: enabled, doesn't support DPO or FUA
> Feb 17 16:30:47 freax [ 4315.384872]  sdb: unknown partition table
> Feb 17 16:30:47 freax [ 4315.385908] sd 1:0:0:0: [sdb] Attached SCSI disk
> Feb 17 16:30:47 freax [ 4315.385954] sd 1:0:0:0: Attached scsi generic sg1 type
> 0
> Feb 17 16:30:47 freax [ 4315.385988] sd 1:0:0:0: [sdb] 640 512-byte hardware
> sectors (0 MB)
> Feb 17 16:30:47 freax [ 4315.385999] sd 1:0:0:0: [sdb] Write Protect is off
> Feb 17 16:30:47 freax [ 4315.386001] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> Feb 17 16:30:47 freax [ 4315.386020] sd 1:0:0:0: [sdb] Write cache: disabled,
> read cache: enabled, doesn't support DPO or FUA
> Feb 17 16:30:47 freax [ 4315.921044] ata2.00: exception Emask 0x10 SAct 0x0
> SErr 0x10000 action 0xa frozen

This is pretty bad ... SCSI has been told to readd the disk somehow, so
it has to do a rescan.  This must have come from some piece of
libata ... it's definitely using the cached data in libata to
manufacture the INQUIRY that makes SCSI think something is there.

Then your log actually repeats this sequence

> Feb 17 16:31:04 freax [ 4332.745067] Buffer I/O error on device sdb, logical
> block 79
> Feb 17 16:31:04 freax [ 4332.745074] ata2.00: detaching (SCSI 1:0:0:0)
> Feb 17 16:31:04 freax [ 4332.745342] sd 1:0:0:0: [sdb] Stopping disk
> Feb 17 16:31:04 freax [ 4332.745690] scsi 1:0:0:0: Direct-Access     ATA     
> Config  Disk     RGL1 PQ: 0 ANSI: 5
> Feb 17 16:31:04 freax [ 4332.745768] sd 1:0:0:0: [sdb] 640 512-byte hardware
> sectors (0 MB)
> Feb 17 16:31:04 freax [ 4332.745779] sd 1:0:0:0: [sdb] Write Protect is off
> Feb 17 16:31:04 freax [ 4332.745781] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> Feb 17 16:31:04 freax [ 4332.745800] sd 1:0:0:0: [sdb] Write cache: disabled,
> read cache: enabled, doesn't support DPO or FUA
> Feb 17 16:31:04 freax [ 4332.745845] sd 1:0:0:0: [sdb] 640 512-byte hardware
> sectors (0 MB)
> Feb 17 16:31:04 freax [ 4332.745855] sd 1:0:0:0: [sdb] Write Protect is off
> Feb 17 16:31:04 freax [ 4332.745857] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> Feb 17 16:31:04 freax [ 4332.745875] sd 1:0:0:0: [sdb] Write cache: disabled,
> read cache: enabled, doesn't support DPO or FUA
> Feb 17 16:31:04 freax [ 4332.745878]  sdb: unknown partition table
> Feb 17 16:31:04 freax [ 4332.745959] sd 1:0:0:0: [sdb] Attached SCSI disk
> Feb 17 16:31:04 freax [ 4332.745998] sd 1:0:0:0: Attached scsi generic sg1 type

So, the bottom line is that hotplug does work in SCSI (I can even
demonstrate it with SATA as long as I use a SAS controller), so this
does look to be a libata issue.  The complicating factor is that libata
does have special shutdown paths in SCSI ... they don't look like they
could be causing this, but it's not impossible.

James



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 9010] SCSI device is not offlined properly and tries to cache data from previous device
       [not found] <bug-9010-11613@http.bugzilla.kernel.org/>
                   ` (2 preceding siblings ...)
  2008-03-21 13:35 ` bugme-daemon
@ 2008-03-21 14:30 ` bugme-daemon
  2008-03-22 11:52 ` bugme-daemon
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: bugme-daemon @ 2008-03-21 14:30 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=9010





------- Comment #27 from anonymous@kernel-bugs.osdl.org  2008-03-21 07:30 -------
Reply-To: James.Bottomley@HansenPartnership.com

On Fri, 2008-03-21 at 06:35 -0700, bugme-daemon@bugzilla.kernel.org
wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=9010
> 
> 
> 
> 
> 
> ------- Comment #26 from lkmlist@gmail.com  2008-03-21 06:35 -------
> To make it short:
> 
> Attach drive for the first time: --> sdb1
> The disk works, I can access it.
> 
> When I remove it is removed... somehow... but It looks like there is a ghost
> disk added (still with the kernelname sdb1) but not accessible (of couse.. I
> hold the disk in my hand...).
> 
> replugging the same device doesn't fix the problem and does not work.
> 
> here's a short version of the above dmsg:
[...]

All of this seems to show a hotplug failure in libata.  The SCSI
mid-layer handles this reasonably well (there are problems with
unplugging and replugging a device very rapidly).  All of our hotplug
busses (SAS, FC, iSCSI) work just fine.  For the non-hotplug busses like
SPI, you have to tell the kernel you've removed the disk manually, but
otherwise even that works.

This seems to be the place where the trouble is:


> Feb 17 16:30:47 freax [ 4315.384346] ata2.00: device is on DMA blacklist,
> disabling DMA
> Feb 17 16:30:47 freax [ 4315.384425] ata2.00: configured for PIO4
> Feb 17 16:30:47 freax [ 4315.384430] ata2: EH complete
> Feb 17 16:30:47 freax [ 4315.384437] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK
> driverbyte=DRIVER_SENSE,SUGGEST_OK
> Feb 17 16:30:47 freax [ 4315.384440] sd 1:0:0:0: [sdb] Sense Key : Aborted
> Command [current] [descriptor]
> Feb 17 16:30:47 freax [ 4315.384456] sd 1:0:0:0: [sdb] Add. Sense: No
> additional sense information
> Feb 17 16:30:47 freax [ 4315.384469] sd 1:0:0:0: [sdb] Stopping disk

This last message is from sd just before it tries to do the final put of
the device.  This is the tricky one, it's a special path only used by
libata (which sets the manage_start_stop flag).  After finishing this,
the device should be dead and gone.

> Feb 17 16:30:47 freax [ 4315.384614] scsi 1:0:0:0: Direct-Access     ATA     
> Config  Disk     RGL1 PQ: 0 ANSI: 5
> Feb 17 16:30:47 freax [ 4315.384699] sd 1:0:0:0: [sdb] 640 512-byte hardware
> sectors (0 MB)
> Feb 17 16:30:47 freax [ 4315.384710] sd 1:0:0:0: [sdb] Write Protect is off
> Feb 17 16:30:47 freax [ 4315.384712] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> Feb 17 16:30:47 freax [ 4315.384731] sd 1:0:0:0: [sdb] Write cache: disabled,
> read cache: enabled, doesn't support DPO or FUA
> Feb 17 16:30:47 freax [ 4315.384796] sd 1:0:0:0: [sdb] 640 512-byte hardware
> sectors (0 MB)
> Feb 17 16:30:47 freax [ 4315.384816] sd 1:0:0:0: [sdb] Write Protect is off
> Feb 17 16:30:47 freax [ 4315.384827] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> Feb 17 16:30:47 freax [ 4315.384853] sd 1:0:0:0: [sdb] Write cache: disabled,
> read cache: enabled, doesn't support DPO or FUA
> Feb 17 16:30:47 freax [ 4315.384872]  sdb: unknown partition table
> Feb 17 16:30:47 freax [ 4315.385908] sd 1:0:0:0: [sdb] Attached SCSI disk
> Feb 17 16:30:47 freax [ 4315.385954] sd 1:0:0:0: Attached scsi generic sg1 type
> 0
> Feb 17 16:30:47 freax [ 4315.385988] sd 1:0:0:0: [sdb] 640 512-byte hardware
> sectors (0 MB)
> Feb 17 16:30:47 freax [ 4315.385999] sd 1:0:0:0: [sdb] Write Protect is off
> Feb 17 16:30:47 freax [ 4315.386001] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> Feb 17 16:30:47 freax [ 4315.386020] sd 1:0:0:0: [sdb] Write cache: disabled,
> read cache: enabled, doesn't support DPO or FUA
> Feb 17 16:30:47 freax [ 4315.921044] ata2.00: exception Emask 0x10 SAct 0x0
> SErr 0x10000 action 0xa frozen

This is pretty bad ... SCSI has been told to readd the disk somehow, so
it has to do a rescan.  This must have come from some piece of
libata ... it's definitely using the cached data in libata to
manufacture the INQUIRY that makes SCSI think something is there.

Then your log actually repeats this sequence

> Feb 17 16:31:04 freax [ 4332.745067] Buffer I/O error on device sdb, logical
> block 79
> Feb 17 16:31:04 freax [ 4332.745074] ata2.00: detaching (SCSI 1:0:0:0)
> Feb 17 16:31:04 freax [ 4332.745342] sd 1:0:0:0: [sdb] Stopping disk
> Feb 17 16:31:04 freax [ 4332.745690] scsi 1:0:0:0: Direct-Access     ATA     
> Config  Disk     RGL1 PQ: 0 ANSI: 5
> Feb 17 16:31:04 freax [ 4332.745768] sd 1:0:0:0: [sdb] 640 512-byte hardware
> sectors (0 MB)
> Feb 17 16:31:04 freax [ 4332.745779] sd 1:0:0:0: [sdb] Write Protect is off
> Feb 17 16:31:04 freax [ 4332.745781] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> Feb 17 16:31:04 freax [ 4332.745800] sd 1:0:0:0: [sdb] Write cache: disabled,
> read cache: enabled, doesn't support DPO or FUA
> Feb 17 16:31:04 freax [ 4332.745845] sd 1:0:0:0: [sdb] 640 512-byte hardware
> sectors (0 MB)
> Feb 17 16:31:04 freax [ 4332.745855] sd 1:0:0:0: [sdb] Write Protect is off
> Feb 17 16:31:04 freax [ 4332.745857] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> Feb 17 16:31:04 freax [ 4332.745875] sd 1:0:0:0: [sdb] Write cache: disabled,
> read cache: enabled, doesn't support DPO or FUA
> Feb 17 16:31:04 freax [ 4332.745878]  sdb: unknown partition table
> Feb 17 16:31:04 freax [ 4332.745959] sd 1:0:0:0: [sdb] Attached SCSI disk
> Feb 17 16:31:04 freax [ 4332.745998] sd 1:0:0:0: Attached scsi generic sg1 type

So, the bottom line is that hotplug does work in SCSI (I can even
demonstrate it with SATA as long as I use a SAS controller), so this
does look to be a libata issue.  The complicating factor is that libata
does have special shutdown paths in SCSI ... they don't look like they
could be causing this, but it's not impossible.

James


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 9010] SCSI device is not offlined properly and tries to cache data from previous device
       [not found] <bug-9010-11613@http.bugzilla.kernel.org/>
                   ` (3 preceding siblings ...)
  2008-03-21 14:30 ` bugme-daemon
@ 2008-03-22 11:52 ` bugme-daemon
  2008-03-22 22:25   ` James Bottomley
  2008-03-22 22:26 ` bugme-daemon
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 13+ messages in thread
From: bugme-daemon @ 2008-03-22 11:52 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=9010





------- Comment #28 from lkmlist@gmail.com  2008-03-22 04:52 -------
Hotplug works fine on my ICH7 ports, it's just that damn jmicron controller. So
it must be an effect trigered by a hardware or firmware "feature".

So finally someone from libata has to fix this problem?

thanks for the explanation

regards
Bjoern


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bug 9010] SCSI device is not offlined properly and tries to cache data from previous device
  2008-03-22 11:52 ` bugme-daemon
@ 2008-03-22 22:25   ` James Bottomley
  0 siblings, 0 replies; 13+ messages in thread
From: James Bottomley @ 2008-03-22 22:25 UTC (permalink / raw)
  To: bugme-daemon; +Cc: linux-scsi

On Sat, 2008-03-22 at 04:52 -0700, bugme-daemon@bugzilla.kernel.org
wrote:
> Hotplug works fine on my ICH7 ports, it's just that damn jmicron controller. So
> it must be an effect trigered by a hardware or firmware "feature".
> 
> So finally someone from libata has to fix this problem?

I think so ... I don't rule out it's being caused by one of the libata
specific pieces in SCSI, but the analysis of the problem definitely has
to begin in libata.

> thanks for the explanation

You're welcome.

James



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 9010] SCSI device is not offlined properly and tries to cache data from previous device
       [not found] <bug-9010-11613@http.bugzilla.kernel.org/>
                   ` (4 preceding siblings ...)
  2008-03-22 11:52 ` bugme-daemon
@ 2008-03-22 22:26 ` bugme-daemon
  2008-10-21 18:25 ` bugme-daemon
  2008-10-22 14:07 ` bugme-daemon
  7 siblings, 0 replies; 13+ messages in thread
From: bugme-daemon @ 2008-03-22 22:26 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=9010





------- Comment #29 from anonymous@kernel-bugs.osdl.org  2008-03-22 15:26 -------
Reply-To: James.Bottomley@HansenPartnership.com

On Sat, 2008-03-22 at 04:52 -0700, bugme-daemon@bugzilla.kernel.org
wrote:
> Hotplug works fine on my ICH7 ports, it's just that damn jmicron controller. So
> it must be an effect trigered by a hardware or firmware "feature".
> 
> So finally someone from libata has to fix this problem?

I think so ... I don't rule out it's being caused by one of the libata
specific pieces in SCSI, but the analysis of the problem definitely has
to begin in libata.

> thanks for the explanation

You're welcome.

James


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 9010] SCSI device is not offlined properly and tries to cache data from previous device
       [not found] <bug-9010-11613@http.bugzilla.kernel.org/>
                   ` (5 preceding siblings ...)
  2008-03-22 22:26 ` bugme-daemon
@ 2008-10-21 18:25 ` bugme-daemon
  2008-10-22 14:07 ` bugme-daemon
  7 siblings, 0 replies; 13+ messages in thread
From: bugme-daemon @ 2008-10-21 18:25 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=9010





------- Comment #30 from bugzilla.kernel.org@beej.org  2008-10-21 11:25 -------
jeff?


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 9010] SCSI device is not offlined properly and tries to cache data from previous device
       [not found] <bug-9010-11613@http.bugzilla.kernel.org/>
                   ` (6 preceding siblings ...)
  2008-10-21 18:25 ` bugme-daemon
@ 2008-10-22 14:07 ` bugme-daemon
  7 siblings, 0 replies; 13+ messages in thread
From: bugme-daemon @ 2008-10-22 14:07 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=9010





------- Comment #31 from lkmlist@gmail.com  2008-10-22 07:07 -------
Did anyone take care of this?

If you want't I can try to reproduce it with 2.6.27.*

Kind regards
Bjoern


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 9010] SCSI device is not offlined properly and tries to cache data from previous device
       [not found] <bug-9010-11613@https.bugzilla.kernel.org/>
@ 2012-05-17 14:42 ` bugzilla-daemon
  2012-05-17 14:43 ` bugzilla-daemon
  1 sibling, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2012-05-17 14:42 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=9010


Alan <alan@lxorguk.ukuu.org.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
                 CC|                            |alan@lxorguk.ukuu.org.uk
         Resolution|                            |OBSOLETE
         Regression|---                         |No




--- Comment #32 from Alan <alan@lxorguk.ukuu.org.uk>  2012-05-17 14:42:54 ---
Should be obsolete now, if not please reopen versus recent kernel

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 9010] SCSI device is not offlined properly and tries to cache data from previous device
       [not found] <bug-9010-11613@https.bugzilla.kernel.org/>
  2012-05-17 14:42 ` [Bug 9010] SCSI device is not offlined properly and tries to cache data from previous device bugzilla-daemon
@ 2012-05-17 14:43 ` bugzilla-daemon
  1 sibling, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2012-05-17 14:43 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=9010


Alan <alan@lxorguk.ukuu.org.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |CLOSED




-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-05-17 14:43 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <bug-9010-11613@https.bugzilla.kernel.org/>
2012-05-17 14:42 ` [Bug 9010] SCSI device is not offlined properly and tries to cache data from previous device bugzilla-daemon
2012-05-17 14:43 ` bugzilla-daemon
     [not found] <bug-9010-11613@http.bugzilla.kernel.org/>
2008-03-19  8:43 ` bugme-daemon
2008-03-19 20:24   ` James Bottomley
2008-03-19 20:24 ` bugme-daemon
2008-03-21 13:35 ` bugme-daemon
2008-03-21 14:30   ` James Bottomley
2008-03-21 14:30 ` bugme-daemon
2008-03-22 11:52 ` bugme-daemon
2008-03-22 22:25   ` James Bottomley
2008-03-22 22:26 ` bugme-daemon
2008-10-21 18:25 ` bugme-daemon
2008-10-22 14:07 ` bugme-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).